Abstract

Are temperature proxy records linear recorders of past temperature conditions? A statistical test for linearity is applied to 15 millennial-long proxy records with an annual resolution that was shown to significantly respond to Northern Hemisphere annual mean temperature selected from a collection of 30 proxies. The test, based on generalized additive modeling, shows that most of the proxies can indeed be shown to be linear functions of annual mean temperature, but two proxy records do not appear to have a linear relationship with temperature—this supports the assumption of linearity in most climate reconstruction work. The method tests for nonlinearity in a proxy relative to the group of proxies with which it is being used together. The robustness of the results is tested, and it was found that the results are stable to the choice of proxies. The linearity-testing method is quite general and could in the future be used for larger and more extensive sets of proxies.

1. Introduction

Considering the fact that a systematic network of instrumental temperature measurements around the globe started as late as the second half of the nineteenth century, we are dependent on other types of information to understand temperature variability prior to that. Such information is essential to evaluate whether the recent global warming falls outside the range of the natural variability of the last one or two millennia in either magnitude or rate (NRC 2006; Jansen et al. 2007), as well as providing material that may be used to evaluate paleoclimate model output and the paleomodel forcings. Our knowledge of past temperature variability must be drawn from temperature-sensitive proxy data. Such data can be extracted from both historical records and from various natural recorders of climate variability, such as corals, fossil pollen, ice cores, lake and marine sediments, speleothems, and tree-ring width and density. A review of different types of temperature proxy data is given in, for example, Bradley (1999) and Jones et al. (2009), and the availability of millennia-long temperature proxy records in the Northern Hemisphere is synthesized in Ljungqvist et al. (2012).

Multiproxy temperature reconstructions are based on training a model on datasets that include both observations and proxy data. Then, using the trained model on proxies alone, a model-based value for the observational quantity is extracted. Most current reconstruction methods used are linear—a linear proportionality between a signal in the proxy and in the real world is assumed. But do we know a priori whether the relationship actually is linear? We expect that linear methods will produce better reconstructions if the proxies are linear temperature recorders. While knowledge of the biological and geological systems that lay down the proxy records helps to understand which systems are linear (Frank et al. 2010), it is of interest to have quantitative tests for linearity.

There has been an increasing awareness that climate proxy records do not always show a linear relationship to temperature (Tingley et al. 2012). This has been most evident in the field of dendroclimatology, where the so-called divergence problem, the fact that some high-latitude tree-ring records show a lessened or even negative response to higher temperatures in the late twentieth century, has been studied by numerous researchers (Andreu-Hayles et al. 2011; Briffa et al. 1998a,b; D’Arrigo et al. 2008; Loehle 2009; Wilson et al. 2007; and the references therein). The nonlinear properties of some tree-ring width records have in pioneering studies been assessed by, for example, Carrer and Urbinati (2001), Fritts (1976), Graumlich and Brubaker (1986), and D’Arrigo et al. (2004). The nonlinear response of some temperature-sensitive proxy records to temperature has also been brought to attention regarding low-resolution proxy archives such as chironomid, diatom, and pollen. Nonparametric methods have been applied for reconstructing temperature from such proxies by, among others, Birks (1995), and Birks et al. (2010), and Bayesian reconstruction techniques that feature nonlinear modeling have been used by Haslett et al. (2006), Korhola (2002), Toivonen et al. (2001), and Vasko et al. (2000).

We present here methods and tests that can reveal the nonlinear relationship between proxies and temperature, so-called nonparametric methods. These methods are well known in statistics—see, for example, Teräsvirta et al. (2010, chapter 10)—but only occasionally utilized in paleoclimatic research. By using a nonparametric method, the aim is to avoid assumptions on the parametric form of the relationship in question. We let the data “speak for themselves,” enabling us to find a function that describes the available data well. This is in contrast to parametric modeling, where a specific model with parameters is assumed to generate the data in question. One such method in climate reconstruction is the simple “direct regression” method (pioneered by Groveman and Landsberg 1979). In such a parametric model, it can be easy to do inference, and great gains in efficiency are possible, however, only if the model is (almost) true. If the assumed model is incorrect, then inferences can be useless, leading to misleading interpretations of the data. Nonparametric models provide a simple way to find structures in datasets without imposing a parametric model, and it is also possible to test whether relationships are linear.

In this paper we will examine two such models, a nonparametric additive model and a semiparametric additive model, that allow for some linear and some nonlinear regressors. For a purely illustrative purpose, we will compare these two methods with reconstructions using the simple direct regression model. The rest of the paper is organized as follows. In section 2 we give a brief overview of the nonparametric models and the corresponding estimation methodology. In section 3 we present the proxy data and temperature series used, while section 4 reports the results of the fitted models and tests of linearity. Finally, section 5 presents a summary. In the  appendix we demonstrate the additive nonparametric method in a test example to show that it works on realistic data with known properties.

2. Methods

a. Linear regression model

In the multivariate regression—used, for instance, in Groveman and Landsberg (1979)—a Northern Hemisphere (NH) mean temperature Y is expressed as a linear sum over d selected proxies X, as follows:

 
formula

“Training,” to determine the coefficients (α, βj), is performed during an interval where both Y and the X’s are available, and the determined coefficients are then used to build a model for use at all other times. Assuming data stationarity is central to this step. This method is still in some use although several alternatives exist; see Christiansen et al. (2009) for a current review. We stress that it is used here for illustrative purposes only.

b. Nonparametric regression models

The aim of nonparametric models is to relax assumptions on the form of a regression function, and to let the data search for a suitable function that describes the available data well. These approaches are powerful in exploring fine structural relationships and provide very useful diagnostic tools for parametric models.

In a multivariate regression problem, we want to study the relationship between the response variable Y and the vector of covariates X = (X1, … , Xd)T via

 
formula

It is useful to model the unknown regression function m(x) additively, that is,

 
formula

Usually an intercept term is added, that is, E(Y) = α. This gives us the following additive nonparametric model:

 
formula

where m1, … , md are unknown univariate functions, E(e) = 0, Var(e) = σ2, and e is independent of the vector of covariates X. To ensure identifiability, m1, … , md are required to satisfy

 
formula

which implies that E(Y) = α.

Estimation of the unknown functions m1, … , md is done by the backfitting algorithm, introduced by Breiman and Friedman (1985) and Buja et al. (1989). Note, first, that if the additive model, (3), is correct, then

 
formula

This relationship suggests an iterative procedure for the estimation of the unknown functions. Thus, for a known constant α and given functions mj, jk, the function mk can be estimated by a univariate regression fit based on the observations , i = 1, … , n, where is the ith observation of the kth additive variable. Denote the univariate smoother of mk by Sk. The algorithm works as follows:

  • Step 1. Initialization: for k = 1, … d.

  • Step 2. Find new transformations: For k = 1, … d: 
    formula
  • Step 3. Repeat step 2 until convergence.

The idea behind this algorithm is to carry out a fit, calculate partial residuals from that fit, and refit again, which is why the iteration scheme is called backfitting. The starting functions can be obtained in various ways, for example, from a linear regression fit of Y on the covariates Xk. The smoothing operator Sk can be other nonparametric regression estimators, such as kernel methods—see, for example, Fan and Gijbels (1996).

This way of modeling nonlinear relationships between a response and several predictors has been extensively used within the statistical community for some years. In fact, the additive model above is a version of a wider model called generalized additive model (GAM); see Hastie and Tibshirani (1990). Here the conditional mean (m(X)) of a response Y is related to an additive function of the predictors via a link function g as follows:

 
formula

See also Hastie and Tibshirani (1990) and Hastie et al. (2009) for further technical details regarding the backfitting algorithm and nonparametric regression models. An alternative to the backfitting algorithm is the marginal integration method; see Tjøstheim and Auestad (1994).

In the linear regression model, each regressor represents 1 degree of freedom (Df)—in the additive model, more degrees of freedom are used up; see Hastie and Tibshirani (1990)—it is therefore generally advantageous to consider models that contain both linear and nonlinear terms; these are called semiparametric models. The methods are implemented in R (see www.r-project.org), and the software used in this paper is available on request.

3. Data

A wide range of different kinds of proxies with annual resolution have been used in this study: one speleothem microlayer thickness record, three ice-core δ18O records (one of which is a composite of three individual cores; see below), seven varved thickness sediment records, 13 tree-ring width records, four tree-ring maximum latewood density records, one tree-ring height-increment record, and one tree-ring δ13C record. Information about each of these 30 proxy records is given in Table 1, such as the name of the proxy, latitude and longitude, type of proxy, season bias, dating uncertainty, and reference to the original publication. We only use proxies with annual resolution—this places the proxies on a comparable basis to the instrumental data they are calibrated against but excludes most presently available millennia-long proxies.

Table 1.

Proxies R1–R5 are Pearson correlation coefficients, where R1 and R2 are the correlations to NH mean T—with and without linear trends—for a.d. 1850–1969; R3 and R4 are the same but correlated against the local gridpoint T and for the years 1870–1969; and R5 is by the original author, when available. Here “—” indicates that no information is available; Δ is dating uncertainty in years. Note for Polar Urals, the R5 is from the Briffa et al. (1995) version of the Polar Urals for the May–September season a.d. 1882–1980. Note for the Southern Colorado Plateau, the authors stated that R5 was calculated from the “mean max annual T.” This is difficult to find, instead we used the mean of the monthly-mean June–August T, although these are based on daily mean values, not daily max values. Four of the high-latitude series are evaluated (R3 and R4) against averages of gridpoint T covering an area: Renland, 65°–75°N, −30°–20°W; Avam–Taimyr, 69°−73°N, −92°−102°E; Greenland composite, 60°−75°N, −55°−33°W; and Gulf of Alaska, 55°−62°N, −153°−131°W. The correlation coefficients are boldfaced when significant at the 1% level according to a two-sided t test.

Proxies R1–R5 are Pearson correlation coefficients, where R1 and R2 are the correlations to NH mean T—with and without linear trends—for a.d. 1850–1969; R3 and R4 are the same but correlated against the local gridpoint T and for the years 1870–1969; and R5 is by the original author, when available. Here “—” indicates that no information is available; Δ is dating uncertainty in years. Note for Polar Urals, the R5 is from the Briffa et al. (1995) version of the Polar Urals for the May–September season a.d. 1882–1980. Note for the Southern Colorado Plateau, the authors stated that R5 was calculated from the “mean max annual T.” This is difficult to find, instead we used the mean of the monthly-mean June–August T, although these are based on daily mean values, not daily max values. Four of the high-latitude series are evaluated (R3 and R4) against averages of gridpoint T covering an area: Renland, 65°–75°N, −30°–20°W; Avam–Taimyr, 69°−73°N, −92°−102°E; Greenland composite, 60°−75°N, −55°−33°W; and Gulf of Alaska, 55°−62°N, −153°−131°W. The correlation coefficients are boldfaced when significant at the 1% level according to a two-sided t test.
Proxies R1–R5 are Pearson correlation coefficients, where R1 and R2 are the correlations to NH mean T—with and without linear trends—for a.d. 1850–1969; R3 and R4 are the same but correlated against the local gridpoint T and for the years 1870–1969; and R5 is by the original author, when available. Here “—” indicates that no information is available; Δ is dating uncertainty in years. Note for Polar Urals, the R5 is from the Briffa et al. (1995) version of the Polar Urals for the May–September season a.d. 1882–1980. Note for the Southern Colorado Plateau, the authors stated that R5 was calculated from the “mean max annual T.” This is difficult to find, instead we used the mean of the monthly-mean June–August T, although these are based on daily mean values, not daily max values. Four of the high-latitude series are evaluated (R3 and R4) against averages of gridpoint T covering an area: Renland, 65°–75°N, −30°–20°W; Avam–Taimyr, 69°−73°N, −92°−102°E; Greenland composite, 60°−75°N, −55°−33°W; and Gulf of Alaska, 55°−62°N, −153°−131°W. The correlation coefficients are boldfaced when significant at the 1% level according to a two-sided t test.

A general problem for any reconstruction is the risk of “overfitting”—that is, that we have so many proxies for which we have to find a coefficient or, in the case of GAM, an additive smoothed model term, that we are using up all or almost all the degrees of freedom and therefore can, in the extreme case, explain all the variance in the calibration dataset and at the same time have a limited or even reduced explanatory power in independent data. As we have restricted our proxies to those with annual resolution, we have eliminated many other, long, series and thereby minimized the risk of overfitting.

We test the set of 30 proxies for relevance by screening them in terms of how well they correlate to both the Northern Hemispheric mean temperature and the local temperatures; see Table 1. As noted by Juckes et al. (2007), choosing proxies by screening their correlation to temperature is not an entirely unproblematic approach. Even when screening proxies to temperature, there still exists a risk that poor proxies are included because of insufficient instrumental temperature data for the screening process. The uncertainty estimation becomes more problematic if temperature measurements have been used already in the data selection. We found that 15 of the proxies are both significantly correlated (standard two-sided t test, p = 0.01) to global and local temperatures, and these form the set we use in the subsequent analysis. We use a two-sided test since, a priori, we do not know if the proxy responds with an increase or a decrease when the temperature rises. Most temperature proxy records have temporal and seasonal limitations (Bradley 1999; Bradley and Jones 1992; Jones et al. 2009). The proxies have different optimal season response and few proxies are likely to fully reflect annual mean temperature (Ljungqvist 2010). The response to the proxies’ optimal season can be higher than the correlation to annual mean temperature used here.

Most of the series we have here are continuous, but the series for Iceberg Lake and Jämtland have gaps, and the Greenland composite and the southern Sierra Nevada end before 1990—the missing values, in the standardized series, are replaced with 0’s. This is an unacceptable practice were the reconstructions to be used as such, but they are calculated and shown in this paper only to help illustrate the nonlinearity testing method. To train the models, we use the Northern Hemisphere annual mean temperature data from the 5° × 5° gridded Hadley Centre and Climate Research Unit surface temperature, version 3 (HadCRUT 3v) dataset from Brohan et al. (2006).

4. Results

In this section we use the nonparametric additive method to reveal whether nonlinear relationships exist between 15 selected proxies and the mean NH temperature. We also perform a multiproxy temperature reconstruction based on a multilinear regression, a nonparametric model, and a semiparametric model based on observed values of the proxies from a.d. 800. It is well known that low-frequency variability is underestimated by the multilinear regression method (Christiansen et al. 2009), so we stress that the reconstruction is performed to have a reference against which the nonlinearity test can be viewed—the direct reconstruction is not presented here for other uses.

To calibrate our models, we use annual observations of the NH mean temperature data from the 5° × 5° gridded HadCRUT3v dataset (Brohan et al. 2006) and the proxies from 1850 through 1969—that is, 120 observations—which is the calibration period. The NH temperature is centered over the calibration period. The proxies are centered and normalized with the mean and standard deviation from the calibration period. This choice of calibration period ensures that there is no missing data in any of the proxies used; that is, there is no use of zero padding in the calibration of the models.

We first fit the classical multiple linear regression model to our data, also known as the direct global method, that is, a regression with the NH temperature as the response and selected proxies as predictors. The results are given in Table 2. Here the estimated coefficients and corresponding p value for the significance test is reported.

Table 2.

Calibration results, with (first and second columns) the estimated coefficients and corresponding p value of a standard significance test for the linear model, (third column) the p value for the nonlinearity test of the proxies in the nonparametric model, and (fourth column) the p value for the nonlinearity test of the proxies in the semiparametric model.

Calibration results, with (first and second columns) the estimated coefficients and corresponding p value of a standard significance test for the linear model, (third column) the p value for the nonlinearity test of the proxies in the nonparametric model, and (fourth column) the p value for the nonlinearity test of the proxies in the semiparametric model.
Calibration results, with (first and second columns) the estimated coefficients and corresponding p value of a standard significance test for the linear model, (third column) the p value for the nonlinearity test of the proxies in the nonparametric model, and (fourth column) the p value for the nonlinearity test of the proxies in the semiparametric model.

Next, we fit a nonparametric model, that is, the model from Eq. (3), to the proxies selected above. We use the software program R, and the function gam() from the package gam, with default settings; that is, we use the identity link function and smoothing splines with 4 degrees of freedom—see, for example, Green and Silverman (1994)—as the smoothing operator. An approximate F test is used to evaluate the significance of the nonlinearity, to determine whether including the nonlinear component of each smooth term in the model resulted in a significantly better fit than a linear relationship. Although the test statistics do not have exact or even asymptotic F distributions, Hastie and Tibshirani (1990) report that simulations show them to be useful approximations. The test results for nonlinear effects are shown in Table 2 and show two significantly nonlinear proxies: the tree-ring width records Indigirka and Yamal. Figure 1 shows the fitted functions for each proxy, and some of them seem to have a nonlinear relationship with the temperature; however, only two of them are found significantly nonlinear. The dashed curves in each plot are the upper and lower pointwise twice-standard-error curves. The vertical marks along the bottom of each graph illustrate the distribution of the values of the proxies. Further details regarding the nonlinear effects test are found in Hastie and Tibshirani (1990).

Fig. 1.

The estimated functions (i.e., ) for each proxy for the nonparametric model. The dashed lines in the plots are twice the pointwise standard error bounds. The vertical marks along the bottom illustrate the distribution of the proxies.

Fig. 1.

The estimated functions (i.e., ) for each proxy for the nonparametric model. The dashed lines in the plots are twice the pointwise standard error bounds. The vertical marks along the bottom illustrate the distribution of the proxies.

We compare these two models, that is, the linear model and the nonparametric model, by a comparison test; see Table 3. The model comparison test is an approximate F test, where the test statistic is

 
formula

where RSSi is the residual sum of squares for model i and DFi is the approximate degrees of freedom for model i; see Hastie and Tibshirani (1990) for further details. We see that the reduction in RSS for the nonparametric model compared with the linear model is not significant.

Table 3.

Model comparison tests with (first column) the Df, (second column) the RSS, and (fourth column) the results from the two tests.

Model comparison tests with (first column) the Df, (second column) the RSS, and (fourth column) the results from the two tests.
Model comparison tests with (first column) the Df, (second column) the RSS, and (fourth column) the results from the two tests.

Since we have found that 2 of the 15 proxies relate nonlinearly to temperature, we fit a semiparametric model, that is, the other 13 proxies are modeled linearly. We thus also reduce problems with overfitting. The results for the semiparametric model are seen in Table 2; that is, the same nonlinear effects test as above is performed for the two proxies. Again, the tests indicate that Indigirka and Yamal relate nonlinearity to temperature. Figure 2 shows the estimated linear and nonparametric curves.

Fig. 2.

As in Fig. 1, but for each proxy for the semiparametric model.

Fig. 2.

As in Fig. 1, but for each proxy for the semiparametric model.

We then compare the explanatory power of the linear and the semiparametric model in a comparison test as above (Table 3). We find that the reduction in residual sum of squares is significant, indicating a better fit for the semiparametric model over the linear model.

Finally, we calculate the in-sample correlation (i.e., during the training interval) of the three methods and the NH temperature; see Table 4. These nonparametric methods give better in-sample fit than the linear method.

Table 4.

In-sample correlation between the reconstructions and measured NH temperature (1850–1969).

In-sample correlation between the reconstructions and measured NH temperature (1850–1969).
In-sample correlation between the reconstructions and measured NH temperature (1850–1969).

a. Robustness test

The results regarding proxy nonlinearity are interpretable in a relative sense—that is, the nonlinearity is present in a proxy with respect to the other proxies, and a natural question is whether the list of nonlinear proxies detected will be different if a different set of proxies is used. We test this in a very simple way, by removing one proxy at the time and recalculating, noting which proxies test positive for nonlinearity, then replacing the omitted proxy, and going on to the next one. Thus, 15 tests are performed. Table 5 shows the results. Evidently there are two proxies that overwhelmingly test positive for nonlinearity—Indigirka and Yamal—while a handful of others now and then also pass the test, depending on which proxy is omitted in the leave-one-out analysis. This gives a robust indication that the two proxies are special with respect to the other proxies as a group.

Table 5.

Results of testing the robustness of the nonlinearity test, based on leave-one-out sampling. As there are 15 proxies, we can choose 15 different sets of 14 proxies each and test for nonlinearity and see whether a particular proxy tests positive for nonlinearity. We count the number of times a proxy is found significantly nonlinear (at 10% level) in the 15 possible calibrated nonparametric models. For instance, Yamal was found to be nonlinear 10 out of 15 times, while Teletskoe Lake was only found nonlinear twice out of the 15 tests.

Results of testing the robustness of the nonlinearity test, based on leave-one-out sampling. As there are 15 proxies, we can choose 15 different sets of 14 proxies each and test for nonlinearity and see whether a particular proxy tests positive for nonlinearity. We count the number of times a proxy is found significantly nonlinear (at 10% level) in the 15 possible calibrated nonparametric models. For instance, Yamal was found to be nonlinear 10 out of 15 times, while Teletskoe Lake was only found nonlinear twice out of the 15 tests.
Results of testing the robustness of the nonlinearity test, based on leave-one-out sampling. As there are 15 proxies, we can choose 15 different sets of 14 proxies each and test for nonlinearity and see whether a particular proxy tests positive for nonlinearity. We count the number of times a proxy is found significantly nonlinear (at 10% level) in the 15 possible calibrated nonparametric models. For instance, Yamal was found to be nonlinear 10 out of 15 times, while Teletskoe Lake was only found nonlinear twice out of the 15 tests.

b. Reconstruction uncertainties

The reconstructions presented here are for comparative purposes only, but it is of interest to see how specific they are given the proxies used and whether they are individually different once a picture of their uncertainties are at hand. To obtain this information, we perform bootstrapping with replacement on the observations. This is a well-known procedure that is occasionally coming into use in climate reconstruction circles (Till and Guiot 1990; Guiot et al. 2005). We performed 1000 resamples, performed the training and reconstruction each time, and compiled the reconstructions. In the end we had 1000 values for each year of the reconstruction period and found the lower 2.5% and the upper 97.5% percentiles and plot these, along with the reconstructions, in Fig. 3.

Fig. 3.

(top) The reconstructed rolling 10-yr mean NH temperatures from the linear method, (middle) the nonlinear method, and (bottom) the semiparametric method. The gray curves are the 2.5% lower and upper 97.5% percentiles; see text for details on the bootstrapping performed to generate these values.

Fig. 3.

(top) The reconstructed rolling 10-yr mean NH temperatures from the linear method, (middle) the nonlinear method, and (bottom) the semiparametric method. The gray curves are the 2.5% lower and upper 97.5% percentiles; see text for details on the bootstrapping performed to generate these values.

The reconstructions are from a.d. 800 to 1990, to minimize problems with missing proxy data. However, for these reconstructions, zero padding in the standardized proxy series is applied for missing observations in the Greenland composite (1974–90), the southern Sierra Nevada (1989–90), Jämtland (888–908), and the Columbia Icefield (800–949). As already mentioned, such a replacement of missing data may affect the reconstructions. To estimate such an effect, we have excluded the Columbia Icefield tree-ring record in a separate analysis, as this is the proxy with the most missing data. That is, first, we have used the other 14 proxies and the mean NH temperature to calibrate three models: a complete linear, a nonparametric, and a semiparametric model with Indigirka and Yamal as nonlinear. Second, reconstructions are performed with these three models. A comparison between these reconstructions (with 14 proxies) and the reconstructions in Fig. 3 (with 15 proxies) indicates that the zero padding has almost no effect on the linear and semiparametric reconstructions, and just a small effect on the nonparametric reconstruction, which leads us to believe that even though zero padding has been applied, the reconstructions in this case seem relatively robust.

We note that the intervals of uncertainty in Fig. 3 are quite wide—so wide that here is hardly any difference between the reconstructions—but particularly wide in the case of the fully nonlinear method. This is consistent with the possibility of overfitting in the training interval. Note the much larger uncertainty, relatively, for the period near a.d. 1000 in the nonlinear method.

5. Summary and discussion

We have introduced testing for nonlinearity in proxy-based temperature reconstructions and a nonlinear reconstruction method. An alternative to the approximate test for nonlinear effects we have used here is the generalized likelihood ratio (GLR) test developed by Fan and Jiang (2005). By using such a test, it is possible to test a parametric null hypothesis—for example, that the mean NH temperature is a linear model of the available proxies—against an alternative, not a linear model. We note that even though the asymptotic null distribution of the GLR statistic is available, in finite samples one would resort to conditional bootstrap methods to obtain the null distributions. We therefore defer application of this test for future research.

We show that these nonparametric methods do give better in-sample fit and further, we should expect that temperature reconstructions utilizing such methods also could be more correct. Of course, we should have in mind problems with overfitting when using nonlinear models.

Furthermore, autocorrelation in the residuals can cause problems during reconstructions, such as biased level or biased uncertainty estimates. This problem may still be present, and in future research we aim to control for this fact by introducing autoregressive error terms. Also, using a better selection procedure for relevant proxies for the nonparametric additive regression model would be of interest, but we note that inference in additive models is not well developed—see, for example, Fan and Jiang (2005).

Our analysis shows that relative to the present group of proxies, two are nonlinear relative to the NH mean temperature. The series are the tree-ring width records Indigirka and Yamal. One possible physical explanation for the nonlinear behavior of the Yamal tree-ring width record (Briffa 2000; Briffa et al. 2008) may be the observed change in growth form of the trees in this region during the latter part of the twentieth century caused by the warmer and thus more favorable growth conditions (Devi et al. 2008). Concerning the Indigirka tree-ring width record (Moberg et al. 2006), only the last 600 yr of the record have actually been published in a specific article that critically evaluates the record (under the name Yakutia) (Hughes et al. 1999). It is interesting to note, however, that the correlation between the two records only amounts to 0.77 over the period a.d. 1400–1993, which they both have in common.

That only 2 out of 15 proxies tested positive for nonlinearity is support for the general assumption that temperature proxies can be used in linear reconstruction attempts. Loehle (2009) discusses how nonlinearity can come about in the context of trees’ response functions; it is evident how the response can be not only nonlinear but monotonic and also bivalued, although then with oppositely signed response rates. Our own result for Yamal seems to be of the former kind (Devi et al. 2008), while the result for Indigirka hints at the latter (or an even more complex case). We note that data nonlinearities do not just arise due to direct causes affecting trees and their environment but can also be due to mundane things like data collection issues (Esper et al. 2012). None of the nonlinearities we have detected was found in other forms of temperature proxies.

To validate the models, we would like to introduce cross validation on data not used to train the model—a common and important procedure in the reconstruction field but difficult to carry out due to the scarcity of independent data. “Cross validation,” also known as “leave one out,” validation is a possibility but is complicated by the presence of autocorrelation in the data.

Finally, testing reconstruction methods on ensembles (Christiansen et al. 2009) of pseudodata allows one to find not only the scatter of the reconstructions but also the bias. The use of the bootstrap routine is, however, a very economical way of extracting information on the scatter of a reconstruction from limited data. Further research should also include comparing nonparametric reconstruction methods with the more commonly used reconstruction methods, such as the iterative regularized expectation maximization (RegEM) method (Dempster et al. 1977; Schneider 2001; Mann and Rutherford 2002; Smerdon and Kaplan 2007), and also examining their performance in an ensemble study as in Christiansen et al. (2009).

Acknowledgments

We wish to express our gratitude to the scholars that provided us with proxy data that are not available from public databases: Christophe Corona, CNRS/Aix-Marseille Universit, and Ivan A. Kalugin, Siberian branch of the Russian Academy of Sciences. This work was supported by the Danish Climate Centre at the Danish Meteorological Institute. We thank the reviewers, who helped us improve this manuscript.

APPENDIX

An Application of Nonparametric Regression to an Artificial Case

In this appendix we document the application of the additive nonparametric model to an artificial case, that is, we specify a nonlinear model for some of the proxies and then check whether the method detects this model. For simplicity, we assume an additive model consisting of just three standardized proxies, as follows:

 
formula

where the three proxies X1, X2, and X3 are the Southern Colorado Plateau (as its coefficient is found significant in the linear model), Indigirka, and Yamal (as these are found significantly nonlinear), respectively. We further assume that the regression function for the Southern Colorado Plateau is linear (i.e., m1 = 0.06X1), the regression function for Indigirka is nonlinear [m2 = 0.1 cos2(X2)], and the regression function for Yamal is also nonlinear (). The functions are chosen to resemble the calibrated functions from the semiparametric model.

By using the calibration data from 1850 to 1969 for these proxies, we generate the artificial “temperature” Y from the model above. In addition, we add noise (e), which is from a normal distribution with zero mean and standard deviation equal to 0.15, to the artificial temperature. We then fit a nonparametric model to these data. The result from the estimation is shown in Table A1 and Fig. A1. The test correctly detects m2 and m3 to be nonlinear, and rejects m1 as nonlinear. In the figure the estimated functions are plotted as dashed curves, and the true underlying functions are plotted as solid curves. Clearly, the method is capable of detecting the true underlying functions, but not perfectly, mainly due to the added noise. In the bottom-right plot, the points are the artificial temperature Y with noise. Clearly, they deviate quite a bit from the solid curve, which is the artificial temperature Y without noise. The dashed curve is the reconstruction obtained from the calibrated nonparametric model. The reconstruction is quite close to the true curve, and we conclude that the method works.

Table A1.

Results from the approximate nonlinear test for the fitted nonparametric model for the artificial case.

Results from the approximate nonlinear test for the fitted nonparametric model for the artificial case.
Results from the approximate nonlinear test for the fitted nonparametric model for the artificial case.
Fig. A1.

(from top left to bottom left) The estimated functions (dashed curves) for the fitted nonparametric model and the true functions (solid curves) for the artificial case. (bottom right) Points are the artificial Y with noise, the solid curve is the artificial Y without noise, and the dashed curve is the reconstruction obtained from the nonparametric model.

Fig. A1.

(from top left to bottom left) The estimated functions (dashed curves) for the fitted nonparametric model and the true functions (solid curves) for the artificial case. (bottom right) Points are the artificial Y with noise, the solid curve is the artificial Y without noise, and the dashed curve is the reconstruction obtained from the nonparametric model.

REFERENCES

REFERENCES
Andreu-Hayles
,
L.
,
R.
D’Arrigo
,
K. J.
Anchukaitis
,
P. S. A.
Beck
,
D.
Frank
, and
S.
Goetz
,
2011
:
Varying boreal forest response to Arctic environmental change at the Firth River, Alaska
.
Environ. Res. Lett.
,
6
,
045503
,
doi:10.1088/1748-9326/6/4/045503
.
Bird
,
B. W.
,
M. B.
Abbott
,
B. P.
Finney
, and
B.
Kutchko
,
2009
:
A 2000 year varve-based climate record from the central Brooks Range, Alaska
.
J. Paleolimnol.
,
41
,
25
41
.
Birks
,
H. J. B.
,
1995
:
Quantitative palaeoenvironmental reconstructions. Statistical Modelling of Quaternary Science Data, D. Maddy and J. S. Brew, Eds., Technical Guide, Vol. 5, Quaternary Research Association, 161–254
.
Birks
,
H. J. B.
,
O.
Heiri
,
H.
Seppä
, and
A.
Bjune
,
2010
:
Strengths and weaknesses of quantitative climate reconstructions based on late-Quaternary biological proxies
.
Open Ecol. J.
,
3
,
68
110
.
Bradley
,
R. S.
,
1999
:
Paleoclimatology: Reconstructing Climates of the Quaternary. 2nd ed. International Geophysics Series, Vol. 68, Academic Press, 613 pp
.
Bradley
,
R. S.
, and
P. D.
Jones
,
1992
:
Climatic variations over the last 500 years. Climate since A.D. 1500, R. S. Bradley and P. D. Jones, Eds., Routledge, 649–665
.
Breiman
,
L.
, and
J.
Friedman
,
1985
:
Estimating optimal transformations for multiple regression and correlation (with discussion)
.
J. Amer. Stat. Assoc.
,
80
,
580
619
.
Briffa
,
K. R.
,
2000
:
Annual climate variability in the Holocene: Interpreting the message of ancient trees
.
Quat. Sci. Rev.
,
19
,
87
.
Briffa
,
K. R.
,
P. D.
Jones
,
F. H.
Schweingruber
,
S. G.
Shiyatov
, and
E. R.
Cook
,
1995
:
Unusual twentieth-century summer warmth in a 1,000-year temperature record from Siberia
.
Nature
,
376
,
156
159
.
Briffa
,
K. R.
,
F. H.
Schweingruber
,
P. D.
Jones
,
T. J.
Osborn
,
I. C.
Harris
,
S. G.
Shiyatov
,
E. A.
Vaganov
, and
H.
Grudd
,
1998a
:
Trees tell of past climates: But are they speaking less clearly today?
Philos. Trans. Roy. Soc. London
,
B353
,
65
73
.
Briffa
,
K. R.
,
F. H.
Schweingruber
,
P. D.
Jones
,
T. J.
Osborn
,
S. G.
Shiyatov
, and
E. A.
Vaganov
,
1998b
:
Reduced sensitivity of recent tree-growth to temperature at high northern latitudes
.
Nature
,
391
,
678
682
.
Briffa
,
K. R.
,
V. V.
Shishov
,
T. M.
Melvin
,
E. A.
Vaganov
,
H.
Grudd
,
R. M.
Hantemirov
,
M.
Eronen
, and
M. M.
Naurzbaev
,
2008
:
Trends in recent temperature and radial tree growth spanning 2000 years across northwest Eurasia
.
Philos. Trans. Roy. Soc. London
,
B363
,
2269
2282
.
Brohan
,
P.
,
J. J.
Kennedy
,
I.
Harris
,
S. F. B.
Tett
, and
P. D.
Jones
,
2006
:
Uncertainty estimates in regional and global observed temperature changes: A new dataset from 1850
.
J. Geophys. Res.
,
111
,
D12106
,
doi:10.1029/2005JD006548
.
Buja
,
A.
,
T.
Hastie
, and
R.
Tibshirani
,
1989
:
Linear smoothers and additive models
.
Ann. Stat.
,
17
,
453
510
.
Büntgen
,
U.
,
D. C.
Frank
,
D.
Nievergelt
, and
J.
Esper
,
2006
:
Summer temperature variations in the European Alps, a.d. 755–2004
.
J. Climate
,
19
,
5606
.
Büntgen
,
U.
, and
Coauthors
,
2011
:
2500 years of European climate variability and human susceptibility
.
Science
,
331
,
578
582
.
Carrer
,
M.
, and
C.
Urbinati
,
2001
:
Spatial analysis of structural and tree-ring related parameters in a timberline forest in the Italian Alps
.
J. Veg. Sci.
,
12
,
643
652
.
Christiansen
,
B.
,
T.
Schmith
, and
P.
Thejll
,
2009
:
A surrogate ensemble study of climate reconstruction methods: Stochasticity and robustness
.
J. Climate
,
22
,
951
976
.
Cook
,
T. L.
,
R. S.
Bradley
,
J. S.
Stoner
, and
P.
Francus
,
2009
:
Five thousand years of sediment transfer in a high Arctic watershed recorded in annually laminated sediments from Lower Murray Lake, Ellesmere Island, Nunavut, Canada
.
J. Paleolimnol.
,
41
,
77
94
.
Corona
,
C.
,
J.-L.
Edouard
,
G.
Guibal
,
J.
Guiot
,
S.
Bernard
,
A.
Thomas
, and
N.
Denelle
,
2011
:
Long-term summer (a.d. 751–2008) temperature fluctuation in the French Alps based on tree-ring data
.
Boreas
,
40
,
351
366
.
D’Arrigo
,
R. D.
, and
Coauthors
,
2001
:
1738 years of Mongolian temperature variability inferred from a tree-ring width chronology of Siberian pine
.
Geophys. Res. Lett.
,
28
,
543
546
.
D’Arrigo
,
R. D.
,
R. K.
Kaufmann
,
N.
Davi
,
G. C.
Jacoby
,
C.
Laskowski
,
R. B.
Myneni
, and
P.
Cherubini
,
2004
:
Thresholds for warming-induced growth decline at elevational tree line in the Yukon Territory, Canada
.
Global Biogeochem. Cycles
,
18
,
GB3021
,
doi:10.1029/2004GB002249
.
D’Arrigo
,
R. D.
,
R.
Wilson
, and
G.
Jacoby
,
2006
:
On the long-term context for late twentieth century warming
.
J. Geophys. Res.
,
111
,
D03103
,
doi:10.1029/2005JD006352
.
D’Arrigo
,
R. D.
,
R.
Wilson
,
B.
Liepert
, and
P.
Cherubini
,
2008
:
On the ‘divergence problem’ in northern forests: A review of the tree-ring evidence and possible causes
.
Global Planet. Change
,
60
,
289
305
.
Dempster
,
A. P.
,
N. M.
Laird
, and
D. B.
Rubin
,
1977
:
Maximum likelihood from incomplete data via the EM algorithm
.
J. Roy. Stat. Soc.
,
B39
,
1
38
.
Devi
,
N.
,
F.
Hagedorn
,
P.
Moiseev
,
H.
Bugmann
,
S.
Shiyatov
,
V.
Mazepa
, and
A.
Rigling
,
2008
:
Expanding forests and changing growth forms of Siberian larch at the Polar Urals treeline during the 20th century
.
Change Biol.
,
14
,
1581
1591
.
Esper
,
J.
,
E. R.
Cook
, and
F. H.
Schweingruber
,
2002
:
Low-frequency signals in long tree-ring chronologies for reconstructing past temperature variability
.
Science
,
295
,
2250
.
Esper
,
J.
,
U.
Büntgen
,
M.
Timonen
, and
D. C.
Frank
,
2012
: V
ariability and extremes of northern Scandinavian summer temperatures over the past two millennia
.
Global Planet. Change
,
88–89
,
1
9
.
Fan
,
J.
, and
I.
Gijbels
,
1996
:
Local Polynomial Modelling and Its Applications. Monogr. on Statistics and Applied Probability, No. 66, Chapman and Hall, 360 pp
.
Fan
,
J.
, and
J.
Jiang
,
2005
:
Nonparametric inferences for additive models
.
J. Amer. Stat. Assoc.
,
100
,
890
907
.
Frank
,
D.
,
J.
Esper
,
E.
Zorita
, and
R.
Wilson
,
2010
:
A noodle, hockey stick, and spaghetti plate: A perspective on high-resolution paleoclimatology
.
Wiley Interdiscip. Rev. Climate Change
,
1
,
507
516
.
Fritts
,
H. C.
,
1976
:
Tree Rings and Climate. Academic Press, 567 pp
.
Graumlich
,
L. J.
,
1993
:
A 1000-year record of temperature and precipitation in the Sierra Nevada
.
Quat. Res.
,
39
,
249
255
.
Graumlich
,
L. J.
, and
L. B.
Brubaker
,
1986
:
Reconstruction of annual temperature (1590–1979) for Longmire, Washington, derived from tree rings
.
Quat. Res.
,
25
,
223
234
.
Green
,
P. J.
, and
B. W.
Silverman
,
1994
:
Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Monogr. on Statistics and Applied Probability, No. 58, Chapman and Hall, 182 pp
.
Groveman
,
B. S.
, and
H. E.
Landsberg
,
1979
:
Simulated Northern Hemisphere temperature departures 1579–1880
.
Geophys. Res. Lett.
,
6
,
767
769
.
Grudd
,
H.
,
2008
:
Torneträsk tree-ring width and density ad 500–2004: A test of climatic sensitivity and a new 1500-year reconstruction of north Fennoscandian summers
.
Climate Dyn.
,
31
,
843
857
.
Guiot
,
J.
,
A.
Nicault
,
C.
Rathgeber
,
J.
Edouard
,
F.
Guibal
,
G.
Pichard
and
C.
Till
,
2005
:
Last-millennium summer-temperature variations in western Europe based on proxy data
.
Holocene
,
15
,
489
500
.
Haslett
,
J.
,
M.
Whiley
,
S.
Bhattacharya
,
M.
Salter-Townshend
,
S. P.
Wilson
,
J. R. M.
Allen
,
B.
Huntley
, and
F. J. G.
Mitchell
,
2006
:
Bayesian palaeoclimate reconstruction
.
J. Roy. Stat. Soc.
,
A169
,
395
438
.
Hastie
,
T. J.
, and
R. J.
Tibshirani
,
1990
:
Generalized Additive Models. Monogr. on Statistics and Applied Probability, No. 43, Chapman and Hall, 335 pp
.
Hastie
,
T. J.
,
R. J.
Tibshirani
, and
J.
Friedman
,
2009
:
The Elements of Statistical Learning: Data Mining, Inference and Prediction. 2nd ed. Springer, 745 pp
.
Helama
,
S.
,
M. M.
Fauria
,
K.
Mielikinen
,
M.
Timonen
, and
M.
Eronen
,
2010
:
Sub-Milankovitch solar forcing of past climates: Mid and late Holocene perspectives
.
Geol. Soc. Amer. Bull.
,
122
,
1981
1988
.
Hughes
,
M.
,
E.
Vaganov
,
S.
Shiyatov
,
R.
Touchan
, and
G.
Funkhouser
,
1999
:
Twentieth-century summer warmth in northern Yakutia in a 600-year context
.
Holocene
,
9
,
629
634
.
Jansen
,
E.
, and
Coauthors
,
2007
:
Paleoclimate. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 433–497
.
Jones
,
P.
, and
Coauthors
,
2009
:
High-resolution palaeoclimatology of the last millennium: A review of current status and future prospects
.
Holocene
,
19
,
3
49
.
Juckes
,
M. N.
,
M. R.
Allen
,
K. R.
Briffa
,
J.
Esper
,
G. C.
Hegerl
,
A.
Moberg
,
T. J.
Osborn
, and
S. L.
Weber
,
2007
:
Millennial temperature reconstruction intercomparison and evaluation
.
Climate Past
,
3
,
591
609
.
Kalugin
,
I. A.
,
A. V.
Daryin
, and
V. V.
Babich
,
2009
:
Reconstruction of annual air temperatures for three thousand years in Altai region by lithological and geochemical indicators in Teletskoe Lake sediments
.
Dokl. Earth Sci.
,
426
,
681
684
.
Korhola
,
A.
,
2002
:
Holocene temperature changes in northern Fennoscandia reconstructed from chironomids using Bayesian modelling
.
Quat. Sci. Rev.
,
21
,
1841
1860
.
Lamoureux
,
S. F.
, and
R. S.
Bradley
,
1996
:
A late Holocene varved sediment record of environmental change from northern Ellesmere Island, Canada
.
J. Paleolimnol.
,
16
,
239
255
.
Linderholm
,
H. W.
, and
B. E.
Gunnarson
,
2005
:
Summer temperature variability in central Scandinavia during the last 3600 years
.
Geogr. Ann.
,
87A
,
231
241
.
Lindholm
,
M.
,
R.
Jalkanen
,
H.
Salminen
,
T.
Aalto
, and
M.
Ogurtsov
,
2011
:
The height-increment record of summer temperature extended over the last millennium in Fennoscandia
.
Holocene
,
21
,
319
326
.
Ljungqvist
,
F. C.
,
2010
:
A new reconstruction of temperature variability in the extra-tropical Northern Hemisphere during the last two millennia
.
Geogr. Ann.
,
92A
,
339
351
.
Ljungqvist
,
F. C.
,
P. J.
Krusic
,
G.
Brattström
, and
H. S.
Sundqvist
,
2012
:
Northern Hemisphere temperature patterns in the last 12 centuries
.
Climate Past
,
8
,
227
249
.
Lloyd
,
A. H.
, and
L. J.
Graumlich
,
1997
:
Holocene dynamics of treeline forests in the Sierra Nevada
.
Ecology
,
78
,
1199
1210
.
Loehle
,
C.
,
2009
:
A mathematical analysis of the divergence problem in dendroclimatology
.
Climatic Change
,
94
,
233
245
.
Loso
,
M. G.
,
2009
:
Summer temperatures during the Medieval Warm Period and Little Ice Age inferred from varved proglacial lake sediments in southern Alaska
.
J. Paleolimnol.
,
41
,
117
128
.
Luckman
,
B. H.
, and
R. J. S.
Wilson
,
2005
:
Summer temperatures in the Canadian Rockies during the last millennium: A revised record
.
Climate Dyn.
,
24
,
131
144
.
Mann
,
M. E.
, and
S.
Rutherford
,
2002
:
Climate reconstruction using ‘pseudoproxies.’
Geophys. Res. Lett.
,
29
,
1501
,
doi:10.1029/2001GL014554
.
Moberg
,
A.
,
D. M.
Sonechkin
,
K.
Holmgren
,
N. M.
Datsenko
,
W.
Karlén
, and
S.-E.
Lauritzen
,
2006
:
Corrigendum: Highly variable Northern Hemisphere temperatures reconstructed from low- and high-resolution proxy data
.
Nature
,
439
,
1014
,
doi:10.1038/nature04575
.
Moore
,
J. J.
,
K. A.
Hughen
,
G. H.
Miller
, and
J. T.
Overpeck
,
2001
:
Little Ice Age recorded in summer temperature reconstruction from varved sediments of Donard Lake, Baffin Island, Canada
.
J. Paleolimnol.
,
25
,
503
517
.
NRC
,
2006
:
Surface Temperature Reconstructions for the last 2,000 Years. National Academies Press, 160 pp
.
Popa
,
I.
, and
Z.
Kern
,
2009
:
Long-term summer temperature reconstruction inferred from tree-ring records from the Eastern Carpathians
.
Climate Dyn.
,
32
,
1107
1117
.
Salzer
,
M. W.
, and
K. F.
Kipfmueller
,
2005
:
Reconstructed temperature and precipitation on a millennial timescale from tree-rings in the Southern Colorado Plateau, U.S.A
.
Climatic Change
,
70
,
465
487
.
Schneider
,
T.
,
2001
:
Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values
.
J. Climate
,
14
,
853
871
.
Smerdon
,
J. E.
, and
A.
Kaplan
,
2007
:
Comments on testing the fidelity of methods used in proxy-based reconstructions of past climate: The role of the standardization interval
.
J. Climate
,
20
,
5666
5670
.
Tan
,
M.
,
T.
Liu
,
J.
Hou
,
X.
Qin
,
H.
Zhang
, and
T.
Li
,
2003
:
Cyclic rapid warming on centennial-scale revealed by a 2650-year stalagmite record of warm season temperature
.
Geophys. Res. Lett.
,
30
,
1617
,
doi:10.1029/2003GL017352
.
Teräsvirta
,
T.
,
D.
Tjøstheim
, and
C. W. J.
Granger
,
2010
:
Modelling Nonlinear Economic Time Series. Advanced Texts in Econometrics, Oxford University Press, 557 pp
.
Thomas
,
E. K.
, and
J. P.
Briner
,
2009
:
Climate of the past millennium inferred from varved proglacial lake sediments on northeast Baffin Island, Arctic Canada
.
J. Paleolimnol.
,
41
,
209
224
.
Till
,
C.
, and
J.
Guiot
,
1990
:
Reconstruction of precipitation in Morocco since 1100 a.d. based on Cedrus atlantica tree-ring widths
.
Quat. Res.
,
33
,
337
351
.
Tingley
,
M.
,
P.
Craigmile
,
M.
Haran
,
B.
Li
,
E.
Mannshardt
, and
B.
Rajaratnam
,
2012
:
Piecing together the past: Statistical insights into paleoclimatic reconstructions
.
Quat. Sci. Rev.
,
35
,
1
22
.
Tjøstheim
,
D.
, and
B.
Auestad
,
1994
:
Nonparametric identification of nonlinear time series: Projection
.
J. Amer. Stat. Assoc.
,
89
,
1398
1409
.
Toivonen
,
H.
,
H.
Mannila
,
A.
Korhola
, and
H.
Olander
,
2001
:
Applying Bayesian statistics to organism-based environmental reconstruction
.
Ecol. Appl.
,
11
,
618
630
.
Treydte
,
K. S.
,
D. C.
Frank
,
M.
Saurer
,
G.
Helle
,
G. H.
Schleser
, and
J.
Esper
,
2009
:
Impact of climate and CO2 on a millennium-long tree-ring carbon isotope record
.
Geochim. Cosmochim. Acta
,
73
,
4635
4647
.
Vasko
,
K.
,
H. T.
Toivonen
, and
A.
Korhola
,
2000
:
A Bayesian multinomial Gaussian response model for organism-based environmental reconstruction
.
J. Paleolimnol.
,
24
,
243
250
.
Vinther
,
B. M.
, and
Coauthors
,
2008
:
Synchronizing ice cores from the Renland and Agassiz ice caps to the Greenland Ice Core Chronology
.
J. Geophys. Res.
,
113
,
D08115
,
doi:10.1029/2007JD009143
.
Vinther
,
B. M.
,
P. D.
Jones
,
K. R.
Briffa
,
H. B.
Clausen
,
K. K.
Andersen
,
D.
Dahl-Jensen
, and
S. J.
Johnsen
,
2010
:
Climatic signals in multiple highly resolved stable isotope records from Greenland
.
Quat. Sci. Rev.
,
29
,
522
538
.
Wilson
,
R.
, and
Coauthors
,
2007
:
A matter of divergence: Tracking recent warming at hemispheric scales using tree ring data
.
J. Geophys. Res.
,
112
,
D17103
,
doi:10.1029/2006JD008318
.