## 1. Introduction

The climate system is believed to be turbulent in the sense that it would fluctuate even in the absence of time-varying forcing. These fluctuations, called natural fluctuations, must be distinguished from fluctuations induced by time-varying forcing in order to establish that an observed change differs significantly from natural fluctuations, a procedure called *detection,* and that the observed change is consistent with that induced by a specific forcing, a procedure called *attribution.* If the forced and unforced fluctuations are independent additive variables with known space–time covariances, then the detection and attribution problems can be treated using standard methods in signal processing (Hasselmann 1979, 1993; Bell 1986; North et al. 1995; Mitchell et al. 2001; International Ad Hoc Detection and Attribution Group 2005).

In practice, however, neither the forced nor the unforced fluctuations are known. Consequently, most investigators estimate the statistics of these fluctuations from general circulation models (GCMs). While imperfect, GCMs provide further information not accessible from observations, such as climate realizations under different forcings, and multiple realizations under the same forcing. Unfortunately, at the present time, the typical ensemble size is too small (less than a half dozen) and the integration length too short (less than a few millennia) to estimate the forced and unforced fluctuations directly without invoking additional statistical processing methods. Accordingly, most investigators apply some type of time filtering to estimate the space-time structure of the forced response. For example, Tett et al. (1996) and Hegerl et al. (1996) used differences between 10- and 20-yr means, respectively, to represent the forced response. Tett et al. (1999) used five successive decadal means to represent the forced response in both space and time. Linear trends over 10 or more years of annual mean data also have been used to represent the observed climate change (e.g., Folland et al. 2001). In addition, most investigators apply a spatial filter to eliminate small-scale variability and to reduce the dimension of the detection and attribution problem. For example, Hegerl et al. (1996, 1997), North and Stevens (1998), and Allen and Tett (1999) represented the forced and unforced fluctuations with the first few empirical orthogonal functions (EOFs) of GCM output. Tett et al. (1999) reduced the spatial resolution of GCM output to a few spherical harmonics prior to EOF truncation.

The question arises as to why investigators apply an ad hoc space–time filter as part of an “optimal” detection and attribution procedure. Presumably, such filtering is motivated by an assumed scale separation in space and time between natural and forced fluctuations. However, the precise criterion for adopting these filters is rarely explicit, and the above filters are almost certainly suboptimal for most appropriate criteria. For instance, applying independent time filters at each point ignores spatial correlations of the forced fluctuations. Also, the dominant principal components of a low-pass-filtered time series are not necessarily the components with large *relative* power at low frequencies. To see this, consider the sum of two orthogonal spatial patterns, one of which fluctuates slowly on decadal time scales, the other of which fluctuates as independent white noise. If the variance of the white noise is sufficiently large, then the field associated with the white noise will dominate the low-pass-filtered variance, since a low-pass filter removes little power at low frequencies, and hence will be the leading principal component in the low-pass time series. It follows that projecting the associated EOF onto the *original* time series will produce white noise. In essence, a field may dominate the low-frequency variance simply because it dominates the variance at all frequencies.

The above example illustrates that an EOF analysis applied to a low-pass time series merely identifies components whose power at low frequencies dominates that of all *other components.* It does not, however, necessarily identify components that have large power at low frequencies compared to all *other frequencies.* For example, linear trends and decadal oscillations have large power at low frequency *relative to high frequencies,* whereas white noise has equal power at all frequencies. We propose, then, that a more appropriate criterion for reducing the dimension of a detection and attribution problem is that the resulting dimension should optimize the power at low frequencies relative to all frequencies. In this paper, we discuss a technique for finding time series that maximize the ratio of low-frequency variance to total variance, or what amounts to the same thing, maximizes the ratio of low-frequency to high-frequency variance (where “high-frequency” refers to all frequencies outside the low-frequency spectral window).

*τ*is the time lag and

*ρ*is the autocorrelation function of the time series.

_{τ}The decorrelation time (1) appears frequently in turbulence studies and time series analysis as a measure of an integral time scale of a random process. Less appreciated is the fact that the sample decorrelation time also measures the strength of trends, discontinuities, or other low-frequency signals. For such nonstationary time series, the sample autocorrelation function does not vanish at long time lags and hence the integral (1) does not converge as the upper limit of integration tends to infinity. Nevertheless, one can still interpret the decorrelation time as proportional to the ratio of the estimated power at zero frequency to the total power, as discussed in DelSole (2001) and in more detail in section 2b. Thus, OPA can be interpreted as a procedure that first isolates low-frequency, nonstationary signals, orders them by the ratio of the estimated power at zero-frequency power to total power, then isolates stationary signals and orders them by their decorrelation time. The present paper goes beyond DelSole (2001) by 1) considering more carefully the effect of finite time series on estimates of decorrelation times, 2) clarifying the connection between OPA and power spectral analysis, 3) proposing an objective procedure for deciding the number of principal components for representing the reduced subspace in which to solve the optimization problem, 4) introducing a statistical significance test for the decorrelation times, and 5) accounting for the annual cycle.

For decadal time scales, OPA appears to be closely related to the technique proposed by Schneider and Held (2001), which finds components that maximize the ratio of interdecadal variance to intradecadal variance. Neither procedure prejudices the low-frequency variability to be a prescribed function of time, in contrast to linear trend analysis, and both procedures take advantage of nonlocal covariances in space and time to improve the signal-to-noise ratio. Both procedures can be viewed as “lifting off successive decoupled layers of interdecadal variations,” in the terminology of Schneider and Held, and hence provide attractive methods for isolating interdecadal signals. The main advantages of OPA are that it does not depend strongly on arbitrarily predefined filter time scales, such as decadal time periods, and has a natural connection to power spectrum analysis.

This paper applies OPA to simulated and observed multidecadal fields with the aim of elucidating the low-frequency behavior of surface temperature. The statistical analysis is discussed in section 2, and the time series used in this study are discussed in section 3. Our results are reviewed in section 4. The concluding section summarizes our results.

## 2. Optimal persistence analysis with finite samples

### a. Review of OPA

**for a state vector**

*f**that yields a time series that maximizes the decorrelation time (1). DelSole (2001) showed that this optimization problem leads to the generalized eigenvalue problem*

**x**_{t}**Σ**

*is the time-lagged covariance matrix of the process, and superscript T denotes the transpose. The eigenvectors*

_{τ}**give the weighting coefficients for the desired linear combination of state variables; that is,**

*f*

*f*^{T}

*is a scalar time series that maximizes*

**x**_{t}*T*

_{1}. The eigenvectors

**may be interpreted as “fingerprints” of low-frequency variability in the sense that they define a spatial filter that can be applied to the state at each time to produce a time series that optimizes (1). Time series produced by different eigenvectors are mutually uncorrelated. The eigenvalues**

*f**λ*give the corresponding value of

*T*

_{1}. By convention, the optimal persistence patterns (OPPs) are ordered by decreasing value of

*T*

_{1}, and the eigenvectors are normalized such that

*f*^{T}

**Σ**

_{0}

**= 1, which is equivalent to normalizing the time series to unit variance. To each eigenvector corresponds an optimal persistence pattern, given by**

*f***=**

*q***Σ**

_{0}

**. The fingerprints**

*f***typically have a much smaller scale than the OPP**

*f***, a common feature of fingerprint methods. The product of an OPP**

*q***and its associated time series**

*q*

*f*^{T}

*, summed over all components, recovers the original data. The resulting superposition can be interpreted as a decomposition of the time series into an ordered set of components such that the leading component varies on the longest time scale, the second varies on the longest time scale subject to being uncorrelated with the first, and so on. This decomposition is analogous to the use of principal components (PCs) to maximize variance, except that instead of maximizing variance, the procedure maximizes the time scale of the component.*

**x**_{t}*ω*) of a stationary process is the Fourier transform of the time-lagged covariance matrix

**Σ**

*; that is,*

_{τ}**Σ**

^{T}

_{τ}=

**Σ**

_{−τ}, the eigenvector problem (2) can be written equivalently as

### b. OPA with finite samples

*N*values. The sample analog of the power spectrum 𝗣(

*ω*) is the periodogram 𝗣

**(**

_{N}*ω*), defined as

_{k}*ω*is any nonzero frequency 2

_{k}*πk*/

*N*between −

*π*and

*π*(Brockwell and Davis 1996). The periodogram is related to the sample covariance matrix by

*r*are weights, called the

_{τ}*lag window,*that decrease with the absolute value of time lag and

*M*is an upper bound less than

*N*− 1, called the

*truncation point*. This truncated weighted sum has some intuitive appeal since it gives less weight to the covariances at larger time lags, which are estimated with fewer samples. Nevertheless, the main justification for employing a truncated weighted sum is that the resulting power spectrum estimates have desirable properties in terms of bias and variance (Jenkins and Watts 1968). These considerations suggest that the appropriate finite sample approximation to the eigenvalue problems (2) and (4) is

_{0}as a sum of

_{N}(

*ω*). This quotient is the finite sample analog of (5), except that the above quotient involves the “smoothed” power spectrum

_{k}_{N}(

*ω*) rather than the periodogram 𝗣

_{k}*(*

_{N}*ω*); that is, the power at zero frequency in the numerator is a

_{k}*smoothed estimate*of the periodogram near zero frequency. The parameter

*M*essentially determines how many terms in the periodogram are averaged together to compute the smoothed power spectrum at zero frequency. Note, however, that the lag window and truncation point merely influence the degree of smoothing of the periodogram at zero frequency, but this does not alter the fact that the estimate is always centered at zero frequency.

*M*, then the decorrelation time is not a very meaningful concept. A more meaningful measure of “persistence” is the fraction of variance at zero frequency relative to all frequencies. If we identify the variance at low frequency as “signal,” then this fraction can be called the “signal-to-total ratio” (STR). We see from (11) that the STR is

We report the STR values as well as the decorrelation times of all results.

A major advantage of the Parzen window, as compared to the Tukey window, say, is that it cannot give negative power spectra estimates. Following Chatfield (1989), we choose *M* = *N*

### c. Projection onto cyclostationary principal components

*cyclostationary principal components*(CSPCs). A cyclostationary random process (in the wide sense) is defined as a process whose first- and second-order moments are periodic functions of time (Gardner and Franks 1975). Aside from forced signals, the time series in this study are expected to be cyclostationary with an annual period. A fundamental fact, which does not appear to be widely appreciated in the climate community, is that any discrete

*T*-periodic cyclostationary processes can be converted into a

*T*-dimensional stationary process simply by augmenting the state vector to include

*T*consecutive states, a fact apparently first pointed out by Gladysev (1961) and stated explicitly in Jones and Brelsford (1967) and Troutman (1979). More precisely, if

*is a discrete*

**x**_{t}*T*-periodic cyclostationary process, then the vector,

*r*. This fact allows cyclostationary processes to be described by the theory of multivariate stationary processes.

The principal components of the augmented state vector (17) will be called the cyclostationary principal components. The CSPCs should be distinguished from the cyclostationary EOFs (CSEOFs) discussed in Kim et al. (1996), which are essentially principal components of the time-lagged data, modified to account for cyclostationarity. Thus, CSEOFs may be interpreted as an extension of singular spectrum analysis to cyclostationary processes [the integral definition of cyclostationary EOFs in Kim et al. (1996) is precisely the integral definition of the Karhunen–Loeve expansion in the time domain, which is the continuous generalization of singular spectrum analysis, as discussed in Ghil et al. (2002)]. CSPCs differ from extended EOFs (EEOFs) or equivalently, multichannel singular spectrum analysis [M-SSA; see von Storch and Zwiers (1999) for discussion of these techniques], in two ways: the *T*-consecutive state vectors of CSPCs do not overlap with each, in contrast to EEOF and M-SSA (i.e., the index *r* is an integer for CSPCs and a fraction for EEOFs), and the period *T* of CSPCs is imposed by the periodicity of the underlying process, whereas the “period” *T* in an M-SSA and EEOF is chosen according to some statistical criterion. Kim and Wu (1999) compare the two methods and conclude that CSPCs, which they call periodically extended EOFs, “may be an excellent and inexpensive alternative for the CSEOF technique,” but raise questions about its sensitivity to sampling errors. We present evidence that CSPCs perform exceptionally well at capturing the seasonal cycle of low-frequency variability.

To compute CSPSs, we first compute the zero-lag covariance matrix of the augmented vector (17), with *T* = 4, corresponding to the number of seasons per year, and with *r* an integer between 1 and 100, indicating the year in the period 1899–1998 inclusive. Each element of the augmented vector is multiplied by the square root of the cosine of latitude, so that the sum of squares corresponds to an area-weighted sum square of the field. This weighting is inverted whenever plots of the fields are presented. The eigenvectors of the covariance matrix are then computed to give 𝗖_{0} = 𝗘** Θ** 𝗘^{T}, where 𝗘 is unitary and whose columns give the spatial structure associated with the CSPCs, and **Θ** is a diagonal matrix whose *k*th diagonal element gives the variance explained by the *k*th CSPC.

If more than one ensemble member is available, then the sample mean is computed by averaging over all ensemble members, and the covariance matrix is averaged over all ensemble members. Using only one ensemble member gave virtually indistinguishable results to using all available ensemble members, indicating that differences due to ensemble sizes are minor.

### d. Truncation criterion

The most important parameter in OPA, in terms of its influence on the results, is the number of principal components used to represent the reduced subspace in which the optimal solution will be sought. Choosing too few principal components will lead to failure to represent essential characteristics of the low-frequency variability, while choosing too many principal components will lead to overfitting—that is, to fitting variability due to sampling errors. In general, the decorrelation times are biased upward, or equivalently, the power spectra are biased toward low frequency, with the bias increasing with the number of principal components.

*given the value of*

**x**_{t}*at the midpoint within the averaging period:*

**x**_{t}*e*represents the error of the prediction and 𝗔 is an unknown square matrix. Since the lag window

_{t}*r*decays with the absolute value of

_{τ}*τ*, the sum on the left is weighted toward values near

*. The least squares estimate of the linear operator is*

**x**_{t}

*f*_{1}

*f*_{2}, . . . ,

*f**], and similarly for the OPPs, 𝗤 = [*

_{K}

*q*_{1}

*q*_{2}, . . . ,

*q**], then it can be shown that*

_{K}The above considerations lead us to propose a cross-validation method for deciding the number of PCs in OPA. See von Storch and Zwiers (1999) for a discussion of the basic methodology. The only modification needed for our purposes is that instead of “leave-one-out” cross validation, we apply “leave 2*M* + 1 out” cross validation to ensure that the predictands in (18) for the training and verification datasets do not have overlapping windows. The mean-square error of all predictions will be called the cross-validated mean-square error. A plot of the cross-validated mean-square error as a function of the number of predictors for the Hadley Centre Climate Research Unit (HadCRUT2) time series (not shown) reveals a minimum of around 10 predictors, with the truncation point *M* = 20. For this calculation, we added a constant “predictor” to the state vector * x_{t}* to account for the intercept term; without this term the cross-validation procedure is subject to a bias that more than doubles the apparent optimum number of predictors. Precisely the same minimum was found using the truncation point

*M*= 10. We conclude that the appropriate number of principal components in OPA for the HadCRUT2 time series is 10. A similar analysis for other time series revealed that the optimal number of components is almost always larger than 10. For simplicity, we have chosen to use 10 principal components for all OPA. This level of truncation captures 55% of the variance in observations and 43%–62% of the variance in all simulations.

In general, the optimal persistence patterns are insensitive to the number of principal components, but the time series and associated decorrelation times are sensitive to the number of principal components. Specifically, as the number of principal components increases, the decorrelation times tend to increase and the time series become smoother and more dominated by low-frequency power. Neither the optimal persistence patterns nor the associated time series are sensitive to the choice of lag window *r _{τ}* and truncation point

*M*.

The amount of variance “explained” by an optimal persistence pattern can be computed from the fact that the time series associated with OPPs are uncorrelated, and hence the total variance can be decomposed into a sum of variances contributed by the individual components. Assuming the CSPCs are normalized to unit variance, the variance explained by the *k*th OPP * q_{k}* is

*q*^{T}

_{k}

**Θ**

*q*_{k}, where

**Θ**is the diagonal eigenvalue matrix of 𝗖

_{0}defined at the end of section 2c. The resulting variance is divided by the

*total*variance (including the truncated variance) to compute the fraction of variance explained by the

*k*th OPP.

### e. Statistical significance

It is useful to compare the decorrelation times from a given time series to the distribution of values that would occur under the null hypothesis of Gaussian white noise. Remarkably, the decorrelation time is invariant with respect to nonsingular, linear transformations in the truncated EOF subspace (see DelSole 2001). It follows that the sampling distribution of the decorrelation times is independent of the detailed spatial covariances of the system. In particular, we may adopt the null hypothesis that the random process is white noise in *both space and time,* and the resulting sampling distribution will be relevant to all white noise processes of the same size with arbitrary spatial correlations. Thus, the appropriate sampling distribution of decorrelation time can be computed by Monte Carlo methods simply by generating a single realization of Gaussian white noise and using these values to fill the state vectors in both space and time, and then computing the decorrelation times and archiving their distribution. We have performed 10 000 trials of this procedure for the appropriate dimensions of our problem and display the 99% confidence limit in the appropriate figures. Virtually identical confidence limits were obtained with 1000 trials, indicating convergence of the Monte Carlo estimates.

## 3. Models and data

The models analyzed in this study are a subset of those included in the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4). The complete list of models used in this study is given in Table 1. The models were chosen based on data availability. Documentation of the various models and their forcings can be found at various Web sites related to IPCC (including http://www-pcmdi.llnl.gov/ and www.ipcc.ch), as well as the official Web site for each model’s respective institution.

This paper analyzes only the “climate of the twentieth century” scenario, designated 20c3m, for the period 1899–1998. The primary variable examined in this study is temperature of the atmosphere near the surface (TAS). In addition, only 3-month averages are considered in this study, starting from January–March (JFM) and continuing to October–December (OND) of each year. Each 3-month average will be referred to as a “season.”

The observational land surface temperature record used for verifying model simulations is the 5° × 5° gridded HadCRUT2 dataset compiled jointly by the Climatic Research Unit (CRU) and the Met Office’s Hadley Centre (Jones and Moberg 2003; available from http://www.cru.uea.ac.uk/cru/data/temperature/). Only grid points that have at least 1 month of nonmissing data per season in the entire period 1899–1998 are retained. This stringent criterion ensures a continuous seasonal anomaly record at each grid point. This criterion was adopted to avoid questions related to the influence of missing data on our results. Although this criterion leaves only 301 grid points for analysis, we present evidence in section 4 that it is adequate to extract the dominant low-frequency signals for the global climate system.

The observational sea surface temperature record used for verifying model skin temperatures (TS) over the ocean is the 1° × 1°, gridded First Hadley Centre SST datasat (HadSST1) dataset compiled at the Hadley Center and documented in Rayner et al. (2003). Only the ocean region within 30°S and 50°N is considered, within which there are no “missing values.”

## 4. Results

The optimal decorrelation times for the IPCC simulations and observations (HadCRUT2) are shown in Fig. 1. Comparison with the 1% significance level for decorrelation time indicated by the dashed line in Fig. 1 shows that only the first two optimal decorrelation times for HadCRUT2 are statistically distinguishable from white noise. Furthermore, the two decorrelation times for HadCRUT2 tend to be larger than those for the IPCC simulations, suggesting that simulations tend to underestimate low-frequency variability in the climate system. The leading decorrelation time from the IPCC simulations is statistically significant, but the second decorrelation time is statistically significant only for a few models. Interestingly, the *trailing* optimal decorrelation times in all simulations are statistically significant, indicating that the simulations tend to be more devoid of white noise components than observations. However, these components are not physically significant owing to their low decorrelation times.

The time series for the leading OPP in observed surface temperature (HadCRUT2) is shown in Fig. 2. The time series has a clear secular trend from 1910 to 1950 and 1980 to 1998, with a leveling off, or perhaps a decrease, between 1950 and 1980. These features are robust in the sense that they do not depend strongly on the analysis parameters—the same result is found for truncation points *M* = 5 to *M* = 20, and for EOF truncations from 5 to 50. The main sensitivity is that, as the number of EOFs increases, the decorrelation time and signal-to-noise ratio increase and the time series becomes smoother.

The seasonal cycle of the leading OPP from HadCRUT2 is shown in Fig. 3. The spatial structure for JFM is similar to that shown in Schneider and Held (2001) for January, including the enhanced warming in northwest Canada and Asia, cooling in the southeast United States, warming over Japan, and cooling in the eastern Mediterranean. The main seasonal variation in the leading OPP consists of a change from a positive winter anomaly to a negative summer anomaly over eastern Europe. Otherwise, the structure of the leading OPP remains essentially the same throughout the annual period, with the amplitude peaking in winter. These results are consistent with those noted by Schneider and Held (2001) for January and July. However, Schneider and Held (2001) found a second structural change in their leading pattern over North America. This latter change is found to occur in our second OPP, as we now discuss.

The time series for the second OPP in observed surface temperature is shown in Fig. 4. We see that the time series is nonmonotonic, which is an expected consequence of the fact that the OPPs are orthogonal in time and the fact that the leading OPP is primarily monotonic. As with the leading OPP, the spatial structure and the secular oscillation in the time series are robust with respect to the analysis parameters. The seasonal cycle of the second OPP, shown in Fig. 5, undergoes significant structural change, especially in eastern Europe and western North America. In particular, the second OPP corresponds to a winter warming and summer cooling during the last half-century in Europe and western North America. This structural change is consistent with that discussed in Schneider and Held (2001) between January and July.

The third OPP was found to be sensitive to the analysis parameters, consistent with the statistically insignificant decorrelation time for this component, and hence is not reported here. Without other information, it is impossible to determine whether these two components arise from natural variability, or from anthropogenic, astronomical, or geological forcing.

To gain insight into how well the leading OPPs capture the kinds of low-frequency variability of interest to climate research, we compare the local trends computed from unfiltered observations with those from a truncated set of OPPs. The trends computed for JFM during the period 1901–98 from unfiltered observations, and from a filtered time series generated by retaining only the leading two OPPs, are shown in Fig. 6. We see that the leading two OPPs give local trends that are virtually indistinguishable from the trends computed from unfiltered observations. This consistency holds for all seasons: the pattern correlation between the two trend maps exceeds 0.98 in all seasons. We also have computed trends for the other multidecadal periods 1910–45, 1946–75, and 1976–98, as considered in Folland et al. (2001), and found that the average pattern correlations are 0.77, 0.55, and 0.60. In these latter periods, the magnitude of the trends tend to be underestimated by the leading two OPPs. The consistency between the trend maps leads us to conclude that the leading OPPs capture the essential space–time structure of low-frequency variability in observations. We draw attention to the fact that the leading two OPPs provide a much more efficient and comprehensive representation of the space–time structure of low-frequency variability than multidecadal trends or averages, which are sensitive to the choice of end points and still require dimension reduction methods. We also compared time averages over the same multidecadal periods (not shown) and found that the first two OPPs could reproduce the dominant structure and magnitudes of the time means.

We have computed the leading OPP of land surface temperature for each climate simulation. The time series of the leading OPP in each simulation is shown in Fig. 7. Different lines in the same panel correspond to different ensemble members. Most time series exhibit a clear secular trend, especially in the last half-century. About a third of the models show a slope change sometime between 1950 and 1970. Note that the time series for individual ensemble members are correlated with each other. This correlation between ensemble members is not imposed by OPA: OPA merely maximizes a ratio of power spectra, where the total power spectra were computed by computing the power spectra in each ensemble member separately, and then averaging. By assumption, natural variability is very unlikely to occur with the same phase in different ensemble members from the same model. Therefore, the fact that the leading optimal persistence pattern varies with the same phase in different ensemble members, and even between different model simulations, is strong evidence that the leading OPP is associated with a forced response of the model. Of course, we cannot eliminate the possibility that the correlation between ensemble members may be caused by climate drift or other issues, except by analyzing simulations under different scenarios. The spatial structure of the leading OPP in each simulation (not shown) reveals a number of similarities and differences, to be discussed below. However, all leading patterns were strictly nonnegative, implying (in conjunction with Fig. 7) that the leading OPP in simulations corresponds to global warming.

A crude indication of the similarities and differences in the spatial structure of the leading simulated OPPs can be seen in Fig. 8, which shows the mean, standard deviation, and their square ratio at each grid point for JFM. The last quantity can be interpreted as a “signal-to-noise ratio” of the leading OPP, with large values indicating strong agreement among the simulations. The result shows that the simulated surface temperatures exhibit most agreement in northern Canada and in central Europe, where the average amplitude also has large amplitude. Mean values over Scandinavia and Iceland also are large, but the standard deviation over those locations also is relatively large, indicating that the models disagree more over those areas relative to other areas. The signal-to-noise ratios are much smaller during JAS and April–June (AMJ). The results for OND are very similar to those for JFM, except shifted northward everywhere by about 10° latitude.

The second OPP of simulated land surface temperature was found to be sensitive to the analysis parameters, consistent with the low statistical significance of their decorrelation times. Furthermore, the correlation of the time series between different ensemble members of the same model is much smaller than those for the leading OPP. These considerations suggest that the second OPP is dominated by natural variability.

To gain insight into how the representation of data affects the results, each model simulation was interpolated onto the 5° × 5° observational grid and “missing grid points” were masked out. The OPPs were then computed from the resulting data, just as we computed them for HadCRUT2, including the computation of CSPCs from the new grid. The time series of the leading OPPs computed from the model grid and the new grid are shown in Fig. 9. In most cases the time series strongly resembles the OPP time series derived from the full model grid. For instance, the correlation coefficient exceeds 0.85 in over half the cases, with models 8 and 10 being notable exceptions. The spatial structure of the leading OPP from the observational grid (not shown) also gives a reasonable representation of the OPP from the full model grid, for the points in common between the two grids. The fact that the leading OPP is nearly the same whether it is derived from the full model grid or from the reduced observational grid suggests that the observational grid can capture a significant fraction of the dominant low-frequency variability in the global land–climate system.

If the leading OPP is due primarily to the forced signal, then it is of interest to characterize how well each of the models simulates this pattern. To do this, the fingerprint of the leading OPP derived from HadCRUT2 was projected onto each of the simulated fields (as interpolated onto the observational grid). The resulting time series are shown in Fig. 10. In all cases, the simulations tend to have stronger high-frequency variance than observations. We reiterate that OPA gives biased estimates of signal-to-noise ratios, so that the OPP time series often will have too little high-frequency variance and the difference in variances may not be real. However, this bias in the observational time series cannot explain the different noise variances in the different simulations. The correlation coefficients between the observed and simulated amplitudes are summarized in Table 2. The 1% significance level of the correlation coefficient for 100 independent samples is about 0.23. The table shows that about a third of the simulations have statistically significant correlations. The fact that the correlations for the other models are statistically marginal or insignificant suggests that most models are not able to capture the spatial structure of the observed low-frequency variability.

The correlation of the global average temperature (as represented on the observational grid, omitting points with missing data) between HadCRUT2 and each of the model simulations is given in Table 2. We see that these correlations tend to be stronger than those associated with the observed leading OPP. Also, the leading OPP in each simulation tends to be of single sign, and therefore contributes strongly to the global average signal. Indeed, the time series of the leading OPP in each model is strongly correlated with the global average time series. These results show that while each model exhibits significant low-frequency variability, and the associated time variation is similar to that of the global average temperature, the spatial structure of low-frequency variability can differ from model to model and from observations.

We repeated the above analysis for surface temperatures over the ocean. The EOF truncation criterion again suggested an optimum number of 10 principal components. However, the number of statistically significant decorrelation times increased from 2 to 5 for the observational record (HadSST1), and decorrelation times for most model simulations were statistically significant. The first 5 optimal persistence patterns were found to recover local linear trends in all multidecadal periods very well (all pattern correlations exceeded 0.72, and more than half of them exceeded 0.9). The mean, standard deviation, and squared ratio of all leading OPPs in JFM are shown in Fig. 11. The consensus pattern shows warming in the Tropics, with the signal-to-noise ratio suggesting most agreement in warming of the Indian Ocean.

## 5. Summary and discussion

Many climate detection studies employ ad hoc filtering and dimension reduction methods, often without explicitly stating reasons for adopting such methods. These methods often make use of multidecadal linear trends or means, which ignore nonlinear variations and/or spatial coherence, and which in addition must be supplemented by dimension reduction methods. This paper suggested that a technique known as optimal persistence analysis (OPA) may provide an attractive alternative to the above methods. OPA decomposes a multivariate time series into an ordered set of uncorrelated components such that the first component varies on the longest time scale, the second varies on the longest time scale subject to being uncorrelated with the first, and so on. In this procedure, “time scale” is measured by a weighted average time-lagged autocorrelation function, or equivalently, by the ratio of estimated zero-frequency power to total power. The decomposition accounts for spatial coherence in the time series and does not prejudice the time variation to be a prescribed function of time, unlike linear trend analysis. The present paper goes beyond previous work by DelSole (2001) by clarifying the connection between OPA and spectral analysis, and developing procedures for dealing with finite samples, including a method for deciding the appropriate EOF truncation prior to OPA and for testing the statistical significance of the decorrelation times. Furthermore, this paper showed that the use of cyclostationary principal components (CSPCs) allowed OPA to efficiently represent low-frequency variability in different seasons simultaneously.

Application of OPA to the observed land surface temperature, as represented by the HadCRUT2 time series during the 100-yr period of 1899–1998, suggested that only the first two components were statistically distinguishable from white noise. The first OPP indicated enhanced warming for all seasons in northwestern Canada, western Europe, and Japan, and cooling in the southeastern United States. This component exhibits a secular trend throughout the period 1899–1998, interrupted by a trend with reversed sign in the period around 1946–75. The second OPP captured low-frequency variability in the seasonal cycle, with eastern Europe and western North America exhibiting enhanced warming during winter and enhanced cooling during summer for the second half-century. The first two OPPs also captured the time average surface temperature and the local linear trends in each season in selected subcentennial periods, demonstrating that OPPs can capture the type of climate signals examined in climate detection studies.

Application of OPA to the land surface temperature simulated by 17 climate models, each run with forcing characteristic of the twentieth century (i.e., scenario 20c3m), suggested that while most components were statistically distinguishable from white noise, only the first two, and in most cases only the first, were physically relevant in the sense of having large decorrelation times. The time series of the leading OPP in each simulation was dominated by a secular trend (with a few exceptions). The time series between different ensemble members of the same model tended to be highly correlated. This fact is significant because, by assumption, natural variability is unlikely to occur with the same phase in different ensemble members from the same model. Therefore, the fact that the leading optimal persistence pattern varies with the same phase in different ensemble members, and even between different model simulations, is strong evidence that the leading OPP is associated with a forced response of the model. Of course, to be certain of this conclusion we must compare simulations with and without certain forcings.

The leading OPP in each model was strictly nonnegative, implying that these components are associated with global warming. According to the signal-to-noise ratio, the simulations tend to agree in their predictions of warming in central Europe and western Canada during winter but tend to disagree in their predictions of warming farther north, as in Iceland and Scandinavia. Thus, *the best agreement in the prediction of warming among the models also happens to be where warming has in fact been observed to occur.*

The leading OPP in each simulation was essentially the same whether it was computed on the respective model grid or on the HadCRUT2 grid with missing data masked out. The similarity, in both space and time, between the OPPs computed from different grids suggests that the observation grid may not pose a serious barrier to capturing the dominant structure of low-frequency variability of the global land–climate system. The leading OPP also was essentially the same whether it was computed using all available ensemble members or just a single ensemble member, indicating that differences due to ensemble size are not significant. This result raises questions about the benefits gained from large ensemble sizes if OPA, or similar methods, can extract the dominant low-frequency variability from a single realization.

As a first analysis into the similarities and differences between the observed and simulated optimal persistence patterns, we projected the observed leading OPP onto the simulations. This projection gives a time series corresponding to each simulation, which can be compared to the observed time series to judge how well each simulation reproduces the time variability of the leading component of low-frequency variability in the observations. The resulting time series, shown in Fig. 10, reveal that most simulations underestimate the low-frequency variability, or equivalently, overestimate the high-frequency variability, although this difference might be due to an inherent bias in OPA. About half of the simulations generated time series that were correlated with the observed counterpart at the 1% significance level; about a third of the simulations generated time series that were negatively correlated with the observed counterpart. These results generally indicate that while each model exhibits similar temporal variation in low-frequency variability, the spatial structure of this variability differs from model to model and from observations in at least a third of the models. A similar analysis for ocean surface temperature yielded similar results.

The analysis discussed in this paper clearly can be expanded in several directions. The question of how to compare optimal persistence patterns between different datasets was only briefly touched upon here. For instance, it is unclear which subspace should be used, and whether rotations and translations that optimally match a vector with other vectors in a subspace should be explored. It would be most interesting to apply OPA to climate simulations run under different scenarios. Comparison between the results should give interesting insight into the differences between forced and unforced low-frequency variability.

## Acknowledgments

I thank J. Shukla for suggesting this project and J. Kinter for making available the necessary resources for this project. I thank Jennifer Adams, Brian Doty, Mike Fennesy, Dan Paolino, and Robert Burgman for their expert assistance in acquiring the data used in this study, and Ben Kirtman and Tapio Schneider for many stimulating discussions regarding the analysis procedure and interpretation of results. I am especially indebted to the reviewers of this paper for penetrating comments, which lead to substantial clarifications, and to Jennifer Adams for helpful suggestions on the presentation. I acknowledge the international modeling groups for providing their data for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting and archiving the model data, the JSC/CLIVAR Working Group on Coupled Modelling (WGCM) and their Coupled Model Intercomparison Project (CMIP) and Climate Simulation Panel for organizing the model data analysis activity, and the IPCC WG1 TSU for technical support. The IPCC Data Archive at Lawrence Livermore National Laboratory is supported by the Office of Science, U.S. Department of Energy. This research was supported by the National Science Foundation (ATM0332910), National Aeronautics and Space Administration (NNG04GG46G), and the National Oceanographic and Atmospheric Administration (NA04OAR4310034).

## REFERENCES

Allen, M. R., and S. F. B. Tett, 1999: Checking for model consistency in optimal fingerprinting.

,*Climate Dyn.***15****,**419–434.Bell, T. L., 1986: Theory of optimal weighting of data to detect climatic change.

,*J. Atmos. Sci.***43****,**1694–1710.Brockwell, P. J., and R. A. Davis, 1996:

*Time Series: Theory and Methods*. Springer-Verlag, 577 pp.Chatfield, C., 1989:

*The Analysis of Time Series: An Introduction*. Chapman and Hall, 241 pp.DelSole, T., 2001: Optimally persistent patterns in time-varying fields.

,*J. Atmos. Sci.***58****,**1341–1356.Folland, C. K., and Coauthors, 2001: Observed climate variability and change.

*Climate Change 2001: The Scientific Basis,*J. T. Houghton et al., Eds., Cambridge University Press, 91–181.Gardner, W. A., and L. E. Franks, 1975: Characterization of cyclostationary random processes.

,*IEEE Trans. Inf. Theory***21****,**4–14.Ghil, M., and Coauthors, 2002: Advanced spectral methods for climatic time series.

,*Rev. Geophys.***40****.**1003, doi:10.1029/2000RG000092.Gladysev, E. G., 1961: Periodically correlated random processes.

,*Soviet Math.***2****,**385–388.Hasselmann, K., 1979: On the signal-to-noise problem in atmospheric response studies.

*Meteorology over the Tropical Oceans,*D. B. Shaw, Ed., Royal Meteorological Society, 251–259.Hasselmann, K., 1993: Optimal fingerprints for the detection of time-dependent climate change.

,*J. Climate***6****,**1957–1971.Hegerl, G. C., H. von Storch, K. Hasselmann, U. Cubasch, B. D. Santer, and P. D. Jones, 1996: Detecting anthropogenic climate change with an optimal fingerprint method.

,*J. Climate***9****,**2281–2306.Hegerl, G. C., K. Hasselmann, U. Cubasch, J. F. B. Mitchell, E. Roeckner, R. Voss, and J. Waszkewitz, 1997: Multi-fingerprint detection and attribution of greenhouse gas- and aerosol forced climate change.

,*Climate Dyn.***13****,**613–634.International Ad Hoc Detection and Attribution Group, 2005: Detecting and attributing external influences on the climate system: A review of recent advances.

,*J. Climate***18****,**1291–1314.Jenkins, G. M., and D. G. Watts, 1968:

*Spectral Analysis and Its Applications*. Holden-Day, 525 pp.Jones, P. D., and A. Moberg, 2003: Hemispheric and large-scale surface air temperature variations: An extensive revision and an update to 2001.

,*J. Climate***16****,**206–223.Jones, R. H., and W. M. Brelsford, 1967: Time series with periodic structure.

,*Biometrika***54****,**403–408.Kim, K-Y., and Q. Wu, 1999: A comparison study of EOF techniques: Analysis of nonstationary data with periodic statistics.

,*J. Climate***12****,**185–199.Kim, K-Y., G. R. North, and J. Huang, 1996: EOFs of one-dimensional cyclostationary time series: Computations, examples, and stochastic modeling.

,*J. Atmos. Sci.***53****,**1007–1017.McLachlan, G. J., 1992:

*Discriminant Analysis and Statistical Pattern Recognition*. Series in Probability and Mathematical Statistics, Wiley, 544 pp.Mitchell, J. F. B., D. J. Karoly, G. C. Hegerl, F. W. Zwiers, M. R. Allen, and J. Marengo, 2001: Detection of climate change and attribution of causes.

*Climate Change 2001: The Scientific Basis,*J. T. Houghton et al., Eds., Cambridge University Press, 695–738.North, G. R., and M. Stevens, 1998: Detecting climate signals in the surface temperature record.

,*J. Climate***11****,**563–577.North, G. R., K. Y. Kim, S. S. P. Shen, and J. W. Hardin, 1995: Detection of forced climate signals. Part I: Filter theory.

,*J. Climate***8****,**401–408.Rayner, N. A., D. E. Parker, E. B. Horton, C. K. Folland, L. V. Alexander, D. P. Rowell, E. C. Kent, and A. Kaplan, 2003: Global analyses of SST, sea ice, and night marine air temperature since the late nineteenth century.

,*J. Geophys. Res.***108****.**4407, doi:10.1029/2002JD002670.Schneider, T., and I. M. Held, 2001: Discriminants of twentieth-century changes in earth surface temperatures.

,*J. Climate***14****,**249–254.Tett, S. F. B., J. F. B. Mitchell, D. E. Parker, and M. R. Allen, 1996: Human influence on the atmospheric vertical temperature structure: Detection and observations.

,*Science***274****,**1170–1173.Tett, S. F. B., P. A. Stott, M. A. Allen, W. J. Ingram, and J. F. B. Mitchell, 1999: Causes of twentieth century temperature change.

,*Nature***399****,**569–572.Troutman, B. M., 1979: Some results in periodic autoregression.

,*Biometrika***66****,**219–228.von Storch, H., and F. W. Zwiers, 1999:

*Statistical Analysis in Climate Research*. Cambridge University Press, 494 pp.

List of models used in this study, their numerical identification, the associated institutes, and the countries at which the institutes are located.

Summary of the following diagnostics: CC-TAVE: correlation between simulated and observed global average surface temperature on the 5° × 5° observational grid, with missing data mask; CC-OPP1: correlation between observed leading OPP and its projection onto simulations (the two time series are shown in Fig. 7); T1(1) and T1(2): decorrelation time (in years) of the first (1) and second (2) optimal persistence patterns; ENSM: number of ensemble members used in the OPA.