## Abstract

This paper proposes an optimal method for estimating time-dependent climate change signals from general circulation models. The basic idea is to identify vectors that maximize the mean-square detection statistic derived from optimal fingerprinting techniques. The method also provides an objective and systematic procedure for identifying the limit to which a signal can be restricted in space and time without losing detectability. As an illustration, the method is applied to the Coupled Model Intercomparison Project, phase 3 multimodel dataset to determine the continental seasonal-mean anomaly in surface air temperature and precipitation that is most detectable, on average, in these models. Anomalies in seasonal-mean surface air temperature are detectable in all seasons by almost all models on all continents but Europe; seasonal-mean anomalies over Europe are undetectable for some models, though this does not preclude other expressions of the signal, such as those that include longer time averages or time-lag information, from being detectable. Detectability in seasonal-mean temperature is found not only for multidecadal warming trends but also for cooling after major volcanic eruptions. In contrast, seasonal-mean precipitation anomalies are detectable in only a few models for averages over 5 yr or more, suggesting that the signal should include more spatiotemporal detail to be detectable across more models. Nevertheless, of the precipitation anomalies that are detectable, the signal appears to be of two characters: a systematic trend and enhanced frequency of extreme values. These results derived from twentieth-century simulations appear to be consistent with previous studies based on twenty-first-century simulations with larger signal-to-noise ratios.

## 1. Introduction

Numerous studies have concluded that most of the warming observed since the middle of the twentieth century is due to human influence and that further warming over the next century is expected (Trenberth et al. 2007; Hegerl et al. 2007; Christensen et al. 2007). Local governments and policy makers can use this information to plan for certain well-established consequences of global warming, such as rising sea levels, but other consequences, such as the alteration of the local climate, require more information beyond slow, global changes. Unfortunately, smaller spatial and temporal scales are characterized by stronger natural fluctuations that obscure the climate changes due to humans and produce larger uncertainties in future projections. The question arises as to what are the shortest space and time scales for which detection, attribution, and prediction of climate changes are statistically justified.

Typically, detection and attribution of climate change are established on the basis of optimal fingerprinting techniques. These techniques are optimal in the sense that they construct a climate change detection variable that maximizes the signal-to-noise ratio, provided the climate change signal one wishes to detect and the statistics of the background unforced variability are prespecified. The best available estimates of these latter quantities come from coupled atmosphere–ocean general circulation models. Leaving aside model errors, the statistics of unforced variability are relatively straightforward to estimate from “control” simulations run without interannual variations in climate forcing. In contrast, estimation of time-dependent climate change signals is generally a nontrivial problem, owing to the fact that climate models simulate these signals embedded in a background of internal variability. Thus, identification of time-varying climate change signals already constitutes a signal detection problem that must be solved before optimal fingerprinting techniques can be applied to observations (Hasselmann 1993). Moreover, inaccurate specification of the climate change signal generally reduces the signal-to-noise ratio, raising the concern that some climate changes may be undetectable merely because of poor estimation of the signal.

The problem of identifying climate change signals is especially relevant to “event attribution” studies that attempt to quantify the change in probability of certain events due to anthropogenic and natural forcing. In many event attribution studies, the change in probability is quantified even though the exact climate change signal to which the event is linked has not been shown to be detectable. As one typical example among many, Stott et al. (2004) concluded that anthropogenic increases in greenhouse gas concentration and other pollutants very likely doubled the risk of European heat waves during summer. However, Stott et al. (2004) did not demonstrate that warming over Europe was detectable in a single summer but rather that a particular sequence of summer-mean temperature patterns evolving over decades was detectable. Such multidecadal climate change signals are utilized because they have much larger signal-to-noise ratios than seasonal means alone, due to enhanced filtering of internal variability by longer-term averaging and enhanced spatiotemporal information. Nevertheless, detection is formally equivalent to rejecting the null hypothesis that all elements of the climate change signal vanish simultaneously. When the null hypothesis is rejected, all that can be concluded is that at least one of the elements is nonzero, but we do not know which ones. This situation is analogous to testing the significance of regression, in which significance implies that at least one of the coefficients for the predictor variables differs from zero, but further tests are required to establish that particular predictors have significant coefficients. Therefore, demonstrating that a sequence of decadal-mean, seasonal-mean patterns is detectable does not necessarily imply that the signal has nonzero amplitude at the times and places needed to influence the event.

The purpose of this paper is to derive an optimal estimate of time-dependent climate change signals from general circulation models. The basic idea is to identify vectors that maximize the mean-square detection statistic in independent forced simulations. In essence, we first apply fingerprinting techniques to derive an optimal detection statistic assuming the climate change signal is known and then determine the signal vector that maximizes the statistic over independent forced simulations. An attractive feature of the technique is its definitiveness: if the maximized detection statistic is not significant, then no linear combination of variables that were adjusted to find the maximum is significantly detectable in the models. This contrasts with standard detection analysis, in which a signal composed of *T* basis vectors may be undetectable while some other linear combination of basis vectors might be detectable. The proposed technique allows us to objectively decide when the spatial or temporal scale is too short to allow externally forced variability to be distinguished from internal variability. The procedure for finding the desired components and testing their significance is reviewed in section 2. The technique turns out to be related to a discriminant analysis technique used in DelSole et al. (2011) to identify the forced response of global sea surface temperatures. The models and data used to illustrate this framework are described in section 3. The result of applying this technique to detect a signal expressed as a seasonal-mean anomaly in a single period is discussed in section 4. We conclude with a summary and discussion.

## 2. Identifying the space of climate change signals

A basic paradigm in climate change studies is that climate variability can be partitioned into two kinds, forced and unforced. Unforced variability refers to the variability that occurs in the absence of interannual variations of natural or anthropogenic forcing. Such variability arises solely from internal dynamics of the coupled atmosphere–ocean–biosphere–cryosphere system and generally includes variability associated with weather and El Niño. In contrast, forced variability refers to variability that occurs in response to interannual variations of natural and anthropogenic forcing, including volcanic and solar forcing. To the extent that forced variability can be modeled as an independent and additive perturbation to unforced variability, the observed variability can be modeled as

where **o** is the observation vector; the columns of contain the response to particular climate forcings, called climate change signals; **a** is a vector of regression coefficients; and **u** is a random vector representing unforced variability. The distribution of **u** is assumed to be normal with mean *μ*_{U} and covariance matrix **Σ**_{U}, which are both estimated from unforced climate simulations. The assumptions implicit in model (1) are not satisfied exactly in the real climate system, but studies based on less restrictive assumptions tend to confirm results from traditional detection analysis when the signal-to-noise ratio is large, as is the case with surface temperature over the past half century (Hegerl et al. 2006; Hegerl and Zwiers 2011).

It should be recognized that **o** can be chosen with considerable flexibility. In many detection and attribution studies, **o** is expressed as a sequence of temporally smoothed patterns or as a multidecadal linear trend pattern (Hegerl et al. 2000; Stott et al. 2001; Zwiers and Zhang 2003). In particular, a single vector represents variability over most of the observational record, so that only one such vector exists. However, the observation vector also could be chosen to be, say, a 5-yr mean, in which case multiple realizations of the observation vector could exist in the same historical record (e.g., one vector for every possible 5-yr mean). The technique that follows can be applied to any of these choices of the observation vector.

A goal of climate change studies is to test whether the observed change can be explained by unforced variability. This question can be framed more precisely as testing the hypothesis **a** = 0 in (1). The best estimate of **a**, in a least squares unbiased sense, is known to be

The generalized likelihood ratio test of the hypothesis **a** = 0 against the alternative **a** ≠ 0 yields the detection statistic

The null hypothesis **a** = 0 is rejected when *φ*^{2} exceeds the predefined statistical threshold based on a chi-squared distribution with *K* degrees of freedom, where *K* is the number of climate change signals (i.e., number of columns of ). If only one signal exists (i.e., *K* = 1), then an equivalent statistic is

which has a normal distribution under the null hypothesis *a* = 0. The advantage of the *φ* statistic is that it can account for the sign of the climate change signal. In either case, if the hypothesis **a** = 0 is rejected, then the signal is said to have been “detected.”

The above test assumes that the climate change signals are known. In practice, these signals are estimated from coupled atmosphere–ocean general circulation models, which generate their own internal variability. Thus, the desired signals generally are embedded in background internal variability and hence must be extracted by a filter. The most justifiable filter is an ensemble average; that is, an average over independent realizations of climate states generated by a model driven by the same time-varying forcing. Unfortunately, computational resources severely limit the number of ensemble members and as a result most estimates of the climate change signal are obtained by additional filtering designed to capture the largest spatial and temporal scales. However, such filtering is not guaranteed to optimize the signal-to-noise ratio, and in particular will reduce the signal-to-noise ratio if the climate change signal varies within the filter window.

In this paper, we propose an optimal method for estimating the space of time-varying climate change signals from climate model simulations. We call such simulations 20C runs, since most detection analyses focus on climate changes during the twentieth century. We propose identifying the space of signals by finding the vectors that maximize the mean-square detection statistic in independent 20C runs. That is, we select to render the average *φ*^{2} as large as possible in independent 20C runs.

It turns out that maximizing the case of a single climate change signal yields all the information needed to solve the multisignal problem. Therefore, we assume a single signal. To avoid confusion, the symbol **p** will be used to denote the vector that maximizes detectability, to distinguish it from the true signals . Similarly, the amplitude of **p** will be denoted *β*, to distinguish it from the actual amplitudes **a** of the true signals . This notation implies that the least squares unbiased estimate of the amplitude of **p** is

Let **o** have a mean *μ*_{20C} and covariance matrix **Σ**_{20C}, which can be estimated from independent 20C runs. Then

It follows from this and (5) that the average of *φ*^{2} is

Since the right-hand side of (7) is a Rayleigh quotient, we invoke the standard linear algebra result that **p** can be obtained by solving the eigenvalue problem

where *λ* = *E*(*φ*^{2}). Solutions to this eigenvalue problem give the vectors that maximize the mean-square detection statistic. Below we explore some consequences of this solution.

### a. Climate change signals with time-dependent amplitudes

In many studies, climate change signals are expressed in such a way that **a** takes on a single value in a given historical record. In this section, we consider signals with amplitudes **a** that vary in time. For instance, could represent climate change over a 5-yr period, in which case the amplitude would then measure the magnitude of change in a specific 5-yr period. Let the time mean and covariance matrix of **a** be

where the bar denotes a time average. Assuming model (1) holds, the time- and ensemble-mean vector and covariance matrix of **o** are

Substituting these quantities in (8) yields

A standard theorem in linear algebra states that the column space of a product of matrices—say —is contained in the column space of (Harville 1997, corollary 4.2.3). This theorem implies that the nontrivial eigenvectors of (11) must span the same space as the climate change signals . If only one signal exists (i.e., has one column), then only one eigenvalue *λ* differs from one and the corresponding eigenvector solution **p** is proportional to the signal ; that is, the eigenvector recovers the correct climate change signal. The converse is not necessarily true: if only one eigenvalue *λ* differs from one, it does not follow that only one signal exists; for instance, the signals might be linearly dependent in either space or time. If two linearly independent, uncorrelated signals exist—**f**_{1}*a*_{1} and **f**_{2}*a*_{2}—then the above theorem implies that at most two eigenvalues differ from one. However, the two eigenvectors generally would be linear combinations of **f**_{1} or **f**_{2}; that is, the eigenvectors generally cannot be paired uniquely with the response to specific climate forcing.

The leading eigenvector of (8) will be called the “most detectable component,” since it maximizes the mean-square detection statistic over all ensembles, time periods, and models. However, other vectors might be more detectable in certain time periods or models.

### b. Fixed forcing

A simple case worth considering is a 20C simulation that has no interannual variation in climate forcings. If, in addition, the variability of the forced and unforced components are unaffected by climate forcings (i.e., **Σ**_{20C} = **Σ**_{U}), then (8) can be manipulated into the form

The matrix on the left-hand side of (12) is rank 1, which implies that only one nontrivial eigenvector solution exists. It is readily verified that this solution is

where the symbol ~ means “proportional to,” since eigenvectors are unique up to multiplicative factors. Thus, the vector that maximizes the mean-square detection statistic is simply the mean difference between 20C and unforced runs. Moreover, model (1) implies *μ*_{20C} − *μ*_{U} = *μ*_{a}, giving **p** ~ *μ*_{a}. In the case in which only one climate change signal exists (i.e., has one column), then the procedure obtains the correct signal vector. This result justifies defining the difference in means as the climate change signal. If multiple signals exist (i.e., has two or more columns), then the leading vector is a linear superposition of signals; in particular, the leading vector generally cannot be identified with a unique signal.

### c. Relation to discriminant analysis

The above procedure is closely related to discriminant analysis. Discriminant analysis has been discussed extensively by Schneider and Held (2001), Straus et al. (2003), and DelSole et al. (2011), so only relevant details will be discussed here. Consider the problem of finding the linear combination of variables that maximizes the ratio of mean-square anomalies of the 20C and unforced runs, where “anomaly” refers to deviations from the control mean. If the weights of the linear combination are **q**, then we seek the weights that maximize the ratio

Maximizing the ratio in (14) leads to the eigenvalue problem

This eigenvalue problem is precisely equivalent to (8) if we make the identification

Therefore, the weight vector that maximizes the mean-square anomalies between 20C and unforced runs also is related to the vectors that maximize the mean-square detection statistic. This result is sensible. After all, to the extent that the forced variability can be modeled as an independent and additive perturbation to unforced variability, the total variance equals the sum of the variances due to the forced and unforced components. An immediate consequence of this fact is that the variance of forced simulations ought to be larger than the variance of unforced simulations, since the forced simulations contain an extra component of variability relative to the unforced simulations. Moreover, components whose forced variance differs as much as possible from the unforced variance define components in which the forced variability is most easily distinguished from unforced variability.

The vector **q** derived from discriminant analysis is merely the fingerprint for the corresponding vector **p**. This fact can be seen by normalizing the eigenvector to satisfy

In this case, the fingerprint is merely the weight vector **q** applied to the residual **o** − *μ*_{U},

The above method is closely related to the technique proposed by DelSole et al. (2011) to identify the forced response of global sea surface temperatures. The only difference is that in DelSole et al. (2011) both the control and 20C runs were centered with respect to their own means, which effectively removes the term in (15) involving the difference in means.

### d. Solution based on truncated principal components

In practice, the covariance matrices for the 20C and control runs are unknown and hence must be estimated from finite samples. Unfortunately, the state vector of the climate system easily exceeds 10^{5}, while the number of simulation years rarely exceeds 10^{3}. These numbers imply that the covariance matrices are singular and that the inverse in the eigenvalue problem (8) does not exist. Moreover, as with all optimization problems, the quantity being optimized tends to be biased in the given sample, with the bias increasing with the number of estimated parameters. Therefore, the number of parameters used to maximize the mean-square detection statistic should be severely restricted. These problems are hallmarks of ill-posed estimation problems, and the usual approach to solving them is to impose constraints on the unknown parameters to reduce the number of effectively independent parameters. Such methods are called regularization methods (Schneider and Held 2001).

We regularize the problem by maximizing the mean-square detection statistic only in a low-dimensional space spanned by the leading principal components (PCs) of the data. For this purpose, we use PCs of the combined 20C and control runs, expressed as anomalies relative to the control mean of the respective model. We found that using PCs of the control runs only, which optimize unforced variability, turn out to be inefficient at representing forced variability. Using PCs of 20C runs is not recommended because the resulting components tend to overfit variance in the 20C run, leading to a bias in the detection statistic. Using PCs of the combined 20C and control runs was considered a reasonable compromise. Our solution based on principal component regularization is summarized in the appendix.

### e. Statistical significance

A key question in maximizing the mean-square detection statistic is whether the resulting value is statistically significant. We adopt the following approach to testing significance. Since the sample sizes in this study are relatively large, we split the data into two parts, a training sample and an assessment sample. The training set is used to find the PCs and maximize detectability, while the assessment set is used to test detectability. In particular, since the vectors to be tested are determined independently of the assessment sample, the classical detection procedure outlined in section 2 can be applied. In the case of time-varying amplitudes for the climate change signals, a detection analysis would be applied to each time step. In such cases, we summarize the results by testing the mean-square detection itself, whose sampling distribution under the null hypothesis of identical normal populations can be determined by Monte Carlo methods and differs only slightly from the familiar *F* distribution (due to the fact that the numerator contains an additive term involving the difference in means between the two samples).

## 3. Models and data

The dataset used in this study is from the World Climate Research Programme (WCRP) Coupled Model Intercomparison Project, phase 3 (CMIP3) multimodel dataset. The statistics for unforced variability are estimated from preindustrial control runs in which anthropogenic and natural forcing agents are fixed to their preindustrial values. In contrast, the statistics for forced variability are estimated from the twentieth-century runs, which are initialized from a specific point in the preindustrial control runs and forced by historic, time-varying concentrations of well-mixed greenhouse gases and sulfate aerosols and in some models by other anthropogenic (e.g., black carbon particulate or land-use changes) and natural (solar radiation and volcanic aerosols) forcings (Biasutti et al. 2008).

The 3-month means of surface air temperature and precipitation from the twentieth-century runs and preindustrial control runs are analyzed. The 3-month means are denoted by the first letter of each respective month [e.g., January–March (JFM), April–June (AMJ), July–September (JAS), and October–December (OND)].

By pooling different models together, we hope to obtain results that are robust across models. To facilitate calculation and model intercomparison, all fields were interpolated to a common 5° × 5° grid. The length of the unforced runs varies with models. We use the last 300 yr of preindustrial control runs from control runs that are at least 300 yr long in surface air temperature and precipitation. Although unforced runs are not supposed to have trends because the forcings are fixed, some runs do exhibit significant trends because of adjustment to equilibrium. Such trends artificially inflate the variance, thereby contaminating the analysis. Therefore, models with significant trends in unforced runs were omitted. Additionally, models with significantly different variances compared to other models were removed. The precise details of our selection procedure follow that of Jia and DelSole (2011) and lead to the selection of eight models, which are summarized in Table 1.

The data is split into training and assessment samples. For training, we use from each model only the first 150 yr of the control run and only one 20C run (whose length depends on model). This gives 150 × 8 = 1200 yr for estimating unforced quantities and 1095 yr for forced quantities. The mean of the control run was subtracted from both the control and 20C run of each model. Thus, the control runs are centered whereas the forced runs are not. These samples were pooled and subjected to principal component analysis. The resulting PCs were then used to maximize detection. The amplitudes of the EOFs in assessment data were computed by projecting the pseudoinverse of the EOFs. As there is only one 20C ensemble member from the L’Institut Pierre-Simon Laplace Coupled Model, version 4 (IPSL CM4), the assessment data do not include output from this model. Thus, the assessment data consist of 7 of the 150-yr control runs and 19 of the 20C runs, totaling 2607 yr.

## 4. Results

In this section, we discuss results of maximizing the mean-square detection statistic for a signal expressed as a seasonal-mean anomaly. This analysis should not be construed as suggesting that a seasonal-mean anomaly is sufficient to describe climate change signals. It is not. Indeed, space–time vectors, such as multidecadal trends or sequences of decadal-mean, seasonal-mean temperatures, possess much stronger signal-to-noise ratios than seasonal means alone, because of enhanced filtering of internal variability by longer-term averaging and enhanced spatiotemporal information content. Nevertheless, we depart from using space–time vectors because these have been studied extensively and shown to be adequate for detecting and attributing the observed global warming over the past 50 yr. Rather, recent investigations have raised the question of how far the space and time scale of such vectors can be reduced before detectability is lost. The main purpose of this paper is to develop a general method for optimizing detectability. Determining the specific limit of detectability is difficult because the limits derived from this method are conditioned on, among other things, how the signal is expressed (e.g., as a seasonal anomaly or as a sequence of seasonal anomalies over multidecadal time scales). Since we analyze only seasonal-mean anomalies, our conclusions should not be interpreted as defining limits for other types of signal vectors.

### a. Analysis details

As discussed in section 2d, we maximize the mean-square detection statistic in a space spanned by the leading PCs of the combined 20C and control runs. The PCs were computed for surface air temperature and precipitation separately and in each continent separately. The sensitivity of the maximized components to the number of PCs was studied extensively. After about six PCs, the mean-square detection statistic increases only gradually with the number of PCs in the vast majority of cases examined. In a few cases, the statistic increases by about 20% after including an extra PC somewhere between 10 and 30 PCs, but the leading vector **p** was nearly the same throughout this range of PC truncations. This lack of sensitivity is presumably due to the relatively large sample size (about 1200 yr for estimating forced and unforced quantities separately).

### b. Results for seasonal-mean surface air temperature

We first identify the components that maximize the mean-square detection statistic of JAS surface air temperature over all models and then project the components onto independent assessment simulations of each model. We display results using 30 PCs. Independent estimates of the mean-square detection statistic are shown in Fig. 1. Only the first 10 components are shown, as the remaining components are insignificant. In North America and Asia, the leading component is detectable in all models except one, and a second component is detectable in some models. Similarly, South America and Africa has a detectable component in all models and perhaps a second detectable component in certain models. In Europe and Australia, several 20C runs do not show a detectable component. This pattern of results was found in other seasons (not shown): namely, that the leading component dominates the mean-square detection statistic in all models and in all continents except in Europe. Importantly, the magnitude of the detection statistic is highly model dependent even within the same continent; for example, the detection statistic for the leading component over North America ranges from 1 to about 10.

The time series of the most detectable component in independent 20C assessment runs are shown in Fig. 2. The time series exhibit a clear trend, interrupted by sudden coolings coincident with major volcanic eruptions, in each continent but Europe. The relatively weaker trend in Europe is consistent with the mean-square detection statistic being closer to 1 for Europe. However, some time series for Europe do show an increasing trend in the last two decades of the twentieth century, which leads to statistically significant detection statistic in some models. Since the detection statistic (4) is merely the time series shown in Fig. 2, detection in any given year is formally equivalent to determining whether the time series falls outside the two dashed lines in that particular year. It should be recognized that about 19 realizations of 20C simulations are displayed in the figure, so about two members are expected to exceed the significance threshold each year by random chance. (The actual number of realizations in each year differs because different models had different initialization times, but all 19 of the 20C simulations are available after 1900). The percentage of 20C realizations that fall outside the two dashed lines over the whole period are given in each panel. Europe is a clear outlier in this measure, too. It is interesting to note that not only are the positive amplitudes at the end of the twentieth century detectable in all continents except Europe, but the negative amplitudes around 1883, coinciding with the eruption of Krakatoa, are detectable in some models, at least in most continents except Europe and Australia.

The time series for Europe also shows that a single ensemble member experiences a multidecadal cooling period between 1925 and 1990. This particular ensemble member, which happens to be from the HadCM3 model, implies a detectable cooling and hence differs from the warming characteristic of the other continents. We have examined the control and 20C runs for this model and cannot find any obvious problem with the control or 20C simulations (e.g., climate drift) that might explain this outlier.

The spatial pattern corresponding to the leading component for JAS surface air temperature is shown in Fig. 3. We see that all patterns are of single sign. Combining this fact with the increasing trend in each continent implies that the bulk of the detectability is due to twentieth-century warming on continental scales.

We also have examined results for seasonal-mean surface air temperature based on 5- and 10-yr means (not shown). In general, the detection statistic increases with longer-term averages, but so too does the significance level (owing to the smaller sample size). Although the results are model dependent, the overall conclusions are practically the same: namely, that the leading component tends to be significant in most models and continents, except for Europe, which consistently has the most 20C runs without a detectable component.

### c. Other seasons and spatial averages

The multimodel mean-square detection statistic for all four seasons and the annual mean are listed in Table 2. Annual-mean quantities are consistently more detectable than individual seasonal-mean quantities. Among the four seasons, Northern Hemisphere summer and Southern Hemisphere spring tend to be the most detectable.

The spatial structure of the most detectable component in each continent turns out to be of single sign, when the component is statistically significant (not shown). This fact raises the question of whether the climate change signal can be identified simply from the spatial average over the continent in question. To investigate this, we computed the multimodel detection statistic for continental averages in independent assessment data. The results, listed in Table 2, show that in most cases the leading component is more detectable than the spatial average, often by a large margin. The only exceptions are for Europe, where the values are not significant, and for Africa, where the differences in the statistic are small.

### d. Results for seasonal-mean precipitation

Independent estimates of the mean-square detection statistic for 5-yr-mean JAS precipitation are shown in Fig. 4. Results for 1-yr means are not shown because most values are not statistically significant. For display purpose we use six PCs; there is little sensitivity of the mean-square detection statistic after five PCs. Even for 5-yr means, most detection statistics are not statistically significant, and those that are significant are model dependent. This result is not surprising given that it is widely acknowledged that forced precipitation signals are much weaker and more difficult to detect than temperature signals.

Time series for the most detectable component of 5-yr-mean JAS precipitation, shown in Fig. 5, reveal that the most detectable component is characterized by a downward trend over North America and Europe and an upward trend over South America. The corresponding spatial patterns, shown in Fig. 6, reveal that the most detectable vector tends to be positive in these continents, so upward and downward trends correspond to increases and decreases of 5-yr-mean precipitation, respectively.

Over North America, some individual ensemble members experience randomly occurring 5-yr rainfall deficits during the twentieth century, whereas compensating random excesses are not found. This implies that the detectable signal is composed partly of extreme “drought” periods. The corresponding spatial pattern shown in Fig. 6 indicates that these deficit periods will be concentrated in the southern United States and Mexico, consistent with the more sophisticated analysis of drought indices by Strzepek et al. (2010). Although discriminant analysis is not necessarily optimized for detecting extreme events, such extreme events will inflate the 20C variance (regardless of sign) and hence inflate the detection statistic.

The most detectable vector over Asia and Africa are characterized by tripole and dipole structures. Hence, computing spatial averages over these areas would significantly damp the climate change signal. Over Asia, the time series exhibits no obvious trend but does exhibit strong extreme events for some members. The corresponding pattern shows that these changes will be concentrated over India and Southeast Asia. Note, however, that these changes are detectable only in two models. The time series for South America reveals a positive upward trend in the ensemble mean, but some members exhibit a strong negative trend. Thus, although the detection statistic is larger for South America compared to other continents, the character of the signal differs with model. The signal over Africa is detectable only in the Geophysical Fluid Dynamics Laboratory (GFDL) model, and the corresponding pattern is a zonally elongated patch south of the Sahel. The time series for GFDL (not individually shown) exhibits no obvious trend during the twentieth century. The negative trend over Europe is reproduced by different models but not necessarily all models. The pattern suggests that the signal is dominated by drying over southern Europe, consistent with twenty-first-century projections (Christensen et al. 2007), although we detected this signal based on twentieth-century simulations alone.

## 5. Summary and discussion

This paper developed a technique for identifying vectors that maximize the mean-square detection statistic for climate change. The technique may be used to identify the space of time-varying climate change signals from general circulation model simulations. The technique is based on the assumption that the forced variability is an independent and additive perturbation to unforced variability. Under this assumption, optimal fingerprint analysis generates a climate change detection statistic that maximizes the signal-to-noise ratio, provided the climate change signal and the statistics of unforced variability are prespecified. If independent simulations with interannual variations in climate forcing are available, then the proposed technique determines the vectors that maximize the detection statistic with respect to the forced simulations. The maximization problem leads to an eigenvalue problem involving the covariances of the simulations, as well as the difference in means between the simulations. The method is closely related to discriminant analysis, where the discriminant ratio is the mean-square detection statistic. It also is related to the technique proposed in DelSole et al. (2011) for identifying the forced response of global sea surface temperatures. If the climate change signal is characterized by a single linearly dependent vector, then the technique will find it; if the signal is characterized by multiple linearly independent and uncorrelated vectors, then the technique will find a set of vectors that span the space of signal vectors.

An attractive feature of the technique is its definitiveness: if the maximized detection statistic is not significant, then no linear combination of variables that were adjusted to find the maximum is significantly detectable in the models. Alternative approaches to defining climate change signals are suboptimal and hence may lead to poorly represented signals, raising doubts as to whether lack of detection is due to poorly specified signals. The proposed technique is optimal and therefore removes these doubts. However, the results of the technique are conditioned on the particular state vector chosen to represent the climate change signal. In general, the state vector for describing the signal can be chosen with considerable flexibility. For instance, the climate change signal might be represented by multidecadal trends, differences between two multidecadal periods, or a sequence of decadal means. Conclusions drawn for one choice of state vector may not carry over to other state vectors, even for the same dataset with the same climate change signal. It is an open question as to what is the best representation of the climate change signal. The proposed technique should be useful in exploring this question, since it provides an optimal and objective method for quantifying detection.

Recent interest has focused on event attribution analysis that attempts to quantify climate changes on seasonal and continental scales. Unfortunately, there often is a gap between the space and time scale of the event in question, such as heat waves and flooding occurring on subcontinental space scales and on monthly time scales, and the climate change signal that has been detected, often characterized by multidecadal time scales and global-to-continental space scales. The proposed technique may prove useful for bridging this gap.

To illustrate the technique, we applied it to identify the seasonal-mean anomaly in surface air temperature and precipitation that is most detectable, in the sense that the anomaly maximizes the mean-square detection statistic. It should be recognized that lack of detectability for seasonal-mean anomalies, which have relatively small signal-to-noise ratios, does not preclude detectability of signals expressed by more detailed spatiotemporal structure. We used simulations from the CMIP3 archive to perform the analysis, where forced quantities were estimated from twentieth-century runs and unforced quantities were estimated from preindustrial control runs. Simulations from eight models were pooled together to derive components that maximize the mean-square detection statistic over all models. The components were then projected onto independent assessment simulations to verify detectability model by model. Seasonal-mean surface temperature was found to be detectable in all seasons by almost all models in all continents but Europe. Seasonal temperature anomalies over Europe tend to be undetectable in multiple (but not all) models, though this does not preclude other types of spatiotemporal signals from being detectable. The most detectable component is characterized by warming over the twentieth century, with cooling episodes following major volcanic eruptions, similar to the familiar response of global-mean temperature. The seasonal-mean cooling following major volcanic eruptions also is found to be detectable, depending on the magnitude and timing of the eruption.

Most models are found to have only a single detectable seasonal-mean anomaly in each continent. For these cases, no signal in seasonal-mean anomaly is detectable after the leading discriminant component has been regressed out of the time series. This result suggests that the responses to different forcings (e.g., anthropogenic emissions of greenhouse gases and aerosols, volcanic aerosols, solar variability, black carbon, and land-use changes), when averaged over a season, are nearly linearly dependent in space and/or time, and hence climate change signals will be difficult to separate based only on a single seasonal anomaly. This result is not surprising, as it is the essential reason that detection and attribution studies usually employ vectors that capture variations in both space and time. We emphasize that when the maximization procedure identifies multiple detectable vectors, the vectors generally are linear combinations of the true climate change signals and cannot be paired with responses to particular forcings.

Seasonal-mean precipitation anomalies were found to be detectable in only a few models and only for averages of 5 yr or more, suggesting that the signal should include more spatiotemporal detail to be detectable across more models. Of the precipitation anomalies that are detectable, the character of the signal appears to be of two types: a systematic trend in certain continents and enhanced frequency of extreme values. The technique identifies detectable drying anomalies in the southern United States and Mexico and southern Europe, which have all been diagnosed in previous studies, however, using twenty-first-century runs with substantially larger signal-to-noise ratios. The most detectable component also suggests greater frequency of deficit years in North America and larger extremes (both positive and negative) in Southeast Asia. We emphasize that the later conclusions relating to precipitation extremes apply to particular models and are not consistent across models. We also note that different models predict different precipitation trends over South America.

In many cases, we find a seasonal-mean anomaly to be detectable primarily because the anomalies are part of a multidecadal trend. Thus, detection of seasonal anomalies indirectly implies detection of the multidecadal trend on which it is superposed. However, the reverse is not so clear: a multidecadal trend might be detectable but the individual seasonal anomalies comprising it may not, simply because multidecadal trends have larger signal-to-noise ratios than seasonal means, owing to stronger temporal filtering.

In one of the few detection analyses of precipitation, Zhang et al. (2007) found a detectable response in the zonal-mean, land-mean precipitation trend. However, the modeled response pattern needed to be inflated by over a factor of 5 in order to match observed trends. Nevertheless, our results show that such a weak response still is detectable in particular models. Furthermore, Zhang et al. (2007) expressed the response as trends within certain specified zonal bands. It turns out that these bands nicely encompass the signals found in our study. In particular, the drying patterns found in this study lie primarily between the equator and about 30°N, consistent with the signal found by Zhang et al. (2007). However, our analysis shows that these trends were not simulated by the same models. We also found a large-scale drying pattern in southern Europe, which Zhang et al. (2007) identified as a band with “disagreement” between models. However, this band included much of Asia and the United States, which had no detectable precipitation response in JAS or JFM in our study. Perhaps a fingerprint that excluded Asia and the United States but included southern Europe, would have revealed a stronger signal. Our main point is that some precipitation climate change signals are characterized by patterns with opposing signs, and unless the boundaries of the sign changes are known, it is likely that spatial averages will damp the signals. These considerations suggest that the technique proposed here might be helpful in choosing a representation of a climate signal that optimizes detectability.

The signals in this study have been derived solely from analysis of climate models. DelSole et al. (2011) applied a related procedure to derive the forced response of global sea surface temperatures and showed this vector was detectable in observations. Whether the seasonal-mean anomalies derived in this paper also are detectable in observations requires a separate study.

## Acknowledgments

We thank Tapio Schneider and Xuebin Zhang, acting as reviewers, and Michael Tippett for many insightful comments that lead to substantial improvements in methodology and clarity of presentation. We also thank Jagadish Shukla, Myles Allen, and Prashant Sardeshmukh for influential discussions regarding this work. This research was supported by the National Science Foundation (ATM0332910, ATM0830062, and ATM0830068), National Aeronautics and Space Administration (NNG04GG46G and NNX09AN50G), and the National Oceanic and Atmospheric Administration (NA04OAR4310034, NA09OAR4310058, NA05OAR4311004, NA10OAR4310210, and NA10OAR4310249). The views expressed herein are those of the authors and do not necessarily reflect the views of these agencies.

### APPENDIX

#### Solution Based on Principal Components

This appendix summarizes our method for maximizing the mean-square detection statistic, assuming the solution can be represented as a linear combination of basis vectors **e**_{1}, **e**_{2}, …, **e**_{T}. The procedure is essentially equivalent to that found in Schneider and Held (2001), Straus et al. (2003), and DelSole et al. (2011).

Let samples from the 20C simulation be collected in the *N*_{20C} × *M* matrix and samples from the control simulation be collected in the *N _{U}* ×

*M*matrix . The samples are expressed as anomalies relative to the mean of the control run. The parameters

*N*

_{20C}and

*N*identify the number of samples (e.g., number of seasonal anomalies) in each simulation, and

_{U}*M*identifies the state dimension. The state dimension

*M*should be equal for the two datasets, while the number of samples can differ. Let us collect the basis vectors into a single matrix

The amplitude of the basis vectors are obtained by regression methods, which involves a pseudo inverse ^{i} with the property

In particular, the amplitudes are derived from the datasets as

In practice, the basis vectors are empirical orthogonal functions (EOFs). Consequently, we refer to the basis vectors as EOFs, and the corresponding amplitudes given by (A3) as principal components.

In general, the number of basis vectors *T* is a small fraction of the dimension *M*. This implies that some variability will not be captured by the basis vectors. In other words, by considering only *T* basis vectors, we have filtered out variability from the data. It is helpful to distinguish the full data from the filtered data with separate symbols. Accordingly, we use dots to indicate matrices associated with filtered or truncated datasets, in which case the filtered version of the data is

The covariance matrices for the principal components are

where tildes indicate quantities in EOF space. Note that, since and are centered relative to the control run, is a sample covariance matrix for the control run but is a covariance matrix for the 20C run plus a rank-1 matrix associated with the difference in means between the 20C and control runs.

The idea now is to maximize the mean-square detection statistic not for the original data and but for the filtered data or, equivalently, for the coordinates of the basis vectors **B**_{20C} and **B**_{U}. Numerically, it is preferable to solve (15) rather than (8), since the former avoids computing an inverse matrix and preserves symmetry properties of the solution. Therefore, we solve the generalized eigenvalue problem

Since the above covariance matrices are *T* × *T*, the solution yields *T* eigenvalues, which we order in descending order, and yields the corresponding eigenvectors . The corresponding vector that maximizes the mean-square detection statistic is

Projecting this vector into the original sample space gives

The amplitude of the *k*th vector is found by projecting eigenvector **q**_{k} onto the data,

These amplitudes are identical to the least squares unbiased estimates of the amplitudes of the vectors **p**_{k} given by (18). The symmetry of the matrices in (A7) implies that the amplitudes of one component are uncorrelated with those of any other. We normalize the amplitudes to have unit variance in the control runs, which implies