Anthropogenic influences on surface temperature over the second half of the twentieth century are examined using output from two general circulation models (HadCM2 and ECHAM3). Optimal detection techniques involve the comparison of observed temperature changes with those simulated by a climate model, using a control integration to test the null hypothesis that all the observed changes are due to natural variability. Two recent studies have examined the influence of greenhouse gases and the direct effect of sulfate aerosol on surface temperature using output from the same two climate models but with many differences in the methods applied. Both detected overall anthropogenic influence on climate, but results on the separate detection of greenhouse gas and sulfate aerosol influences were different. This paper concludes that the main differences between the results can be explained by the season over which temperatures were averaged, the length of the climatology from which anomalies were taken, and the use of a time-evolving signal pattern as opposed to a spatial pattern of temperature trends. This demonstration of consistency increases confidence in the equivalence of the methodologies in other respects, and helps to synthesize results from the two approaches. Including information on the temporal evolution of the response to different forcings allows sulfate aerosol influence to be detected more easily in HadCM2, whereas focusing on spatial patterns gives better detectability in ECHAM3.
General circulation models have been used to simulate the climate response to a range of forcing scenarios that include both natural and anthropogenic influences. These integrations then may be compared with observations to assess the extent to which observed changes are consistent with a range of forcing agents. This comparison may be made quantitatively using the “multifingerprinting” methodology of Hasselmann (1997). Several studies have applied these methods to the detection of anthropogenic influence on surface temperature (e.g., Hegerl et al. 1997; Tett et al. 1999; Barnett et al. 1999; Stott et al. 2001; Hegerl et al. 2000; Allen et al. 2001). The latter two studies both consider the detection of greenhouse gas and direct sulfate aerosol influence using the second Hadley Centre coupled GCM (HadCM2; Johns et al. 1997) and the European Centre for Medium-Range Weather Forecasts–Hamburg (ECHAM3) GCM developed at the Max Planck Institute for Meteorology in Hamburg, Germany, here coupled to the Large-Scale Geostrophic ocean model (Voss et al. 1998). However, the methods applied differ in several ways. That of Allen et al. (2001; hereinafter A01) uses a time–space diagnostic based on five decadal means of surface temperature from between 1946 and 1996, with a T4 spherical harmonic representation to concentrate on large spatial scales. Anomalies are taken from a 90-yr climatology of annual mean temperature, and the detection methodology described in Allen and Tett (1999) is applied but using a total least squares estimator to take account of uncertainty in the model-predicted response patterns (a total least squares estimator jointly minimizes errors in the dependent and independent variables, whereas an ordinary least squares estimator only minimizes errors in the dependent variable). By contrast, Hegerl et al. (2000; hereinafter H00) use a detection diagnostic based on the spatial pattern of linear trends over 50 June–July–August (JJA) averages. A signal pattern is derived by taking the first empirical orthogonal function (EOF) of surface temperature from a scenario run (an integration with time-varying anthropogenic forcing). This first EOF approximates to the linear trends in these temperatures and is dominated by the pattern of twenty-first-century temperature change, although this is spatially similar to simulated temperature changes over the observed period. A pattern amplitude is estimated by taking the scalar product of this signal pattern with linear trends in JJA temperatures over the 50-yr period of 1948–97, fitted at each grid point. The amplitude of the anthropogenic signal in observations is then compared with that simulated by the model. H00 use an ordinary least squares fit, although the difference between this and a total least squares estimate is likely to be small, because the signal pattern is derived from a longer period than that used by A01 and should contain less noise.
Each study applies its methodology to output from several models, and HadCM2 and ECHAM3 are considered in both cases. Each study also considers the detection of a range of anthropogenic and natural forcing factors alone and in various combinations. Both studies find that a greenhouse gas–only (G) and a combined greenhouse gas–direct sulfate aerosol (GS) signal are detectable using either model and that G alone overestimates the observed change. A two-way regression of the observations onto the greenhouse gas (G) and direct sulfate aerosol (S) signals is also considered in both studies (Table 1). There is considerable disagreement between the results of the two approaches for this two-way regression; indeed, the only case on which they agree is that the greenhouse gas response can be detected separately from the sulfate aerosol response using ECHAM3. In this note we identify the reasons for these differences.
Because there are several important differences between the methods used in the two studies, we started with the method used by A01 and made changes incrementally toward the technique used by H00. The most obvious difference between the studies is the season over which surface temperatures are averaged. Both studies use the Parker et al. (1994) surface temperature dataset over similar periods, but A01 used decadal means of annual mean temperatures, and H00 used JJA mean temperatures. JJA temperatures were used because it was found that the ECHAM3 sulfate signal was more distinguishable from the (negative) greenhouse gas signal in this season, since this is when the intensity of the solar radiation incident on the main sulfate aerosol source regions is largest. With a change to JJA mean temperatures alone, the A01 results became consistent with those of H00 in all cases listed in Table 1, except that greenhouse gas influence was still detected in HadCM2. While H00 derived their diagnostic using anomalies relative to a 50-yr climatology contemporaneous with their data, A01 used temperature anomalies calculated relative to a 90-yr climatology to retain information about century-scale warming. For the cases considered in Table 1, the use of a 50-yr climatology in the A01 methodology did not resolve any of the discrepancies.
Last, we concentrate on a more fundamental difference between the analysis methods used in the two studies. A01 use a diagnostic based on decadal means of a T4 spherical harmonic truncation of the global surface temperature field. This spatial truncation was used because it was found that the variability and response to forcing is only realistic in HadCM2 on these large spatial scales (Stott and Tett 1998). Decadal means were used partly because they reduce the size of the diagnostic and because they average over a period comparable to the 11-yr solar cycle, thus reducing but not eliminating the influence of solar forcing on the results (e.g., Lean and Rind 1998; note that the influence of low-frequency solar forcing on this analysis is discussed by A01). By contrast, H00 project 1948–97 gridpoint trends onto the first EOF of gridpoint JJA surface temperatures from a scenario run. This EOF approximates to the gridpoint trends from the scenario integration. H00 rely on the EOF truncation used in the optimization to concentrate the analysis on the largest spatial scales. Applying the methodology of A01 but using a gridpoint trend diagnostic based on decadal means similar to that used by H00 enabled all the differences noted in Table 1 to be explained.
Having explained the differences in the detectability of the greenhouse gas and sulfate aerosol signals in the two studies, we proceeded to establish whether we could explain the differences in the amplitudes of these signals. Both methods were applied to surface temperature over the 1946–95 period, and two changes were made to the way in which the results of H00 were presented to make comparison possible. First, H00 used stepwise linear regression on orthogonalized signal patterns, whereas A01 used multiple regression with respect to total least squares. The stepwise regression coefficients of H00 were transformed into multiple regression coefficients, as used by A01, using the transformation described by Hegerl and Allen (2002, manuscript submitted to J. Climate). Second, H00 presented the ratio of the observed anthropogenic change to natural variability alongside the same ratio for the model-simulated response, whereas A01 presented the ratio of the observed response to the modeled response. By expressing the results of H00 in terms of the ratio of the observed response to that simulated by each model (β in A01), we were able to compare results quantitatively.
Applying the methodology of A01 but using JJA temperatures, a 50-yr climatology, a gridpoint trend diagnostic, and an ordinary least squares fit, we obtain the scaling factors shown in Fig. 1. For both models we found residual variability to be consistent with control variability using the residual test described in Allen and Tett (1999). Results of the H00 method transformed into multiple regression coefficients are plotted on the same scale for comparison. Note that results from the two studies are now largely consistent. We do not expect exact agreement, because there are remaining differences in the methodologies. First, H00 use the first EOF of annually sampled surface temperature from a scenario run to define the signal pattern, rather than gridpoint trends based on decadal means over a recent 50-yr period. Before the optimal fingerprinting methodology is applied, modeled and observed diagnostics must be transformed into a truncated space so that the autocovariance matrix of the detection diagnostic can be estimated from the limited length of control integration available. H00 use EOFs of surface temperature from scenario integrations for this truncation, rather than part of the control. This is done partly to ensure that the anthropogenic signal is captured in the truncated space. Last, H00 truncate onto EOFs of annual mean temperatures rather than 50-yr trends.
All qualitative differences between the results of A01 and H00 are explained by the use of annual mean versus JJA temperature and the use of a time–space versus a gridpoint trend diagnostic. By making these changes to the methodology of A01, along with a change to a 50-yr climatology and ordinary least squares fitting, we were able to approximate closely the results of H00. Agreement was good both in terms of the best-guess regression coefficients and the associated uncertainties. The similarity of these results despite remaining technical differences in implementation demonstrates that these details have a relatively small effect on estimated signal amplitudes, particularly when compared with the effect of using signal patterns from different models (compare Figs. 1a and 1b). This gives us more confidence in the results of both studies. Of course, this enhancement of confidence is limited to the detection methodologies themselves, since we compare results using the same GCMs.
In general, if one technique detects a given signal while another does not, it means that the latter method is using a suboptimal diagnostic, provided of course that both methods correctly estimate natural variability. Although the combined greenhouse gas and sulfate aerosol signal is detected using both techniques in both models, we find that a time–space diagnostic leads to better separability of the greenhouse gas and sulfate aerosol signals in HadCM2, whereas trends give better separability in ECHAM3. This result is consistent with what we would expect based on the correlations between greenhouse gas and sulfate aerosol signal patterns. H00 note that simulated trend patterns in response to greenhouse gas and sulfate aerosol forcing are more distinct in ECHAM3 than in HadCM2, and we find that temporal variations in these responses are more distinct in HadCM2.
We thank the NOAA Office of Global Programs Climate Change Data and Detection Program Element and the DOE Office of Health and Environmental Research for funding this work. NPG was also supported by a CASE studentship from the U.K. Natural Environment Research Council, Ref GT04/98/217/AS; GCH by the National Science Foundation; MRA by an Advanced Research Fellowship from the U.K. NERC; and PAS by the U.K. Department of the Environment, Transport and the Regions under Contract PECD 7/12/37.
Corresponding author address: Nathan P. Gillett, Department of Physics, University of Oxford, Atmospheric, Oceanic, and Planetary Physics, Clarendon Laboratory, Parks Road, Oxford OX1 3PU, United Kingdom.Email: email@example.com