Statistical relationships between future and historical model runs in multimodel ensembles (MMEs) are increasingly exploited to make more constrained projections of climate change. However, such emergent constraints may be spurious and can arise because of shared (common) errors in a particular MME or because of overly influential models. This study assesses the robustness of emergent constraints used for Arctic warming by comparison of such constraints in ensembles generated by the two most recent Coupled Model Intercomparison Project (CMIP) experiments: CMIP3 and CMIP5. An ensemble regression approach is used to estimate emergent constraints in Arctic wintertime surface air temperature change over the twenty-first century under the Special Report on Emission Scenarios (SRES) A1B scenario in CMIP3 and the Representative Concentration Pathway (RCP) 4.5 scenario in CMIP5. To take account of different scenarios, this study focuses on polar amplification by using temperature responses at each grid point that are scaled by the global mean temperature response for each climate model. In most locations, the estimated emergent constraints are reassuringly similar in CMIP3 and CMIP5 and differences could have easily arisen from sampling variation. However, there is some indication that the emergent constraint and polar amplification is substantially larger in CMIP5 over the Sea of Okhotsk and the Bering Sea. Residual diagnostics identify one climate model in CMIP5 that has a notable influence on estimated emergent constraints over the Bering Sea and one in CMIP3 that that has a notable influence more widely along the sea ice edge and into midlatitudes over the western North Atlantic.
The Arctic region has exhibited some of the most dramatic recent changes in climate, with in particular a rapid retreat of sea ice since at least 1979 and a loss of ice mass on Greenland since the late 1990s (Stroeve et al. 2007; Rignot et al. 2008). These changes are expected to have an important impact on sea level and ocean circulation (Lemke et al. 2007).
Climate model simulations have shown that projected lower-tropospheric warming in the Arctic is amplified compared to the projected warming in global mean temperature (e.g., Frierson 2006). However, there is substantial variation in the amplitude of polar amplification simulated by different climate models (Hawkins and Sutton 2009). Investigation of phase 3 of the Coupled Model Intercomparison Project (CMIP3) multimodel ensemble (Meehl et al. 2007) has demonstrated that the intermodel spread in projected change in surface air temperatures near the winter sea ice edge can be partially accounted for by differences in the means of the historical runs (Räisänen et al. 2010; Bracegirdle and Stephenson 2012, hereafter BS12). More specifically, near the ensemble mean sea ice edge, models that simulate colder present-day surface temperatures (associated with more sea ice) give more future warming (associated with the transition from sea ice to open ocean) (Holland and Bitz 2003). Such state dependence of the response provides an emergent constraint (Collins et al. 2012) that can be used to reduce model-related uncertainty and give more precise projections (BS12). Other emergent constraints have also been found and exploited for other variables such as Arctic sea ice extent and snow albedo (Hall and Qu 2006; Boe et al. 2009).
However, emergent constraints may also be spurious and arise by serendipity because of common structural errors in a particular ensemble of climate model experiments (Stephenson et al. 2012). Furthermore, the estimate of an emergent constraint may be overly sensitive to an individual climate model in a multimodel ensemble (MME); in other words, a particular model may be overly influential. This note explores these robustness issues by comparing emergent constraints in CMIP5 (Taylor et al. 2012) and CMIP3 experiments and by using a Cook’s distance diagnostic to identify the most influential models. Following BS12, the emergent constraint in wintertime surface air temperatures at high northern latitudes is quantified using ensemble regression. By ensemble regression we mean a regression at each grid point across an MME between the present-day mean state of a variable and the projected change in that variable. To take account of the different scenarios used in CMIP3 [Special Report on Emission Scenarios (SRES) A1B is used here] and CMIP5 [Representative Concentration Pathway (RCP) 4.5 is used here], gridpoint temperature changes are first scaled by the global mean temperature change for each model.
2. Data and methods
The analysis is based on wintertime (December–February mean) gridded surface air temperatures (CMIP variable name “tas”), obtained from CMIP3 and CMIP5 experiments. Tables 1 and 2 list the models for which available data were successfully downloaded. Data from the CMIP5 “historical” scenario runs and CMIP3 Climate of the Twentieth Century (20C3M) runs were used to define the present-day basic state. A 30-yr climatological mean over 1970–99 winters was used to define the present-day basic state in both the CMIP5 historical runs and the CMIP3 20C3M runs. Future scenarios used 2069–98 data from the CMIP5 RCP4.5 and CMIP3 SRES A1B scenario runs. For both CMIP5 and CMIP3, the future climate change response is defined as the difference between means from the future and present-day scenarios. An estimate of true present-day climate was obtained by using the observationally constrained European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis (ERA-Interim) dataset (Dee et al. 2011). Before analysis, all climate model and reanalysis datasets were bilinearly interpolated onto the same horizontal grid [the Hadley Centre Global Environmental Model 2, Earth System (HadGEM2-ES); grid: 1.25° latitude × 1.875° longitude].
b. Ensemble regression framework
Following BS12, at each grid point the multimodel ensemble data are represented by the linear regression model , where is the mean wintertime temperature simulated by model i for the historical scenario and is the response in wintertime temperature at the same grid point divided by the global mean temperature response simulated by the same model. The residuals are assumed to be identically and independently distributed with zero expectation. The response for each model is the difference between the mean of all future runs and the mean of all past runs for that model, with no account taken for the different numbers of runs (summarized in Tables 1 and 2). The framework provides a parsimonious description of the ensemble using only three parameters at each grid point, , , and [the variance of the residuals ], which are estimated here using ordinary least squares. The fraction of variance, , provides a simple measure of the strength of the emergent constraint. The model is used to predict the mean response and confidence interval one would obtain for a basic state equal to that observed historically (for details see appendix). To identify data points in the regression that have the most influence on the estimated parameters, a Cook’s distance (Faraway 2005; see appendix for more details) has been calculated at each grid point for each model.
a. Emergent constraints in CMIP3 and CMIP5
The analysis shown here includes all models except model 3.10 in CMIP3. Model 3.10 has been removed as in previous studies (e.g., BS12) because of its known biases and unduly large influence (see next section). After removal of model 3.10, CMIP5 and CMIP3 give similar ensemble regression estimates of polar amplification (Fig. 1). The Arctic average mean response (north of 60°N) is slightly larger in the CMIP5 models (2.78°C) than in the CMIP3 models (2.49°C). Figures 1b,d show the 95% prediction interval at each grid point (see appendix for definition). The CMIP3- and CMIP5-based predictions have similar precision with only slightly larger 95% prediction intervals for CMIP5 than CMIP3. Figure 1a also shows the locations referred to below: the Labrador Sea (denoted with “L”), the Bering Sea (denoted with “B”), and the Sea of Okhotsk (denoted with “O”).
Figures 2a–c show the fraction of variance explained by the regression in CMIP5 and CMIP3. The emergent constraint is strongest along the winter boundary between sea ice and open ocean (referred to here as the ice edge). It accounts for broadly similar fractions of variance in CMIP5 and CMIP3 (Figs. 2a,b). However, the CMIP5 constraint is weaker along the sea ice edge of the North Atlantic and stronger over the Sea of Okhotsk.
Figures 2d–f show the estimated regression slopes for the CMIP5 and CMIP3 ensembles. These also are broadly similar for CMIP5 and CMIP3, with negative slopes along the ice edge. Differences between and are generally smaller than the 95% confidence intervals associated with their summed variances, except over the Sea of Okhotsk and the Bering Sea (Figs. 2g–i), where the slopes for CMIP5 are substantially more negative.
Figure 3 shows detailed comparisons over the Labrador Sea (Figs. 3a–c), the Sea of Okhotsk (Figs. 3d–f), and the Bering Sea (Figs. 3g–i). These locations were chosen because the Labrador Sea is an example of a location of close agreement between CMIP3 and CMIP5, whereas over the Sea of Okhotsk and Bering Sea differences between and are large. The implications of these differences can be explored by using the ensemble regression estimates from CMIP3 to make predictions of CMIP5 model responses based on the model historical mean state. Over the Labrador Sea (Fig. 3c) the CMIP5 and CMIP3 estimated slopes are similar. However, over the Sea of Okhotsk, the relatively large differences between and mean that CMIP5 responses predicted from CMIP3 relationships are unreliable. It should be noted, however, that the CMIP5 projected change for all but four CMIP5 models is within the 95% prediction interval of the projected change estimated using CMIP3 ensemble regression. Over the Bering Sea the differences are due to an outlier with a large influence on the ensemble regression slope (model 5.8), which is clearly apparent in Fig. 3g. With model 5.8 removed, the CMIP5 and CMIP3 slopes at this location are in much closer agreement (not shown).
b. Influential models in ensemble regression
Figure 4 shows Cook’s distances averaged over the subarctic, which suggest that model 5.8 in CMIP5 and model 3.10 in CMIP3 are by far the most influential in the ensemble regressions for these experiments. This is consistent with the above finding that model 5.8 has a large influence over the Bering Sea. The spatial distribution of the Cook’s distance of model 5.8 shows that its large influence occurs almost entirely over the central and southern Bering Sea (Fig. 5a). This is in contrast with model 3.10, which shows large Cook’s distances in many regions (Fig. 5b). In their CMIP3 study, BS12 also identified model 3.10 as having large influence (based on leverage diagnostics; see appendix) and omitted it since it is known to have an unrealistically small poleward ocean heat transport at midlatitudes (Arzel et al. 2006). Model 5.8 is worthy of a more detailed investigation to find out why it was identified as influential in CMIP5. Model 5.6 exhibits the next largest subarctic Cook’s distance after model 5.8. Interestingly this model has been found to be an outlier in terms of September Arctic sea ice extent (Massonnet et al. 2012). The CMIP5 ensemble regression estimates were not found to be substantially different in sensitivity tests where model 5.6 was removed (not shown).
This intercomparison of emergent constraints in CMIP3 and CMIP5 has found that the inverse relationship of wintertime polar amplification of Arctic warming to the mean present-day temperatures simulated by models is generally robust. The significant correlations along the winter sea ice edge, identified in CMIP3 by Räisänen (2007) and Knutti et al. (2010), are also present in CMIP5. It is likely that the same physical mechanism can explain this in both cases: namely, that, near the ensemble mean ice edge, models with too much sea ice in present-day climate tend to have more warming in the future since as ice retreats the transition from sea ice to open ocean gives large surface warming (Holland and Bitz 2003). However, if the emergent constraint was caused by structural model error, a small amount of similarity might still be expected since the models in CMIP5 have evolved from those in CMIP3 and so the two ensembles are not completely independent.
The ensemble regression slopes show the same broad spatial pattern in CMIP5 and CMIP3. Over the North Atlantic and most of the Arctic the differences between CMIP5 and CMIP3 regression slopes could easily have arisen because of sampling variation. There are some significant differences over the Sea of Okhotsk and the Bering Sea. Over the Bering Sea, this is attributable to a model with a large local influence on the ensemble regression slope (discussed below). Over the Sea of Okhotsk, the reason for the differences is less clear. One possibility is that the emergent constraint is less robust in this region and prone to changes in structural errors possibly related to model improvements such as higher horizontal resolution. However, a comparison between “high” and “low” resolution subsets of the CMIP3 ensemble showed no indication of a sensitivity of regression slope to resolution (not shown). Alternatively, it is clear that over the Sea of Okhotsk the intermodel spread in historical climatology is smaller in CMIP5 than in CMIP3 (Figs. 3d,e) but with a similar range of projected changes. The consequence of this is therefore a steeper slope in CMIP5.
It is possible that significance in the differences between the parameters estimated from the CMIP3 and CMIP5 ensembles is overestimated here because of the effective number of independent models being smaller than the actual number of models (e.g., Jun et al. 2008; Masson and Knutti 2011). However, it should be noted that it is the regression residuals in model temperatures, rather than the model temperatures, that are assumed to be independently distributed. The inclusion of a dependence on the basic state helps to produce residuals that appear to be identically and independently distributed. This point is supported by a lack of obvious clustering of related models in the scatterplots of past–future relationships in gridpoint surface temperature shown in Fig. 2 of BS12.
A Cook’s distance diagnostic identified model 5.8 in CMIP5 and model 3.10 in CMIP3 as the most influential models over the Arctic. The large Cook’s distance exhibited by model 3.10 shows that the large leverage found by BS12 for this model is causing a large influence on ensemble regression slopes. The issues with model 3.10 are well documented and understood. However, it is not yet clear what the reasons are for the strong influence of model 5.8 over the Bering Sea. This is an important issue to investigate in the future but is beyond the scope of this paper.
This study is part of the British Antarctic Survey Polar Science for Planet Earth Programme funded by the Natural Environment Research Council. Three anonymous reviewers are thanked for their constructive comments, which helped to significantly improve the manuscript. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups (listed in Table 2 of this paper) for producing and making available their model output. For CMIP, the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. The European Centre for Medium-Range Weather Forecasting is thanked for providing the ERA-Interim dataset. DBS also thanks David Sexton and Mat Collins for useful comments on structural errors and emergent constraints.
Prediction Uncertainty, Leverage, and Influence
This appendix gives a brief explanation of the concepts of prediction uncertainty, leverage, and influence relevant to ensemble regression. The reader is referred to standard textbooks on linear models for more comprehensive discussion (e.g., Faraway 2005).
From the regression model described in section 2b, the predicted mean climate change response is given by , where is an estimate of the present-day basic state (e.g., the climatological mean of reanalysis temperatures). The carat symbol denotes either a parameter estimate or a regression model prediction. It should be noted that only ensemble climate model data are used to estimate the regression parameters, and so observations have no effect on estimates of the slope and intercept. Uncertainty in future observations involves uncertainty in the predicted response and uncertainty caused by the natural variability of future observations. We represent it here with the 95% prediction interval with lower and upper limits,
which are valid for large ensemble size if the residuals are normally distributed.
With regard to leverage and Cook’s distance diagnostics, it is useful to write down the regression model in matrix form as , where is known as the “hat matrix” and is a matrix with columns incorporating the intercept and predictor . The diagonal elements of the hat matrix are referred to as “leverages.”
With one predictor variable , high leverage occurs for points that have outlying values in . BS12 used leverage to help identify influential climate models in ensemble regression. However, the influence of a data point on the regression estimates also depends on both its leverage and its distance from the regression slope. An item with a small leverage could still have a large influence if it has a large residual. Both effects are taken into account in the Cook’s distance diagnostic given by
where is the number of predictors plus 1 (Faraway 2005, p. 70).