1. Introduction
The Arctic region has exhibited some of the most dramatic recent changes in climate, with in particular a rapid retreat of sea ice since at least 1979 and a loss of ice mass on Greenland since the late 1990s (Stroeve et al. 2007; Rignot et al. 2008). These changes are expected to have an important impact on sea level and ocean circulation (Lemke et al. 2007).
Climate model simulations have shown that projected lower-tropospheric warming in the Arctic is amplified compared to the projected warming in global mean temperature (e.g., Frierson 2006). However, there is substantial variation in the amplitude of polar amplification simulated by different climate models (Hawkins and Sutton 2009). Investigation of phase 3 of the Coupled Model Intercomparison Project (CMIP3) multimodel ensemble (Meehl et al. 2007) has demonstrated that the intermodel spread in projected change in surface air temperatures near the winter sea ice edge can be partially accounted for by differences in the means of the historical runs (Räisänen et al. 2010; Bracegirdle and Stephenson 2012, hereafter BS12). More specifically, near the ensemble mean sea ice edge, models that simulate colder present-day surface temperatures (associated with more sea ice) give more future warming (associated with the transition from sea ice to open ocean) (Holland and Bitz 2003). Such state dependence of the response provides an emergent constraint (Collins et al. 2012) that can be used to reduce model-related uncertainty and give more precise projections (BS12). Other emergent constraints have also been found and exploited for other variables such as Arctic sea ice extent and snow albedo (Hall and Qu 2006; Boe et al. 2009).
However, emergent constraints may also be spurious and arise by serendipity because of common structural errors in a particular ensemble of climate model experiments (Stephenson et al. 2012). Furthermore, the estimate of an emergent constraint may be overly sensitive to an individual climate model in a multimodel ensemble (MME); in other words, a particular model may be overly influential. This note explores these robustness issues by comparing emergent constraints in CMIP5 (Taylor et al. 2012) and CMIP3 experiments and by using a Cook’s distance diagnostic to identify the most influential models. Following BS12, the emergent constraint in wintertime surface air temperatures at high northern latitudes is quantified using ensemble regression. By ensemble regression we mean a regression at each grid point across an MME between the present-day mean state of a variable and the projected change in that variable. To take account of the different scenarios used in CMIP3 [Special Report on Emission Scenarios (SRES) A1B is used here] and CMIP5 [Representative Concentration Pathway (RCP) 4.5 is used here], gridpoint temperature changes are first scaled by the global mean temperature change for each model.
2. Data and methods
a. Data
The analysis is based on wintertime (December–February mean) gridded surface air temperatures (CMIP variable name “tas”), obtained from CMIP3 and CMIP5 experiments. Tables 1 and 2 list the models for which available data were successfully downloaded. Data from the CMIP5 “historical” scenario runs and CMIP3 Climate of the Twentieth Century (20C3M) runs were used to define the present-day basic state. A 30-yr climatological mean over 1970–99 winters was used to define the present-day basic state in both the CMIP5 historical runs and the CMIP3 20C3M runs. Future scenarios used 2069–98 data from the CMIP5 RCP4.5 and CMIP3 SRES A1B scenario runs. For both CMIP5 and CMIP3, the future climate change response is defined as the difference between means from the future and present-day scenarios. An estimate of true present-day climate was obtained by using the observationally constrained European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis (ERA-Interim) dataset (Dee et al. 2011). Before analysis, all climate model and reanalysis datasets were bilinearly interpolated onto the same horizontal grid [the Hadley Centre Global Environmental Model 2, Earth System (HadGEM2-ES); grid: 1.25° latitude × 1.875° longitude].
CMIP5 and CMIP3 models used in this study. The realization numbers used are also shown for CMIP5 historical and RCP4.5 runs and for CMIP3 20C3M and SRES A1B runs. Identification numbers (ID) are introduced here for the purpose of identifying CMIP models elsewhere in this paper.

b. Ensemble regression framework
Following BS12, at each grid point the multimodel ensemble data are represented by the linear regression model
3. Results
a. Emergent constraints in CMIP3 and CMIP5
The analysis shown here includes all models except model 3.10 in CMIP3. Model 3.10 has been removed as in previous studies (e.g., BS12) because of its known biases and unduly large influence (see next section). After removal of model 3.10, CMIP5 and CMIP3 give similar ensemble regression estimates of polar amplification (Fig. 1). The Arctic average mean response (north of 60°N) is slightly larger in the CMIP5 models (2.78°C) than in the CMIP3 models (2.49°C). Figures 1b,d show the 95% prediction interval at each grid point (see appendix for definition). The CMIP3- and CMIP5-based predictions have similar precision with only slightly larger 95% prediction intervals for CMIP5 than CMIP3. Figure 1a also shows the locations referred to below: the Labrador Sea (denoted with “L”), the Bering Sea (denoted with “B”), and the Sea of Okhotsk (denoted with “O”).

Predicted polar amplification (warming per 1°C in global mean temperature): (a) CMIP5 mean response, (b) CMIP5 95% prediction interval, (c) CMIP3 mean response, and (d) CMIP3 95% prediction interval. In (a), the letters O, B, and L indicate the locations of the Sea of Okhotsk, the Bering Sea, and the Labrador Sea, respectively.
Citation: Journal of Climate 26, 2; 10.1175/JCLI-D-12-00537.1
Figures 2a–c show the fraction of variance explained

Summary of ensemble regression for CMIP5 and CMIP3. Fraction of variance explained
Citation: Journal of Climate 26, 2; 10.1175/JCLI-D-12-00537.1
Figures 2d–f show the estimated regression slopes for the CMIP5 and CMIP3 ensembles. These also are broadly similar for CMIP5 and CMIP3, with negative slopes along the ice edge. Differences between
Figure 3 shows detailed comparisons over the Labrador Sea (Figs. 3a–c), the Sea of Okhotsk (Figs. 3d–f), and the Bering Sea (Figs. 3g–i). These locations were chosen because the Labrador Sea is an example of a location of close agreement between CMIP3 and CMIP5, whereas over the Sea of Okhotsk and Bering Sea differences between

Scatterplots comparing linear fits to intermodel relationships in surface air temperature (SAT) over (a)–(c) the Labrador Sea (60°N, 54.4°W), (d)–(f) the Sea of Okhotsk (55°N, 150°E), and (g)–(i) the Bering Sea (57.5°N, 174.4°E). These locations are marked in Fig. 1a. CMIP5 output is shown in (a),(d), and (g) and CMIP3 output is shown in (b),(e), and (h). The black solid (dashed) lines show the ensemble regression (ensemble mean) predicted mean response in scaled SAT, and the blue solid (dashed) lines show the 95% prediction interval. In (c) and (f), the change from individual CMIP5 model projections is compared with estimates of those projections based on feeding the CMIP5 historical mean state into the CMIP3 regression relationships. The vertical lines show the prediction intervals. The arrows in (g) and (i) indicate model 5.8.
Citation: Journal of Climate 26, 2; 10.1175/JCLI-D-12-00537.1
b. Influential models in ensemble regression
Figure 4 shows Cook’s distances averaged over the subarctic, which suggest that model 5.8 in CMIP5 and model 3.10 in CMIP3 are by far the most influential in the ensemble regressions for these experiments. This is consistent with the above finding that model 5.8 has a large influence over the Bering Sea. The spatial distribution of the Cook’s distance of model 5.8 shows that its large influence occurs almost entirely over the central and southern Bering Sea (Fig. 5a). This is in contrast with model 3.10, which shows large Cook’s distances in many regions (Fig. 5b). In their CMIP3 study, BS12 also identified model 3.10 as having large influence (based on leverage diagnostics; see appendix) and omitted it since it is known to have an unrealistically small poleward ocean heat transport at midlatitudes (Arzel et al. 2006). Model 5.8 is worthy of a more detailed investigation to find out why it was identified as influential in CMIP5. Model 5.6 exhibits the next largest subarctic Cook’s distance after model 5.8. Interestingly this model has been found to be an outlier in terms of September Arctic sea ice extent (Massonnet et al. 2012). The CMIP5 ensemble regression estimates were not found to be substantially different in sensitivity tests where model 5.6 was removed (not shown).

Subarctic averages (area weighted over 50°–70°N) of Cook’s distances for (a) the CMIP5 models and (b) the CMIP3 models.
Citation: Journal of Climate 26, 2; 10.1175/JCLI-D-12-00537.1

Spatial distribution of Cook’s distance for (a) model 5.8 and (b) model 3.10.
Citation: Journal of Climate 26, 2; 10.1175/JCLI-D-12-00537.1
4. Conclusions
This intercomparison of emergent constraints in CMIP3 and CMIP5 has found that the inverse relationship of wintertime polar amplification of Arctic warming to the mean present-day temperatures simulated by models is generally robust. The significant correlations along the winter sea ice edge, identified in CMIP3 by Räisänen (2007) and Knutti et al. (2010), are also present in CMIP5. It is likely that the same physical mechanism can explain this in both cases: namely, that, near the ensemble mean ice edge, models with too much sea ice in present-day climate tend to have more warming in the future since as ice retreats the transition from sea ice to open ocean gives large surface warming (Holland and Bitz 2003). However, if the emergent constraint was caused by structural model error, a small amount of similarity might still be expected since the models in CMIP5 have evolved from those in CMIP3 and so the two ensembles are not completely independent.
The ensemble regression slopes show the same broad spatial pattern in CMIP5 and CMIP3. Over the North Atlantic and most of the Arctic the differences between CMIP5 and CMIP3 regression slopes could easily have arisen because of sampling variation. There are some significant differences over the Sea of Okhotsk and the Bering Sea. Over the Bering Sea, this is attributable to a model with a large local influence on the ensemble regression slope (discussed below). Over the Sea of Okhotsk, the reason for the differences is less clear. One possibility is that the emergent constraint is less robust in this region and prone to changes in structural errors possibly related to model improvements such as higher horizontal resolution. However, a comparison between “high” and “low” resolution subsets of the CMIP3 ensemble showed no indication of a sensitivity of regression slope to resolution (not shown). Alternatively, it is clear that over the Sea of Okhotsk the intermodel spread in historical climatology is smaller in CMIP5 than in CMIP3 (Figs. 3d,e) but with a similar range of projected changes. The consequence of this is therefore a steeper slope in CMIP5.
It is possible that significance in the differences between the parameters estimated from the CMIP3 and CMIP5 ensembles is overestimated here because of the effective number of independent models being smaller than the actual number of models (e.g., Jun et al. 2008; Masson and Knutti 2011). However, it should be noted that it is the regression residuals in model temperatures, rather than the model temperatures, that are assumed to be independently distributed. The inclusion of a dependence on the basic state helps to produce residuals that appear to be identically and independently distributed. This point is supported by a lack of obvious clustering of related models in the scatterplots of past–future relationships in gridpoint surface temperature shown in Fig. 2 of BS12.
A Cook’s distance diagnostic identified model 5.8 in CMIP5 and model 3.10 in CMIP3 as the most influential models over the Arctic. The large Cook’s distance exhibited by model 3.10 shows that the large leverage found by BS12 for this model is causing a large influence on ensemble regression slopes. The issues with model 3.10 are well documented and understood. However, it is not yet clear what the reasons are for the strong influence of model 5.8 over the Bering Sea. This is an important issue to investigate in the future but is beyond the scope of this paper.
This study is part of the British Antarctic Survey Polar Science for Planet Earth Programme funded by the Natural Environment Research Council. Three anonymous reviewers are thanked for their constructive comments, which helped to significantly improve the manuscript. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups (listed in Table 2 of this paper) for producing and making available their model output. For CMIP, the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. The European Centre for Medium-Range Weather Forecasting is thanked for providing the ERA-Interim dataset. DBS also thanks David Sexton and Mat Collins for useful comments on structural errors and emergent constraints.
APPENDIX
Prediction Uncertainty, Leverage, and Influence
This appendix gives a brief explanation of the concepts of prediction uncertainty, leverage, and influence relevant to ensemble regression. The reader is referred to standard textbooks on linear models for more comprehensive discussion (e.g., Faraway 2005).



With regard to leverage and Cook’s distance diagnostics, it is useful to write down the regression model in matrix form as




REFERENCES
Arzel, O., , T. Fichefet, , and H. Goosse, 2006: Sea ice evolution over the 20th and 21st centuries as simulated by current AOGCMs. Ocean Modell., 12, 401–415.
Boe, J. L., , A. Hall, , and X. Qu, 2009: September sea-ice cover in the Arctic Ocean projected to vanish by 2100. Nat. Geosci., 2, 341–343, doi:10.1038/ngeo467.
Bracegirdle, T. J., , and D. B. Stephenson, 2012: Higher precision estimates of regional polar warming by ensemble regression of climate model projections. Climate Dyn., 39, 2805–2821, doi:10.1007/s00382-012-1330-3.
Collins, M., , R. E. Chandler, , P. M. Cox, , J. M. Huthnance, , J. Rougier, , and D. B. Stephenson, 2012: Quantifying future climate change. Nat. Climate Change, 2, 403–409, doi:10.1038/nclimate1414.
Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597, doi:10.1002/qj.828.
Faraway, J. J., 2005: Linear Models with R. Chapman & Hall/CRC, 229 pp.
Frierson, D. M. W., 2006: Robust increases in midlatitude static stability in simulations of global warming. Geophys. Res. Lett., 33, L24816, doi:10.1029/2006GL027504.
Hall, A., , and X. Qu, 2006: Using the current seasonal cycle to constrain snow albedo feedback in future climate change. Geophys. Res. Lett., 33, L03502, doi:10.1029/2005GL025127.
Hawkins, E., , and R. Sutton, 2009: The potential to narrow uncertainty in regional climate predictions. Bull. Amer. Meteor. Soc., 90, 1095–1107.
Holland, M. M., , and C. M. Bitz, 2003: Polar amplification of climate change in coupled models. Climate Dyn., 21, 221–232.
Jun, M., , R. Knutti, , and D. Nychka, 2008: Spatial analysis to quantify numerical model bias and dependence: How many climate models are there? J. Amer. Stat. Assoc., 103, 934–947, doi:10.1198/016214507000001265.
Knutti, R., , R. Furrer, , C. Tebaldi, , J. Cermak, , and G. A. Meehl, 2010: Challenges in combining projections from multiple climate models. J. Climate, 23, 2739–2758.
Lemke, P., and Coauthors, 2007: Observations: Changes in snow, ice and frozen ground. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 337–383.
Masson, D., , and R. Knutti, 2011: Climate model genealogy. Geophys. Res. Lett., 38, L08703, doi:10.1029/2011GL046864.
Massonnet, F., , T. Fichefet, , H. Goosse, , C. M. Bitz, , G. Philippon-Berthier, , M. M. Holland, , and P.-Y. Barriat, 2012: Constraining projections of summer Arctic sea ice. Cryosphere, 6, 1383–1394.
Meehl, G. A., , C. Covey, , T. Delworth, , M. Latif, , B. McAvaney, , J. F. B. Mitchell, , R. J. Stouffer, , and K. E. Taylor, 2007: The WCRP CMIP3 multimodel dataset: A new era in climate change research. Bull. Amer. Meteor. Soc., 88, 1383–1394.
Räisänen, J., 2007: How reliable are climate models? Tellus, 59, 2–29.
Räisänen, J., , L. Ruokolainen, , and J. Ylhäisi, 2010: Weighting of model results for improving best estimates of climate change. Climate Dyn., 35, 407–422, doi:10.1007/s00382-009-0659-8.
Rignot, E., , J. L. Bamber, , M. R. van den Broeke, , C. Davis, , Y. H. Li, , W. J. van de Berg, , and E. van Meijgaard, 2008: Recent Antarctic ice mass loss from radar interferometry and regional climate modelling. Nat. Geosci., 1, 106–110, doi:10.1038/ngeo102.
Stephenson, D. B., , M. Collins, , J. C. Rougier, , and R. E. Chandler, 2012: Statistical problems in the probabilistic prediction of climate change. Environmetrics, 23, 364–372, doi:10.1002/env.2153.
Stroeve, J., , M. M. Holland, , W. Meier, , T. Scambos, , and M. Serreze, 2007: Arctic sea ice decline: Faster than forecast. Geophys. Res. Lett., 34, L09501, doi:10.1029/2007GL029703.
Taylor, K. E., , R. J. Stouffer, , and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design. Bull. Amer. Meteor. Soc., 93, 485–498.