1. Introduction
How much will our greenhouse gas emissions warm our planet? This is a defining question of our time. The magnitude of this warming is usually characterized in terms of the equilibrium climate sensitivity (ECS), which is the global-average surface temperature response to doubling CO2 from preindustrial conditions and letting the planet return to equilibrium. Because the planetary response to future changes in atmospheric composition is difficult to determine based on observations of past and current climate (Collins et al. 2013), ECS is often estimated using global climate models (GCMs). Despite its importance, predictions of ECS from different GCMs vary by a factor of 2 (Flato et al. 2013) and intermodel spread in ECS has not decreased substantially over time (Charney et al. 1979; Knutti and Hegerl 2008; Andrews et al. 2012; Knutti et al. 2017). Unsurprisingly, this continued uncertainty has led to a desire to identify models that are more trustworthy. A natural way to do this is to assume that models that more accurately reproduce the current climate are more likely to capture its changes correctly. Unfortunately, models that perform well for some metrics may perform poorly for others (Gleckler et al. 2008), climate predictions from skillful models do not always agree (Waugh and Eyring 2008), and the ability to reproduce current climate does not necessarily imply predictive skill. Thus another popular approach (which is the focus of this paper) is to identify quantities in the current climate that have skill at predicting future changes in GCMs. The strength of correlation between predictor and predictand across an ensemble of GCMs is typically used to measure the explanatory power of a potential relationship. Observed values of current-climate predictors can then be used to choose which GCM predictions are most credible. These current-climate predictors are commonly called emergent constraints.
One problem with emergent constraints is that large intermodel correlations between current-climate and future-climate quantities are expected by chance in multimodel databases (Masson and Knutti 2013; Caldwell et al. 2014). As a result, emergent constraints without a solid physical basis should be viewed with skepticism. Unfortunately, most emergent constraints in the published literature lack a satisfying physical explanation. This is understandable because the climate system is complex and difficult to distill into simple physical relationships. Identifying these potential emergent constraints is an important and natural first step toward uncovering real constraints. Since the majority of recently proposed emergent constraints imply more severe sensitivity to greenhouse gases (Klein and Hall 2015), evaluating the credibility of predictions from emergent constraints has significant societal importance.
The goal of this paper is to evaluate the credibility and independence of previously published emergent constraints. Our sources of data are described in section 2 and the constraints we test are introduced in section 3. Section 4 provides a short primer on statistical significance of correlations before the independence of these emergent constraints is investigated in section 5. In section 6, a new method for decomposing correlation between ECS and an emergent constraint is introduced and used to understand the physical mechanisms underpinning the success of each tested constraint. A discussion and conclusions follow in section 7.
2. Data

Cloud feedbacks in the tables used for CMIP3 and CMIP5 models were computed using the adjusted cloud radiative forcing technique (Soden et al. 2004, 2008; Shell et al. 2008). Feedback terms unrelated to clouds were computed by converting the relevant physical quantities into TOA radiative perturbations using radiative kernels (Held and Soden 2000; Soden et al. 2008; Shell et al. 2008). For CMIP3, kernels were simply multiplied by the net change in the relevant physical quantity from Intergovernmental Panel on Climate Change Special Report on Emissions Scenarios (SRES) A1B simulations and normalized by global-average surface warming to obtain feedback values. For CMIP5 data, the Soden et al. (2008) kernels were used to compute radiative perturbations (with respect to contemporaneous preindustrial control climatologies) for each year of the 150-yr-long abrupt 4×CO2 simulations. These values were then linearly regressed against corresponding globally averaged changes in surface temperature
For both CMIP3 and CMIP5 ensembles, the data used in this study are computed without running experiments to equilibrium. Armour et al. (2013) and Rose et al. (2014) showed that the strength of the net feedback depends on the background climate state. In particular, ECS estimates tend to increase as model runs are extended (Williams et al. 2008; Winton et al. 2010; Andrews et al. 2012, 2015). In Fig. S1 in the online supplemental material we test the impact of temporal variation in net feedback by repeating some of our analysis using just the first 20 years of each 4×CO2 run, by using just years 21–150, and by using all years between 1 and 150. This figure shows that changes in simulation period have little effect on our results. Because net feedback is likely to continue changing beyond the 150 years evaluated here, our ECS estimates are probably best described as “effective climate sensitivities,” which are underestimates of the true ECS. In spite of the approximate nature of these values, the difference between equilibrium and effective climate sensitivity is probably a second-order effect [as suggested by Fig. 2 of Andrews et al. (2015)] and simulations that would allow us to compute something more akin to “true” ECS (e.g., coupled 2×CO2 simulations extending thousands of years) are not available for most CMIP5 models.
ECS values from CMIP3 simulations run to equilibrium with fixed deep-ocean heat transports and a shallow “slab” ocean layer are available from Table 8.2 of Randall et al. (2007); these slab ocean ECS values are somewhat different (the correlation between slab and SRES A1B ECS values is 0.63) but switching datasets does not change any of our conclusions. We use SRES A1B values for F and ECS to maintain consistency with process-level feedback values, which are not available for slab runs.
CMIP3 and CMIP5 data differ in several important ways. First, water vapor feedback for CMIP3 data was computed as the TOA radiative impact of change in specific humidity while for CMIP5 data water vapor feedback was computed as the TOA radiative impact of relative humidity (RH) change [as advocated by Held and Shell (2012)]. This change in definition requires compensatory changes in Planck and lapse-rate feedbacks. Using fixed-RH feedbacks has little impact on intermodel differences of the Planck feedback (which are small regardless of how they are calculated) but reduces the strong anticorrelation between water vapor and lapse rate found in earlier studies. Additionally, CMIP3 calculations are done on runs where both greenhouse gases and aerosols are varying in time, while CMIP5 simulations test only the impact of greenhouse gas changes. These differences in treatment of CMIP3 and CMIP5 model output force us to consider CMIP3 and CMIP5 models separately in our decomposition. For further details about how feedbacks, forcing, and ECS are calculated for each ensemble, consult the original data sources cited above.
3. Survey of potential emergent constraints studied
In this section we provide a short overview of each of the 19 proposed emergent constraints analyzed in this paper. For each constraint, we provide the following:
- a description of the constraint (also summarized in Table 1 for quick reference),
- the proposed explanation for why this constraint is a good predictor of ECS,
- an a priori expectation of the sign and magnitude of correlation between the predictor and ECS, and
- an initial evaluation of each constraint based on previous literature and correlations computed for this study (summarized in Table 2).
Short description of each emergent constraint tested in this paper.

Correlations of emergent constraints with ECS as reported in their original papers and as computed using the subsets of models for which we have constraint information as well as forcing and feedback components. Except for reported values from Covey (which used CMIP1 data), columns 2 and 3 report a single number if the study combined CMIP3 and CMIP5 models and otherwise reports individual CMIP3 and CMIP5 values separated by a slash (/). Values in bold are significant at 90% confidence using a Student’s t test assuming independent models (which is an overly permissive test; see text for details). An asterisk is used where no data are available.

Most constraint data used here come directly from the studies introducing that constraint. Because not all models used in these previous studies provide information necessary for our decomposition, we also provide correlations in Table 2 computed using the subset of models which provide all data we need. Correlation with a subset of models provides a weak sense of the robustness of our conclusions; testing on new ensembles would provide a more rigorous test. Because the first five studies we consider were published before CMIP5 data were available, we are able to test them against data they were not trained on by computing these constraints ourselves. Constraints that persist across ensembles are unlikely to occur by random chance, although it is worth mentioning that models used in CMIP5 are modified versions of models used in earlier intercomparisons (Pennell and Reichler 2011; Knutti 2010; Knutti et al. 2013, and references therein), so successive CMIP ensembles are not themselves completely independent. It is also worth noting that a real constraint may be present in one ensemble but not in another if the models used in those two ensembles were structurally different. For example, an emergent constraint might be detected in CMIP5 but not CMIP3 if it resulted from a process that was added for the first time in CMIP5 models. Alternatively, a constraint might appear in CMIP3 but not CMIP5 if all developers worked to make sure their models satisfied a constraint identified in CMIP3, thus getting rid of all spread in that predictor in CMIP5. While both of these scenarios are possible in theory, it is hard to imagine how model changes between CMIP3 and CMIP5 would affect any of the 19 constraints considered. As a result, we use reproducibility of a constraint across ensembles as a measure of their credibility.
This study gathers together more previously proposed constraints than any single previous study, but it is not itself exhaustive. Other studies were omitted because we were not aware of them while writing this paper, because they have already been shown to not be robust to changes in ensemble (e.g., Klocke et al. 2011), because they propose more constraints than our analysis can handle (Huber et al. 2011), or because computing them for CMIP5 models was too technically challenging given our available time (Shukla et al. 2006; Webb et al. 2015). Our scope is also limited by our focus on ECS, which precludes studies focused on other aspects of the climate system (e.g., Hall and Qu 2006; Cox et al. 2013). Defining emergent constraints relative to specific feedbacks rather than to a more integrative quantity like ECS would perhaps be preferable because it makes articulating a clear physical explanation for emergent relationships easier (Klein and Hall 2015). Furthermore, because the climate system is so complex, it is hard to believe that a single physical mechanism exists that can explain most of the intermodel spread in climate sensitivity (and therefore have very large correlation with ECS). Nonetheless, constraints on ECS are worth pursuing because they have the most value at reducing climate change uncertainty. Constraints on an individual feedback may be easier to find, but their practical utility is limited if that feedback does not project strongly onto ECS. We include Qu et al. (2013) in our study even though it was not previously tested on ECS because its mechanism (tropical low clouds) is known to be important for ECS. We also tested the constraints proposed in Gordon and Klein (2014) and McCoy et al. (2016), which both target high-latitude clouds, but ultimately omitted them from this study because they were poorly correlated with ECS; we take this to mean that only constraints on tropical clouds have a strong impact on ECS.
a. Covey
Covey et al. (2000) and Knutti et al. (2006) suggest that the strength of the hemisphere-averaged seasonal cycle of surface temperature may be a good proxy for the sensitivity of the planet to greenhouse gas changes because both are climate responses to radiative forcing changes. Models with a larger seasonal cycle are therefore theorized to respond have a stronger response to CO2 increase. Because surface air temperature is controlled by many factors, some (like ocean circulation) occurring on time scales longer than a single season, this constraint is likely to be relatively weak.
Because Covey et al. (2000) used data from the CMIP1 archive and Knutti et al. (2006) used data from an ensemble of simulations using a single GCM with perturbed tuning parameters, we compute our own Covey values for CMIP3 and CMIP5 ensembles. For each model, we compute the Covey value by taking the Northern Hemisphere average of the climatological surface temperature difference between January and July minus a similar quantity defined over the Southern Hemisphere. Climatological averages are computed using all available data from 20C3M and historical simulations (for CMIP3 and CMIP5 models, respectively). As in Covey et al. (2000), no attempt was made to correct for drift. As in all computations performed for this paper, computed values are the average over all available ensemble members. Sufficient data (including information to compute surface temperature, ECS, and F and λ components for our decomposition) were available for 12 CMIP3 models and 27 CMIP5 models. Covey et al. (2000) found a correlation of +0.4 between ECS and their constraint for 17 CMIP1 models; we find correlations of −0.36 and +0.35 for CMIP3 and CMIP5 data (respectively). Lack of consistency between ensembles suggests that the Covey constraint may not be robust, but the size of each sample is small (a problem with all statistical studies based on the CMIP archive) and the correlation we are seeking is weak, so false negatives are possible. As noted in Fasullo et al. (2015), perturbed physics ensembles (which typically have many more samples) may be more appropriate for teasing out small correlations like this. Unfortunately, relationships from perturbed physics ensembles often do not generalize to other collections of models (Sanderson 2011; Klocke et al. 2011; Masson and Knutti 2013).
b. Volodin
Volodin (2008) found a strong correlation in CMIP3 models between ECS and the gradient in total cloudiness between the tropics (taken to be between 28°N and 28°S latitude) and southern midlatitudes (between 36° and 56°S) for years 1980–2000. He hypothesized that cloud response to climate change may be governed by the same mechanisms that cause cloud fraction to decrease with increasing sea surface temperature (SST) as one moves equatorward. This means that models with stronger (more negative) latitudinal cloudiness gradients will have higher ECS. Volodin’s logic seems dubious because latitudinal variations in cloudiness are affected not only by local SST but also by the large-scale circulation. Nonetheless, when we compute Volodin values for the CMIP5 archive, we find that strong negative correlation is maintained (Table 2). Because the Volodin constraint was not trained on the CMIP5 dataset, this is a strong test of constraint robustness. A modern variant on the Volodin approach is described in the Siler section below.
c. Trenberth
The Southern Hemisphere averaged TOA energy balance between 1990 and 2000 was found to be correlated with ECS in CMIP3 models by Trenberth and Fasullo (2010). Their explanation is that models tend to predict increased cloudiness (negative cloud feedback) over the Southern Ocean in a warmer climate, but that is only possible because these models strongly underpredict the extremely high observed cloud fraction in this area. Models with more realistic clouds (and hence less positive TOA radiative imbalance) are expected to have less cloud increase in this area and correspondingly higher ECS. When we calculate Trenberth values for CMIP5 data and compute the resulting correlation with ECS, we get a negligibly small value. Grise et al. (2015) performed a similar calculation and arrived at the same conclusion. Upon further investigation, Grise et al. found that the correlation between Southern Hemisphere TOA radiation and ECS in CMIP3 models came as much from subtropical stratocumulus/trade cumulus areas as from the Southern Ocean. Further, connection between the Southern Ocean and ECS was found to only occur in models with excessively reflective present-day subtropical clouds (which includes most CMIP3 models but only half of the CMIP5 models). The connection between Southern Ocean and subtropical clouds seems to be an artifact of tuning (Grise et al. 2015; Kay et al. 2016; McCoy et al. 2016). Because Southern Ocean TOA radiation biases were not found to be well correlated with ECS in the full set of CMIP5 models and because the physical explanation for such a correlation is unclear, Grise et al. (2015) conclude that Southern Ocean TOA biases are not a valid emergent constraint. They conclude instead that Southern Hemisphere TOA radiation is correlated with ECS primarily through stratocumulus-to-trade-cumulus transition regions, which have greater scope for cloud reduction when they are more extensive in the current climate.
d. Fasullo M and D
In Fasullo and Trenberth (2012), the authors correlated May–August zonal-mean present-day RH from 1980–2000 against ECS for CMIP3 models and identified the two regions of largest correlation. One of these regions (denoted D) lies in the subtropical midtropospheric dry zone between approximately 20° and 8.5°S and 440 to 350 mb. The other region (denoted M) lies in the moist convective region between 1.5°S and 10°N latitude and 740 to 570 mb. The physical mechanisms governing these correlations are unclear, so it is impossible to make an a priori prediction of the sign or magnitude of these correlations. Because correlation with ECS was only computed for CMIP3 models in the Fasullo paper, we compute our own values of the Fasullo metrics for the 9 CMIP3 models and 23 CMIP5 models with sufficient data. Our correlations of M and D with ECS are also very similar to Fasullo and Trenberth (2012) values for CMIP3 data but have very weak magnitude when applied to CMIP5 data. This surprising result can be confirmed and understood by comparing Fig. 3 herein and Fig. S4 from Fasullo and Trenberth (2012). These figures show the correlation between ECS-like quantities and climatological- and zonal-average RH as a function of latitude and height for CMIP3 and CMIP5 models, respectively. While it is true that the general structure of these plots look similar, the M zone of positive correlation has completely disappeared in the CMIP5 plot and the region of negative correlation in the subtropics has shifted toward the surface and has weakened relative to Fasullo’s D region. Correlations in Fasullo and Trenberth’s (2012) Fig. S4 over the M and D boxes as defined in that paper are consistent with the values reported in our Table 2. Thus, while patterns of RH over the entire tropics [as advocated by Su et al. (2014), described later] may end up being a useful predictor of climate change, the specific regions identified by Fasullo are almost certainly spurious.
e. Qu











Bretherton and Blossey (2014) provide a physical explanation for the Qu result based on large-eddy simulations (LES): warmer temperatures increase boundary layer (BL) cloud-layer humidity fluxes for a given liquid water path, which increases cloud-top entrainment drying and hence reduces BL cloud mass and fraction. Because this mechanism operates on time scales much shorter than the variability sampled by Qu et al. (2013), short- and long-term behavior should be identical where this mechanism is dominant. Proving that the LES-based Bretherton and Blossey mechanism also explains the time scale invariance found in much coarser/cruder GCM simulations analyzed by Qu et al. (2013) is important future work.
f. Klein TCA and ctp-tau
Klein et al. (2013) provide metrics of model skill at reproducing present-day total cloud amount (TCA) and combined cloud-top pressure and optical depth (ctp-tau), which are strongly correlated with cloud feedback
g. Su
Su et al. (2014) shows that changes in tropical clouds can be predicted by changes in the Hadley circulation in which they are embedded. They find that the quality of a model’s representation of the present-day Hadley circulation is a good predictor of its ECS value. While it makes sense that cloud (and hence ECS) changes would follow Hadley cell changes, the linkage between a model’s representation of the present-day Hadley circulation and its future change is unclear. In particular, Fig. 1 of Su et al. (2014) suggests that the relationship between the mean state and future changes in the Hadley cycle is complicated. This missing piece precludes an a priori prediction for the strength or sign of the Su constraint. The Su constraint is computed by calculating zonal average profiles from the surface to 100 mb of cloud fraction and RH between 45°S and 40°N for both model output and observations, then calculating measures of model quality by either taking the slope of the regression between modeled and observed profiles for each latitude and averaging over latitudes or by computing the spatial correlation between modeled and observed values. Metrics defined with respect to RH or cloud fraction and using the slope or spatial correlation to calculate error provide similar skill and emergent constraint decomposition information, so we use the regression slope of the RH metric (chosen because it has greatest skill) for the remainder of this paper. Despite the fact that Su et al. (2014) only reports results for CMIP5 data, we do not compute this constraint for CMIP3 data because the calculation is complicated and requires observational data that we do not have readily available.
The Su constraint is very similar to an earlier proposal in Volodin (2008), who noted that the error in zonally averaged RH over certain regions in the tropical midtroposphere and BL is well correlated with ECS in CMIP3 models. We do not analyze the Volodin RH constraint here because its methodology is unclear and it involves observational datasets we do not have available. If the region of calculation for the Volodin constraint is functionally equivalent to that used by Su and the observations used in both studies are compatible, then the Volodin and Su studies may be taken together as evidence that the Su constraint is valid in both CMIP3 and CMIP5 datasets.
h. Sherwood D, S, and LTMI
Sherwood et al. (2014) provide three indices of lower-tropospheric mixing in the current climate that are correlated with ECS. Because direct measures of lower-tropospheric mixing are not available for most models in the CMIP archive, these indices are somewhat indirect. The first index (called S) is meant to measure mixing between the BL and the lower troposphere in the convective parameterizations active in the ascending branch of the tropical overturning circulation. It is calculated as the average of the vertical gradients between 700 and 850 mb of RH and temperature (normalized to receive equal weight and signed so smaller gradients make S more positive) averaged over the west Pacific warm pool. Because temperature and moisture typically decrease with height and mixing moves heat and moisture upward in this region, S becomes more positive as mixing between the BL and lower free troposphere increases. The second index (called D) is framed in terms of vertical differences in resolved-scale vertical velocity with height, so it captures resolved-scale mixing. It measures the fraction of BL air in ascending columns in the tropical east Pacific and tropical Atlantic that leaves the column in the midtroposphere rather than in the upper-troposphere. The third index, called the lower-tropospheric mixing index (LTMI), is simply the sum of S and D.
There are several pieces to the physical explanation for correlation between ECS and S, D, or LTMI. First, global-average precipitation and evaporation must be equal on multiyear time scales (because the atmosphere’s ability to stockpile moisture is very limited) and are expected to increase by about 2% for each degree Celsius of
i. Brient cloud shallowness
Brient et al. (2016) build upon Sherwood et al. (2014) by noting that while strengthening of shallow convective drying acts to decrease BL clouds as the planet warms, reductions in BL turbulent moisture flux are also important. Brient et al. argue that intermodel spread in both of these quantities is needed to fully explain future changes in shallow convective cloudiness. They use the fraction of clouds below 850 mb that are also below 950 mb in current-climate tropical (30°S–30°N) weakly subsiding (pressure velocity between 10 and 30 mb day−1) ocean regions as a proxy for these effects. Models with higher values of this shallowness index in the current climate have stronger influence by convective drying relative to turbulent moistening and are thus expected to have larger reductions in future clouds. While Brient et al. (2016) provide a more complete explanation for cloud changes in shallow-convective areas, its correlation with ECS is only half as large as Sherwood LTMI in our study (0.38 vs 0.65; see Table 2).
j. Zhai
Zhai et al. (2015) found that the seasonal response of boundary layer cloud fraction to sea surface temperature in subsidence regions over the ocean between 20°and 40° latitude in both hemispheres is a strong predictor of ECS in a combination of CMIP3 and CMIP5 models. This constraint is very similar to that of Qu et al. (2013) but uses regions less focused on stratocumulus and generally farther poleward, targets seasonal instead of interannual variability, and does not remove the component of cloud response due to EIS changes before computing
k. Tian
Tian (2015) found the amplitude of erroneous convergence and deep convection in the southeast Pacific (the so-called double-ITCZ bias common in GCMs) to be correlated with ECS in a combination of CMIP3 and CMIP5 models. Formally, the Tian constraint is defined as the annual mean precipitation averaged over the box covering 0°–20°S, 100°–150°W. This relationship lacks a solid explanation. The authors do note that Hwang and Frierson (2013) found that models with stronger Southern Ocean cloud biases tended to have a stronger double ITCZ [although Kay et al. (2016) find this relationship to only hold in models with fixed SST]; combining Hwang and Frierson (2013)’s result with the Trenberth constraint, one might predict that a stronger double ITCZ and stronger cloud increases over the Southern Ocean in the future (and correspondingly weaker ECS) may both be symptoms of underprediction in Southern Hemisphere clouds. If this was the case, ECS should be negatively correlated with the strength of the double ITCZ across models. Tian also cites Hirota and Takayabu (2012) as finding that slowdown of the Hadley circulation is stronger in models with weaker double ITCZ bias. If this is the case, we might expect the Tian and Su constraints to be related. Tian data are already available for a wide variety of CMIP3 and CMIP5 models so we do not calculate our own values. Unsurprisingly, our correlations between ECS and the Tian constraint are similar to the value from his paper.
l. Brient cloud albedo
Brient and Schneider (2016) find that deseasonalized current-climate shortwave cloud albedo response to SST variations in tropical oceanic low clouds regions (defined as the 25% of ocean grid cells between 30°N and 30°S with driest 500-mb relative humidity) is negatively correlated with ECS in CMIP5 models. This is essentially a variant on the Qu et al. (2013) mechanism using a different region and measure of cloudiness, so we expect it to be correlated with Qu and Zhai constraints. That correlation is shown in section 5 to be strong.
m. Lipat
Lipat et al. (2017) find that the present-day latitude of the southern edge of the Hadley cell in austral summer is a good predictor of ECS in CMIP5 models. Their argument is based on shortwave cloud radiative effect changes in the lower midlatitudes (roughly between 28° and 48°S latitude). Models whose Hadley cell does not extend far into this region experience a large decrease in shortwave cloud radiative effect as the Hadley cell expands, replacing very cloudy midlatitude conditions with a less cloudy subtropical regime. Models whose Hadley cell already extends far into the lower midlatitudes see less change because most of the radiatively sensitive area is already filled with subtropics-type clouds. As a result, we expect Hadley cell edge latitude (signed so that farther south is more positive) to be negatively correlated with ECS. This is borne out in Table 2. Because both Lipat and Su constraints are both related to Hadley cell representation in models, one might expect them to be related.
n. Siler
Siler et al. (2018) generalize upon the finding of Volodin (2008) that intermodel differences in ECS are well predicted by the latitudinal gradient of present-day cloudiness by showing that
Siler et al. (2018) distill their geographic pattern of correlations into a single number for each model by taking the magnitude of the projection of that model’s cloud albedo map onto the map of multimodel correlation between cloud albedo and
Because almost all models used in Siler et al. (2018) have the output needed for our study, our correlations in Table 2 are almost identical to Siler’s. Interestingly, even though the Siler constraint is more sophisticated than Volodin, it does not produce stronger correlation.
o. Cox
Cox et al. (2018) use a simple differential equation for surface temperature response to white-noise radiative forcing in the presence of climate feedbacks to motivate an emergent constraint related to the strength and autocorrelation of globally averaged surface temperature variations. Models with larger temperature variations and stronger year-to-year autocorrelation tend to have larger ECS. Unlike other constraints, application of the historical temperature record to the Cox constraint implies ECS values that are somewhat weaker than the CMIP5 multimodel mean.
The Cox constraint is an interesting fit for our study because its proposed mechanism is related to fluctuation dissipation rather than a particular feedback process. As a result, our decomposition cannot be used to assess the validity of the Cox constraint. We include the Cox study in our analysis because it is currently the subject of great community interest and because our decomposition illuminates the physical mechanisms controlling the temperature response investigated by Cox et al. (2018).
4. Statistical significance
Most of the potential constraints described above provide some mention that their correlations are significant but provide few details about how this was tested. Significance of correlations can be easily tested either by noting that
Low sample size makes it very difficult to say anything definitive at all about relationships in the CMIP archives. One manifestation of this is the likelihood that some previously proposed constraints are spurious. Identifying such constraints is the main goal of this paper. Unfortunately, small sample size also works against the goal of identifying bad constraints in the sense that a constraint may fail the tests in this paper not because it is incorrect, but instead because of unlucky alignment of available models. An anecdote puts this danger in context. Initially we followed Caldwell et al. (2016) by only using CMIP5 models that had less than 15% error in their clear-sky radiative kernel calculations. Eventually we decided to include all models in our analysis because the increase in sample size was deemed worth the potential for increased sampling error, particularly because cloud feedbacks are the dominant source of correlation with ECS and their calculation is relatively accurate and only weakly affected by kernel errors. In 17 of the 19 constraints tested here, this change in ensemble composition had little effect. For the Qu constraint, however, correlation dropped from −0.63 to −0.29 when all models were used. Using all models had the opposite effect on Brient cloud shallowness (herein Brient Shal): its correlation grew from 0.05 to 0.38. Scatterplots for each of these relationships are presented in Fig. 1. In both cases, correlation changed because models that failed the clear-sky linearity test had systematically different behavior than the rest of the ensemble. Does this mean that Qu is more credible than indicated by the rest of this study? Is Brient Shal less credible? We interpret these findings as an indication of the uncertainty in any correlation obtained from CMIP data. If our results are any indication, results are robust 17/19

Scatterplot of constraint value vs ECS for CMIP5 models passing the clear-sky linearity test for radiative kernel decomposition at the 15% level (blue) and for CMIP5 models failing this test (red). Qu is tested in the left panel and Brient Shal in the right.
Citation: Journal of Climate 31, 10; 10.1175/JCLI-D-17-0631.1
Another issue is that the search for emergent constraints naturally lends itself to trying relationships until a strong correlation is found. This is problematic because if one tries n relationships for significance at the S% level there is a
5. Are emergent constraints independent?
In this section we focus on the question of whether previously proposed constraints are truly independent, or whether they are merely different manifestations of the same underlying phenomenon. This is important because as identifying emergent constraints becomes more popular, researchers need to be careful that new constraints are not merely repackaged versions of older constraints. We are in a unique position to answer this question by virtue of the large collection of previously proposed constraints we have gathered.
Figure 2 shows correlation coefficients for all combinations of emergent constraints considered. Diagonal values are always 1 because a constraint is perfectly correlated with itself. Cells above the diagonal are redundant because

Correlation between pairs of emergent constraints. Boxes with correlations significant at 90% using a two-tailed t test are colored, with insignificant correlations in gray. Darker shades indicate larger correlation. Positive correlations are reddish and negative correlations are blueish. In each cell, the first number is the correlation between quantities listed on the x and y axes. The number in parentheses is the number of models used in this calculation. Dark boxes (high correlation) have white text and light boxes (low correlation) have black text. The sign of emergent constraints expected to be negatively correlated with ECS has been reversed so positive values in this plot indicate that both constraints have the same effect on ECS. Each correlation is calculated using data from all available CMIP3 and CMIP5 models. Colored lines and accompanying numbers reflect groups of constraints that are discussed in the text.
Citation: Journal of Climate 31, 10; 10.1175/JCLI-D-17-0631.1
Correlations that are significant at 90% using a two-tailed t test are shown in color, with darker colors indicating stronger correlations. As noted in section 4, the probability that at least one of the 152 correlations below the diagonal of Fig. 2 passes our significance test by chance is

(left) The number of constraints with which a given constraint is significantly correlated (y axis) as a function of that constraint’s correlation with ECS (x axis). (right) Average correlation with ECS of all constraints significantly correlated with a given constraint (y axis) as a function of that constraint’s correlation with ECS (x axis). The Covey constraint was omitted from both plots because it was an outlier.
Citation: Journal of Climate 31, 10; 10.1175/JCLI-D-17-0631.1
Overlapping groups of constraints for which we might expect a relationship based on physical grounds are indicated by colored lines and corresponding numbers in Fig. 2. The first grouping involves Lipat, Trenberth, and Volodin, which are all related to present-day Southern Hemisphere cloudiness. Siler is also included in this group because its definition is so similar to Volodin’s; Fig. 2 shows that Volodin and Siler are correlated at 0.8. All constraints in this group are correlated at
The constraints in group 2 are related to mean-state clouds and related indicators over geographically broad areas. Surprisingly, while Siler is well correlated with all constraints in this group, the other constraints are not that well correlated with each other. It seems natural that if Siler is similar to two other constraints, those constraints should be similar to each other. Such behavior is known in math as the triangle inequality, and obeying this constraint is a requirement for all measures of distance. Our correlation matrix does not satisfy the triangle inequality because each correlation is based on a different ensemble of models. Using a single set of models for all constraints would solve this problem but is untenable here because we would be left with seven models. There are also real reasons to expect mean-state constraints to be uncorrelated. Klein TCA and ctp-tau focus on cloud fraction and optical depth, for example, and these two quantities can change independently.
Group 3 contains constraints based on mean-state RH. If the locations picked by Fasullo are particularly important, one may expect them to show up in the Su constraint. This does not seem to be the case. Group 4 consists of Tian and Fasullo D, which both target convection-related variables in largely overlapped regions. Unsurprisingly, they are correlated at 0.6. Group 5 consists of constraints based on the ability of convection to remove moisture from the tropical boundary layer. Sherwood D and S are uncorrelated, which explains why LTMI = D + S explains a much larger fraction of ECS than D or S in isolation. Brient Shal, which was based on Sherwood’s concepts, seems to be an unrelated constraint.
The last group focuses on current-climate response of low clouds to variations in SST. Zhai and Brient cloud albedo (herein Brient Alb) do seem to be related to each other, but they are only weakly related to Qu. This could be due to unlucky sampling, but it could also be due to differences in constraint design, including differences in geographical region, sampling time periods, or the fact that Qu removes the component of cloud change coincident with EIS. It is also interesting to note that Volodin and Siler are strongly correlated with Brient Alb at 0.7. Volodin/Siler and Zhai/Brient Alb are similar in that both assume cloud changes track SST in a climate-invariant way, so perhaps this is unsurprising. The fact that Zhai and Brient use temporal variations as their present-day measure while Volodin and Siler use geographic variations raises interesting questions about cloud feedback.
Because Tian cites Trenberth and Fasullo (2010) for support, it is worth noting that the correlation between Tian and Trenberth is 0.5. Tian also makes reference to the strength of the Hadley circulation. Su and Lipat both measure aspects of the Hadley cell, but Su and Lipat are not significantly correlated with Tian or with each other.
Perhaps more interesting than the blocks of expected relationships in Fig. 2 is the region of unexpected correlations. With the exception of the Covey constraint—which is poorly correlated not only with most other constraints, but also with ECS—these unexpected correlations have similar magnitude to those found in the expected-relationship blocks. Several unexpected correlations are over 0.7 in magnitude! Of particular interest is the Cox constraint. The Cox constraint is designed to measure the response properties of global-average surface temperature to forcing, but the feedback process governing that response is unclear. Cox is very strongly correlated with Zhai and Klein TCA, and is significantly correlated with Brient Alb, Sherwood LTMI, and Tian. All of these constraints are related to clouds, suggesting that clouds are the main mechanism controlling surface temperature variations. This hypothesis will be further explored in section 6.
Given the number of unexpected yet apparently significant connections between constraints, an empirical method for identifying groups of related constraints seems warranted. We tried a variety of clustering algorithms, but failure of the triangle equality makes the results very sensitive to the definition of distance between clusters and makes the results difficult to interpret. For example, should Volodin and Klein TCA be considered synonymous with Siler because both are correlated with Siler at 0.8? Or is Klein TCA a separate mechanism because its correlation with Volodin is only 0.3? Larger model ensembles and/or mechanistic understandings of potential relationships are needed to decide. In the meantime, we simply state that pairs of constraints with large correlation in Fig. 2 are probably related, and that understanding why is important future work. Central to this goal is the need to understand why each constraint has skill in predicting ECS. Doing so is the focus of the remainder of this paper.
6. Decomposing correlations
So far we have described each of the 19 constraints and tested them against new data where possible. We have also looked for relationships between emergent constraints. In this section we describe a method for decomposing correlation between ECS and an emergent constraint into components associated with individual feedback and forcing processes and into contributions from different geographical regions. In some sense the decomposition described here provides a bridge between predictors of ECS and predictors of individual climate processes by identifying the processes and regions that contribute to correlation with ECS and by clarifying how correlation with a particular process contributes (or is unimportant for) correlation with ECS.
a. Global-average decomposition
The first step in this decomposition is to write the net feedback λ as a sum of individual feedback terms






















Decomposition of the correlation between ECS and each emergent constraint following Eq. (5) is provided in Fig. 4. As noted in section 2, CMIP3 data (Fig. 4a) and CMIP5 data (Fig. 4b) differ in the following ways:
- Only net Cld is shown in Fig. 4a because separate SW and LW Cld components are not available for CMIP3.
- For CMIP3,
is computed as the residual between the net feedback and the sum of noncloud feedbacks. As a result, is absorbed into in Fig. 4a. - Pl, WV, and LR are computed relative to fixed specific humidity in Fig. 4a, whereas Fig. 4b uses fixed RH.



Decomposition of correlation between the emergent constraints listed on the y axis and ECS into components due to forcing and feedback terms (identified in the legend). Constraints negatively correlated with ECS in their original paper are multiplied by −1 for easy comparison with other constraints. The correlation with ECS is the sum of positive and negative terms and is indicated for each emergent constraint as a white dot.
Citation: Journal of Climate 31, 10; 10.1175/JCLI-D-17-0631.1
Cloud feedback is the main source of strong correlation with ECS for most emergent constraints, particularly in the CMIP3 ensemble. This can be understood by noting that the correlation of a particular
Values of

One odd feature of Table 3 is that the Cld term has weight greater than 1. This seems to imply that strong correlations with
This anticorrelation between
Influences from multiple sources seems to be more pronounced in CMIP5 ensembles. Some of this comes from the fact that SW and LW Cld components are included separately for CMIP5 data but not for CMIP3 (because of lack of available data), but CMIP5 data are more complex even when SW and LW cloud feedback are combined. One reason for this is probably due to increasing model complexity with time. The fact that CMIP3 values are computed from runs that include transient aerosol changes while CMIP5 data do not also complicates interpretation. Similarity between decompositions computed using independent ensembles would be a useful indicator of the credibility of an emergent constraint, but this cannot be evaluated in this study because of differences in the experimental design of CMIP3 and CMIP5. One aspect of the CMIP5 results that is simpler is the partitioning between LR and WV: in CMIP3 these quantities oppose each other and are of roughly equal size. When computed relative to fixed RH in CMIP5, however, the importance of WV fades and LR is shown to be the dominant source of correlation with ECS.
Figure 4 can be used to test whether a constraint’s correlation with ECS is due to its proposed physical explanation or not. This is only possible for constraints with well-defined physical mechanisms; constraints without an explanation cannot be tested and therefore can never be moved beyond the potential constraint status. Sherwood D and LTMI and Brient Shal and Alb pass this test for both CMIP3 and CMIP5 data–they are proposed to operate through changes in low clouds and their correlation with ECS comes primarily through shortwave cloud feedback. Sherwood S and Qu are also meant to operate through shortwave cloud feedback but gain correlation mainly through other terms for CMIP5 data. Nonrobustness between ensembles suggests these constraints may be spurious. Further decomposition of the SW cloud feedback term into amount and scattering components (not shown) reveals that Qu—which was originally framed in terms of low cloud amount changes—is operating as intended in the sense that SW cloud amount feedback in stratocumulus regions does actually contribute to negative correlation with ECS but its effect is canceled out by opposing contributions from SW cloud scattering feedback.
b. Regional decomposition




This geographical decomposition is applied to selected constraints in Fig. 5. These constraints were chosen because they are relatively independent of each other, they target different regions, and they are of contemporary interest. Similar figures including SW and LW cloud feedbacks separately are available in the supporting material for all constraints. In these plots, ECS is broken into terms due to net Cld and F, with all other terms combined into a single plot because their spatial variations are unimportant.

Decomposition of selected emergent constraints (columns) into dominant terms (rows). Titles in bold at the top of each column list the constraint tested and the correlation of that constraint with ECS computed as the sum of all panels in that column. Sums in the title for each panel give the global sum of the geographic decomposition of that term following Eq. (6), which is comparable to the global-average contribution to that term as plotted in Fig. 4.
Citation: Journal of Climate 31, 10; 10.1175/JCLI-D-17-0631.1
One striking feature of Fig. 5 is that the correlation contributions for a particular

Weighting function
Citation: Journal of Climate 31, 10; 10.1175/JCLI-D-17-0631.1
Another interesting feature of Fig. 5 is the fact that the spatial distribution of F contribution for each constraint is almost perfectly opposed in the tropics by the net cloud contribution. Anticorrelation between F and λ for simulations without aerosol changes was previously noted for global averages by Ringer et al. (2014). This relationship may be an artifact of the fact that we follow Gregory et al. (2004) in computing feedback and forcing as the slope and y intercept of the same data. This hypothesis could be tested by getting F from runs with 4×CO2 and present-day SST, but such analysis is outside the scope of this paper.
While similarities in geographic structure between constraints is interesting, the main goal of Fig. 5 is to test proposed mechanisms. The Lipat constraint is related to cloud changes in the Southern Hemisphere at the border between the subtropics and midlatitudes. Geographic decomposition of the cloud contribution from Lipat does show more amplitude in this latitude band than other constraints, but this region is still not the main source of correlation with ECS. Without understanding how Hadley cell extent could affect future changes in tropical cloudiness, this constraint remains unconfirmed. Sherwood D predicts tropical low cloud changes due to BL drying by convection. It is computed using data from the tropical Atlantic and east Pacific, though it is unclear whether this is the region where cloud changes are expected. It does have large correlation in the tropical east Pacific and Atlantic, but its correlation with cloud feedback in the west Pacific is even bigger. Brient Alb is related to low clouds in subsiding (eastern subtropical) oceanic regions. The magnitude of correlation in the subtropical eastern oceans is greater than for other constraints, but equatorial ascent-region clouds again play an unexpectedly large role. Geographic decomposition is also illuminating for the other constraints, but is relegated to the supplemental material because the validity of the corresponding constraint can already be assessed from the other information in this paper.
7. Discussion and conclusions
This study provides several methods for evaluating the credibility of a proposed emergent constraint. We hope this work triggers an effort to evaluate new and existing emergent constraints, discarding unreliable constraints and developing consensus and trust around confirmed predictors. To that end, we ask which of the 19 emergent constraints tested here are trustworthy. Our assessment is provided in Table 4. Six constraints (Covey, Trenberth, Fasullo D and M, Sherwood S, and Sherwood LTMI) do not appear to be credible because they are either not robust to change of ensemble or their correlation with ECS is not due to their proposed physical mechanism. The credibility of three constraints—Lipat, Qu, and Cox—is ambiguous. Lipat gains correlation with ECS from the expected region and mechanism but gains more correlation from unexpected sources. Similarly, Qu also gains correlation from the expected mechanism and region but fails to have a large correlation with ECS for the models used in this study because of unexpected compensation from other terms. Additionally, while Qu fails to be robust to all changes in ensemble, it does have a large correlation with ECS in CMIP3 and in a subset of CMIP5 models and it is conceptually related to the Zhai and Brient Alb constraints, which do seem to be robust. The Cox constraint has a physical explanation that is unrelated to particular feedbacks and regions and hence cannot be tested in our framework. An additional six constraints (Volodin, Siler, Klein ctp-tau and TCA, Su, and Tian) cannot be tested because they lack clear physical mechanisms. These constraints should not be considered credible until they are fully understood. Decomposition of these constraints’ correlation with ECS may prove useful in uncovering the physical explanations for their skill (if any exist). In this context it is interesting to note that Klein TCA is predominantly related to tropical LW cloud feedback, while Klein ctp-tau and Tian are related to SW cloud feedback over a broad variety of regions and Su is tied to cloud feedback mainly over the tropics (see Figs. S3 and S4).
Assessment of proposed constraints.

The remaining four constraints (Sherwood D, Brient Shal, Zhai, and Brient Alb) pass all tests in this study and thus seem credible. Worryingly, all of the studies introducing these constraints note that their constraint implies higher climate sensitivity than predicted by giving each CMIP5 model equal weight. The Sherwood D constraint in particular is only satisfied by models with ECS greater than 3.4 K, while the Sherwood S and LTMI metrics, which themselves predict relatively high climate sensitivity, are much closer to the centroid of CMIP model values. While the tendency for emergent constraints to predict higher climate sensitivity has been noted in the past (e.g., Tian 2015; Qu et al. 2018), the validity of this finding has been unclear because it was based on potential rather than credible constraints.
So what does it mean that four credible emergent constraint studies all predict warming at the upper end of community expectation? One interpretation is that these studies reinforce each other’s conclusions: if all agree, they must be right. This is an appropriate interpretation if all constraints are flawed samples of the same underlying distribution/physical process. In this case, the more we sample the underlying distribution, the better we will understand it. If each constraint is instead targeting a different physical process that contributes to ECS, the constraints will contribute additively toward determining ECS. In this latter case, having one constraint predict high sensitivity and another predict low sensitivity does not invalidate the constraints; they may simply constrain different drivers of climate sensitivity. The credible constraints identified in this study are all related to tropical low clouds and all except Brient Shal are shown in Fig. 2 to be significantly correlated with each other. Zhai and Brient Alb even share a common physical mechanism. Thus it is tempting to view all constraints as reinforcing each other. It is, however, unsurprising that the best emergent constraints would be related to tropical low clouds because (as noted above)
Section 5 provides a first step toward identifying related constraints. By comparing constraint definitions and explanations as well as correlations between pairs of constraints, we conclude that Siler and Volodin describe the same physical mechanism, as do Zhai and Brient Alb. Beyond these pairs, we had trouble identifying groups of similar constraints because one constraint would frequently be correlated with two others that were not themselves correlated with one another. This breakdown of the triangle inequality results from the fact that the models available for each constraint differ coupled with the extremely small sample size of the CMIP archives. While the 19 constraints considered here are definitely much more similar to each other than expected by chance, lack of empirical methods for grouping forces us to fall back on physical reasoning to identify related constraints. This is difficult when the mechanisms responsible for potential constraints are not well understood.
It is important to stress that all conclusions in Table 4 should be considered tentative because the number of models used in each correlation calculation is so small. As discussed in section 4, insufficient sample size is underscored by the fact that correlation for 2 of the 19 constraints changed radically when we switched from using only models that passed the clear-sky linearity test to using all models. This is a problem not just with our methodology, but with all studies attempting to identify emergent constraints from the relatively small ensembles available from CMIP. This conclusion is supported by the fact that strong correlation with ECS disappeared in four of the five constraints in this study confronted with new ensembles. It is interesting to note, however, that several of these failing constraints are strongly correlated with newer constraints that do show strong correlations with CMIP5 data. Perhaps these original studies do have some value, but were overtuned to their training dataset. It is also worth noting that while our criteria of robustness across successive CMIP generations and correlation coming mainly from a single feedback mechanism seem like reasonable rules of thumb, there may be situations where real constraints do not satisfy these criteria. In these cases, the need for an exception should be obvious from the purported physical mechanism. Grounds for such an exception are not clear for any of the constraints evaluated here. Another important caveat to this study is that it focuses entirely on correlations that only capture linear relationships, whereas climate response may be nonlinearly related to a present-day predictor [see appendix 2 of Covey et al. (2000) for an example]. Compositing a predictor into an average over models with low ECS and a separate average over models with high ECS [as done by Su et al. (2014) and Brient et al. (2016)] may be better for identifying nonlinear emergent constraints but is not conducive to our decomposition approach.
We would like to acknowledge the modeling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the World Climate Research Program’s Working Group on Coupled Modeling, for their roles in making available the CMIP3 and CMIP5 multimodel datasets. Support of these datasets is provided by the U.S. Department of Energy (DOE) Office of Science. We would also like to thank the authors of the emergent constraint studies evaluated here for sharing their data. Finally, a huge thanks to Paul Durack for maintaining the CMIP archive for internal use at LLNL. This work was supported by the Office of Science Biological and Environmental Research (BER) program at Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. All authors were supported by BER’s Regional and Global Climate Modeling (RGCM) Program.
REFERENCES
Andrews, T., J. M. Gregory, M. J. Webb, and K. E. Taylor, 2012: Forcing, feedbacks and climate sensitivity in CMIP5 coupled atmosphere–ocean climate models. Geophys. Res. Lett., 39, L09712, https://doi.org/10.1029/2012GL051607.
Andrews, T., J. M. Gregory, and M. J. Webb, 2015: The dependence of radiative forcing and feedback on evolving patterns of surface temperature change in climate models. J. Climate, 28, 1630–1648, https://doi.org/10.1175/JCLI-D-14-00545.1.
Armour, K. C., C. M. Bitz, and G. H. Roe, 2013: Time-varying climate sensitivity from regional feedbacks. J. Climate, 26, 4518–4534, https://doi.org/10.1175/JCLI-D-12-00544.1.
Bony, S., and J.-L. Dufresne, 2005: Marine boundary layer clouds at the heart of cloud feedback uncertainties in climate models. J. Geophys. Res., 32, L20806, https://doi.org/10.1029/2005GL023851.
Bony, S., and Coauthors, 2006: How well do we understand and evaluate climate change feedback processes? J. Climate, 19, 3445–3482, https://doi.org/10.1175/JCLI3819.1.
Bretherton, C. S., and P. N. Blossey, 2014: Low cloud reduction in a greenhouse-warmed climate: Results from Lagrangian LES of a subtropical marine cloudiness transition. J. Adv. Model. Earth Syst., 6, 91–114, https://doi.org/10.1002/2013MS000250.
Brient, F., and T. Schneider, 2016: Constraints on climate sensitivity from space-based measurements of low-cloud reflection. J. Climate, 29, 5821–5835, https://doi.org/10.1175/JCLI-D-15-0897.1.
Brient, F., T. Schneider, Z. Tan, S. Bony, X. Qu, and A. Hall, 2016: Shallowness of tropical low clouds as a predictor of climate models’ response to warming. Climate Dyn., 47, 433–449, https://doi.org/10.1007/s00382-015-2846-0.
Caldwell, P. M., C. S. Bretherton, M. D. Zelinka, S. A. Klein, B. D. Santer, and B. M. Sanderson, 2014: Statistical significance of climate sensitivity predictors obtained by data mining. Geophys. Res. Lett., 41, 1803–1808, https://doi.org/10.1002/2014GL059205.
Caldwell, P. M., M. D. Zelinka, K. E. Taylor, and K. Marvel, 2016: Quantifying the sources of intermodel spread in equilibrium climate sensitivity. J. Climate, 29, 513–524, https://doi.org/10.1175/JCLI-D-15-0352.1.
Charney, J. G., and Coauthors, 1979: Carbon Dioxide and Climate: A Scientific Assessment. National Academies Press, 34 pp., https://doi.org/10.17226/12181.
Collins, M., and Coauthors, 2013: Long-term climate change: Projections, commitments and irreversibility. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 1029–1136.
Covey, C., and Coauthors, 2000: The seasonal cycle in coupled ocean–atmosphere general circulation models. Climate Dyn., 16, 775–787, https://doi.org/10.1007/s003820000081.
Cox, P. M., D. Pearson, B. B. Booth, P. Friedlingstein, C. Huntingford, C. D. Jones, and C. M. Luke, 2013: Sensitivity of tropical carbon to climate change constrained by carbon dioxide variability. Nature, 494, 341–344, https://doi.org/10.1038/nature11882.
Cox, P. M., C. Huntingford, and M. S. Williamson, 2018: Emergent constraint on equilibrium climate sensitivity from global temperature variability. Nature, 553, 319–322, https://doi.org/10.1038/nature25450.
Dufresne, J.-L., and S. Bony, 2008: An assessment of the primary sources of spread of global warming estimates from coupled atmosphere–ocean models. J. Climate, 21, 5135–5144, https://doi.org/10.1175/2008JCLI2239.1.
Fasullo, J. T., and K. Trenberth, 2012: A less cloudy future: The role of subtropical subsidence in climate sensitivity. Science, 338, 792–794, https://doi.org/10.1126/science.1227465.
Fasullo, J. T., B. M. Sanderson, and K. E. Trenberth, 2015: Recent progress in constraining climate sensitivity with model ensembles. Curr. Climate Change Rep., 1, 268–275, https://doi.org/10.1007/s40641-015-0021-7.
Flato, G., and Coauthors, 2013: Evaluation of climate models. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 741–866.
Forster, P. M., and K. E. Taylor, 2006: Climate forcings and climate sensitivities diagnosed from coupled climate model integrations. J. Climate, 19, 6181–6194, https://doi.org/10.1175/JCLI3974.1.
Gleckler, P. J., K. E. Taylor, and C. Doutriaux, 2008: Performance metrics for climate models. J. Geophys. Res., 113, D06104, https://doi.org/10.1029/2007JD008972.
Gordon, N. D., and S. A. Klein, 2014: Low-cloud optical depth feedback in climate models. J. Geophys. Res. Atmos., 119, 6052–6065, https://doi.org/10.1002/2013JD021052.
Gregory, J., and Coauthors, 2004: A new method for diagnosing radiative forcing and climate sensitivity. Geophys. Res. Lett., 31, L03205, https://doi.org/10.1029/2003GL018747.
Grise, K. M., L. M. Polvani, and J. T. Fasullo, 2015: Reexamining the relationship between climate sensitivity and the Southern Hemisphere radiation budget in CMIP models. J. Climate, 28, 9298–9312, https://doi.org/10.1175/JCLI-D-15-0031.1.
Hall, A., and X. Qu, 2006: Using the current seasonal cycle to constrain snow albedo feedback in future climate change. Geophys. Res. Lett., 33, L03502, https://doi.org/10.1029/2005GL025127.
Held, I. M., and B. Soden, 2000: Water vapor feedback and global warming. Annu. Rev. Energy Environ., 25, 441–475, https://doi.org/10.1146/annurev.energy.25.1.441.
Held, I. M., and B. Soden, 2006: Robust responses of the hydrological cycle to global warming. J. Climate, 19, 5686–5699, https://doi.org/10.1175/JCLI3990.1.
Held, I. M., and K. M. Shell, 2012: Using relative humidity as a state variable in climate feedback analysis. J. Climate, 25, 2578–2582, https://doi.org/10.1175/JCLI-D-11-00721.1.
Hirota, N., and Y. N. Takayabu, 2012: Inter-model differences of future precipitation changes in CMIP3 and MIROC5 climate models. J. Meteor. Soc. Japan, 90A, 307–316, https://doi.org/10.2151/jmsj.2012-A16.
Huber, M., I. Mahlstein, M. Wild, J. Fasullo, and R. Knutti, 2011: Constraints on climate sensitivity from radiation patterns in climate models. J. Climate, 24, 1034–1052, https://doi.org/10.1175/2010JCLI3403.1.
Hwang, Y.-T., and D. M. W. Frierson, 2013: Link between the double-intertropical convergence zone problem and cloud biases over the Southern Ocean. Proc. Natl. Acad. Sci. USA, 110, 4935–4940, https://doi.org/10.1073/pnas.1213302110.
Kamae, Y., H. Shiogama, M. Watanabe, T. Ogura, T. Yokohata, and M. Kimoto, 2016: Lower-tropospheric mixing as a constraint on cloud feedback in a multiparameter multiphysics ensemble. J. Climate, 29, 6259–6275, https://doi.org/10.1175/JCLI-D-16-0042.1.
Kay, J. E., C. Wall, V. Yettella, B. Medeiros, C. Hannay, P. Caldwell, and C. Bitz, 2016: Global climate impacts of fixing the Southern Ocean shortwave radiation bias in the Community Earth System Model (CESM). J. Climate, 29, 4617–4636, https://doi.org/10.1175/JCLI-D-15-0358.1.
Klein, S. A., and A. Hall, 2015: Emergent constraints for cloud feedbacks. Curr. Climate Change Rep., 1, 276–287, https://doi.org/10.1007/s40641-015-0027-1.
Klein, S. A., Y. Zhang, M. D. Zelinka, R. Pincus, J. Boyle, and P. J. Gleckler, 2013: Are climate model simulations of clouds improving? An evaluation using the ISCCP simulator. J. Geophys. Res. Atmos., 118, 1329–1342, https://doi.org/10.1002/jgrd.50141.
Klocke, D., R. Pincus, and J. Quaas, 2011: On constraining estimates of climate sensitivity with present-day observations through model weighting. J. Climate, 24, 6092–6099, https://doi.org/10.1175/2011JCLI4193.1.
Knutti, R., 2010: The end of model democracy? Climatic Change, 102, 395–404, https://doi.org/10.1007/s10584-010-9800-2.
Knutti, R., and G. C. Hegerl, 2008: The equilibrium sensitivity of the Earth’s temperature to radiation changes. Nat. Geosci., 1, 735–743, https://doi.org/10.1038/ngeo337.
Knutti, R., G. A. Meehl, M. R. Allen, and D. A. Stainforth, 2006: Constraining climate sensitivity from the seasonal cycle in surface temperature. J. Climate, 19, 4224–4233, https://doi.org/10.1175/JCLI3865.1.
Knutti, R., D. Masson, and A. Gettelman, 2013: Climate model genealogy: Generation CMIP5 and how we got there. Geophys. Res. Lett., 40, 1194–1199, https://doi.org/10.1002/grl.50256.
Knutti, R., M. A. A. Rugenstein, and G. C. Hegerl, 2017: Beyond equilibrium climate sensitivity. Nat. Geosci., 10, 727–736, https://doi.org/10.1038/ngeo3017.
Lipat, B. R., G. Tselioudis, K. M. Grise, and L. M. Polvani, 2017: CMIP5 models’ shortwave cloud radiative response and climate sensitivity linked to the climatological Hadley cell extent. Geophys. Res. Lett., 44, 5739–5748, https://doi.org/10.1002/2017GL073151.
Masson, D., and R. Knutti, 2013: Predictor screening, calibration, and observational constraints in climate model ensembles: An illustration using climate sensitivity. J. Climate, 26, 887–898, https://doi.org/10.1175/JCLI-D-11-00540.1.
McCoy, D. T., D. L. Hartmann, M. D. Zelinka, P. Ceppi, and D. P. Grosvenor, 2015: Mixed-phase cloud physics and Southern Ocean cloud feedback in climate models. J. Geophys. Res. Atmos., 120, 9539–9554, https://doi.org/10.1002/2015JD023603.
McCoy, D. T., I. Tan, D. L. Hartmann, M. D. Zelinka, and T. Storelvmo, 2016: On the relationships among cloud cover, mixed-phase partitioning, and planetary albedo in GCMs. J. Adv. Model. Earth Syst., 8, 650–668, https://doi.org/10.1002/2015MS000589.
Meehl, G. A., C. Covey, T. Delworth, M. Latif, B. McAvaney, J. F. B. Mitchell, R. J. Stouffer, and K. E. Taylor, 2007: The WCRP CMIP3 multi-model dataset: A new era in climate change research. Bull. Amer. Meteor. Soc., 88, 1383–1394, https://doi.org/10.1175/BAMS-88-9-1383.
Pennell, C., and T. Reichler, 2011: On the effective number of climate models. J. Climate, 24, 2358–2367, https://doi.org/10.1175/2010JCLI3814.1.
Qu, X., A. Hall, S. A. Klein, and P. M. Caldwell, 2013: On the spread of changes in marine low cloud cover in climate model simulations of the 21st century. Climate Dyn., 42, 2603–2626, https://doi.org/10.1007/s00382-013-1945-z.
Qu, X., A. Hall, A. M. DeAngelis, M. D. Zelinka, S. A. Klein, H. Su, B. Tian, and C. Zhai, 2018: On the emergent constraints of climate sensitivity. J. Climate, 31, 863–875, https://doi.org/10.1175/JCLI-D-17-0482.1.
Randall, D., and Coauthors, 2007: Climate models and their evaluation. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 589–662.
Ringer, M. A., T. Andrews, and M. J. Webb, 2014: Global-mean radiative feedbacks and forcing in atmosphere-only and coupled atmosphere–ocean climate change experiments. Geophys. Res. Lett., 41, 4035–4042, https://doi.org/10.1002/2014GL060347.
Rose, B. E. J., K. C. Armour, D. S. Battisti, N. Feldl, and D. D. B. Koll, 2014: The dependence of transient climate sensitivity and radiative feedbacks on the spatial pattern of ocean heat uptake. Geophys. Res. Lett., 41, 1071–1078, https://doi.org/10.1002/2013GL058955.
Sanderson, B. M., 2011: A multimodel study of parametric uncertainty in predictions of climate response to rising greenhouse gas concentrations. J. Climate, 24, 1362–1377, https://doi.org/10.1175/2010JCLI3498.1.
Sanderson, B. M., R. Knutti, and P. Caldwell, 2015: A representative democracy to reduce interdependency in a multimodel ensemble. J. Climate, 28, 5171–5194, https://doi.org/10.1175/JCLI-D-14-00362.1.
Shell, K. M., J. T. Kiehl, and C. A. Shields, 2008: Using the radiative kernel technique to calculate climate feedbacks in NCAR’s Community Atmospheric Model. J. Hydrometeor., 21, 2269–2282, https://doi.org/10.1175/2007JCLI2044.1.
Sherwood, S. C., S. Bony, and J.-L. Dufresne, 2014: Spread in model climate sensitivity traced to atmospheric convective mixing. Nature, 505, 37–42, https://doi.org/10.1038/nature12829.
Shukla, J., T. DelSole, M. Fennessy, J. Kinter, and D. Paolino, 2006: Climate model fidelity and projections of climate change. Geophys. Res. Lett., 33, L07702, https://doi.org/10.1029/2005GL025579.
Siler, N., S. Po-Chedley, and C. S. Bretherton, 2018: Variability in modeled cloud feedback tied to differences in the climatological spatial pattern of clouds. Climate Dyn., 50, 1209–1220, https://doi.org/10.1007/s00382-017-3673-2.
Soden, B. J., and I. M. Held, 2006: An assessment of climate feedbacks in coupled ocean–atmosphere models. J. Climate, 19, 3354–3360, https://doi.org/10.1175/JCLI3799.1.
Soden, B. J., A. J. Broccoli, and R. S. Hemler, 2004: On the use of cloud forcing to estimate cloud feedback. J. Climate, 17, 3661–3665, https://doi.org/10.1175/1520-0442(2004)017<3661:OTUOCF>2.0.CO;2.
Soden, B. J., I. M. Held, R. Colman, K. M. Shell, J. T. Kiehl, and C. A. Shields, 2008: Quantifying climate feedbacks using radiative kernels. J. Climate, 21, 3504–3520, https://doi.org/10.1175/2007JCLI2110.1.
Su, H., J. H. Jiang, C. Zhai, T. J. Shen, J. D. Neelin, G. L. Stephens, and Y. L. Yung, 2014: Weakening and strengthening structures in the Hadley circulation change under global warming and implications for cloud response and climate sensitivity. J. Geophys. Res. Atmos., 119, 5787–5805, https://doi.org/10.1002/2014JD021642.
Taylor, K. E., R. J. Stouffer, and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design. Bull. Amer. Meteor. Soc., 93, 485–498, https://doi.org/10.1175/BAMS-D-11-00094.1.
Tian, B., 2015: Spread of model climate sensitivity linked to double-intertropical convergence zone bias. Geophys. Res. Lett., 42, 4133–4141, https://doi.org/10.1002/2015GL064119.
Trenberth, K. E., and J. T. Fasullo, 2010: Simulation of present-day and twenty-first-century energy budgets of the southern oceans. J. Climate, 23, 440–454, https://doi.org/10.1175/2009JCLI3152.1.
Volodin, E. M., 2008: Relation between temperature sensitivity to doubled carbon dioxide and the distribution of clouds in current climate models. Izv. Atmos. Ocean. Phys., 44, 288–299, https://doi.org/10.1134/S0001433808030043.
Waugh, D. W., and V. Eyring, 2008: Quantitative performance metrics for stratospheric-resolving chemistry–climate models. Atmos. Chem. Phys., 8, 5699–5713, https://doi.org/10.5194/acp-8-5699-2008.
Webb, M. J., and Coauthors, 2015: The impact of parametrized convection on cloud feedback. Philos. Trans. Roy. Soc. London, 373A, 20140414, https://doi.org/10.1098/rsta.2014.0414.
Williams, K. D., W. J. Ingram, and J. M. Gregory, 2008: Time variation of effective climate sensitivity in GCMs. J. Climate, 21, 5076–5090, https://doi.org/10.1175/2008JCLI2371.1.
Winton, M., K. Takahashi, and I. M. Held, 2010: Importance of ocean heat uptake efficacy to transient climate change. J. Climate, 23, 2333–2344, https://doi.org/10.1175/2009JCLI3139.1.
Zhai, C., J. H. Jiang, and H. Su, 2015: Long-term cloud change imprinted in seasonal cloud variation: More evidence of high climate sensitivity. Geophys. Res. Lett., 42, 8729–8737, https://doi.org/10.1002/2015GL065911.
Zhao, M., 2014: An investigation of the connections among convection, clouds, and climate sensitivity in a global climate model. J. Climate, 27, 1845–1862, https://doi.org/10.1175/JCLI-D-13-00145.1.