1. Introduction
The sensitivity of Earth’s globally averaged surface temperature to perturbations in atmospheric CO2 concentration varies between climate models, causing their projections of future climate change for identical emissions scenarios to diverge. Equilibrium climate sensitivity (ECS)—the change in Earth’s surface temperature for a doubling of atmospheric CO2—is a model prediction that is used as a proxy of that sensitivity. As a model construct and a prediction of a distant future state, ECS cannot be validated by observational data (Knutti and Rugenstein 2015). To constrain predictions of ECS, “emergent constraints” have been hypothesized to relate the present-day or recent state of climate models—which can be tested against observations—to their predictions of ECS. In theory, climate models with superior observed fidelity for emergent constraint metrics would be more likely to make accurate predictions of ECS, which could help quantify and constrain uncertainty in ECS. Dozens of emergent constraints have been hypothesized, but little consensus exists on how to determine whether hypothesized emergent constraints are credible.
Demonstrating an empirical correlation between a metric and ECS in the Coupled Model Intercomparison Project (CMIP) ensembles is the de facto prerequisite to consideration as an emergent constraint (Klein and Hall 2015). But while the CMIP ensembles are the largest collection of diverse models available, CMIP samples are interdependent and selected arbitrarily, that is, they are neither randomly nor systematically drawn from a larger population of climate models (Knutti 2010) that would, theoretically, represent the uncertainty in modeling and predicting climate. These sampling concerns complicate quantitative analysis of emergent constraints’ credibility.
Single-model perturbed-physics ensembles (SMEs) are collections of model configurations that differ for well-understood reasons within the structure of a single climate model. These ensembles are commonly used to estimate parametric sensitivity and uncertainty in climate models, (e.g., Murphy et al. 2004; Sanderson et al. 2008; Klocke et al. 2011; Shiogama et al. 2012; Gettelman et al. 2012), or to optimize parameters according to objective functions (e.g., Yang et al. 2013; Tett et al. 2017). The model configurations within any of these ensembles exist according to decisions made about which uncertain model parameters to perturb, the selection of perturbation values, and whether to weight models according to performance. SMEs have been shown to be insufficient for developing emergent constraints that are robust across multimodel ensembles (MMEs; Klocke et al. 2011), but carefully constructed SMEs can be used to examine emergent constraints from MMEs (Masson and Knutti 2013; Klein and Hall 2015). Such tests are not definitive—despite their capacity for generating large sample sizes, SME samples could all be similarly flawed and unable to simulate the empirical evidence for or against the emergent constraints regardless of whether the constraint is grounded in a real physical process. However, the same disclaimer must be made of some or all members of multimodel ensembles. The usefulness and interpretability of any SME should be determined on a case-by-case basis dependent on the scientific question and details of the SME design.
To test an emergent constraint, we argue that a SME should be calibrated such that its distribution of model parameters is neither uniformly sampled across a range, which could admit model configurations that fail to uphold a true emergent constraint because of inadequacy, nor should it consist of only optimized model configurations, which could contain correlations to ECS because of a lack of parametric diversity (Masson and Knutti 2013). Optimized models could also fail to generate correlations between any metric and ECS because of a lack of diversity in the metric, known as underdispersion (Yokohata et al. 2012), and they could overfit the observations. In this context, we describe a SME as “calibrated” if its samples of model configurations approximate the posterior probability density (ppd) of model configurations according to the likelihood that they could have simulated the observed climate. To the extent that a calibrated SME is “weighted,” it is so by having more model configurations (samples) from parts of the parameter space that are more likely according to observations and fewer in parts of the parameter space that are less likely but cannot be ruled out. Consider two subspaces A and B of the parameter space. If the climate observations d are 10 times more likely to have been simulated by models from subspace A than subspace B, the calibrated ensemble will contain 10 times more models with parameters from subspace A than subspace B (without any duplicate model configurations). For hypothesis testing, calibration allows the ensemble to be interpreted through the “one model one vote” (Knutti 2010) paradigm.
In this paper, hypothesized emergent constraints from Fasullo and Trenberth (2012) and Sherwood et al. (2014) are tested using a SME derived from the National Center for Atmospheric Research (NCAR) Community Atmosphere Model, version 3.1 (CAM3.1), global climate model calibrated according to Bayesian inference (herein the CAM3 SME; Jackson et al. 2004, 2008; Jackson and Huerta 2016). CAM3 SME feedbacks are analyzed in relation to the hypothesized emergent constraints, as is the evidence for their hypothesized physical mechanisms of influence on ECS and their ability to constrain probabilistic estimates of ECS. These hypothesized emergent constraints were chosen as examples because they are simple to calculate and pertain to cloud feedback, but the process for selecting them is admittedly somewhat arbitrary, and they are not representative of the full spectrum of the growing field of hypothesized emergent constraints—for a more exhaustive review of emergent constraints, see Klein and Hall (2015), Knutti et al. (2017), and Caldwell et al. (2018).
Introduction to emergent constraints tested
Both the relative humidity (RH; Fasullo and Trenberth 2012) and lower-tropospheric mixing (LTMI; Sherwood et al. 2014) emergent constraints attempt to reduce the intermodel spread in ECS by connecting the cloud response to a warming climate to present-day climate model biases. The RH emergent constraint targets the intermodel spread in the reduction of the upward shortwave radiative flux at the top of the atmosphere (TOA) from the subtropical dry zones poleward to 50°. The differential reductions in albedo are linked to differential reductions in relative humidity and cloud amount. Model responses are hypothesized to relate to the present-day relative humidity biases in the dry, descending branch of the Hadley cell (RHD) and moist, ascending branch of the Hadley cell (RHM). Models that best resolve the stark differences in relative humidity between these branches are hypothesized to best predict the future drying attributable to the poleward expansion of the descending Hadley cell branch (Fasullo and Trenberth 2012).
The LTMI constraint relates the projected decrease in tropical low-cloud amount to the tendency for models to dry the boundary layer through turbulent mixing with the free troposphere in the present-day climate. Models that mix the most at present are hypothesized to dry the boundary layer more in a warming climate because the vertical gradient of specific humidity will steepen because of the Clausius–Clapeyron effect. The tendency to mix across the boundary layer in the present-day climate is decomposed into indices of small-scale parameterized-mixing (LTMIS) and large-scale, resolved-mixing (LTMID). The indices are uncorrelated in CMIP and additive into a single LTMI index. Models with high LTMI indices at present are hypothesized to lose more low clouds and have more positive low-cloud shortwave cloud feedbacks in the tropics, subtropics, and potentially midlatitudes (Sherwood et al. 2014). For the details of calculating these emergent constraints from the present day climate, see appendix A.
2. Methods
a. Calibrating the CAM3 SME using Bayesian inference
Markov chain Monte Carlo (MCMC) sampling approaches can solve Eq. (1) by iteratively stepping through parameter space and accepting or rejecting model configurations based on their relative likelihood (e.g., Metropolis et al. 1953; Hastings 1970). There are significant challenges to using MCMC to solve inverse problems for computationally expensive models like GCMs (Qian et al. 2016). The standard approach is to develop a computationally cheap surrogate model and use MCMC sampling on the surrogate until the sampler converges to the
The calibrated CAM3 SME consists of 1800 CAM3.1 model configurations, which are realizations of M drawn from the approximate
b. Testing emergent constraints
The efficacy of an emergent constraint is judged by its correlation to ECS in the CAM3 SME models, on the apparent relevance of its hypothesized physical mechanism in the CAM3 SME, and by the constraint it provides on the posterior density of ECS in the CAM3 SME. Computing the correlation between the indices and ECS is straightforward and is the status quo for demonstrating the efficacy of an emergent constraint hypothesis on CMIP models. This study explicitly computes cloud and other climate feedbacks in the CAM3 SME using the Shell et al. (2008) radiative kernel and examines correlations between each feedback and emergent constraints for insight into the physical mechanisms that may connect the emergent constraints to ECS. Additionally, CAM3 SME models are composited by their emergent constraint indices, and the spatial pattern of the cloud feedbacks for each composite are compared to each other to reveal where emergent constraints relate to the intraensemble spread in cloud feedback. Finally, the impact of updating estimates of ECS using evidence from each emergent constraint is tested as described below. For brevity we refer to hypothesized emergent constraints as emergent constraints despite the fact that they are not “confirmed” (Klein and Hall 2015).
Constraining estimates of ECS
3. Results
a. Characterizing the feedbacks and ECS in the CAM3 SME
Global-mean shortwave (SW) and longwave (LW) cloud feedback (
Correlation coefficients between climate feedback parameters and ECS in the CAM3 SME (including the “outlier model configuration”).
In the CAM3 SME ensemble mean,
The intraensemble variance in
The spatial pattern of the difference in
b. Testing emergent constraints: Correlation to ECS
Both the RH and LTMI constraints are correlated to ECS in the CAM3 SME, but with weaker correlations than in CMIP (Table 2). Each hypothesis is composed of two indices, and for both hypotheses, only one index is found to be significantly correlated to CAM3 SME ECS. We also recompute the emergent constraints over a subset of CMIP3 and CMIP5 models and generally replicate the published findings (Tables 2 and A1).
Correlation coefficients between hypothesized emergent constraints and ECS. Bold font indicates statistical significance at the 95% level. Subscript p indicates a published value, and subscript c indicates a calculation from this paper in an effort to replicate published results.
1) Correlation: RH constraint
The correlations between RH and ECS in the dry zone
Both RHD and RHM correlations could have been improved in the SME if the domains of their calculations were shifted northward about 7° (Fig. 6a); however, this would displace each domain north of the peak dry and humid zones. All CAM3 SME members have a wet bias in RHD, the descending branch of the circulation. This is likely due to underdispersion of the CAM3 SME (Yokohata et al. 2012), but may also relate to the dry zone being defined higher in the atmosphere than the peak dryness in the CAM3 SME—in the CAM3 SME’s peak dry zone, 103 of the 165 model configurations are within the observed dry zone relative humidity, albeit over a smaller spatial domain. The RH constraint failed conclusively across a subset of CMIP5 piControl models (n = 18), with weak correlations opposite the hypothesized sign (Fig. 6c). This result is consistent with the findings from the supplementary materials in Fasullo and Trenberth (2012), which showed the RH constraint’s inability to predict twenty-first century warming under the RCP8.5 scenario.
2) Correlation: LTMI constraint
The small-scale, parameterized index of LTMI (LTMIS) is correlated to ECS in the CAM3 SME (
c. Testing emergent constraints: Physical mechanisms
1) Physical mechanisms: RH constraint
While the hypothesized physical mechanism for the RH constraint is not fully described, Fasullo and Trenberth (2012) speculate that RHM and RHD relate to future cloud loss in the midlatitudes because of the poleward expansion of the Hadley cell. Hadley cell expansion is a likely explanation for the positive CAM3 SME ensemble mean
Cloud feedback is defined by the change in radiative flux due to changes in clouds normalized by the change in global-mean temperature. The CAM3 SME midlatitude change in adjusted shortwave cloud forcing (change in cloud forcing adjusted for changes in clear sky radiative forcing) is correlated to ECS. But the midlatitude cloud feedback is relatively similar for all CAM3 SME ensemble members (the zonal mean midlatitude
Is the RH constraint related to that signal? To test that hypothesis, we test whether the RH constraint is associated with the change in midlatitude-adjusted shortwave cloud forcing, without normalizing by global-mean temperature change. Because of bias and underdispersion in the CAM3 SME for the RHD metric, we focus on RHM.
2) Physical mechanisms: LTMI constraint
For LTMIS the relationship with
For LTMID the relationship to ECS is complex, as there are two groups of models that have high ECS and
High- (low) LTMID models tend to have high (low) parameter values for tau, alfa, and icritc (see Table B1), fewer (more) low clouds along the flanks of the ITCZ, and more (less) shallow convective precipitation. Neither group is more credible than the other as indicated by the model skill scores for the present-day climates. But one line of evidence suggesting the
A final comment on the physical mechanism for LTMI relates to the suggestion of Sherwood et al. (2014) and Klein and Hall (2015) that the hypothesized physical mechanism for LTMI may be incomplete, given that the correlation between LTMI and ECS is stronger than the correlation between LTMI and a subset of CMIP models providing shortwave cloud radiative effect (Sherwood et al. 2014). This finding is replicated in the CAM3 SME: LTMI is more correlated to ECS
Correlation coefficients between hypothesized emergent constraints and ECS or selected global-average climate feedback parameters λ. Variables Q and α are water vapor and surface albedo, respectively, and SW and LW are the shortwave and longwave components of climate feedbacks. Bold font indicates statistical significance at the 95% level.
d. Testing emergent constraints: Posterior estimates of ECS
The posterior probability mass function for ECS, computed using Eqs. (4) and (5) and generating weighted histograms for each emergent constraint, is narrowed and shifted upward for RHD and LTMI (Fig. 11). Any metric with a relationship to ECS and sufficient observational certainty will change the posterior probability, but the credibility of the shift should be considered a function of the credibility of the constraint itself. The posterior estimate for RHD is dubious because the CAM3 SME fails to span the observed range (Fig. 5)—each model configuration’s likelihood is exceedingly low, and the posterior ECS peaks at just above 3 K because a handful of models with an ECS of approximately 3 K are incrementally closer to the observed range than the other model configurations. Given such a large RHD bias, it is doubtful that those incrementally improved models are actually better estimators of ECS. The shift and narrowing of the posterior mass for LTMI may be credible. Its credibility hinges on our acceptance that the LTMID metric, nonlinear as it is, is a reliable predictor of ECS. We must also accept the given bounds of observational uncertainty in LTMID from Sherwood et al. (2014). However, interannual and interdataset variability across reanalysis products for LTMID indicates a larger observational uncertainty may be more accurate (C. Schumacher 2018, personal communication).
4. Discussion and conclusions
We find RHM and LTMIS to be correlated to ECS through tropical
For both the RH and LTMI constraints, we find reasons to suspect the hypothesized physical mechanisms, even when the index is correlated to ECS. For RHM, there is no evident relationship to the poleward expansion of the Hadley cell in the midlatitudes. Perhaps the lack of a well-understood connection to the midlatitudes helps explain why this constraint failed in the CMIP5 models (Figs. 5 and 6c). LTMIS is correlated to reductions in midlevel cloud fraction (700–400 hPa) and weakly correlated to gains in low-cloud fraction, which is not what one would expect from the hypothesized physical mechanism of mixing-induced boundary layer drying. LTMIS is more strongly correlated to
Kamae et al. (2016) found that the LTMI emergent constraint is effective in SMEs derived from the MIROC3 and MIROC5 GCMs but only when using the older of two choices for convection schemes. Although that result applies to different sets of SMEs, we find it comparable to our finding that LTMID only has a positive correlation to ECS and
An ensemble for testing emergent constraints is only as useful as its models are both credible and diverse. A lack of diversity is a significant limitation in the interpretation of emergent constraints with SMEs. For example, it may be possible that RHM is correlated to intraensemble scatter in midlatitude cloud feedback given an ensemble with more diversity in that quantity. Gettelman et al. (2012) found that, in a SME representing a one-step-at-a-time evolution from CAM4 to CAM5, the storm-track feedbacks contribute the most warming (Gettelman et al. 2012) and have very high regression slopes against ECS (Gettelman et al. 2013). The difference in which cloud feedbacks are important to ECS can be explained by the Gettelman et al. (2012) ensemble using two different shallow cumulus schemes: Hack (1994) and Bretherton and Park (2009). Storm-track feedbacks are sensitive to the choice of scheme, so, as the CAM3 SME contains only ensemble members with the Hack shallow cumulus convection scheme, it undersamples diversity in storm-track cloud feedback relative to Gettelman et al. (2012). Similarly, the CAM3.1 low-cloud fraction is diagnosed using, among other factors, an empirical term that increases the low-cloud fraction for increased lower-tropospheric stability (Medeiros et al. 2008). This empirical relationship was not perturbed in generating the CAM3 SME, which may limit the low-cloud fraction variability relative to CMIP models in which low-cloud fraction is prognostic or is not constrained by the empirical relationship between lower-tropospheric stability and low-cloud fraction.
Part of the motivation for using a calibrated SME to test emergent constraints is that the diversity within the ensemble is mathematically defensible. But, that diversity still arises because of choices in ensemble design, some of them arbitrary. Among the factors that can affect the diversity exhibited by a single-model ensemble are the sampling technique, the parameters perturbed, and the structure of the base model being perturbed. The calibrated CAM3 SME does not represent the final word for uncertainty quantification or the merits of these constraints.
Despite the limitations of SMEs, the interpretability of a calibrated SME is a major advantage over noncalibrated MMEs and SMEs. For example, although a scientist must set bounds on parameter ranges and select an appropriate likelihood function [a challenging problem (Nosedal-Sanchez et al. 2016)], the calibration process determines how broad or narrow a region of parameter space is included in the ensemble, and the density of samples from each part of parameter space is proportional to its likelihood. The CAM3 SME’s structural uniformity may be advantageous for emergent constraint hypothesis testing, as models differ for a limited number of understandable reasons that are tied to known processes (e.g., clouds and convection as in the CAM3 SME) and traceable to a few lines of code. An analogy can be made to testing hypotheses against intermediate complexity models in that one can rule out certain hypotheses or processes that are not included in the model/ensemble. For example, because the RHM constraint is correlated to ECS in the CAM3 SME even though the ensemble lacks variability in cloud feedback because of Hadley cell expansion, we conclude that Hadley cell expansion is not a necessary component of the physical mechanism of RHM. A similar argument can be made for using CAM3.1 to generate a SME despite the fact that there are more recent versions of the CAM model. Older releases of climate models tend to be cheaper to run, which is crucial to generating a calibrated ensemble using a sampling technique like MVFSA. There is no clear reason to exclude older models from multimodel ensembles (Rauser et al. 2015), and, similarly, there is no reason not to use them for SMEs, so long as one is aware of their limitations.
Neither MMEs nor SMEs represent all of the uncertainties in climate model base state and prediction, but combining the two types of ensembles maximizes both the sample size and the amount of diversity. In lieu of a technique of ensemble construction that contains structural and parametric uncertainty, combining separate MMEs and SMEs to challenge emergent constraints is a pragmatic solution.
As computing power and developments in ensemble calibration develop, the posterior density of SMEs, or collections of SMEs and MMEs, could potentially gain more acceptance as actual measures of uncertainty in climate prediction. Within that framework, the potential use for emergent constraints grows—should they be found robust for reasons we understand, they could help guide the interpretation of scatter in model projections from qualitative to quantitative. Our results suggest that the tested emergent constraints are not yet reliable and well understood enough to be used for that purpose.
Acknowledgments
We thank Dr. Steven Sherwood for making his lower-tropospheric mixing code public and for his guidance in computing the LTMI constraint. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model output. For CMIP, the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. We thank Dr. Pedro DeNezio for his assistance in processing CMIP data and Dr. Karen Shell for publishing her radiative kernels. This work was supported by the U.S. Department of Energy Office of Science, Biological and Environmental Research Regional and Global Climate Modeling Program under Award Number DE-SC0006985. The calibration of the CAM3.1 SME was supported by allocation award ATM100049 from the Extreme Science and Engineering Discovery Environment (XSEDE) program, which is supported by the National Science Foundation. We thank three anonymous reviewers for their thoughtful and helpful advice on how to improve this paper.
APPENDIX A
Calculating Emergent Constraints and Cloud Feedbacks
a. Calculating emergent constraints
1) RH constraint
The domains used by these authors to calculate the RH constraint are
Latitude: RHD (from −21° to −9.5°), RHM (from −1° to 10°)
Pressure: RHD (from 480 to 310 hPa), RHM (from 800 to 600 hPa).
2) LTMI constraint
- LTMIS (S) is a calculation of the difference between the annual average of monthly mean area-weighted RH and T vertical gradients over tropical oceans where the mean annual ascent at 500 hPa is in the top quartile. The RH and T gradients are normalized by the difference between their dry and saturated values
Annual-mean climate data are used to calculate S because it is sufficient to replicate the results of Sherwood et al. (2014) for the CMIP models and observations tested (Fig. A1
). However, Sherwood et al. (2014) calculate an S value for each month using monthly means and then calculate its annual average. - LTMID (D) is an area-weighted calculation of the annual mean of monthly means of the ratio between shallow and deep ascent. Divergence in the lower troposphere Δ is defined as
, where low-level ascent is and midlevel ascent isand the calculation is made at each grid cell over oceans from −30° to 30° latitude and −160° to 45° longitude.LTMID iswhere the Heaviside function H requires both divergence and rising air at midlevels. We also require rising air at low levels, which is likely if not guaranteed by having rising and diverging air at midlevels.Eq. (A4) appears consistent with the code made publicly available by S. Sherwood (https://github.com/scs46/LTMI-mixing), which requires ascent at low and midlevels. The code differs slightly from the equation for D in Sherwood et al. (2014), which would admit into the domain those grid cells in which air is sinking and diverging at midlevels:See Fig. A1 for LTMID results using Eqs. (A4) and (A5).
3) Replicating the published emergent constraint results in CMIP
Each constraint was computed across a subset of models from CMIP3 and CMIP5 and correlations with ECS were compared to their authors’ published results (Table 2). A complete validation of the metric calculation, and of the authors’ published results, was not possible because of differences between CMIP model configurations used in published results and the configurations tested in this study (Table A1). However, for both the RH and LTMI constraints, the metric calculations are, to the best of our knowledge, calculated similarly or exactly the same way as by Fasullo and Trenberth (2012) and Sherwood et al. (2014).
Model details for published and replicated calculations of emergent constraint metrics on the CMIP ensemble.
b. Cloud feedback calculation
Cloud feedbacks in the SME are calculated using the Shell et al. (2008) radiative kernel. The Planck longwave feedback (Pl) in Fig. 2 is the sum of the surface temperature, atmospheric temperature, and lapse rate feedbacks. The feedback analysis does not separate rapid tropospheric adjustments to CO2 from globally averaged surface temperature–modulated changes.
APPENDIX B
Single-Model Ensemble Perturbed Parameters
The CAM3 SME was constructed by perturbing parameters in Table B1 using MVSFA to select parameter values within the specified ranges.
List of cloud and convection parameters perturbed in the CAM3 SME.
REFERENCES
Bretherton, C. S., and S. Park, 2009: A new moist turbulence parameterization in the Community Atmosphere Model. J. Climate, 22, 3422–3448, https://doi.org/10.1175/2008JCLI2556.1.
Caldwell, P. M., M. D. Zelinka, and S. A. Klein, 2018: Evaluating emergent constraints on equilibrium climate sensitivity. J. Climate, 31, 3921–3942, https://doi.org/10.1175/JCLI-D-17-0631.1.
Chiang, J. C. H., and A. R. Friedman, 2012: Extratropical cooling, interhemispheric thermal gradients, and tropical climate change. Annu. Rev. Earth Planet. Sci., 40, 383–412, https://doi.org/10.1146/annurev-earth-042711-105545.
Chung, E.-S., and B. J. Soden, 2015: An assessment of direct radiative forcing, radiative adjustments, and radiative feedbacks in coupled ocean–atmosphere models. J. Climate, 28, 4152–4170, https://doi.org/10.1175/JCLI-D-14-00436.1.
Fasullo, J. T., and K. E. Trenberth, 2012: A less cloudy future: The role of subtropical subsidence in climate sensitivity. Science, 338, 792–794, https://doi.org/10.1126/science.1227465.
Gettelman, A., J. E. Kay, and K. M. Shell, 2012: The evolution of climate sensitivity and climate feedbacks in the Community Atmosphere Model. J. Climate, 25, 1453–1469, https://doi.org/10.1175/JCLI-D-11-00197.1.
Gettelman, A., J. E. Kay, and J. T. Fasullo, 2013: Spatial decomposition of climate feedbacks in the Community Earth System Model. J. Climate, 26, 3544–3561, https://doi.org/10.1175/JCLI-D-12-00497.1.
Hack, J. J., 1994: Parameterization of moist convection in the National Center for Atmospheric Research Community Climate Model (CCM2). J. Geophys. Res., 99, 5551–5568, https://doi.org/10.1029/93JD03478.
Hastings, W. K., 1970: Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109, https://doi.org/10.1093/biomet/57.1.97.
Ingber, L., 1989: Very fast simulated re-annealing. Math. Comput. Modell., 12, 967–973, https://doi.org/10.1016/0895-7177(89)90202-1.
Jackson, C., and G. Huerta, 2016: Empirical Bayes approach to climate model calibration. Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2016-20.
Jackson, C., M. K. Sen, and P. Stoffa, 2004: An efficient stochastic Bayesian approach to optimal parameter and uncertainty estimation for climate model predictions. J. Climate, 17, 2828–2841, https://doi.org/10.1175/1520-0442(2004)017<2828:AESBAT>2.0.CO;2.
Jackson, C., M. K. Sen, G. Huerta, Y. Deng, and K. P. Bowman, 2008: Error reduction and convergence in climate prediction. J. Climate, 21, 6698–6709, https://doi.org/10.1175/2008JCLI2112.1.
Kamae, Y., H. Shiogama, M. Watanabe, T. Ogura, T. Yokohata, and M. Kimoto, 2016: Lower-tropospheric mixing as a constraint on cloud feedback in a multiparameter multiphysics ensemble. J. Climate, 29, 6259–6275, https://doi.org/10.1175/JCLI-D-16-0042.1.
Klein, S. A., and A. Hall, 2015: Emergent constraints for cloud feedbacks. Curr. Climate Change Rep., 1, 276–287, https://doi.org/10.1007/s40641-015-0027-1.
Klocke, D., R. Pincus, and J. Quaas, 2011: On constraining estimates of climate sensitivity with present-day observations through model weighting. J. Climate, 24, 6092–6099, https://doi.org/10.1175/2011JCLI4193.1.
Knutti, R., 2010: The end of model democracy? Climatic Change, 102, 395–404, https://doi.org/10.1007/s10584-010-9800-2.
Knutti, R., and M. A. A. Rugenstein, 2015:Feedbacks, climate sensitivity and the limits of linear models. Philos. Trans. Roy. Soc., 373A, 20150146, https://doi.org/10.1098/rsta.2015.0146.
Knutti, R., M. A. A. Rugenstein, and G. C. Hegerl, 2017: Beyond equilibrium climate sensitivity. Nat. Geosci., 10, 727–736, https://doi.org/10.1038/ngeo3017.
Masson, D., and R. Knutti, 2013: Predictor screening, calibration, and observational constraints in climate model ensembles: An illustration using climate sensitivity. J. Climate, 26, 887–898, https://doi.org/10.1175/JCLI-D-11-00540.1.
Medeiros, B., B. Stevens, I. M. Held, M. Zhao, D. L. Williamson, J. G. Olson, and C. S. Bretherton, 2008: Aquaplanets, climate sensitivity, and low clouds. J. Climate, 21, 4974–4991, https://doi.org/10.1175/2008JCLI1995.1.
Metropolis, N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, 1953: Equation of state calculations by fast computing machines. J. Chem. Phys., 21, 1087–1092, https://doi.org/10.1063/1.1699114.
Murphy, J. M., D. M. H. Sexton, D. N. Barnett, G. S. Jones, M. J. Webb, M. Collins, and D. A. Stainforth, 2004: Quantification of modelling uncertainties in a large ensemble of climate change simulations. Nature, 430, 768–772, https://doi.org/10.1038/nature02771.
Nosedal-Sanchez, A., C. Jackson, and G. Huerta, 2016: A new test statistic for climate models that includes field and spatial dependencies using Gaussian Markov random fields. Geosci. Model Dev., 9, 2407–2414, https://doi.org/10.5194/gmd-9-2407-2016.
Qian, Y., and Coauthors, 2016: Uncertainty quantification in climate modeling and projection. Bull. Amer. Meteor. Soc., 97, 821–824, https://doi.org/10.1175/BAMS-D-15-00297.1.
Randall, D. A., and Coauthors, 2007: Climate models and their evaluation. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 591–662, http://doi.org/10.1016/j.cub.2007.06.045.
Rauser, F., P. Gleckler, and J. Marotzke, 2015: Rethinking the default construction of multimodel climate ensembles. Bull. Amer. Meteor. Soc., 96, 911–920, https://doi.org/10.1175/BAMS-D-13-00181.1.
Sanderson, B. M., C. Piani, W. J. Ingram, D. A. Stone, and M. R. Allen, 2008: Towards constraining climate sensitivity by linear analysis of feedback patterns in thousands of perturbed-physics GCM simulations. Climate Dyn., 30, 175–190, https://doi.org/10.1007/s00382-007-0280-7.
Sen, M. K., and P. Stoffa, 1996: Bayesian inference, Gibbs’ sampler and uncertainty estimation in geophysical inversion. Geophys. Prospect., 44, 313–350, https://doi.org/10.1111/j.1365-2478.1996.tb00152.x.
Shell, K. M., J. T. Kiehl, and C. A. Shields, 2008: Using the radiative kernel technique to calculate climate feedbacks in NCAR’s Community Atmospheric Model. J. Climate, 21, 2269–2282, https://doi.org/10.1175/2007JCLI2044.1.
Sherwood, S. C., S. Bony, and J.-L. Dufresne, 2014: Spread in model climate sensitivity traced to atmospheric convective mixing. Nature, 505, 37–42, https://doi.org/10.1038/nature12829.
Shiogama, H., and Coauthors, 2012: Perturbed physics ensemble using the MIROC5 coupled atmosphere–ocean GCM without flux corrections: Experimental design and results—Parametric uncertainty of climate sensitivity. Climate Dyn., 39, 3041–3056, https://doi.org/10.1007/s00382-012-1441-x.
Soden, B. J., and G. A. Vecchi, 2011: The vertical distribution of cloud feedback in coupled ocean-atmosphere models. Geophys. Res. Lett., 38, L12704, https://doi.org/10.1029/2011GL047632.
Tett, S. F. B., K. Yamazaki, M. J. Mineter, C. Cartis, and N. Eizenberg, 2017: Calibrating climate models using inverse methods: Case studies with HadAM3, HadAM3P and HadCM3. Geosci. Model Dev., 10, 3567–3589, https://doi.org/10.5194/gmd-10-3567-2017.
Villagran, A., G. Huerta, C. Jackson, and M. K. Sen, 2008: Computational methods for parameter estimation in climate models. Bayesian Anal., 3, 823–850, https://doi.org/10.1214/08-BA331.
Yang, B., and Coauthors, 2013: Uncertainty quantification and parameter tuning in the CAM5 Zhang-McFarlane convection scheme and impact of improved convection on the global circulation and climate. J. Geophys. Res. Atmos., 118, 395–415, https://doi.org/10.1029/2012JD018213.
Yokohata, T., J. D. Annan, M. Collins, C. Jackson, M. Tobis, M. J. Webb, and J. C. Hargreaves, 2012: Reliability of multi-model and structurally different single-model ensembles. Climate Dyn., 39, 599–616, https://doi.org/10.1007/s00382-011-1203-1.