1. Introduction
While multimodel mean global warming projections for comparable scenarios are similar for CMIP6 and CMIP5, some climate models in the current-generation CMIP6 ensemble project large twenty-first-century warming and exhibit high climate sensitivity (Gettelman et al. 2019; Sellar et al. 2019; Swart et al. 2019; Voldoire et al. 2019) that is outside of the range of comparable CMIP5 projections (Meehl et al. 2020). Due to the emergence of more and higher climate sensitivity models, the CMIP6 multimodel ensemble simulates a larger spread of projected warming than CMIP5 (Nijsse et al. 2020; Tokarska et al. 2020). High levels of projected twenty-first-century warming in CMIP6 models are associated with high equilibrium climate sensitivity (ECS), although there is some scatter in the relationship (Tokarska et al. 2020). Model differences in simulated changes of shortwave reflection by low-level clouds (particularly in the tropics and midlatitudes) in response to climate change have been found to dominate the uncertainties in global warming projections in past intercomparisons (Brient and Schneider 2016; Vial et al. 2013). For example, Brient and Schneider (2016) point out that most of the variance of CMIP5 models’ ECS can be explained by model differences in low cloud–induced shortwave reflection over tropical oceans. Recent studies provide evidence that changes to the representation of cloud process in CMIP6 models result in simulations in better agreement with satellite datasets in the midlatitudes than prior generations of models (Myers et al. 2021). However, there is a broad spread in cloud feedbacks in CMIP6 resulting in a broad range of climate sensitivity (Schlund et al. 2020; Zelinka et al. 2020).
Statistical methods combining observations and model simulations are an effective approach to constraining projected warming and climate sensitivity. For example, Sherwood et al. (2020) consider multiple lines of evidence, including historical warming and the paleoclimate record as well as process understanding of feedbacks, to estimate the effective climate sensitivity using a Bayesian framework. Ribes et al. (2021) use climate models to provide a prior on the forced response and use the observational record to obtain its posterior distribution. Hattab et al. (2019) use a principal component regression method and discuss the selection of a robust set of observable predictors to estimate climate sensitivity.
The application of emergent constraints is an effective way to narrow the projected warming spread (Bretherton and Caldwell 2020; Nijsse et al. 2020; Tokarska et al. 2020). An emergent relationship between an observable quantity in the historical or present period and a quantity related to the future climate (e.g., twenty-first-century warming) can serve as the basis to constrain projections. The emergent relationship is usually motivated by physical understanding of a process driving climate feedbacks, and observational estimates must exist to distinguish models with a realistic representation of the process from those with a less realistic representation (Bretherton and Caldwell 2020; Meehl et al. 2020; Tokarska et al. 2020). For an emergent constraint to be robust, a clear physical mechanism is required: statistical analyses in Caldwell et al. (2014) show that large correlations across models between past climate variables and future projections can occur by chance. Considering 11 metrics to constrain ECS from both CMIP5 and CMIP6 simulations, Schlund et al. (2020) find that most cloud metrics first identified using CMIP5 data show a weaker emergent constraint in CMIP6. Other lines of evidence further demonstrate that not all of these metrics are robust. In particular, Caldwell et al. (2018) use a feedback decomposition analysis to evaluate the performance of several emergent constraints on ECS and show that only 4 of 19 cloud constraints on ECS proposed in the literature can be considered credible.
Emergent constraints using certain cloud diagnostics have resulted in higher climate sensitivity than those obtained from unconstrained ensembles. Zhai et al. (2015) constrain ECS using the sensitivity of extratropical low cloud fraction to the seasonal cycle of sea surface temperature (SST), showing that the relatively high climate sensitivity models in CMIP3 and CMIP5 are more consistent with observed values of this metric than models with relatively low climate sensitivity. By combining information from several cloud-related metrics, Bretherton and Caldwell (2020) predict a larger constrained mean of ECS in CMIP5 than that of the raw ensemble.
Although higher or medium climate sensitivity models in CMIP5 and CMIP6 are generally found to be in better agreement with observational constraints using cloud-related metrics, recent analyses using the historical global mean surface air temperature warming trend as a constraint find the high-sensitivity models to be less consistent and suggest that they might overestimate the future warming trend (Liang et al. 2020; Nijsse et al. 2020; Tokarska et al. 2020). The preferential weighting of low warming projections when using the recent warming trend as a constraint is at least partially result of the so-called sea surface temperature pattern effect (associated with warming in the western equatorial Pacific Ocean and cooling in the eastern equatorial Pacific Ocean since about 1980), which has been shown to have a strong effect on the observed global mean near-surface air temperature (GSAT) trend in recent decades (Andrews et al. 2018; Dong et al. 2020; Gregory et al. 2020; Zhou et al. 2016, 2021). An increase in the zonal gradient of SST across the low latitudes of the Pacific Ocean has resulted in more low cloud coverage over the eastern part of the basin and a global-scale cooling effect due to the increased reflection of incoming shortwave radiation. As this pattern effect of east–west tropical Pacific SST gradient is more likely a result of internal variability rather than a long-term warming response, this cooling pattern is not expected to persist (Forster et al. 2021; Watanabe et al. 2021). The recent Assessment Report of the Intergovernmental Panel on Climate Change assigns a medium confidence of observed changes in the pattern effect resulting from internal variability (Forster et al. 2021). For the future changes of this pattern effect, there is medium confidence that the observed strengthening of the east–west SST gradient is temporary and will transition to a weakening of the SST gradient on centennial time scales. This SST pattern is not captured well by the CMIP5 or CMIP6 ensembles (Olonscheck et al. 2020). Compared with CMIP5, Olonscheck et al. (2020) found that the use of much larger initial condition ensembles of CMIP6 models does capture the observed cooling pattern, indicating that the absence of the pattern effect in the CMIP5 and CMIP6 ensembles can be partly interpreted as a sampling bias. Therefore, the relatively low observed trend potentially induced by internal variability will favor low climate sensitivity models and may result in spuriously low warming projections when applying the observed GSAT trend as a constraint.
While past studies have focused on the use of cloud metrics to constrain ECS (Bretherton and Caldwell 2020; Brient and Schneider 2016; Brient et al. 2016; Caldwell et al. 2018; Qu et al. 2014; Zhai et al. 2015), it is unclear how effectively cloud metrics constrain the transient projected climate warming, which is most relevant for adaptation and mitigation planning. This fact motivates us to investigate the use of cloud metrics to constrain CMIP6 projections of twenty-first-century warming under various SSP scenarios.
A number of issues regarding the development and implementation of emergent constraints remain outstanding. Although previous studies provide evidence of clear relationships between physically based cloud metrics and projected warming, the relative performance of cloud metrics as constraints compared with historical warming as a constraint is unclear. In particular, the relative impacts of the internal variability on the uncertainty range of constrained projections are unclear for different categories of metrics. Furthermore, as models in the CMIP6 archive share components the simulations they produce may not be statistically independent. The impact of this potential dependence needs to be addressed when assessing the performance of the metrics. Finally, different methods for applying constraints have been proposed. For example, some studies weight models based on their performance compared to observations (Brunner et al. 2019, 2020; Knutti et al. 2017; Lorenz et al. 2018; Sanderson et al. 2015a,b, 2017) while others use linear regression–based approaches (Cox et al. 2018; Hall et al. 2019; Nijsse et al. 2020; Schlund et al. 2020; Thackeray and Hall 2019; Tokarska et al. 2020). A direct comparison of these two approaches is needed.
To address these questions, in this paper we contrast the application of cloud metrics with the application of the global surface air temperature trend as emergent constraints on CMIP6 projected warming, and apply a multivariate linear regression model to make observationally constrained projections of twenty-first-century warming. We first apply a stepwise approach to select the most effective linear regression model from a subset of physically based metrics. The selected linear model is then evaluated in a cross-validated imperfect model test. We then use the linear regression model to constrain projected twenty-first-century GSAT changes from CMIP6 simulations. Potential model dependence is taken into account during both processes of metric selection and warming projection constraint, and different approaches to applying the constraints (weighting vs linear regression) are directly compared.
2. Data and methods
a. Model simulations
We use output from 26 global climate models participating in CMIP6 (Table 1). We calculate the various metrics considered (section 2b) using historical simulations and then use observations with the emergent constraint approach to predict GSAT changes in 2081–2100 relative to 1995–2014 under SSP5–8.5 and SSP1–2.6 scenarios (O’Neill et al. 2016; Riahi et al. 2017). For those models which contributed initial condition ensembles, we consider all ensemble members individually (we sample one random realization per model 10 000 times, as outlined in section 2d) in order to assess the contribution of internal variability, and to not bias results toward models with particularly large ensembles.
CMIP6 Historical, SSP1–2.6, and SSP5–8.5 simulations used in this study. The number of ensemble members provided for each forcing scenario is indicated in the second through the fourth columns.
b. Emergent constraint metrics considered
We consider five potential metrics for constraining GSAT projections: four cloud-related diagnostics and the historical GSAT trend (see appendix D). Caldwell et al. (2018) considered 19 cloud-based metrics proposed in the literature, and found that four have both clear physical connections with cloud feedbacks and significant correlations with ECS in CMIP5. Here, we consider these four cloud-related metrics as potential emergent constraints on twenty-first-century CMIP6 warming projections. Descriptions of these cloud metrics follow.
1) Marine boundary layer cloud metric
The overbar and angle brackets in Eq. (1) represent the monthly climatology and spatial average, respectively. This definition of the metric is similar that of Qu et al. (2014) but uses a region that is farther poleward, and focuses on monthly climatologies rather than interannual variations to reduce the uncertainty from internal variability. With future climate warming, an enhanced vertical humidity gradient between the marine boundary layer (MBL) and the free troposphere in subsidence regions can lead to greater in-cloud buoyancy and stronger production of turbulent kinetic energy in the MBL. This above process results in a weaker temperature inversion in the MBL, which can decrease low cloud formation (Bretherton 2015; Rieck et al. 2012; Sherwood et al. 2014; Vial et al. 2016). As described in Zhai et al. (2015), the sensitivity of MBLC fraction to seasonal SST changes is similar to its sensitivity to centennial SST changes. The different model sensitivities of MBLC fraction to SST changes on centennial time scales account for much of the intermodel variation of climate sensitivity. Furthermore, as mentioned in section 1, the larger shortwave cloud feedback in the CMIP6 ensemble relative to the CMIP5 ensemble on average can be attributed to the changes in the simulation of midlatitude, mixed-phase clouds (especially in the Southern Hemisphere; Zelinka et al. 2020). These facts provide the motivation for applying the MBLC metric as an observational constraint. For the observed value of the MBLC metric, we use the value reported by Zhai et al. (2015) of −1.28% ± 0.19% K−1 (mean ± 1 standard deviation), estimated over the period 2006–10.
2) Brient cloud albedo metric
The Brient cloud albedo (BCA) metric measures the sensitivity of the anomalies of shortwave cloud albedo to SST changes over the tropical oceans (Brient and Schneider 2016). The regions considered are the monthly-varying driest quartile of ocean grid cells between 30°S and 30°N, based on 500-hPa relative humidity. In climate models, the sensitivity of the tropical low cloud albedo to the underlying surface temperature in the present-day climate correlates with the strength of the shortwave tropical low cloud feedback and with future projected warming. Thus, the sensitivity of the variation of tropical low cloud reflection to changes of SST in the present climate can be used as an emergent constraint on future GSAT changes (Brient and Schneider 2016). A significant correlation between this metric and ECS has been found in CMIP6 simulations (Schlund et al. 2020).
3) Lower-tropospheric mixing index
Sherwood et al. (2014) developed three metrics to measure lower-tropospheric mixing: the Sherwood S metric, the Sherwood D metric, and the lower-tropospheric mixing index (LTMI), all of which are correlated with CMIP5 projected warming. The Sherwood S metric quantifies climatological small-scale mixing in the tropical lower free troposphere. The Sherwood D metric quantifies the large-scale mixing over the tropical lower troposphere. LTMI is defined as the sum of Sherwood D and S metrics (Sherwood et al. 2014). Climate models with stronger vertical moisture mixing in the lower troposphere tend to have a larger increase of moisture mixing with climate warming, which could decrease the boundary layer clouds because of the stronger convective drying under climate change (Sherwood et al. 2014). Sherwood et al. (2014) demonstrated that the S, D, and LTMI metrics are correlated with equilibrium climate sensitivity across climate models because of this link to low cloud feedbacks. Schlund et al. (2020) consider all three metrics as potential constraints, and show that only LTMI is significantly correlated with ECS in both CMIP5 and CMIP6 simulations. Therefore, in our study, we take LTMI as a potential constraint.
4) Brient cloud shallowness metric
Models that have shallower clouds over weakly subsiding tropical regions (a large BCS) in the historical period tend to have more influence by convective drying of the planetary boundary layer relative to turbulent moistening under climate warming, which further decreases low cloud cover and leads to a larger positive low cloud feedback. The detailed physical mechanism is rather complicated, involving different partially cancelling effects, and is described in detail in Brient et al. (2016). Further evidence of the relation of BCS to cloud feedbacks comes from a sensitivity test considering the lateral entrainment rate of shallow convection in MPI-ESM (Mauritsen and Roeckner 2020). Mauritsen and Roeckner (2020) found that weak lateral entrainment rates lead to more stratiform clouds within the boundary layer (larger BCS), and that convection-induced drying leads to a stronger reduction of low cloud fraction under climate warming and a higher equilibrium climate sensitivity. As our observational BCS value, we use the value based on CALIPSO/GOCCP (GCM Oriented Cloud CALIPSO Product) data reported by Bretherton and Caldwell (2020): 45% ± 3% over 2006–12.
5) GSAT trend metric
Past warming simulated by climate models is well correlated with future climate warming, especially for the 1970–2014 period (Brunner et al. 2020; Jiménez-de-la-Cuesta and Mauritsen 2019; Liang et al. 2020; Nijsse et al. 2020; Tokarska et al. 2020). The high correlation between the GSAT trend (GT) over 1970–2014 and projected warming emerges because of the small change of aerosol forcing relative to the dominant contribution from greenhouse gases over this period. In our analysis, we use HadCRUT5 (Morice et al. 2021), which is spatially infilled, to compute the observational temperature trend (0.018 ± 0.001 K yr−1, estimated over the period 1970–2014. The quoted observational uncertainty is the 5%–95% range across the 200-member HadCRUT5 ensemble).
c. Linear regression and stepwise metric selection
Our aim is to integrate information from all the metrics we consider constraining projected warming based on CMIP6 models. Our primary method for doing this uses multivariate linear regression with stepwise selection (Senftleben et al. 2020), described in detail in appendix A. To avoid using multiple metrics describing related processes in a linear regression model, we use a stepwise regression method to select a subset of metrics. A risk of using all possible metrics in a single linear regression model is the possibility of overfitting resulting from spurious relations between historical metrics and future projections (Bracegirdle and Stephenson 2012). We apply an iterative approach combining forward selection with backward elimination to build a multiple regression model that can best project future GSAT changes (von Storch and Zwiers 1999). Specifically, this approach adds and eliminates variables iteratively to the linear regression model, stopping when the explained sum of squares (ESS) does not change significantly based on an F test (with significance level p = 0.1). An alternative model weighting approach (Brunner et al. 2019, 2020; Knutti et al. 2017; Lorenz et al. 2018; Sanderson et al. 2015a,b, 2017) to constraining the ensemble is presented in appendix B.
d. Sampling from initial condition ensembles and uncertain observational values
About half of the models participating in CMIP6 provide ensembles of multiple initial-condition realizations. While all available ensemble members could be used together to derive an observationally constrained distribution, this approach would give us no information about the influence of intra-ensemble variability on our results. As well, if we were to follow this approach, those models with larger initial condition ensembles would have a stronger influence on our results. Therefore, when applying the linear regression model or weighting approach, we generate the constrained projections using one randomly selected ensemble member per model. To evaluate the influence of internal variability in the historical simulations on constrained projections, we repeat this process 10 000 times. This sampling approach is applied to all aspects of this study, including the imperfect model test and probabilistic validation (both described in section 2e), and observationally constrained projections. For observationally constrained projections, along with random sampling from the initial condition ensembles, we also sample the observed quantities [X0 in Eq. (A2)] within their uncertainty ranges (assuming Gaussian distributions with means and standard deviations quoted above). For each random selection of ensemble members and each realization of the observations, we calculate a PDF of future GSAT change. We then average these PDFs together to calculate an overall PDF, from which we calculate a 5%–95% range of future warming (appendix A).
The random selection of individual model realizations is important for our analysis because internal variability contributes considerable uncertainty to observational constraints, especially for metrics based on trends (Liang et al. 2020). The CMIP6 archive provides much larger initial condition ensembles than CMIP5, allowing us to better estimate the contribution of internal variability to projected uncertainty. Note that this sampling process can only partly account for internal variability because approximately half the models used only have a single ensemble member. To estimate the effect of undersampling internal variability, we artificially reduce the number of models with multiple ensemble members in observational constraints to test whether projection statistics are convergent as more multiple ensemble models are included. We find that for our main analysis using the full CMIP6 ensemble, the effects of internal variability may be underestimated in observational constraints using the GSAT trend metric, but this is not the case when applying climatologically based cloud metrics (see Text S3 and Fig. S5 in the online supplemental material).
e. Imperfect model test
We apply a cross-validated imperfect model test to assess the performance of the emergent constraint approaches, comparing the linear regression and weighting approaches, as well as the metrics used in these approaches (Brunner et al. 2020; Liang et al. 2020). We first choose one model to act as pseudo-observations (the “truth” in the imperfect model test context), and then apply emergent constraint approaches with all remaining models to predict this truth. This procedure is repeated taking each model in turn as truth. We use probabilistic validation of the imperfect model test to assess the uncertainty estimates resulting from the emergent constraints. Across all models, we note the relative frequency with which pseudo‐observations lie in each quintile of the imperfect model constrained projection (0%–20%, 20%–40%, etc.). If the constraining approach provides well-calibrated uncertainty estimates, the relative frequency within each quintile should be close to 0.2 (Liang et al. 2020). We also calculate the coverage frequency, defined as the percentage of pseudo-observations lying in the constrained 5%–95% predicted uncertainty range. If the uncertainty estimates work well, close to 90% of pseudo-observations should fall in the 5%–95% predicted uncertainty range.
There is an implicit assumption underlying our approach that cloud feedbacks not directly constrained in our study are not systematically biased in CMIP6 models. As will be demonstrated in section 3d, the agreement of constrained historical GSAT temperature evolution with observations provides evidence that the effects of any such systematic bias are limited.
3. Results
a. Metric performance
Based on the ECS results of previous studies discussed in section 2b, we take the BCS, BCA, LTMI, and MBLC metrics as our potential cloud-related constraints. Since the GSAT trend metric has been widely considered as an observational constraint on projections, we also compare constrained projections obtained solely from the cloud constraints with those using both the cloud constraints and the GT. We now evaluate how well these observable metrics are correlated with twenty-first-century warming using CMIP6 simulations. As shown in Fig. 1, projected warming is significantly correlated with BCA, MBLC, and GT, such that the p values are smaller than 0.05 across the internal variability sampling distribution (section 2d). In contrast, the LTMI and BCS correlations are weaker, with p values larger than p = 0.05. These results are similar to those of Schlund et al. (2020), who found that LTMI does not provide a strong emergent constraint for ECS in CMIP6 (in contrast to CMIP5). The spread of correlation coefficients is narrower for BCS, LTMI, and MBLC than BCA or GT metrics. The larger spread in BCA and GT reflects a larger contribution of internal variability in contrast to the other metrics.
Correlation coefficients between potential observational constraints and projected warming; 5%–95% uncertainty range and mean of correlation coefficients between potential constraints, evaluated from historical simulations, and simulated warming in response to the SSP5–8.5 scenario in 2081–2100 (relative to the reference period 1995–2014) based on 10 000 random samples from the initial condition ensembles (section 2d). The p values of the mean correlation coefficients for BCA, GT, MBLC, BCS, and LTMI are 0.004, 0.0007, 0.005, 0.13, and 0.32, respectively. The horizontal gray line represents the correlation value that is significant at the 0.05 level with the number of degrees of freedom estimated based on the number of independent models (appendix C). For display purposes, the signs of the MBLC metric and BCA correlations have been reversed.
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
b. Stepwise regression
We now investigate which combination of metrics produces the most effective emergent constraint. We first construct a regression model using only cloud metrics. The stepwise regression approach we use adds (or removes) the most (or the least) important term in the linear regression model at each step, based on the results of F tests (Figs. 2 and 3). For each step that requires calculation of an F statistic (Figs. 3b–d), we account for the effects of model dependence by using 20 as the effective number of independent models (appendix C). Following the flowchart outlining the stepwise procedure (Fig. 2), we first build a single variable linear model with the MBLC metric because of all cloud metrics it produces the largest ESS value (Fig. 3a). In step 2, we build a two-variable linear model adding the BCS metric because this is the only cloud metric that results in a significant increase in ESS relative to the MBLC metric regression model. As shown in Fig. 3b, the lower 5th percentile of the F statistic range for MBLC+BCS versus MBLC is greater than the critical F value, which indicates a significant improvement to the linear regression model using the MBLC and BCS metrics relative to the MBLC metric only. For the other two choices, MBLC+BCA versus MBLC and MBLC+LTMI versus MBLC, the upper 95th percentile of the F statistic is smaller than the critical F value, indicating that these two combinations should not be considered further.
A flowchart of the stepwise regression procedure using cloud metrics. For each step, the corresponding statistics are shown in Fig. 3.
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
The 5%–95% uncertainty range and mean of F statistics at each step in the stepwise regression including only cloud metrics. The horizontal dotted lines represent critical F values at the 0.1 level. As discussed in section 2d, the 5%–95% uncertainty ranges are generated by randomly sampling from the initial condition ensembles. For step 2 to step 4 in Fig. 3, the vertical lines represent the F statistics obtained taking a value of 20 as number of statistical degrees of freedom, based on an estimate of the number of independent models in the CMIP6 ensemble (appendix C; see Text S1).
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
In step 3 of the procedure (Fig. 3c), we carry out a backward selection step by removing the MBLC metric. Estimates of the resulting F statistic are always significantly larger than the critical F values (irrespective of internal variability or changes in the effective number of degrees of freedom; appendix C), indicating that the model with both MBLC and BCS metrics has significantly larger ESS compared with the linear model including BCS only. Therefore, the MBLC and BCS metrics are retained in the linear model in step 3.
Three-variable linear models are considered in step 4. There is no significant improvement in the fit of the combination MBLC+BCS+LTMI or MBLC+BCS+BCA relative to MBLC+BCS. The upper 95th percentile of the F statistic in MBLC+BCS+LTMI versus MBLC+BCS and the upper 90th percentile of the F statistic in MBLC+BCS+BCA versus MBLC+BCS are smaller than the critical F value. While the internal variability induced range of the MBLC+BCS+BCA versus MBLC+BCS F statistic crosses the critical F value, the median value is much smaller. Hence, all three-variable linear models fail to increase the ESS significantly compared with the model using MBLC and BCS metrics (Fig. 3d). Results from a sensitivity test addressing potential model dependence by varying the degrees of freedom to used to compute the F statistic (appendix C; see Fig. S3) obtain the same set of stepwise selected metrics. Therefore, we use the MBLC and BCS metrics as constraints in our cloud metric based multiple diagnostic linear regression model to predict the GSAT changes in 2081–2100 under SSP5–8.5. We also carried out the stepwise selection on SSP1–2.6 (not shown) and get the same selected metrics as for SSP5–8.5.
Since the GT metric has been widely applied as an observational constraint in previous studies, we repeat the previous analysis including the GT metric in our stepwise regression (Fig. S4). The resulting regression model uses the MBLC and GT metrics as constraints. A schematic of the cloud metrics entering our final regression models is provided in Fig. 4. Although we do consider the use of GT, our main focus in the subsequent is on the use of the two metrics MBLC and BCS in our constraining process, since these two metrics are less influenced by internal variability than GSAT (also see section 3c) and result in the linear regression model with best predictive power.
Schematic plot showing the physical basis of the BCS and MBLC metrics. For the BCS metric, models that have a stronger convective control of cloud cover in subsiding regions in the current climate tend to have shallower clouds, and tend to have a larger reduction in cloud cover associated with strengthened convective drying as the climate warms. For the MBLC metric, models with larger decrease of MBLC fraction in response to the SST warming at seasonal scale tend to have a larger decrease of MBLC fraction to SST warming at centennial scale. MBLC and BCS metrics focus on midlatitude and tropical low-level clouds respectively. Both these metrics are calculated over subsidence regions over the ocean. Detailed definitions of these selected metrics are in section 2b.
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
To further illustrate the statistical relationship between the metrics considered and projected GSAT changes in the CMIP6 simulations, Fig. 5 shows scatterplots of each of the GT, MBLC, and BCS metrics with late twenty-first-century warming. These scatterplots each use only a single, randomly determined ensemble member for each model to calculate the metric and the projected warming. Figure 5 also presents the observed values of the metrics (with uncertainty ranges). For the MBLC metric, models near the center of the range of simulated GSAT changes are in best agreement with observations. These results suggest that applying the MBLC metric will not shift the center of the distribution much relative to the unconstrained ensemble, consistent with previous studies of constraints on ECS (Bretherton and Caldwell 2020; Caldwell et al. 2014; Schlund et al. 2020; Sherwood et al. 2014). In contrast, observed values of the GT and BCS metrics are toward the low end of the range of simulated values. This fact is also in agreement with previous studies (Bretherton and Caldwell 2020; Brient et al. 2016; Nijsse et al. 2020; Tokarska et al. 2020).
Scatterplots showing relationships between selected constraints and projected warming: the (a) GT, (b) MBLC (x axis reversed), and (c) BCS metrics. For illustration, one ensemble member per model is used. The correlation coefficients and p values (relative to a null hypothesis of no correlation) are reported in the bottom right corner of each panel. The vertical lines show the observational values with means in solid and standard deviation in shadow. The dashed lines in each panel show the 66% confidence interval of the linear regression model [Eqs. (A3)–(A5)].
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
c. Imperfect model evaluation of constrained warming
Before presenting results of projected warming constrained by observations, we evaluate the performance of the emergent constraint approach in an imperfect model setting (section 2e), based on values of the root-mean-square error (RMSE) improvement (relative to the unconstrained ensemble) and correlation coefficient (r), both calculated using the pseudo-observations and the means of the constrained imperfect model ensemble (Fig. 6). We present linear regression results for each of the metrics separately, as well as linear regression and weighting results for the two pairs of metrics resulting from the stepwise selection procedure. The linear regression approach performs better than the weighting approach, resulting in larger values of both the correlation coefficient and the RMSE reduction (Fig. 6).
Evaluation of constraining approaches with stepwise selected constraints using an imperfect model test. (a) Correlations between predicted means of constrained projections and pseudo-observations. (b) Reductions in RMSE of constrained projections compared to unconstrained projections. The performance of the liner regression model is shown in green bars and the performance of the weighting method is shown in red bars, using stepwise selected metrics. Please note that the 5%–95% uncertainty ranges (vertical bars) and means (dots) in (a) and (b) are a result of initial-condition sampling.
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
Consistent with the results of the stepwise selection procedure, the linear regression models using only a single metric do not result in constraints as effective as those using two metrics (with the exception of GT, which has median correlation and RMSE reduction values similar to the combination of MBLC and BCS metrics). On average, the constrained projections based on the combination of MBLC and GSAT metrics performs slightly better than those based on the combined MBLC and BCS metrics. However, the uncertainty ranges (from the initial condition sampling) of both correlation coefficient and RMSE improvement are much wider for the combined MBLC and GSAT metrics than for the combined MBLC and BCS metrics, reflecting the substantially larger internal variability in GSAT estimates. As the slight improvement in performance of the combination of the MBLC and GSAT metrics comes with a substantial increase in uncertainty and risk of biases from using the GSAT metric due to poor representation of the pattern effect in CMIP6 models, these results suggest that the more robust constraints based on the cloud-based MBLC and BCS metrics should be preferred. Similar to SSP 5–8.5 described above, the robustness of the MBLC and BCS metrics was determined for SSP1–2.6 using an imperfect model test (not shown).
To test whether the linear regression method with the selected metrics provides reliable uncertainty estimates, we conduct a probabilistic validation (section 2e). Figure 7a shows the results of this analysis for 2081–2100 under SSP5–8.5 with the uncertainty estimated by assuming Gaussian regression residuals (appendix A). Also shown in Fig. 7a are the results for the unconstrained ensemble. Similar to the unconstrained projection, the MBLC and BCS metric linear regression model produces relative frequencies of approximately 0.2 in each quintile (Fig. 7a). As shown in Fig. 7b, the average fraction of pseudo-observations lying within the 5%–95% uncertainty range predicted by linear regression with the assumption of Gaussian residuals is close to 90%, and the range of this fraction due to sampling internal variability is narrow. These results indicate that the assumption of Gaussian regression residuals results in an accurate coverage probability.
(a). Histograms show the relative frequency with which the true twenty-first-century warming in the individual SSP5–8.5 simulations lies within each of five quintiles of projected warming derived using the unconstrained and MBLC+BCS constrained approaches in an imperfect model test, aggregated across all models. Bars denote the median of the 10 000 single-member per model samples. The ±1 standard deviation ranges are denoted in error bars for each quintile. Note that the constrained distributions are slightly narrower than that the unconstrained distributions. (b) The frequency of the fraction of pseudo-observations lying in the 5%–95% constrained uncertainty range across 10 000 samples. The blue bars and red dot show the frequency and the mean of 10 000 samples, respectively.
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
d. Observational constraints
Based on the imperfect model analyses presented above, we now apply the observed metrics to constrain warming projections using the linear regression approach assuming Gaussian residuals. As described in section 2d, we account for internal variability and observational uncertainty by constructing the regression models using one randomly selected ensemble member per model (Fig. 8) with one random realization of the observations sampled from the estimated distribution and repeating this process 10 000 times. The undersampling of internal variability resulting from the fact that only 12 models of SSP 5–8.5 have multiple ensemble members available is potentially important for GT but negligible for climatologically based cloud metrics (section 2d). Throughout these calculations we use the value 20 as the effective number of independent models in the CMIP6 ensemble (appendix C).
Schematic plot to illustrate how observationally constrained projections of warming are obtained using the GSAT trend metric with our Monte Carlo approach. (left) Scatterplot of projected warming against historical warming in individual CMIP6 simulations, with one ensemble member chosen at random from each model. Two representative random samples of ensemble members and observations are illustrated in red and green. The associated regression relation (solid line and associated dashed lines show the linear regression model with corresponding 90% prediction interval), together with a realization of the observations (vertical dashed line), sampled from within its uncertainty range, is used to infer a PDF of projected warming. (right) The process is repeated 10 000 times, and the corresponding PDFs are averaged to obtain the constrained projection (refer to appendix A).
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
Before applying the observational constraint to future climate change, we investigate how the metrics we are considering constrain historical warming in the CMIP6 models (Fig. 9). Both the GT metric and the MBLC–BCS metric pair result in constrained historical warming consistent with observations, such that the observed GSAT always falls within the constrained 5%–95% uncertainty range. This result demonstrates that the linear regression approach with the selected metrics can capture the historical warming, and increases confidence in the constrained future warming results. This is further evidence that the constrained distribution using the MBLC and BCS metrics is the absence of a systematic bias relative to observations.
Constrained 20-yr moving-average GSAT anomalies derived using the linear regression approach with each of the cloud metrics and with the GSAT trend, compared to observations (based period: 1961–90). The observational record we use is HadCRUT5 (spatially infilled version). The x axis shows the center of the 20-yr averaging period. The green (GSAT trend) and gray (cloud metrics: MBLC and BCS) shadows show 5%–95% constrained uncertainty ranges with solid lines showing the best estimates. We account for internal variability and observational uncertainty by constructing the regression models using one randomly selected ensemble member per model and using observed quantities sampled from their uncertainty ranges (assuming Gaussian distributions with means and standard deviations quoted in section 2b) and then repeating this 10 000 times.
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
Constraining projections of twenty-first-century warming under SSP5–8.5 using the GT metric alone in the linear regression model produces lower values of the mean and 5th percentile of the warming distribution than either the unconstrained estimate or the constrained estimate using cloud metrics (Fig. 10 and Table 2; Liang et al. 2020; Nijsse et al. 2020; Tokarska et al. 2020). The 5%–95% uncertainty ranges of both sets of constrained projections are narrower than that of the unconstrained ensemble.
PDFs of constrained and unconstrained GSAT changes between 2081–2100 and 1995–2014 under SSP5–8.5. The curves show the predicted distribution of GSAT changes constrained using the GSAT trend (blue), constrained using cloud metrics (green), and the unconstrained distribution (black). The shadows around these PDF curves displays the contribution of internal variability and observational uncertainty, estimated by sampling one ensemble member per model and sampling the observed quantities within their uncertainty ranges (assuming Gaussian distributions with means and standard deviations quoted in section 2b) 10 000 times. The solid curves correspond to the mean of these 10 000 samples. The upper horizontal bars display the respective 5%–95% projected ranges and means (numerical values are given in Table 2) corresponding to the mean solid curves [the theoretical basis for this calculation is shown in Eq. (A6) in appendix A]. These results are obtained assuming a value of 20 for the number of statistical degrees of freedom of the CMIP6 ensemble (appendix C).
Citation: Journal of Climate 35, 6; 10.1175/JCLI-D-21-0468.1
Best estimates and 5%–95% uncertainty ranges of projected warming using SSP5–8.5 for GSAT changes between 1995–2014 and 2081–2100. When calculating constrained uncertainty, we use a value of 20 as independent model amount in the CMIP6 ensemble (appendix C).
Relative to the unconstrained ensemble, the constrained projections using the cloud metrics result in an increase in the 5th percentile of warming and a decrease in the 95th percentile, with little effect on the mean. Specifically, with the cloud metric weighting, the unconstrained 5%–95% uncertainty range of warming in 2081–2100 relative to 1995–2014 under SSP5–8.5 of 2.34–5.81 K is narrowed to 2.84–5.12 K. Assuming that the models are independent slightly decreases this range (see Text S1, Table 2, and Table S1 in the online supplemental material).
Similar results are found for the projection of twenty-first-century warming under the SSP 1–2.6 scenario (Table 2; see also Table S1). Weighting by the cloud metrics reduces the 5%–95% uncertainty range from 0.38–2.04 to 0.60–1.70 K, and reduces the mean from 1.30 to 1.15 K.
4. Summary and discussion
Previous studies have demonstrated the existence of physically meaningful relationships between equilibrium climate sensitivity and low-level cloud metrics across CMIP5 and CMIP6 models (Brient and Schneider 2016; Brient et al. 2016; Caldwell et al. 2016, 2018; Schlund et al. 2020; Zelinka et al. 2020; Zhai et al. 2015). These relationships enable us to constrain future warming using observed values of these metrics in a complementary approach to the use of the observed GSAT trend as a constraint. We have applied two cloud metrics, a marine boundary layer cloud (MBLC) metric (Zhai et al. 2015) and Brient cloud shallowness (Brient et al. 2016), obtained from a larger set of physically motivated cloud metrics using a stepwise selection process, to constrain future warming under two different scenarios, SSP1–2.6 and SSP5–8.5. In developing the emergent constraints, we have compared two different approaches: linear regression (Cox et al. 2018; Karpechko et al. 2013; Nijsse et al. 2020; Senftleben et al. 2020; Tokarska et al. 2020) and a weighting method (Brunner et al. 2019, 2020; Knutti et al. 2017; Lorenz et al. 2018; Sanderson et al. 2015a,b, 2017). Using a cross-validated imperfect model test across available CMIP6 models, we find that for the problem considered the linear regression approach produces more effective constraints than the weighting approach.
The cloud metrics we use to constrain future warming have less uncertainty, which might be due to the reduced effect of internal variability than the GSAT trend often used to constrain warming projections (Liang et al. 2020; Nijsse et al. 2020; Tokarska et al. 2020). As a result, the robustness of constrained projections based on cloud metrics is improved compared to those based on the GSAT trend metric. Furthermore, the SST pattern effect (Andrews et al. 2018; Watanabe et al. 2021; Zhou et al. 2016, 2021) is known to affect climate sensitivity but is not captured well in the CMIP6 archive, perhaps due to undersampling of internal variability (Forster et al. 2021; Olonscheck et al. 2020; Watanabe et al. 2021). These considerations support the use of the cloud metric rather than the GSAT trend for constraining future warming.
We account for observational uncertainty and internal variability in our analysis by sampling from the estimated distribution of observational uncertainty and sampling individual members from initial condition ensembles when constructing observationally constrained projections. Applying the multiple observed cloud metrics as constraints to future GSAT changes, we find that for both the SSP1–2.6 and SSP5–8.5 scenarios the projected warming uncertainty ranges are considerably narrower relative to unconstrained simulations, with little change in mean warming (Fig. 10; see also Fig. S7). We also find that observationally constrained projections using climatological cloud metrics have substantially reduced prediction uncertainty associated with internal variability in historical simulations relative to constrained projections using the GSAT trend. Furthermore, our study provides evidence for increasing the lower bound of the warming range of CMIP6 projections, as well as lowering the upper bound. This result differs from constrained projections based on the GSAT trend alone, which exhibit a substantial decrease in the upper bound and the mean of the projection range, but little change in the lower bound (Brunner et al. 2020; Caldwell et al. 2018; Liang et al. 2020; Nijsse et al. 2020; Tokarska et al. 2020).
Our study provides a framework to apply multiple metrics to constrain future warming that is also appropriate for constrained projections of equilibrium climate sensitivity. Our results imply that the mean climate sensitivity of the CMIP6 ensemble may not in fact be biased high as some studies have suggested, and that uncertainties in projected warming can be considerably narrowed using physically reasonable cloud constraints.
Acknowledgments.
We acknowledge the World Climate Research Programme’s Working Group on the modeling of Sixth Coupled Model Intercomparison Project listing in Table 1. This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC; Grant RGPIN‐2017‐04043) and the China Scholarship Council (CSC; Grant 202008180022).
Data availability statement.
The data used in this study are openly available. The CMIP6 simulation data are available at https://esgf-node.llnl.gov/projects/cmip6/, and the HadCRUT5 dataset that we compute the observational temperature trend is available at https://www.metoffice.gov.uk/hadobs/hadcrut5/.
APPENDIX A
Linear Regression Method
The derivation of Eq. (A4) is provided in Hooper and Zellner (1961). In Eq. (A5), M is the number of models and p denotes the number of metrics included in the regression; n is equal to the value of 26 when we take the full number of models; otherwise, n is equal to the value of 20 by accounting for model dependence (see appendix C).
As described in section 2d, we assess the effect of internal variability by taking one random selection of one ensemble member per model and simultaneously drawing one random choice of each observed metric from our estimate of the observed distribution. We repeat this process 10 000 times, and derive 10 000 estimates of the PDF of projected warming (shown as shading in Fig. 10). We next demonstrate that the sample mean of these PDFs (shown as solid lines in Fig. 10) is an estimate of the population estimate of the PDF of projected warming.
Based on our sampling strategy, each derived PDF with its corresponding value of μ and σ is equally probable. Sampling μ and σ from their joint distribution and then averaging the resulting conditional distributions of x gives a sample estimate of this population mean distribution (solid curves of constrained PDFs).
APPENDIX B
Weighting Method
We then obtain W, a vector of weights for each model, by normalizing wi by the sum of wi. In Eq. (B1), M is the number of models considered, Di is the difference between the simulated value of the metric from model i and the observed value, and Sij is the difference between models i and j for the selected metric. When we weight models with multiple metrics, we calculate Di and Sij giving equal weights to each metric after normalizing each metric by the median across models. The parameter σs regulates the degree to which model similarities are penalized in the weights, and σD regulates the effect of model performance on the weights (Knutti et al. 2017; Lorenz et al. 2018). We calculate σs and σD based on approaches proposed by Knutti et al. (2017), Lorenz et al. (2018), and Brunner et al. (2019) (see Texts S1 and S2 in the online supplemental material). In contrast to the linear regression approach, this approach weights model projections based not only on their goodness of fit to historical observations but also based on similarity between models. The model weights obtained using Eq. (B1) not only yield the constrained multimodel mean, but also are used to weight models to obtain a multimodel distribution for the constrained uncertainty range (section 2e).
APPENDIX C
Model Dependence
Models from different institutions may share model components, and some individual modeling centers produce multiple closely related model versions. As a result, the individual members of the multimodel ensemble are not expected to be independent. This interdependence is accounted for in the weighting approach by downweighting models that are similar in their historical simulations. For the linear regression approach, the number of independent models must be estimated in order to determine the number degrees of freedom in the regression. To this end, we assess model dependence using the “model genealogy” method (Masson and Knutti 2011). To build up the CMIP6 model genealogy, we apply a hierarchical clustering method (Brunner et al. 2020; Knutti et al. 2013; Masson and Knutti 2011) implemented by the linkage function in Python SciPy. The intermodel distance is defined as the area weighted mean square difference in near-surface air temperature averaged over 1970–2014 on a 1° grid between two models. We determine the number of independent models by comparing intermodel distances with inter-ensemble member distances in a model genealogy approach (Text S1, Figs. S1 and S2). Based on the results of this analysis, in sections 3b–3d the number of independent models considered in the linear regression approach is set to 20 rather than the full number of models. The sensitivity of the stepwise selection procedure to the number of statistical degrees of freedom used is presented in Figs. S3 and S4.
APPENDIX D
Abbreviations for Trends and Metrics
GSAT |
Global‐mean near‐surface air temperature |
MBLC |
The sensitivity of monthly marine boundary layer cloud fraction to SST changes between 20° and 40° latitude in the Southern and Northern Hemispheres |
BCA |
Brient cloud albedo metric, defined as the sensitivity of the deseasonalized shortwave cloud albedo to SST changes over the tropical oceans |
LTMI |
Lower-tropospheric mixing index, which measures the strength of lower-tropospheric mixing in ascending regions of the tropics |
BCS |
Brient cloud shallowness, defined in terms of the cloud fractions below 850 hPa (CF850) and below 950 hPa (CF950) over weakly subsiding tropical ocean regions |
GT |
Global‐mean near‐surface air temperature trend (GSAT trend) |
REFERENCES
Andrews, T., and Coauthors, 2018: Accounting for changing temperature patterns increases historical estimates of climate sensitivity. Geophys. Res. Lett., 45, 8490–8499, https://doi.org/10.1029/2018GL078887.
Bracegirdle, T. J., and D. B. Stephenson, 2012: Higher precision estimates of regional polar warming by ensemble regression of climate model projections. Climate Dyn., 39, 2805–2821, https://doi.org/10.1007/s00382-012-1330-3.
Bretherton, C. S., 2015: Insights into low-latitude cloud feedbacks from high-resolution models. Philos. Trans. Royal Soc., A373, 20140415, https://doi.org/10.1098/rsta.2014.0415.
Bretherton, C. S., and P. M. Caldwell, 2020: Combining emergent constraints for climate sensitivity. J. Climate, 33, 7413–7430, https://doi.org/10.1175/JCLI-D-19-0911.1.
Brient, F., and T. Schneider, 2016: Constraints on climate sensitivity from space-based measurements of low-cloud reflection. J. Climate, 29, 5821–5835, https://doi.org/10.1175/JCLI-D-15-0897.1.
Brient, F., T. Schneider, Z. H. Tan, S. Bony, X. Qu, and A. Hall, 2016: Shallowness of tropical low clouds as a predictor of climate models’ response to warming. Climate Dyn., 47, 433–449, https://doi.org/10.1007/s00382-015-2846-0.
Brunner, L., R. Lorenz, M. Zumwald, and R. Knutti, 2019: Quantifying uncertainty in European climate projections using combined performance-independence weighting. Environ. Res. Lett., 14, 124010, https://doi.org/10.1088/1748-9326/ab492f.
Brunner, L., A. G. Pendergrass, F. Lehner, A. L. Merrifield, R. Lorenz, and R. Knutti, 2020: Reduced global warming from CMIP6 projections when weighting models by performance and independence. Earth Syst. Dyn., 11, 995–1012, https://doi.org/10.5194/esd-11-995-2020.
Caldwell, P. M., C. S. Bretherton, M. D. Zelinka, S. A. Klein, B. D. Santer, and B. M. Sanderson, 2014: Statistical significance of climate sensitivity predictors obtained by data mining. Geophys. Res. Lett., 41, 1803–1808, https://doi.org/10.1002/2014GL059205.
Caldwell, P. M., M. D. Zelinka, K. E. Taylor, and K. Marvel, 2016: Quantifying the sources of intermodel spread in equilibrium climate sensitivity. J. Climate, 29, 513–524, https://doi.org/10.1175/JCLI-D-15-0352.1.
Caldwell, P. M., M. D. Zelinka, and S. A. Klein, 2018: Evaluating emergent constraints on equilibrium climate sensitivity. J. Climate, 31, 3921–3942, https://doi.org/10.1175/JCLI-D-17-0631.1.
Cox, P. M., C. Huntingford, and M. S. Williamson, 2018: Emergent constraint on equilibrium climate sensitivity from global temperature variability. Nature, 553, 319–322, https://doi.org/10.1038/nature25450.
Dong, Y., K. C. Armour, M. D. Zelinka, C. Proistosescu, D. S. Battisti, C. Zhou, and T. Andrews, 2020: Intermodel spread in the pattern effect and its contribution to climate sensitivity in CMIP5 and CMIP6 models. J. Climate, 33, 7755–7775, https://doi.org/10.1175/JCLI-D-19-1011.1.
Forster, P. , and Coauthors, 2021: The Earth’s energy budget, climate feedbacks, and climate sensitivity. Climate Change 2021: The Physical Science Basis, Cambridge University Press, in press.
Gettelman, A., and Coauthors, 2019: High climate sensitivity in the Community Earth System Model version 2 (CESM2). Geophys. Res. Lett., 46, 8329–8337, https://doi.org/10.1029/2019GL083978.
Gregory, J. M., T. Andrews, P. Ceppi, T. Mauritsen, and M. J. Webb, 2020: How accurately can the climate sensitivity to CO2 be estimated from historical climate change? Climate Dyn., 54, 129–157, https://doi.org/10.1007/s00382-019-04991-y.
Hall, A., P. Cox, C. Huntingford, and S. Klein, 2019: Progressing emergent constraints on future climate change. Nat. Climate Change, 9, 269–278, https://doi.org/10.1038/s41558-019-0436-6.
Hattab, M. W., C. S. Jackson, and G. Huerta, 2019: Analysis of climate sensitivity via high-dimensional principal component regression. Commun. Stat., 5, 394–414, https://doi.org/10.1080/23737484.2019.1670119.
Hooper, J. W., and A. Zellner, 1961: The error of forecast for multivariate regression models. Econometrica, 29, 544–555, https://doi.org/10.2307/1911803.
Jiménez-de-la-Cuesta, D., and T. Mauritsen, 2019: Emergent constraints on Earth’s transient and equilibrium response to doubled CO2 from post-1970s global warming. Nat. Geosci., 12, 902–905, https://doi.org/10.1038/s41561-019-0463-y.
Karpechko, A. Y., D. Maraun, and V. Eyring, 2013: Improving Antarctic total ozone projections by a process-oriented multiple diagnostic ensemble regression. J. Atmos. Sci., 70, 3959–3976, https://doi.org/10.1175/JAS-D-13-071.1.
Knutti, R., D. Masson, and A. Gettelman, 2013: Climate model genealogy: Generation CMIP5 and how we got there. Geophys. Res. Lett., 40, 1194–1199, https://doi.org/10.1002/grl.50256.
Knutti, R., J. Sedlacek, B. M. Sanderson, R. Lorenz, E. M. Fischer, and V. Eyring, 2017: A climate model projection weighting scheme accounting for performance and interdependence. Geophys. Res. Lett., 44, 1909–1918, https://doi.org/10.1002/2016GL072012.
Liang, Y., N. P. Gillett, and A. H. Monahan, 2020: Climate model projections of 21st century global warming constrained using the observed warming trend. Geophys. Res. Lett., 47, e2019GL086757, https://doi.org/10.1029/2019GL086757.
Lorenz, R., N. Herger, J. Sedlacek, V. Eyring, E. M. Fischer, and R. Knutti, 2018: Prospects and caveats of weighting climate models for summer maximum temperature projections over North America. J. Geophys. Res. Atmos., 123, 4509–4526, https://doi.org/10.1029/2017JD027992.
Manabe, S., and R. F. Strickler, 1964: Thermal equilibrium of the atmosphere with a convective adjustment. J. Atmos. Sci., 21, 361–385, https://doi.org/10.1175/1520-0469(1964)021<0361:TEOTAW>2.0.CO;2.
Masson, D., and R. Knutti, 2011: Climate model genealogy. Geophys. Res. Lett., 38, L08703, https://doi.org/10.1029/2011GL046864.
Mauritsen, T., and E. Roeckner, 2020: Tuning the MPI-ESM1.2 global climate model to improve the match with instrumental record warming by lowering its climate sensitivity. J. Adv. Model. Earth Syst., 12, e2019MS002037, https://doi.org/10.1029/2019MS002037.
Meehl, G. A., C. A. Senior, V. Eyring, G. Flato, J.-F. Lamarque, R. J. Stouffer, K. E. Taylor, and M. Schlund, 2020: Context for interpreting equilibrium climate sensitivity and transient climate response from the CMIP6 Earth system models. Sci. Adv., 6, eaba1981, https://doi.org/10.1126/sciadv.aba1981.
Morice, C. P., and Coauthors, 2021: An updated assessment of near-surface temperature change from 1850: The HadCRUT5 data set. J. Geophys. Res. Atmos., 126, e2019JD032361, https://doi.org/10.1029/2019JD032361.
Myers, T. A., R. C. Scott, M. D. Zelinka, S. A. Klein, J. R. Norris, and P. M. Caldwell, 2021: Observational constraints on low cloud feedback reduce uncertainty of climate sensitivity. Nat. Climate Change, 11, 501–507, https://doi.org/10.1038/s41558-021-01039-0.
Nijsse, F. J. M. M., P. M. Cox, and M. S. Williamson, 2020: An emergent constraint on Transient Climate Response from simulated historical warming in CMIP6 models. Earth Syst. Dyn. Discuss., 2020, 1–14, https://doi.org/10.5194/esd-2019-86.
Olonscheck, D., M. Rugenstein, and J. Marotzke, 2020: Broad consistency between observed and simulated trends in sea surface temperature patterns. Geophys. Res. Lett., 47, e2019GL086773, https://doi.org/10.1029/2019GL086773.
O’Neill, B. C., and Coauthors, 2016: The Scenario Model Intercomparison Project (ScenarioMIP) for CMIP6. Geosci. Model Dev., 9, 3461–3482, https://doi.org/10.5194/gmd-9-3461-2016.
Qu, X., A. Hall, S. A. Klein, and P. M. Caldwell, 2014: On the spread of changes in marine low cloud cover in climate model simulations of the 21st century. Climate Dyn., 42, 2603–2626, https://doi.org/10.1007/s00382-013-1945-z.
Ramanathan, V., E. J. Pitcher, R. C. Malone, and M. L. Blackmon, 1983: The response of a spectral general-circulation model to refinements in radiative processes. J. Atmos. Sci., 40, 605–630, https://doi.org/10.1175/1520-0469(1983)040<0605:TROASG>2.0.CO;2.
Riahi, K., and Coauthors, 2017: The shared socioeconomic pathways and their energy, land use, and greenhouse gas emissions implications: An overview. Global Environ. Change, 42, 153–168, https://doi.org/10.1016/j.gloenvcha.2016.05.009.
Ribes, A., S. Qasmi, and N. P. Gillett, 2021: Making climate projections conditional on historical observations. Sci. Adv., 7, eabc0671, https://doi.org/10.1126/sciadv.abc0671.
Rieck, M., L. Nuijens, and B. Stevens, 2012: Marine boundary layer cloud feedbacks in a constant relative humidity atmosphere. J. Atmos. Sci., 69, 2538–2550, https://doi.org/10.1175/JAS-D-11-0203.1.
Sanderson, B. M., R. Knutti, and P. Caldwell, 2015a: Addressing interdependency in a multimodel ensemble by interpolation of model properties. J. Climate, 28, 5150–5170, https://doi.org/10.1175/JCLI-D-14-00361.1.
Sanderson, B. M., R. Knutti, and P. Caldwell, 2015b: A representative democracy to reduce interdependency in a multimodel ensemble. J. Climate, 28, 5171–5194, https://doi.org/10.1175/JCLI-D-14-00362.1.
Sanderson, B. M., M. Wehner, and R. Knutti, 2017: Skill and independence weighting for multi-model assessments. Geosci. Model Dev., 10, 2379–2395, https://doi.org/10.5194/gmd-10-2379-2017.
Schlund, M., A. Lauer, P. Gentine, S. C. Sherwood, and V. Eyring, 2020: Emergent constraints on equilibrium climate sensitivity in CMIP5: Do they hold for CMIP6? Earth Syst. Dyn., 11, 1233–1258, https://doi.org/10.5194/esd-11-1233-2020.
Sellar, A. A., and Coauthors, 2019: UKESM1: Description and evaluation of the U.K. Earth System Model. J. Adv. Model. Earth Syst., 11, 4513–4558, https://doi.org/10.1029/2019MS001739.
Senftleben, D., A. Lauer, and A. Karpechko, 2020: Constraining uncertainties in CMIP5 projections of September Arctic sea ice extent with observations. J. Climate, 33, 1487–1503, https://doi.org/10.1175/JCLI-D-19-0075.1.
Sherwood, S. C., S. Bony, and J. L. Dufresne, 2014: Spread in model climate sensitivity traced to atmospheric convective mixing. Nature, 505, 37–42, https://doi.org/10.1038/nature12829.
Sherwood, S. C., and Coauthors, 2020: An assessment of Earth’s climate sensitivity using multiple lines of evidence. Rev. Geophys., 58, e2019RG000678, https://doi.org/10.1029/2019RG000678.
Stephens, G. L., 1984: The parameterization of radiation for numerical weather prediction and climate models. Mon. Wea. Rev., 112, 826–867, https://doi.org/10.1175/1520-0493(1984)112<0826:TPORFN>2.0.CO;2.
Swart, N. C., and Coauthors, 2019: The Canadian Earth System Model version 5 (CanESM5.0.3). Geosci. Model Dev., 12, 4823–4873, https://doi.org/10.5194/gmd-12-4823-2019.
Thackeray, C. W., and A. Hall, 2019: An emergent constraint on future Arctic sea-ice albedo feedback. Nat. Climate Change, 9, 972–978, https://doi.org/10.1038/s41558-019-0619-1.
Tokarska, K. B., M. B. Stolpe, S. Sippel, E. M. Fischer, C. J. Smith, F. Lehner, and R. Knutti, 2020: Past warming trend constrains future warming in CMIP6 models. Sci. Adv., 6, eaaz9549, https://doi.org/10.1126/sciadv.aaz9549.
Vial, J., J. L. Dufresne, and S. Bony, 2013: On the interpretation of inter-model spread in CMIP5 climate sensitivity estimates. Climate Dyn., 41, 3339–3362, https://doi.org/10.1007/s00382-013-1725-9.
Vial, J., S. Bony, J. L. Dufresne, and R. Roehrig, 2016: Coupling between lower-tropospheric convective mixing and low-level clouds: Physical mechanisms and dependence on convection scheme. J. Adv. Model. Earth Syst., 8, 1892–1911, https://doi.org/10.1002/2016MS000740.
Voldoire, A., and Coauthors, 2019: Evaluation of CMIP6 DECK experiments with CNRM-CM6-1. J. Adv. Model. Earth Syst., 11, 2177–2213, https://doi.org/10.1029/2019MS001683.
von Storch, H., and F. W. Zwiers, 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.
Watanabe, M., J. L. Dufresne, Y. Kosaka, T. Mauritsen, and H. Tatebe, 2021: Enhanced warming constrained by past trends in equatorial Pacific sea surface temperature gradient. Nat. Climate Change, 11, 33–37, https://doi.org/10.1038/s41558-020-00933-3.
Zelinka, M. D., and Coauthors, 2020: Causes of higher climate sensitivity in CMIP6 models. Geophys. Res. Lett., 47, e2019GL085782, https://doi.org/10.1029/2019GL085782.
Zhai, C. X., J. H. Jiang, and H. Su, 2015: Long-term cloud change imprinted in seasonal cloud variation: More evidence of high climate sensitivity. Geophys. Res. Lett., 42, 8729–8737, https://doi.org/10.1002/2015GL065911.
Zhou, C., M. D. Zelinka, and S. A. Klein, 2016: Impact of decadal cloud variations on the Earth’s energy budget. Nat. Geosci., 9, 871–874, https://doi.org/10.1038/ngeo2828.
Zhou, C., M. D. Zelinka, A. E. Dessler, and M. H. Wang, 2021: Greater committed warming after accounting for the pattern effect. Nat. Climate Change, 11, 132–136, https://doi.org/10.1038/s41558-020-00955-x.