## Abstract

Physical uncertainties in global-warming projections are dominated by uncertainties about how the fraction of incoming shortwave radiation that clouds reflect will change as greenhouse gas concentrations rise. Differences in the shortwave reflection by low clouds over tropical oceans alone account for more than half of the variance of the equilibrium climate sensitivity (ECS) among climate models, which ranges from 2.1 to 4.7 K. Space-based measurements now provide an opportunity to assess how well models reproduce temporal variations of this shortwave reflection on seasonal to interannual time scales. Here such space-based measurements are used to show that shortwave reflection by low clouds over tropical oceans decreases robustly when the underlying surface warms, for example, by −(0.96 ± 0.22)% K^{−1} (90% confidence level) for deseasonalized variations. Additionally, the temporal covariance of low-cloud reflection with temperature in historical simulations with current climate models correlates strongly (*r* = −0.67) with the models’ ECS. Therefore, measurements of temporal low-cloud variations can be used to constrain ECS estimates based on climate models. An information-theoretic weighting of climate models by how well they reproduce the measured deseasonalized covariance of shortwave cloud reflection with temperature yields a most likely ECS estimate around 4.0 K; an ECS below 2.3 K becomes very unlikely (90% confidence).

## 1. Introduction

Through their reflection of incoming shortwave radiation and absorption and reemission of longwave radiation, clouds regulate Earth’s energy balance (Stephens et al. 2012). But how the fraction of shortwave radiation reflected by clouds will change as greenhouse gas concentrations rise remains uncertain, with projections differing widely among climate models (Zelinka et al. 2012; Webb et al. 2013; Lauer and Hamilton 2013). Differences in the shortwave reflection by tropical low clouds (TLCs) over oceans account for much of the spread of climate projections across current models. For example, the equilibrium climate sensitivity (ECS), the equilibrium surface warming after doubling carbon dioxide concentrations, is a convenient albeit imperfect yardstick of the sensitivity of the climate system to perturbations in atmospheric greenhouse gas concentrations (Knutti and Hegerl 2008; Andrews et al. 2015). It ranges from 2.1 to 4.7 K across current climate models (Stocker et al. 2013; Forster et al. 2013). More than half of the ECS variance across models is accounted for by differences in the shortwave reflection by TLCs (Bony and Dufresne 2005; Vial et al. 2013). An increasing TLC cover dampens global warming, exerting a negative feedback, because the primary energetic effect of low clouds is to reflect shortwave radiation. Conversely, a decreasing TLC cover amplifies global warming, exerting a positive feedback. Both the magnitude and sign of this feedback are uncertain (Zelinka et al. 2012; Webb et al. 2013).

Ground- and space-based observations point toward weakening shortwave reflection by TLCs under warming and hence an amplifying feedback (Clement et al. 2009; Dessler 2010, 2013; Zhou et al. 2013; Bellomo et al. 2014). A number of recent studies have used the observed covariation of TLC reflection with surface temperature and with other environmental variables to evaluate how well climate models simulate interannual TLC variability (Qu et al. 2014, 2015b; Myers and Norris 2015; Zhou et al. 2015; Myers and Norris 2016). These studies generally indicate that models with strongly positive low-cloud feedback are more consistent with observations than models with weakly positive or negative feedback, suggesting ECS more likely lies in the upper range of model estimates. This is in line with other model–observation comparisons that also point toward higher ECS (Fasullo and Trenberth 2012; Sherwood et al. 2014; Tian 2015). By contrast, studies focusing on Earth’s energy budget generally point toward a lower ECS (Otto et al. 2013), albeit with large uncertainties. The uncertainties in the energy fluxes and different radiative forcing terms are so large that ECSs at the high end of model estimates remain consistent with the data (Marvel et al. 2015; Forster 2016). It has proven difficult to infer quantitative constraints on likely ECS from observations (Klein and Hall 2015).

Here we show how space-based observations can be used to robustly and quantitatively constrain likely ECS. We first use space-based observations to show how TLC reflection over oceans covaries with the underlying sea surface temperature (SST). We then demonstrate that the covariance of TLC reflection with SST in historical climate simulations correlates strongly with the models’ TLC feedback and ECS. This suggests that TLC reflection and its covariance with the underlying SST are controlled by similar physical processes, both as they vary temporally in the present climate and as they change under global warming. Therefore, the covariance of TLC reflection with SST provides an “emergent constraint”—an empirical relation between past variations and future trends in models, with a plausible physical basis for generalizations (Collins et al. 2012; Klein and Hall 2015). This emergent constraint can be used to constrain ECS. We obtain an observationally constrained posterior ECS estimate given current climate models through an information-theoretic weighting of the models according to how well they reproduce the observed covariance of TLC reflection with SST. The posterior ECS estimate shifts the most likely ECS upward and renders ECS at the low end unlikely, but a wide range of ECS remain consistent with the observations.

Section 2 describes the data and climate models we use and the methods we apply. It discusses several innovations in how low-cloud regions are identified and in how models are weighted to obtain a posterior ECS, which together contribute to more robust results than those obtained in previous studies. Section 3 analyzes the covariation of TLC reflection with temperature on time scales ranging from seasonal to interannual, both in observations and in historical simulations with climate models. It also presents the posterior ECS estimate obtained by weighting current climate models. Section 4 discusses the robustness of our results by examining, additionally, how TLC reflection covaries with the strength of the trade inversion, an environmental factor considered in several previous studies (e.g., Qu et al. 2014, 2015b; Myers and Norris 2015). Finally, section 5 summarizes our conclusions and their implications.

## 2. Data, models, and methods

### a. Observational data

We use monthly shortwave fluxes and insolation at the top of the atmosphere from the Clouds and the Earth’s Radiant Energy System (CERES) Energy Balanced and Filled (EBAF) dataset, version Ed2.8, for all 183 currently available months from March 2000 through May 2015 (Loeb et al. 2009). We obtain the concurrent monthly SST from the Extended Reconstructed Sea Surface Temperature (ERSST) dataset (Smith and Reynolds 2003). As described below, we identify TLC regions on the basis of the midtropospheric (500 hPa) relative humidity from the ERA-Interim atmospheric reanalysis (Dee et al. 2011).

We interpolate all data (simulated and observed) to an equal-area grid with 240 × 121 cells globally. A fixed land mask is used for models and observations to identify ocean areas, defined as grid cells with less than 10% land.

### b. Climate simulations

We use simulation results from 29 climate models participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5) of the World Climate Research Programme. The models are listed in Table 1.

For comparison of the models with the observational data, we create simulated datasets of the same length as the observational data by using 183-month periods of the historical CMIP5 simulations of the present climate. For each model, we use three nonoverlapping 183-month periods between the simulated years 1959 and 2005. (The simulations do not cover the exact period for which observations are available.) We analyze each of the three simulated periods like the observational data and pool the results to quantify the statistics of interest and their uncertainties reliably.

For computation of TLC feedbacks under global warming, we use the CMIP5 simulations in which CO_{2} concentrations were abruptly quadrupled from preindustrial levels. We calculate TLC reflection and temperature changes from the differences between years 130–149 and years 2–11 of the CO_{2} quadrupling simulations. (The results are insensitive to the length of the averaging periods.) Excluding the first year from the analysis removes the rapid cloud adjustments that occur in response to carbon dioxide concentration changes, which can be viewed as a forcing rather than a feedback (Gregory and Webb 2008; Webb et al. 2013; Zelinka et al. 2013).

We subdivide the climate models into two groups according to their ECS. The median ECS of 3.45 K separates the 14 lower-sensitivity (LS) models from the 15 higher-sensitivity (HS) models (Table 1).

### c. Low-cloud regions

Both in observations and simulations, we identify TLC regions as the 25% of the tropical ocean area (30°N–30°S) with the lowest midtropospheric (500 hPa) relative humidity (Figs. 1a,b). The results remain identical if a more restrictive threshold (e.g., 20% of the tropical ocean area) is chosen. Because the TLC regions are identified on a month-by-month basis from the relative humidity, this gives moving TLC regions that follow regions of midtropospheric dryness. The regions in which monthly means meet the midtropospheric dryness criterion frequently (Fig. 1c) broadly correspond to regions with frequent low-cloud cover (Fig. 1a). Space-based radar and lidar data (Kay and Gettelman 2009) indicate that middle and high clouds account for 15%–25% of the cloud cover in the TLC regions, suggesting at most minor contamination of our results by radiative effects of middle and high clouds. There are some regions near 30° latitude in which low-cloud fractions are relatively large (Fig. 1a), but the midtropospheric dryness criterion does not identify them frequently as TLC regions (Fig. 1c). In these regions, low clouds likely are often associated with extratropical systems, instead of or in addition to being controlled by local tropical factors, which are our focus here. The situation is similar in climate models, which have similar regions in which the midtroposphere is frequently dry; however, some model biases are also evident, for example, in the equatorial eastern Pacific, likely related to the well-known double-ITCZ bias (Figs. S1 and S2 in the supplementary material) (Lin 2007; Li and Xie 2014; Tian 2015).

Identifying low-cloud regions with a fixed percentile of the relative humidity distribution has several advantages over using other common low-cloud proxies, such as fixed regions (Klein and Hartmann 1993; Qu et al. 2014), or fixed threshold values of midtropospheric vertical velocities (Bony and Dufresne 2005; Vial et al. 2013). For example, relative humidity is more reliably observable than vertical velocity. Moreover, relative humidity and vertical velocity change dynamically in response to reorganizations of the atmospheric circulation, which may shift low-cloud regions—an effect that may be missed when considering fixed regions. And using a percentile threshold rather than a threshold in absolute values remains justifiable even when comparing warmer and colder climates, which may have different mean relative humidities or different mean vertical velocities (circulation strengths), or when comparing different climate models, which may have different relative humidity biases (Fasullo and Trenberth 2012).

### d. TLC reflection

We calculate the monthly TLC reflection *α*_{c} = −〈*S*_{c}〉/〈*I*〉 from the top-of-atmosphere shortwave cloud radiative effect (SWCRE) *S*_{c} and insolation *I* for observations and models, with angle brackets 〈⋅〉 denoting the mean over the TLC regions. The SWCRE *S _{c}* in turn is calculated from the difference between all-sky and clear-sky shortwave fluxes at the top of the atmosphere. The TLC reflection

*α*

_{c}then gives the fraction of the incoming shortwave radiation that is reflected by clouds in the TLC regions. Ambiguities in attributing reflection to clouds may generally arise where cloud and surface reflection cannot be clearly distinguished; however, such ambiguities should be minimal over tropical oceans. Using the TLC reflection

*α*

_{c}instead of SWCRE

*S*

_{c}to quantify shortwave cloud effects has the advantage that the effects of insolation variations with latitude or season are normalized out.

We also calculate how the low-cloud fraction (LCF) depends on SST. This calculation is based on LCF data (cllcalpso field) from the *Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations* (*CALIPSO*) GCM-Oriented *CALIPSO* Cloud Product (GOCCP) dataset (Chepfer et al. 2010), for June 2006 through December 2014. However, because an accurate model-to-observation comparison needs simulators to represent how satellites would see model clouds (Bodas-Salcedo et al. 2011), we prefer using top-of-the-atmosphere radiation. Our method of identifying TLC regions makes using LCF optional, and we merely present the LCF results for comparison.

### e. Regressions, stationary bootstrap, and confidence intervals

The dependence of cloud properties on surface perturbations is calculated as the regression slope between temporal anomalies. To reduce the effect of large residuals on the estimated regression coefficients, we use robust regressions to estimate the coefficients *δα*_{c}/*δ*〈*T*〉 of the regression of TLC reflection *α*_{c} onto the underlying SST 〈*T*〉 (and analogously for the regressions including the inversion strength as a predictor in section 4). As robust regression methodology, we use iteratively reweighted least squares with a bisquare weighting function (Holland and Welsch 1977). We include an intercept term in all regression estimates.

We obtain confidence intervals on regression coefficients such as *δα*_{c}/*δ*〈*T*〉 through a nonparametric bootstrap procedure, which takes the autocorrelations of the time series into account (Politis and Romano 1994). The original pairs of *α*_{c} and 〈*T*〉 time series were resampled by drawing blocks of random length *L*_{i} and assembling new pairs of bootstrap time series from them, of the same total length *L* as the original time series (the last block to be added is simply truncated to obtain the correct total length *L*). The block lengths *L*_{i} are a sequence of independent and identically distributed random variables, drawn from a geometric distribution so that the probability of each block to have length *L*_{i} = *m* is *p*(1 − *p*)^{m−1}, where *p* = *b*^{−1} and *b* is the optimal block length for the time series. The optimal block length *b* is chosen so as to minimize the mean squared difference between the original time series and versions with a time shift (Politis and White 2004). The block length is chosen for the *α*_{c} time series; however, the resulting confidence intervals are essentially unchanged if the block length is calculated for the 〈*T*〉 time series or if a fixed block length is used.

For each pair of observational or model time series considered, we create 200 bootstrap samples in this way. We repeat the robust regression estimation procedure for each pair of time series, thereby obtaining 200 bootstrap samples of the regression coefficients. The bootstrap samples allow us to quantify the sampling uncertainties in the regression coefficients (e.g., because of the finiteness of the time series), robustly and without assumptions about the underlying probability distributions. To quantify the uncertainties, we fit probability density functions (PDFs) to the bootstrap samples using a Gaussian kernel density estimator with bandwidth chosen to minimize the mean integrated squared error for normal data (Bowman and Azzalini 1997). From the fitted PDFs, we obtain most likely values (modes) and confidence intervals of the regression coefficients.

For the estimated confidence intervals of the TLC feedback, which are based on the scatter of the feedbacks among HS and LS models, we use multiples of the standard deviation *σ* among HS and LS models. Estimating more detailed PDFs in this case is difficult to justify, given the small sample size (14 and 15 models) and the lack of independence among the models.

### f. Spectral decomposition of temporal variations

Temporal variations in TLC reflection and SST are decomposed into four frequency bands. Seasonal variations are obtained by bandpass filtering to periods between 10 and 14 months. Deseasonalized variations are obtained by removing the mean annual cycle, through removing the mean deviation from the annual mean for each month of the year. Applying 1-yr high-pass and low-pass filters to the deseasonalized variations then yields the intra-annual and interannual variations. A twelfth-order Chebyshev filter is used throughout.

### g. ECS estimate from multimodel inference

We obtain a posterior ECS given the observational data and climate models by assigning a weight *w*_{i} (with ) to the equilibrium climate sensitivity ECS_{i} of each model *i*. The posterior PDF and associated ECS confidence intervals are obtained through a Gaussian kernel estimate (with bandwidth again chosen to minimize the mean integrated squared error for normal data) in which each ECS_{i} is weighted by *w*_{i}.

We obtain the weights *w*_{i} on the basis of an information-theoretic distance measure between the PDFs of the observed and modeled regression coefficients *δα*_{c}/*δ*〈*T*〉. This is justifiable in our case because they are estimated from time series of the same length *L* so that their sampling variability can be expected to be equal if a model is adequate. As the (nonsymmetric) distance measure between the observational PDF *p*(*x*) and the PDF *q*_{i}(*x*) for the *i*th model, we use the Kullback–Leibler divergence:

The Kullback–Leibler divergence Δ_{i} is the relative entropy between *p* and *q*, which indicates how much information is lost if the model’s PDF *q*_{i} is used to approximate the observed PDF *p*. Because −Δ_{i} is proportional to an entropy, its exponential *l*_{i} = exp(−Δ_{i}) is a probability. It can be interpreted as a likelihood *l*_{i} of model *i* given the observed distribution *p* of regression coefficients *δα*_{c}/*δ*〈*T*〉 (Burnham and Anderson 2010). It follows that the normalized weights,

are weights of evidence in favor of model *i*. In analogy with weights in Bayesian model averages, they can be interpreted as the posterior probability that model *i* is the best model for the data according to the Kullback–Leibler measure (Burnham and Anderson 2010). We estimate the Kullback–Leibler divergence (1) from the kernel density estimate of the PDFs *p* and *q*_{i} that we obtain from the bootstrap samples of the regression coefficients *δα*_{c}/*δ*〈*T*〉. Regressions that include the inversion strength as a predictor are treated analogously.

We found this multimodel inference procedure to yield robust results that are consistent with what one expects upon close visual examination of the data and model outputs. Other procedures that, for example, first infer a linear relation (regression line) between ECS and variables such as *δα*_{c}/*δ*〈*T*〉 from models and then use that linear relation to constrain ECS given observations (cf. O’Gorman 2012) can be strongly influenced by “bad” models that are not consistent with the data but exert large leverage on the inferred slope of the regression line. If the slope of the regression line is strongly constrained by bad models, such a procedure can, misleadingly, yield very narrow ECS estimates that could not be justified by focusing on “good” models, which are broadly consistent with the data. By contrast, our multimodel inference procedure assigns zero weight to models that are inconsistent with the data. However, it is to be kept in mind that the resulting posterior ECS still is only based on how well models simulate selected climate variables, such as the regression coefficient *δα*_{c}/*δ*〈*T*〉. Better ECS bounds may be obtained by simultaneously taking more climate variables or regression coefficients *δα*_{c}/*δ*〈*T*〉 in multiple frequency bands into account. Also, the multimodel inference procedure neglects interdependences among climate models, which may skew results.

## 3. Results

### a. Observed variations of low-cloud reflection

Figure 2a shows deseasonalized variations of the low-cloud reflection *α*_{c} over the TLC regions identified month by month from the midtropospheric relative humidity, with the underlying SST variations in Fig. 2b. The TLC reflection *α*_{c} averaged over the measurement period is 8.95%. Temporal variations *δα*_{c} of the TLC reflection are negatively correlated with temperature variations *δ*〈*T*〉: TLC reflection decreases when the temperature increases, with a robust regression giving *δα*_{c}/*δ*〈*T*〉 ≈ −(0.96 ± 0.22)% K^{−1} for deseasonalized data (Fig. 2c). (Uncertainty ranges throughout this paper indicate 90% confidence intervals, estimated by the stationary bootstrap procedure described in section 2e.) This suggests a positive shortwave feedback from TLCs, because similar processes can be expected to govern the TLC response to SST changes on all time scales longer than seasonal, and because it is unlikely that much of the underlying SST variations are driven by TLC variations; rather, the deseasonalized SST variations are primarily driven by large-scale phenomena, such as El Niño–Southern Oscillation (Dessler 2010).

The reduction of TLC reflection *α*_{c} with warming implies a corresponding SWCRE change, *δ*〈*S*_{c}〉/*δ*〈*T*〉 = −*δ*(*α*_{c}〈*I*〉)/*δ*〈*T*〉, which can approximately be obtained from the reflection variation *δα*_{c}/*δ*〈*T*〉 and the mean insolation 〈*I*〉 = 387.9 W m^{−2} in the TLC regions; insolation variations owing to temperature-dependent shifts of the TLC regions are not significant for deseasonalized data (Table 2). Thus, *δ*〈*S*_{c}〉/*δ*〈*T*〉 ≈ −〈*I*〉*δα*_{c}/*δ*〈*T*〉 ≈ (3.7 ± 0.8) W m^{−2} K^{−1} (Fig. 2c, right axis), a range consistent with previously obtained estimates of SWCRE variations with temperature (Eitzen et al. 2011; Zhou et al. 2013).

The TLC reflection and SWCRE variations are associated with proportional variations of the TLC fraction (Table 3). The relative variations of LCF with SST, 〈LCF〉^{−1}*δ*〈LCF〉/*δ*〈*T*〉 [e.g., −(9.0 ± 2.5)% K^{−1} for deseasonalized variations], are of similar magnitude as the relative variations of TLC reflection with temperature, *α*_{c}^{−1}*δα*_{c}/*δ*〈*T*〉 [e.g., −(10.7 ± 1.5)% K^{−1} for deseasonalized variations]. Observed variations of TLC reflection strongly correlate with observed variations of LCF (last column of Table 3), as is the case in climate simulations (Qu et al. 2014). Thus, variations of TLC reflection appear to be primarily driven by variations in TLC fraction.

While the observed covariance of TLC reflection or SWCRE with temperature suggests a positive shortwave cloud feedback, such a covariance may not only arise because of cloud feedbacks. For example, atmospheric water vapor concentrations generally increase with temperature, leading to an enhanced atmospheric shortwave absorption that shrinks SWCRE toward zero (i.e., reduces SWCRE) and thus may appear as a reduced TLC reflection at the top of the atmosphere (Soden et al. 2004, 2008). However, such shortwave masking effects for marine TLCs are generally small (Vial et al. 2013; Zelinka et al. 2013), modifying the shortwave feedback relative to *δ*〈*S*_{c}〉/*δ*〈*T*〉 by less than 0.3 W m^{−2} K^{−1} or modifying the reflection feedback relative to *δα*_{c}/*δ*〈*T*〉 by less than 0.3/387.9 K^{−1} ≈ 0.08% K^{−1}. Both −*δα*_{c}/*δ*〈*T*〉 and *δ*〈*S*_{c}〉/*δ*〈*T*〉 are significantly greater than these small offsets between TLC feedbacks and measured radiative effects at the top of the atmosphere. Thus, the deseasonalized observations indicate a robustly positive shortwave feedback.

Similar conclusions are reached when examining the covariance of TLC reflection with temperature in the intra-annual (less than 1-yr time scale), seasonal (1 yr), and interannual (greater than 1 yr) frequency bands. TLC reflection decreases with temperature across these frequency bands: *δα*_{c}/*δ*〈*T*〉 ranges from −(1.30 ± 0.06)% K^{−1} for seasonal variations to −(0.75 ± 0.33)% K^{−1} for interannual variations (Fig. 3). Within their statistical uncertainties, the values of *δα*_{c}/*δ*〈*T*〉 in the different frequency bands are broadly consistent with each other, in line with other evidence indicating that SST exerts the primary control on TLC cover (Qu et al. 2014). For all frequency bands, the 90% confidence intervals on *δα*_{c}/*δ*〈*T*〉 are bounded well away from zero or the small offset between cloud feedbacks and measured radiative effects at the top of the atmosphere. Thus, observations in all frequency bands indicate a robustly positive shortwave feedback. (See Table 2 for the corresponding covariance of SWCRE with temperature.) We have verified that our inferences remain essentially unchanged when using different surface temperature datasets or different reanalysis datasets, such as the Modern-Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011), for the relative humidity data from which the TLC regions are inferred.

### b. Low-cloud reflection in climate simulations

The observed covariance of TLC reflection with temperature can be compared with that produced by the CMIP5 climate models (Table 1). In the TLC regions and across the different frequency bands, HS models simulating the present climate usually exhibit a negative correlation of TLC reflection with underlying SST, like the observations; LS models exhibit correlations of either sign (Fig. 3). However, HS and LS models produce a wide variety of TLC reflection variations. In simulations of the warmer climate reached after quadrupling carbon dioxide concentrations, most HS models project a reduction of TLC reflection, whereas LS models project less change or even an increase (Fig. 3). The models’ ECS correlates significantly (*r* = −0.73) with *δα*_{c}/*δ*〈*T*〉 under global warming (Fig. 4). That is, 53% of the ECS variance among the models is accounted for by differences in how the models’ TLC reflections respond to warming. This finding demonstrates the dominance of TLC uncertainties for the ECS spread even more clearly than previous studies (Bony and Dufresne 2005; Zelinka et al. 2012; Vial et al. 2013; Webb et al. 2013).

Beyond that, and what is essential for us to be able to constrain ECS in the end, *δα*_{c}/*δ*〈*T*〉 under global warming in the models correlates strongly with *δα*_{c}/*δ*〈*T*〉 inferred from temporal variations within the present climate (Fig. 5). This confirms and complements the strong correlation between present-day temporal variations of TLC cover and long-term cloud feedbacks seen in previous studies (Qu et al. 2014; Zhou et al. 2015). It is also evident that primarily models with a relatively strong TLC reflectance reduction under global warming (i.e., strongly positive shortwave TLC feedback) are consistent with *δα*_{c}/*δ*〈*T*〉 inferred from observed temporal variations within the present climate (Fig. 5). Weighting each model according to the information-theoretic measure of how well its historical simulation reproduces the observed deseasonalized *δα*_{c}/*δ*〈*T*〉 (section 2g and Table 1) gives a most likely global-warming *δα*_{c}/*δ*〈*T*〉 of −0.21% K^{−1}, with a 90% confidence interval of [−0.80%, 0.15%] K^{−1}. That is, it also indicates a positive shortwave TLC feedback under global warming, narrowing the prior likely range of TLC feedbacks indicated by the model ensemble.

As implied by the strong correlations, on the one hand, between ECS and *δα*_{c}/*δ*〈*T*〉 under global warming (Fig. 4) and, on the other hand, between *δα*_{c}/*δ*〈*T*〉 under global warming and in present-day temporal variations (Fig. 5), *δα*_{c}/*δ*〈*T*〉 inferred from variations in the present climate also correlates strongly with ECS. For deseasonalized variations, the correlation coefficient is *r* = −0.67 (Fig. 6a). It follows that almost half (45%) of the ECS variance among models can already be accounted for by the spread of *δα*_{c}/*δ*〈*T*〉 inferred from deseasonalized variations in historical simulations. That is, almost half of the ECS variance across models can already be inferred from simulations that do not involve any perturbation of the atmospheric greenhouse gas concentrations. The correlations for the other frequency bands are shown in Figs. S3–S5 in the supplementary material and are, for example, weaker for seasonal variations. [The seasonal correlation is weaker than that found by Zhai et al. (2015), who use different methods and a smaller subset of CMIP5 climate models, excluding some that in our analysis lead to weaker correlations.]

### c. Multimodel inference of climate sensitivity

Given the correlations between ECS and *δα*_{c}/*δ*〈*T*〉 inferred from temporal variations in the present climate, we can use the regression coefficients *δα*_{c}/*δ*〈*T*〉 obtained from the observations of the present climate to constrain model projections of ECS. We estimate a posterior ECS from an average of climate models in which each model is weighted according to the information-theoretic measure of how well its historical simulation reproduces the observed deseasonalized *δα*_{c}/*δ*〈*T*〉 (section 2g). This model averaging generally assigns greater weight to HS models because they are more consistent with the observations (Fig. 6a; see Table 1 for the weights). It yields a posterior PDF of ECS with the median and mode at 4.0 K, and with a 90% confidence interval of [2.3, 5.0] K (Fig. 6b). That is, ECS most likely lies in the upper half of current estimates; values below 2.3 K are very unlikely.

We specifically highlight deseasonalized variations because intermodel variations of *δα*_{c}/*δ*〈*T*〉 in this frequency band strongly correlate with ECS (Fig. 6a). However, similar analyses in the other frequency bands give consistent results (Figs. S3–S5). The posterior ECSs obtained from observations in the various frequency bands are listed in Table 4, and the estimated Kullback–Leibler divergences Δ_{i} for each model and each frequency band are listed in Table S1 of the supplementary material. The results also show, for example, that the models’ simulation of seasonal variability on average is much poorer than that of interannual variability so that the deseasonalized variability that we used above provides more robust constraints on ECS than seasonal variability.

Of course, all of these posterior ECS estimates are conditional on the range of ECS simulated by the CMIP5 models. They merely indicate which ECSs in the model range are more plausible than others, given the observations. They do not rule out ECSs entirely outside the range indicated by current climate models: that is, that all models are wrong. We found our information-theoretic weighting of climate models to give more robust posterior ECS estimates than methods that are based, for example, on estimating regression lines between ECS and present-day *δα*_{c}/*δ*〈*T*〉 from climate simulations and that then use the estimated regression lines for inferences about the posterior ECS. Such methods can underestimate the weight of LS models that are consistent with the data we considered; they thus can lead to unrealistically narrow posterior ECS estimates. By contrast, our information-theoretic weighting yields more realistic posterior estimates for the mode and confidence bounds of the ECS.

## 4. Influence of inversion strength

Other environmental factors besides the surface temperature may influence TLCs: for example, vertical motion (Myers and Norris 2013), surface fluxes (Bretherton and Wyant 1997), or the strength of the trade inversion (Klein and Hartmann 1993; Wood and Bretherton 2006; Seethala et al. 2015). Especially changes in inversion strength may lead to different TLC responses to global warming and to local warming within a given climate. Because the free-tropospheric thermal stratification in the tropics is approximately moist adiabatic and is controlled nonlocally by deep convection in convergence zones (e.g., Sobel et al. 2001), temporary local warming within TLC regions generally reduces the local inversion strength and thus the lower-tropospheric static stability. By contrast, global warming is expected to increase the lower-tropospheric static stability because the stratification throughout the tropical free troposphere is expected to remain nearly moist adiabatic, and the temperature lapse rate of a moist adiabat increases as the climate warms (e.g., Miller 1997; Stevens and Brenguier 2009). This may imply a weak increase of inversion strength (Qu et al. 2015a; Wood and Bretherton 2006). It is thus conceivable that such different responses of the lower-tropospheric static stability or inversion strength lead to different TLC responses to global warming and to temporary local warming in the present climate (Qu et al. 2014). However, because the increase in inversion strength under global warming is small relative to the surface temperature increase (Qu et al. 2015b), the surface temperature increase still appears to exert the primary control on the TLC response to warming in climate models. Here we examine to what extent explicitly accounting for inversion strength modifies our results.

So far, we have used the univariate regression of TLC reflection onto surface temperature:

where *δ* denotes fluctuations, *β*_{1} = *δα*_{c}/*δ*〈*T*〉 is the regression coefficient we have considered in the preceding analysis, and *ε* is the residual. Following Qu et al. (2015b), we now modify the regression model and include the inversion strength as an additional predictor:

Here, the estimated inversion strength (EIS) is as defined by Wood and Bretherton (2006) and is calculated as described in the appendix. EIS is a measure of the strength of the trade inversion, which takes into account that the stratification above the trade inversion is nearly moist adiabatic. The tildes indicate that the coefficients and the residuals in the bivariate regression model (4) do not necessarily coincide with the corresponding quantities in the univariate model (3).

Table 5 summarizes the coefficients and inferred for the different frequency bands from observations. The coefficients are again obtained by robust regressions, now based on the bivariate model (4). The bivariate temperature coefficient can be compared with its univariate counterpart *β*_{1} listed in Table 2. Generally, the effect of surface temperature on TLC reflection is slightly reduced when inversion strength is explicitly accounted for, as is evidenced by the slightly shrunken coefficients relative to *β*_{1} (except in the interannual frequency band, where there is no statistically significant change of the coefficient). But it does remain true that TLC reflection weakens robustly as the underlying surface warms, even when accounting separately for variations in inversion strength. This still suggests a positive shortwave feedback from TLCs. In all frequency bands, TLC reflection strengthens when the inversion strength increases. However, the effect of inversion strength is not clearly distinguishable from zero for interannual variability, and it generally is weaker than the surface temperature effect. Adding the inversion strength as a predictor increases the explanatory power of the regression model slightly but significantly, as determined by comparing the bootstrap distribution of residuals. However, some of these results may be sensitive to the exact time period that is being analyzed (Qu et al. 2015b).

We can calculate the same bivariate regressions for the climate models and use an analogous multimodel inference procedure as before to constrain ECS, now explicitly accounting for variations in inversion strength. Even after accounting for variations in inversion strength through the bivariate model, it remains true that ECS in climate models strongly correlates with the temperature coefficient obtained from deseasonalized variations in historical simulations (Fig. 7a). The correlation coefficient (*r* = −0.67) between ECS and the bivariate coefficient is the same as that between ECS and the univariate coefficient *β*_{1} (Fig. 6a). By contrast, ECS in climate models is not significantly correlated (*r* = 0.25) with the inversion strength coefficient obtained from deseasonalized variations in historical simulations (Fig. 7b).

Weighting climate models as before, but now according to how well they reproduce the observed temperature coefficient for deseasonalized variability in historical simulations yields a posterior ECS estimate with mode at 3.9 K and with a 90% confidence interval of [2.4, 5.0] K (see Table 4, second column, for a summary of the results for all frequency bands). That is, the results of the preceding univariate analysis are essentially unchanged when variations of the inversion strength are explicitly taken into account. Given the low correlation between ECS and the inversion strength coefficient in climate models, weighting climate models according to how well they reproduce the observed inversion strength coefficient does not substantially change the range of prior ECS (Table 4, third column). Even if both and are used to weight climate models,^{1} the posterior ECS estimates remain close to those obtained when only or only are used (Table 4, fourth column). Therefore, this analysis confirms the predominant role of surface temperature in controlling the TLC response to warming. It still indicates an ECS that most likely lies in the upper half of current model estimates.

## 5. Conclusions

How reflection of shortwave radiation by TLCs changes as the climate warms is the most uncertain feedback in current climate models. Uncertainties about it account for more than half of the ECS variance among models. Space-based observations over the past 15 years show that, on time scales from seasonal to interannual, shortwave reflection by low clouds over tropical oceans robustly decreases when the underlying surface warms. This suggests a positive shortwave feedback from TLCs. In climate models, how strongly the TLC reflection covaries with the underlying surface temperature in historical simulations of the present climate correlates with the strength of the shortwave TLC feedback and with ECS. Thus, the covariance of TLC reflection with temperature in the present climate can be used as an emergent constraint on ECS.

Climate models that are consistent with the observed covariance of TLC reflection with temperature generally have a higher ECS than models that are not consistent with the observations. An information-theoretic weighting of climate models according to how well they reproduce observations gives a posterior ECS estimate with a most likely value of 4 K, and with a probability of less than 5% for values below 2.3 K. This posterior ECS is essentially unaltered if the influence of the strength of the trade inversion on TLCs is explicitly accounted for.

The consistent covariance of TLC reflection with surface temperature on time scales from seasonal to interannual in the present climate and under global warming in climate simulations indicates that temperature is a key factor controlling TLC cover and that similar processes likely govern the TLC response to warming across the time scales. This implies that a process-oriented analysis of low-cloud variations in the present climate likely can improve the representation of the low-cloud response to climate changes in models and can elucidate the mechanistic connection between low-cloud reflection and temperature.

## Acknowledgments

We thank Jennifer Kay for making available the combined *CloudSat* and *CALIPSO* data used in Fig. 1, Davide Panosetti for drafting Fig. 1, and Tobias Bischoff and Zhihong Tan for helpful discussions. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups (listed in Table 1) for producing and making available their model output. This research was supported by the Swiss National Science Foundation (Grant 200021-156109).

### APPENDIX

#### Calculation of Estimated Inversion Strength

The strength of the trade inversion is estimated by computing EIS following Wood and Bretherton (2006):

Here, the lower-tropospheric stability (LTS) is defined as the potential temperature difference between 700 hPa and the surface (Klein and Hartmann 1993), and is the moist-adiabatic potential temperature gradient at 850 hPa, calculated from the mean temperature between the surface and 700 hPa as in Wood and Bretherton (2006). The height *z*_{700} of the 700-hPa level we take to be fixed, *z*_{700} = 3.1 km, and the lifting condensation level (LCL) is calculated by lifting a near-surface parcel along a dry adiabat until saturation. For CMIP5 simulations, we use the 2-m air temperature *T*_{a} and relative humidity RH_{a} (fields tas and hurs in CMIP5 data) to calculate LCL, and we use the relative humidity in the lowest model layer when hurs is not available. For observations, we use *T*_{a} and the dewpoint temperature *T*_{d} provided by ERA-Interim to calculate LCL. We decided to calculate the monthly LCL in the TLC regions, although the LCL (or the surface relative humidity) is assumed constant in other studies (e.g., Qu et al. 2014); however, the results are only weakly sensitive to the LCL variations (Qu et al. 2015a).

## REFERENCES

_{2}forcing

*Clouds in the Perturbed Climate System*, J. Heintzenberg and R. J. Charlson, Eds., MIT Press, 173–196.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 33–115.

## Footnotes

Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JCLI-D-15-0897.s1.

^{1}

We do so by calculating separate model likelihoods and and using model weights proportional to their product, . As a simple first approximation, this treats the two likelihoods and as independent probabilities.