## Abstract

A large number of perturbed-physics simulations of version 3 of the Hadley Centre Atmosphere Model (HadAM3) were compared with the Clouds and the Earth's Radiant Energy System (CERES) estimates of outgoing longwave radiation (OLR) and reflected shortwave radiation (RSR) as well as OLR and RSR from the earlier Earth Radiation Budget Experiment (ERBE) estimates. The model configurations were produced from several independent optimization experiments in which four parameters were adjusted. Model–observation uncertainty was estimated by combining uncertainty arising from satellite measurements, observational radiation imbalance, total solar irradiance, radiative forcing, natural aerosol, internal climate variability, and sea surface temperature and that arising from parameters that were not varied. Using an emulator built from 14 001 “slab” model evaluations carried out using the climateprediction.net ensemble, the climate sensitivity for each configuration was estimated. Combining different prior probabilities for model configurations with the likelihood for each configuration and taking account of uncertainty in the emulated climate sensitivity gives, for the HadAM3 model, a 2.5%–97.5% range for climate sensitivity of 2.7–4.2 K if the CERES observations are correct. If the ERBE observations are correct, then they suggest a larger range, for HadAM3, of 2.8–5.6 K. Amplifying the CERES observational covariance estimate by a factor of 20 brings CERES and ERBE estimates into agreement. In this case the climate sensitivity range is 2.7–5.4 K. The results rule out, at the 2.5% level for HadAM3 and several different prior assumptions, climate sensitivities greater than 5.6 K.

## 1. Introduction

Considerable uncertainty exists about how sensitive the climate is to changes in CO_{2}. This is often summarized as “equilibrium climate sensitivity” (*S*): the equilibrium global-average temperature change in response to a doubling of CO_{2}. The fourth assessment report from the Intergovernmental Panel on Climate Change (IPCC) reported that *S* was *likely* (more than 66% chance) to be in the range 2.0–4.5 K and values greater than 4.5 K could not be ruled out (Meehl et al. 2007). This uncertainty largely arises from uncertainty in modeling atmospheric processes such as cloud formation and convection as well as changes in snow and ice that act to modify the “greenhouse” effect and albedo of the planet. Uncertainties in climate sensitivity combined with uncertainty in the rate at which the oceans take up heat lead to uncertainty in the response of the climate system to changes in greenhouse gases.

Knutti and Hegerl (2008) review various approaches to provide probabilistic estimates of equilibrium climate sensitivity. In broad terms these can be classified into two different approaches. One approach is such that observed change or variability over the last few decades to the last millennium has been compared to models of varying complexity (e.g., Hegerl et al. 2006; Kettleborough et al. 2007; Olson et al. 2012). A second approach has been to compare observed and simulated climatologies (e.g., Sexton and Murphy 2012; Stainforth et al. 2005; Murphy et al. 2004; Sanderson 2011). Such approaches are often multivariate and assess observational error by using multiple observational datasets. Huybers (2010) explored climate sensitivities in the archive from phase 3 of the Coupled Model Intercomparison Project (CMIP3) and found some evidence that models had been tuned as there was compensation between feedbacks arising from different processes. Lemoine (2010) carried out a similar analysis but considered common biases between models and found considerable sensitivity to assumptions about these biases. Huber et al. (2011), analyzing results from the CMIP3 archive and comparing with radiation measurements, found ranges of climate sensitivity from 2.9 to 4.5 K although they did not attempt a probabilistic estimate.

In perturbed physics ensembles (Murphy et al. 2004) key parameters in a climate model are varied within their uncertainty ranges, leading to the possibility of climate sensitivities (Stainforth et al. 2005) much larger than 6 K. Recently, Rowlands et al. (2012) reported on an observationally constrained rates of warming to 2050 by comparing a very large ensemble of perturbed-physics simulations from a lower-resolution version of the Hadley Centre Coupled Model, version 3 (HadCM3L), with observations for the period 1960–2010. They concluded that global mean warming in the 2050s, relative to 1961–90, is likely in the range 1.4–3 K.

Jackson et al. (2008), using an improved Markov chain Monte Carlo (MCMC) algorithm, generated a range of perturbed parameter versions of the Community Atmosphere Model, version 3.1 (CAM3.1). They used a range of observations, largely based on reanalyses, to constrain the plausible parameter choices and found that model configurations with the smallest systematic errors had a climate sensitivity within 0.5 K of 3 K. Järvinen et al. (2010) also applied a variant of the Markov chain Monte Carlo algorithm to estimate parameters although they did not draw inferences on climate sensitivity.

In Part I of this paper (Tett et al. 2013, hereafter Part I) we reported on successful attempts to automatically tune the Hadley Centre Atmosphere Model, version 3 (HadAM3; Pope et al. 2000), at its N48 (3.75° × 2.5°) resolution to the Loeb et al. (2009) 2000–05 observations of top-of-atmosphere (TOA) radiation values. In this paper we use results from those simulations to draw observationally constrained inferences about climate sensitivity. In an atmospheric model with fixed sea surface temperatures (SSTs) we hypothesize that there is a relationship between TOA radiation and climate feedbacks. We put forward the hypothesis that the processes that cause climate feedbacks also modify the outgoing radiation budget in a fixed SST simulation. For example, water vapor is transported into the upper troposphere via convection and there it reduces outgoing longwave radiation (OLR) in the fixed SST experiment. Changes in water vapor in response to atmospheric temperature changes through convective transport are also one possible climate feedback. Tropospheric water vapor also produces clouds that affect the radiation balance of a model and changes in cloud in response to climate change are a significant climate feedback.

The aims of this paper are as follows:

to explore whether there is any relationship between simulated outgoing radiation and climate sensitivity for HadAM3;

to explore if this relationship is one-to-one or one-to-many;

to produce an uncertainty estimate for model–observational discrepancy; and

to use this estimate to produce an uncertainty estimate for equilibrium climate sensitivity based on the use of the HadAM3 model.

The rest of the paper is structured as follows. In the following methods section we first describe our modeling strategy and how we estimate climate sensitivity using an emulator based on data from the climateprediction.net ensemble (Sanderson et al. 2008), and then present our comprehensive analysis of uncertainties on model–observational discrepancies before finally describing how we compute cumulative density functions for climate sensitivity. Having described our methods we then present our results before concluding with an extended discussion.

## 2. Methods

In this section we first briefly describe our modeling strategy and how we estimated uncertainty in model–observational discrepancy. We then describe how we estimated climate sensitivity for any parameter combination using an “emulator” (e.g., Rougier 2007) and how we used it to compute cumulative distribution functions for climate sensitivity.

### a. Modeling

The default configuration of HadAM3 has been evaluated (Pope et al. 2000) and has an estimated climate sensitivity of 3.3 K to doubling CO_{2} (Williams et al. 2001; Randall et al. 2007). This estimate disagrees with the estimate of 3.7 K of Gregory and Webb (2008) because their calculation was done by halving the response to 4×CO_{2} while Williams et al. (2001) doubled CO_{2}. Our climate sensitivity estimates (see later) are based on the response to doubling CO_{2}. For our atmosphere-only simulations we modified the default model to include a package of natural and anthropogenic forcings (Tett et al. 2007), included a recent estimate of total solar irradiance (TSI; Kopp and Lean 2011), and corrected a bug in the Rayleigh scattering shortwave coefficients.

We “tuned” the N48 (3.75° × 2.5°) version of HadAM3 by modifying four parameters (entcoef, vf1, ct, and RHcrit, described in Part I) that previous work (Knight et al. 2007) had shown were important for climate sensitivity. The parameters were modified using an optimization algorithm that aimed to produce models with specified global-mean outgoing longwave radiation and reflected solar radiation (RSR).

We carried out several optimization experiments. For each experiment we started from 16 different extreme combinations of the four parameters and optimized each starting parameter choice to the same target values of OLR and RSR. We optimized to six targets in all. In all, we carried out about 2500 simulations of HadAM3 that have a broad range of OLR and RSR values although with highest density around the observed values. The parameter values for those configurations close to the observed targets had a broad range of values.

In Part I we showed that there was a compensation in clear-sky OLR between upper tropospheric temperature and water. Consequently most of the changes seen in the ensemble were driven by changes in cloud. This is consistent with literature on reasons for uncertainty in climate sensitivity (e.g., Webb et al. 2006; Randall et al. 2007). More details can be found in Part I of this paper.

### b. Observational–model discrepancy

In this paper we concentrate on comparison with the recent Clouds and the Earth's Radiant Energy System (CERES) observations of OLR and RSR (Loeb et al. 2009) but also consider the older Earth Radiation Budget Experiment (ERBE) values (Fasullo and Trenberth 2008). Any comparison between observations and models requires a quantitative estimate of observational–model discrepancy. We make this estimate by considering several sources of uncertainty and combining them to produce a total uncertainty. Our focus is on observational uncertainties and modeling uncertainties that affect outgoing radiation. To convert some sources of uncertainty to uncertainties in OLR and RSR we make use of some HadAM3 simulations. Unless stated otherwise these make use of the standard configuration of HadAM3 and do not consider structural uncertainty. As we are considering large-scale processes we assume that all uncertainties are Gaussian and make estimates of their values for the 2001–05 period. We consider the following sources of uncertainty quantifying them as plus or minus one standard deviation:

Satellite measurement—From Table 2 of Loeb et al. (2009) we sum the individual components of the bias uncertainty to give a total observational uncertainty of 1.4 and 1.0 W m

^{−2}for OLR and RSR respectively. We assume these are independent of one another.Observational radiative imbalance—The TOA radiation dataset we use was adjusted to have the same radiation balance (Loeb et al. 2009) as ocean observations (Willis et al. 2004) for the upper 750 m (0.86 ± 0.12 W m

^{−2}). Lyman et al. (2010) estimated the upper ocean has warmed by 0.55–0.73 W m^{−2}. We use 0.75 ± 0.25 W m^{−2}to include both estimates.Uncertainty in incoming radiation—There is a small uncertainty in the incoming TSI from which we derive the balance requirement. Solar minimum TSI (Kopp and Lean 2011) has been estimated at 1360.8 ± 0.5 W m

^{−2}, significantly different from older estimates (Willson and Hudson 1991). For the period 2001–05 this gives an incoming TOA radiation of 1362.2 ± 0.5 W m^{−2}arising from the elliptical nature of Earth's orbit and slightly higher TSI when the sun is active.Internal climate variability—From an ensemble of 19 standard configurations of HadAM3 we estimated the covariance of OLR and RSR. The standard deviations are about 0.1 W m

^{−2}for both. This is a negligible source of uncertainty and reflects our use of atmospheric models rather than coupled atmosphere–ocean models where variability in outgoing radiation is much larger (e.g., Tett et al. 2007).Forcing uncertainty—Our simulations are all driven with a package of radiative forcings that are uncertain (Forster et al. 2007). Forcing is the change in downward radiative flux at the tropopause after stratospheric adjustment (Tett et al. 2002) when the stratosphere is in equilibrium. As our simulations are driven with observed SST, feedbacks between forcing and atmospheric state will be reduced and so we would expect a change in forcing to produce a similar change in OLR and RSR as the longwave (LW) and shortwave (SW) forcing, respectively.

To test if forcing and outgoing radiation were similar we carried out a set of three simulations using the standard configuration of HadAM3: NOSUL, in which we removed the direct and indirect effect of sulfate aerosols; Natural, in which all anthropogenic forcings were removed; and NATGHG, in which all anthropogenic forcings except well-mixed greenhouse gases (GHGs) are removed. We then compared these values for 2001–05 with the forcing calculations of Tett et al. (2007) for 2000. We expect the forcing from 2000–05 to be similar and errors in the forcing estimate are small. The changes in OLR and RSR for Natural are broadly consistent with the forcing estimates (Table 1) with a slightly larger OLR than the forcing would suggest. However, considering the changes in the NOSUL and NATGHG experiments suggests that this may be arising from some compensation between the effect of aerosols and GHG on OLR and RSR. Some of this may be due to internal variability in the model simulations.

The dominant SW forcing arises from aerosol forcing and so being cautious we estimate, by comparison between NOSUL and the reference simulations, that this reduces RSR by 1.2 W m

^{−2}and increases OLR by 0.2 W m^{−2}. We compute the covariance matrix for a 1 W m^{−2}uncertainty in SW forcing by computing the outer matrix product of this vector with itself (the matrix multiplication of a vector by its transpose so*C*=_{ij}*υ*) scaled by 1/0.9 (the SW forcing). We assume that LW forcing largely arises from changes in well-mixed greenhouse gases and compute their impact on RSR and OLR as 0.7 and 2 W m_{i}υ_{j}^{−2}respectively from the difference between the NATGHG and Natural simulations. We then scale these values by 1/2.11 (the LW forcing) and compute the covariance from the outer matrix product to obtain a covariance for a change in LW forcing of 1 W m^{−2}.To obtain radiative forcing uncertainties we use the existing uncertainty estimates (Table TS.5 of Solomon et al. 2007), assume uncertainties are independent, and round to one significant figure to give 1

*σ*uncertainties of 1 and 0.20 W m^{−2}for RSR and OLR. These are then used to scale the covariance matrices computed above. The dominant contributions to RSR are from uncertainty in the direct effect and cloud albedo effects of aerosol while for OLR the dominant uncertainties arise for CO_{2}and ozone forcing. Our estimates of forcing uncertainty, to some extent, include structural uncertainty as we use the range of forcing values from Forster et al. (2007) although modified through use of HadAM3 to obtain TOA RSR and OLR.Natural aerosols—There are many natural aerosols in the climate system that largely affect RSR with a minimal impact on OLR (Carslaw et al. 2010). For the current climate natural aerosol feedbacks on the radiation budget are small so we use uncertainties in natural aerosols directly. Key components are organic aerosol, aerosol from the impact of fire, and dust, the effect of which on RSR has been estimated from −0.03 to −1.1, −0.05 to 0.2, and −0.7 to 0.5 W m

^{−2}, respectively (Carslaw et al. 2010). These would combine, assuming independence and that the estimates are 5%–95% Gaussians, to a total 1*σ*uncertainty of about 0.6 W m^{−2}. We also used three simulations from Penner et al. (2006) and after correcting all contemporary simulations to the same RSR found the range in preindustrial RSR was 1 W m^{−2}. We used this as a 1*σ*estimate for natural aerosols as it was greater than the Carslaw et al. (2010) estimate. As with forcing uncertainty this uncertainty range incorporates structural uncertainty in the impact of aerosols on outgoing radiation.SST uncertainty—On the 5-yr time scales we are considering the major source of SST uncertainty is the climatology rather than uncertainty in the individual years. Two Hadley SST datasets have climatological differences of less than 0.2 K (Rayner et al. 2006) over most of the World Ocean. Therefore, we assume the 1

*σ*uncertainty in SST is 0.2 K. We estimated its impact on RSR and OLR by, everywhere, increasing the SST values by 0.5 K and forcing default HadAM3 with it. We found a change in RSR and OLR of −0.4 and 1.2 W m^{−2}and scaled the response by to give the covariance for SST uncertainty of 0.2 K. We also carried out a simulation using the SST dataset of Reynolds et al. (2002) and found this had a small impact on RSR and OLR (about 0.1 W m^{−2}).Other parameters—Our results are based on modifying four HadAM3 parameters that most affect

*S.*Other parameters have less effect on*S*but could affect the outgoing radiation. For example, one of our configurations could be inconsistent with the observations but if we modify other parameters it may be consistent. We treated this as another source of uncertainty. To estimate its covariance we found the 13 distinct parameter combinations from the 14 001 climateprediction.net cases that had default values for entcoef, vf1, ct, and RHcrit and a climate sensitivity between 3.2 and 3.4. We then ran them to compute the RSR and OLR (Fig. 1) and computed a covariance matrix from the 13 cases. The parameters (Knight et al. 2007) that varied were precipitation threshold (cw), ice particle size (ice_size), ice albedo variation (dtice), ice albedo at the melting point (alpham), and nonspherical ice (ice). The largest changes in RSR and OLR arose from using nonspherical ice. However, these parameter changes did not greatly change the total outgoing radiation.

Loeb et al. (2009) computed estimates for average RSR and OLR by adjusting the measured RSR and OLR values within their estimated bias uncertainties until they were consistent with the ocean heat content estimates of net imbalance. As we are using slightly different estimates of ocean heat content uncertainty and require a covariance estimate we computed observational uncertainty by combining distributions for the individual outgoing SW and LW radiation bias uncertainties with a distribution for the total radiation. Total outgoing radiation is taken to be the total incoming minus the expected imbalance [(1362.2/4 − 0.75) ± 0.3 W m^{−2}]. We extended this to include uncertainty in the orthogonal and independent component—the difference between RSR and OLR. In the absence of other information we assume the distribution for the difference is a normal distribution with mean −40% (corresponding to an albedo of 0.3) and standard deviation 10% of the incoming radiation. These are large enough that other uncertainties dominate. We combine this covariance with the satellite bias covariance by first computing the precision matrix (inverse of covariance matrix), linearly transforming the precision matrix to give the individual RSR and OLR components and combining with the precision matrix for satellite bias uncertainty through the formula

where **Λ** is the precision matrix and *μ* is the mean value. Subscripted *c*, *b*, and *L* are the combined values, near-radiative balance covariance (defined above), and those values from Loeb et al. (2009), respectively.

Our analysis gives *μ*_{c} = (99.7, 240.0) W m^{−2} for the RSR and OLR, slightly different from (29.5, 239.6) W m^{−2} of Loeb et al. (2009). We estimate the covariance matrix of this combined observational error (ocean heat content and satellite bias error) to be

Other sources of uncertainty are assumed to be independent of each other and added to this covariance matrix. The different sources of uncertainty vary in their magnitude, although total modeling (all sources of uncertainty considered apart from satellite bias and ocean heat content) uncertainty is much larger than observational uncertainty. The most important contributors to total uncertainty come from forcing and parameter uncertainty (Fig. 1) with a total covariance matrix estimated to be

### c. Estimating climate sensitivity

Running full simulations to calculate the equilibrium climate sensitivity for each parameter combination was not possible given computational constraints. We have therefore used a statistical model to estimate the climate sensitivities for each of the candidate parameter combinations produced in our optimization algorithm. This approach is becoming widely adopted in the field, whereby a statistical model can be trained on past evaluations of a climate model with perturbed physics and then used to predict various output quantities for new parameter combinations (Sanderson et al. 2008; Rougier et al. 2009; Sanso and Forest 2009), with the term *emulator* being adopted. The recent U.K. Climate Projections (UKCP09; see http://ukclimateprojections.defra.gov.uk/) relied heavily on the use of statistical emulation (Murphy et al. 2007). These algorithms can simply be thought of as nonlinear regressions of the climate model parameters onto output quantities of interest.

In this study we have used the random forest technique (Breiman 2001) to build our statistical emulator, based on a 14 001-member perturbed physics ensemble generated from climateprediction.net. All simulations were from Hadley Centre Slab Climate Model, version 3 (HadSM3), which consists of the same atmospheric model (i.e., HadAM3) coupled to a slab thermodynamic ocean. Climate sensitivities were estimated using existing methodology (Stainforth et al. 2005). Subsequent simulations have recently been performed to vary parameters continuously rather than the original grid design improving the ability of our emulator to learn about how climate sensitivity changes as we vary the model parameters. In all, 10 parameters were varied independently in the climateprediction.net ensemble (Sanderson et al. 2008), of which four of the most influential are considered here.

The random forest technique is a machine learning algorithm that has been shown to be very powerful in capturing nonlinear dependencies in a wide variety of problems (Breiman 2001). The algorithm constructs an ensemble of regression trees each built with a bootstrap sample of the original training data, with randomized splitting at each node. The aggregation of a number of classifiers together, known as bagging (bootstrap aggregating), vastly improves the performance of the algorithm and avoids overfitting. The algorithm requires only three parameters in the setup—the number of regression trees, terminal node size, and number of parameters to split the data over at each stage in the tree construction. Sensitivity studies (not shown) indicate that in this case the results of random forest estimates of climate sensitivity are not significantly changed by varying these parameters.

Figure 2a shows the performance of the predictions from the random forest comparing simulated values in HadSM3 to predicted values from the random forest algorithm. All of these predictions are *out of sample*, meaning that predictions were made for models not included in the process of fitting the random forest. This was achieved through a tenfold cross validation as follows:

Randomly split up the 14 001-member ensemble into 10 segments.

Remove all models in segment 1 from the ensemble, and fit the random forest on the remaining 90%. Once fitted make a prediction for the models that have been left out.

Repeat for segment 2 and so on until every segment has been left out.

The result is a set of 14 001 predictions for climate sensitivity, which can be compared to the actual simulated values from HadSM3. We do not expect the random forest to perfectly fit the simulations since the climate sensitivity values we fit to are contaminated by noise due to internal variability and uncertainties induced by the exponential fit used (Stainforth et al. 2005). Overall we find that the random forest predictions explain over 95% of the variability in the simulated climate sensitivity values.

We account for uncertainty in our climate sensitivity estimates by using the error statistics generated from the out-of-sample predictions. Specifically we calculate the root-mean-square error (RMSE) in bins of simulated climate sensitivity (Fig. 2b). The bins are chosen to span the 5%–95% range of the simulated climate sensitivity distribution in deciles. This RMSE is approximately 0.2 K at climate sensitivities of 2–3 K and rising to 0.6 K for climate sensitivities above 6 K, which we use as a varying 1*σ* error in our analysis. Integrated over all climate sensitivities the RMSE is approximately 0.3 K (Fig. 2c). These values are in line with estimates of the uncertainty from initial condition ensembles (Stainforth et al. 2005). Our error estimates are approximately 50% smaller than a similar study that found a 1*σ* error of approximately 0.45 K (Rougier et al. 2009). We attribute this to our ensemble being approximately 50 times larger and exploring a lower dimensional parameter space leading to a much denser sampling of points.

Other methods exist to estimate uncertainty in predictions from the ”random forest” technique (Meinshausen 2006). We use out-of-sample error statistics as it is a standard statistical technique often used in ensemble forecasting (Roulston and Smith 2003) that simplifies and adds transparency to our analysis.

The impact of sampling uncertainty—that is, the uncertainty in the predicted climate sensitivity arising from the specific ensemble members used to fit the random forest—is very small relative to the prediction error (Fig. 2b) and so is ignored in our uncertainty analysis.

### d. Computing probability density functions

In Part I of this paper we showed that an optimization method can be used to adjust model parameters so as to produce simulated global averages of reflected solar radiation and outgoing longwave radiation values that are close to target values. We carried out about 2500 simulations in all and they have a broad range of OLR and RSR values. We would like to use results from those simulations to make probabilistic statements about the climate sensitivity of HadAM3. We chose to focus on climate sensitivity as it is a key summary parameter for future climate change. However, our method could be applied to any future prediction of a climate modeling system.

One issue we face is that the configurations we use are generated by an optimization algorithm. The algorithm has the advantage of generating configurations that are close to the target values but at the cost of making configurations dependent on each other. From the approximately 2500 cases we had 78 configurations with a root-mean-square error of less than 1 W m^{−2} to either the CERES or ERBE observations. The individual parameter values, from this subset, cover a broad range of values, suggesting that we are sampling the distribution well. However, as discussed in Part I there are correlations between the different parameter values.

We label each one of *N* model configurations _{i} with a simulated RSR and OLR of **r**_{i}. Each model configuration has an associated climate sensitivity *S _{i}* that we compute using the emulator described in section 2c. Using Bayes theorem we can write the probability of a model configuration given observations

*P*(

_{i}|

*O*) as

The constant of proportionality can be computed by requiring that the probabilities sum to 1. Thus, *P*(*O* | _{i}) is the likelihood *L _{i}* of

_{i}and we now describe how this is computed. The density of model configurations near target values is largest while those far away from the target have low sampling densities. The probability density for a multivariate Gaussian distribution with mean

*μ*and covariance is

We assume that the likelihood of each _{i} varies smoothly in in a small patch (Ω_{i}) around **r**_{i} and we can compute likelihoods using

For small enough Ω_{i} then *ρ*(**r**) is approximately constant [=*ρ*(**r**_{i})], giving

where . We compute *A _{i}* from the area of the Voronoi polygon (Aurenhammer 1991) for

**r**

_{i}. The Voronoi polygon is the polygon that surrounds the region that is closer to

**r**

_{i}than to any other

**r**

_{j}. These ideas can be generalized to higher dimensions.

Within the 99% consistency region a small number of the Voronoi polygons have large areas (Fig. 3). These polygons include the boundaries of the region generated by our optimization process and so their area depends on the arbitrary choice of boundary. Over such large polygons the area will no longer be sufficiently small that we can approximate Eq. (3) by Eq. (4). So we cap the polygon area at *π* corresponding to a circle with unit radius, giving zero likelihood to regions far from points we generated. We explored reducing this cap to *π*/4 and found it made little difference to our results. Our sampling across the 99% consistency region is sufficiently dense, except near some of the boundaries (Fig. 3), that our assumptions appear reasonable.

Our climate sensitivity values are based on integrations of a few decades of a slab climate model and use of an emulator and so are uncertain. This uncertainty depends on *S* itself (see section 2c and Fig. 2b). We assume, to be conservative, that this emulator uncertainty is coherent across all _{i} and thus make emulator uncertainty a significant contribution to total uncertainty. To include this uncertainty we generate 10 random realizations of *S* assuming the uncertainty in it is Gaussian. For each realization we generate a single realization *ɛ* from a Gaussian distribution. Based on the uncertainties shown in Fig. 2 we add 0.2*ɛ* when *S _{i}* < 3.5 to the emulated climate sensitivity. For values of

*S*larger than this we add 0.3

*ɛ*when 3.5 ≤

*S*< 4.5; 0.4

_{i}*ɛ*when 4.5 ≤

*S*< 6; and 0.5

_{i}*ɛ*when

*S*≥ 6. We then renormalize the existing likelihoods, priors, and posteriors and compute cumulative distributions for

_{i}*S*from these distributions.

## 3. Results

We now apply the uncertainty estimates and approaches described above to the computed cumulative distribution functions for climate sensitivity. Before doing that we revisit the groups we used in Part I of this paper.

In Part I we split the configurations close to the CERES and ERBE observations into two groups on the basis of their land temperatures. Both groups had simulated OLR and RSR close to the target values but one group was warmer over land and drier in the tropics (termed the CERES or ERBE warm group) than the Standard configuration. The other cluster (termed the CERES or ERBE cold group) had a surface climatology close to the default configuration. The CERES cold group has a mean *S* of 3.2 K, slightly smaller than the standard HadAM3 sensitivity (Williams et al. 2001). The CERES warm group has a slightly higher sensitivity of 3.6 K. The ERBE clusters have a broader range of sensitivities of 3.3 K (cold group) and 4 K (warm group). This suggests that it is possible to get different climate sensitivities for similar values of OLR and RSR and so *S* is not a single-valued function of OLR and RSR.

Using all our simulations and the emulated climate sensitivity we can explore the dependency of *S* on OLR and RSR. This shows that high values of *S* occur for low values of RSR while the smallest values occur at high values of both OLR and RSR (Fig. 4). Zooming in closer to the observations we can see a rich structure with islands of high sensitivity surrounded by regions of lower sensitivity (Fig. 5). We can also see quite a dense sampling of model configurations in the region where configurations would be consistent at the 95% level.

Our original simulations had an *S* range of 2.5–10.2 K. Only a subset of these values is consistent with observations, at the 95% level, with climate sensitivities ranging from 3.0 to 4.1 K for the CERES observations and from 3.0 to 5.2 K for the ERBE observations. To build a coupled ocean–atmosphere model we would require that the atmospheric model be in near-radiative balance, which we interpret as being within 1 W m^{−2} of the observed value. The model configurations that had an imbalance within 1 W m^{−2} of the observed value had an *S* range of 3.0–4.1 K for CERES and 3.0–5 K for ERBE. This suggests that, for HadAM3, it is possible to build coupled models that do not need flux correction but that span a plausible range of climate sensitivities.

Given that the model configurations are not uniformly sampled or randomly generated our approach is to take five different prior distributions and then compute five posterior distributions. If the posterior distributions are similar then the observations are important constraints on the posterior probabilities.

We use equally probable prior distributions where some property is equally likely within the range of simulated values. The five we consider are as follows:

uniform—All configurations are equally likely. This gives a posterior probability equal to the likelihood.

radiation—All values of OLR and RSR are equally likely.

parameter—All parameter values are equally likely.

*S*—All climate sensitivities are equally likely.1/

*S*—All climate feedback values are equally likely.

For equal-probable climate sensitivity we computed the weights from the difference in the climate sensitivities with the boundary values having the same weight as those immediately interior. Climate sensitivities were ordered monotonically prior to computing the weights. Similar computations were done for equal-probable climate feedbacks. For radiation and parameter posteriors we constructed the priors from the area/4-volume of the Voronoi polygons/4-polytopes. For radiation weights we capped the area of the polygon at *π*. For parameter weights we capped the 4-volume of the 4-polytopes at 1000 times the median 4-volume of the Voronoi polytopes.

The priors we considered lead to a range of different cumulative distribution functions (Fig. 6a). When combined with the observations and our estimated uncertainty, the posterior distributions are all very similar (Fig. 6a), with 95% of climate sensitivities between 3 and 4 K. Taking account of uncertainty in the emulated climate sensitivity leads to a broader distribution function (Fig. 6b), again with little sensitivity to prior assumption. This suggests that, for our uncertainty estimate and HadAM3, the CERES observations provide a strong constraint on climate sensitivity. Examining the 2.5%, best estimate, and 97.5% values of the distribution (Table 2) shows that sensitivity to prior assumption is about 0.1, giving climate sensitivities ranging from 2.7 to 4.2 K for HadAM3 with a best estimate of 3.4 K. Climate sensitivities outside this range are inconsistent with the CERES observations.

We then explored how sensitive our results are to different assumptions. These sensitivity experiments are as follows:

ERBE—We treated the ERBE values of Fasullo and Trenberth (2008) as we did the CERES results. We first scaled the ERBE RSR value by the Kopp and Lean (2011) TSI values divided by 1365 W m

^{−2}. As modeling uncertainty dominates our total covariance we used the same covariances as before in our analysis.CERES 2×—We scaled the covariance matrix generated from our uncertainty analysis by a factor of 2 but used the CERES observations.

CERES 20×Sat—We scaled the Loeb et al. (2009) RSR and OLR covariance by a factor of 20 and used the CERES observations. This is sufficient (Fig. 5) to make the ERBE and CERES values consistent with one another.

2002 CERES—We only used simulated data for 1 December 2001–30 November 2002 in our observational–model comparison. Internal variability was computed from the 19-member ensemble of HadAM3. The CERES observations and other contributions to total uncertainty were the same as the base case.

CERES Sample—We only used the first one of seven simulations in each of the optimization iterations (see Part I). The same observations and uncertainties were used. This should increase the independence of the samples.

For each of these sensitivity studies we repeated our earlier analysis. Using the ERBE observations rather than the CERES observations has a large impact (Fig. 7 and Table 2) with much greater sensitivity to prior assumptions, a marginally increased lower bound for climate sensitivity, and a much increased upper bound for climate sensitivity. Using the ERBE results we would report a 2.5%–97.5% climate sensitivity range of 2.8–5.6 K with best estimates around 4 K.

Increasing the covariance and using the CERES observations (CERES 2×), not unexpectedly, increases the range of plausible climate sensitivities with a small impact at the lower end but increases the upper end to 5 K. It also increases the sensitivity of our results to prior assumptions. Increasing the CERES measurement uncertainties (CERES 20×Sat) has little impact on the lower bound for climate sensitivity but again increases the upper bound quite considerably (Table 2). The covariance in this case provides a strong constraint on the total outgoing radiation though not on the individual components. Using this analysis we would report a 2.5%–97.5% climate sensitivity of 2.6–5.4 K with best estimates around 3.3–3.6 K and considerable sensitivity to prior assumptions. The sample sensitivity case gives very similar results to the original CERES cases although with an increase of 0.1 K in the overall lower and upper ranges.

Turning now to the 2002 case, here we only use one year of simulated data to compare with one year of CERES data. Considering the climate sensitivity as a function of RSR and OLR (Fig. 8) we see a very similar plot with higher climate sensitivities at low values of RSR and smallest climate sensitivities at high RSR/OLR. The posterior distributions of climate sensitivity using only this year are similar to the reference CERES case with the same range of 2.7–4.2 K. This suggests we could have done our analysis with 2-yr simulations (1 yr to spinup and 1 yr to compare with observations) rather than the 6.5 yr we actually did use. However, we would still need to have estimates of the climate sensitivity for those configurations.

One other observation is that ERBE and CERES results are quite different from one another. Using the CERES observations and changing covariances changes the upper range of the cumulative distribution functions (CDFs) but the CDFs are all similar to one another below the 60%–80% level. The ERBE CDF appears to be characterized by a general shift toward higher sensitivities with differences between it and the CERES distributions apparent at all levels. This leads to differences in the best estimate climate sensitivity, giving 3.3 K for the CERES observations and about 4 K, although with considerable sensitivity to the prior, for the ERBE observations.

## 4. Discussion and conclusions

Having shown that there is a relationship between the two components of outgoing radiation and climate sensitivity we now consider total outgoing radiation. Climate models show on average that

where *R*′ is the change in outgoing radiation, *T*′ the change in surface temperature, *G* the forcing, and *α* is termed the “climate feedback parameter” (Gregory and Webb 2008); *α* is related to the climate sensitivity (the forcing from doubling CO_{2}) by . We might expect that with fixed SSTs and thus largely constant surface temperatures that increasing *α* would lead to an increase in total outgoing radiation while decreasing *α* would lead to a decrease in total outgoing radiation. If this were the case that would provide physical justification for our focus on outgoing radiation in order to observationally constrain climate sensitivity. However, Gregory and Webb (2008), Andrews and Forster (2008), and Colman and McAvaney (2011) have shown that CO_{2} forcing can also generate, in a model-dependent way, rapid changes in tropospheric structure and clouds. The impact of this process could explain why one perturbed physics version of HadSM3 had a low sensitivity (Gregory and Webb 2008). Initially, we neglect this process and assume that the forcing from doubling CO_{2} is 3.76 W m^{−2} (Myhre et al. 1998). Using this forcing we compute *α* from the emulated climate sensitivities.

There is an approximately linear, though noisy, relationship between simulated total outgoing radiation and *α* (Fig. 9) with large outgoing radiation associated, as expected, with large values of *α*. The slope of the best-fit robust line is approximately 30 K (i.e., for a 0.1 W m^{−2} K^{−1} increase in climate feedback strength the outgoing radiation increases by about 3 W m^{−2}). For climate feedback parameters values greater than about 1.2 W m^{−2} K^{−1} the regression slope appears to be stronger with a value of about 75 K (Fig. 9). This increase in slope at high climate feedback values cannot be explained by our neglect of rapid clouds response to CO_{2} changes. Gregory and Webb (2008) found for one low climate sensitivity configuration of HadSM3 that rapid cloud responses caused the effective forcing to decrease and thus the estimated value of *α* to reduce. This process, if anything, would steepen the best-fit regression slope for the low climate sensitivity cases.

To test if these results arise because of the use of four parameters or because of “tuning” the model to observations, we used data from an ensemble of 100 randomly sampled configurations from the 14 001 we used to generate our emulator (see Part I). The 98 cases that did not fail due to numerical problems also show an increase, with considerable scatter, in outgoing radiation as the climate feedback parameter (Fig. 9). Thus, at least for HadAM3, there appears to be a link between outgoing radiation and climate sensitivity, supporting our initial focus on outgoing radiation to constrain climate sensitivity.

Our comparison between model and observation is testing the model fidelity of global-averaged outgoing radiation. With our experimental design it is not possible to test the relative importance of temperature feedbacks and rapid responses to CO_{2} and other forcings.

Could we have carried out our analyses more efficiently? We generated about 2500 configurations of HadAM3 and ran each of them for six years. This is computationally expensive and is only possible as HadAM3 is a relatively cheap model. We have already shown that 1 year of data is enough to determine model–observational consistency. One way to proceed might be to generate the 16 extreme cases and use the three most extreme of these cases to start a series of optimization cases. To explore this we subsampled our data to only include CERES, ERBE, and three targets on the edge of the model–observation consistent region. We then only considered the 2002 data in the analysis. If we had done this, we would have concluded that climate sensitivity lay in the range 2.8–4.4 K with some sensitivity to prior assumptions. This is not hugely different from our results with 2500 simulations each ran for six years. However, we would still need to compute climate sensitivity for those configurations.

Our focus has been on climate sensitivity for which key processes and parameters have already been identified by Knight et al. (2007). Other impacts of climate change may be less obviously related to present-day observations than climate sensitivity appears to be. However, Fowler et al. (2010) found that changes in U.K. extreme precipitation were strongly controlled by the entcoef and vf1 parameters, suggesting our results might also provide some constraints on changes in future extreme precipitation.

We have shown that CERES observations of reflected shortwave radiation (RSR) and outgoing longwave radiation (OLR) provide a significant constraint on the plausible (2.5%–97.5%) range of climate sensitivities for HadAM3. Using the more recent CERES observations we find a range of 2.7–4.2 K for HadAM3 climate sensitivity with little sensitivity to a range of prior distributions and a best estimate of 3.4 K. Using the older ERBE observations we find greater sensitivity to the prior distribution and a range of 2.8–5.6 K with a best estimate around 4 K. Amplifying the CERES OLR and RSR errors to make ERBE and CERES observationally consistent leads to high uncertainty in the individual components but, still, small uncertainty in the total outgoing radiation arising from uncertainty in the ocean heating rate and total solar irradiance. This uncertainty estimate gives a best estimate of about 3.4 K and a climate sensitivity range of 2.7–5.4 K.

Some caveats on our results are necessary. We may be missing or underestimating key uncertainties in model–observational comparison. For example, we have assumed that internal climate variability as simulated by default HadAM3 is adequate. The model does not simulate well the average land surface temperature or the clear-sky outgoing radiation regardless of tuning, so if we had included these in our analysis might have reached different conclusions.

Our uncertainty range only includes the effect of atmospheric and land surface processes and does not take account of oceanographic processes such as changes in ocean circulation or of changes in sea ice. However, Brierley et al. (2010) suggest that perturbations in ocean parameters have little impact on future climate change in HadCM3, suggesting that our neglect of them is not critical. Thus, our results suggest that climate sensitivity, for the HadAM3 model, is unlikely (2.5%) to be greater than 5.6 K. This uncertainty could be narrowed given focused work by the satellite community to resolve differences between the CERES and ERBE measurements.

Our results also have implications for the recent U.K. Climate Projections (Murphy et al. 2010) analysis, which is based on a set of 11 perturbed physics regional model simulations of HadAM3 driven by perturbed physics simulations of the HadCM3 atmosphere–ocean general circulation model (Collins et al. 2011). All these configurations of HadCM3 require flux correction and some have climate sensitivities larger than our results suggest is plausible. If our results are correct then the impact of climate change may be less severe than some of those simulations suggest.

Other groups have found quite different results for the range of plausible climate sensitivities. Shiogama et al. (2012) report a climate sensitivity range of 2.2–3.2 K for the Model for Interdisciplinary Research on Climate, version 5 (MIROC5). However, unlike our range, their range is not based on a measure of model–observational difference but assumes instead that the atmospheric model should have a small net TOA imbalance when driven with SSTs taken from a preindustrial control simulation. This experimental design is likely to underestimate the range of climate sensitivities as any coupled model configuration run to equilibrium would have a small net TOA imbalance. So it is possible that other configurations, with a broader range of climate sensitivities, when run with different preindustrial SSTs would have also been in radiation balance.

Sanderson (2011), using an ensemble of Community Atmosphere Model, version 3.5, simulations, also found a narrow range of climate sensitivities (2.2–3.2 K) by perturbing four parameters across their plausible values. No observational constraints were applied, which presumably would reduce the range still further.

Our study used HadAM3 and varied four parameters that previous work had suggested were important in the models climate sensitivity. Thus, our results are conditional on both the model and the parameters varied. Our uncertainty estimates, based as they are on HadAM3, largely do not include structural uncertainty for which using additional models is one way forward. As the work described above suggests, different models are likely to produce different ranges of climate sensitivities. One way forward is to generate, for each model, a range of perturbed models consistent with observations. If this can be done efficiently then this would allow a better understanding of the range of possible future climates in response to emissions of greenhouse gases.

## Acknowledgments

SFBT was supported by the National Centre for Earth Observations (NERC Grant NE/F001436/1). Support for MJM and computer time on the Edinburgh Computing and Data Facility was provided by the Centre for Earth Dynamics, which is part of the Scottish Alliance for GeoScience, Environment and Society. CC, SFBT, and MJM also acknowledge support from “Bridging the Gaps” and “Maximaths” funding. DJR was supported by a NERC PhD studentship with a CASE award from CEH Wallingford. We are grateful to volunteers who participated in the climateprediction.net experiment, donating spare computing cycles to integrate the HadSM3 model versions, from which we were able to calculate the climate sensitivity for each model parameter configuration. We thank Joyce Penner for providing results from simulations used to estimate the uncertainty in natural aerosol effect on RSR.

## REFERENCES

_{2}forcing induces semi-direct effects with consequences for climate feedback interpretations

*ACM Comput. Surv.,*

**23,**345–405.

*Climate Change 2007: The Physical Science Basis.*S. Solomon et al., Eds., Cambridge University Press, 129–234.

_{2}forcing

*Atmos. Chem. Phys.,*

**10,**9993–10 002.

**38,**L01706,

*Climate Change 2007: The Physical Science Basis,*S. Solomon et al., Eds., Cambridge University Press, 747–845.

*J. Geophys. Res.,*

**117,**D04103,

*Climate Change 2007: The Physical Science Basis,*S. Solomon et al., Eds., Cambridge University Press, 589–662.

*J. Roy. Stat. Soc.,*

**38,**2543–2558,

*Climate Change 2007: The Physical Science Basis,*S. Solomon et al., Eds., Cambridge University Press, 19–91.