## 1. Introduction

Estimates of climate sensitivity often use global energy balance or other simple climate models with a limited number of adjustable parameters and compare modeled and observed values of multidecadal warming and other climate variables. Such estimates play an important role in assessment of climate sensitivity (Hegerl et al. 2007; Bindoff et al. 2014). Most of these studies use a Bayesian framework as a basis for assessing uncertainty and developing a probability density function (PDF) for climate sensitivity. This paper addresses the methodological challenge of selecting the appropriate Bayesian prior distributions for climate sensitivity and other parameters employed in simple climate model analyses.

Deriving valid probabilistic estimates for climate sensitivity and other uncertain parameters such as effective ocean diffusivity has proved challenging. These challenges primarily arise from the strongly nonlinear relationships between observable variables and these climate system parameters, combined with large observational uncertainties. Such factors make the selection of appropriate Bayesian prior distributions for the parameters crucial but nonobvious. In suitable cases, the problem of prior selection may be addressed by considering Bayesian parameter inference as consisting of first generating probabilistic estimates for the “true” values of the observable variables—those corresponding to what would have been observed in the absence of uncertainty—and then performing a transformation of variables from the observables to the parameters.

Frame et al. (2005, hereafter FR05) was a seminal paper that addressed the role of prior assumptions regarding climate sensitivity and is particularly well suited for illustration of a transformation-of-variables approach. FR05 used probabilistic observationally derived estimates of twentieth-century warming attributable to greenhouse gas increases and of effective heat capacity to estimate a probability density function (PDF) for climate sensitivity (and implicitly also for effective ocean diffusivity) on different sampling strategies, representing different prior assumptions. Such an analysis can only be viewed in Bayesian terms, since in frequentist statistics there is no role for prior assumptions, nor is putting a probability distribution on a fixed but unknown parameter permitted.

**on which observed data**

*θ***y**depend is proportional to the probability density of the data

**, with**

*θ***y**fixed) multiplied by the density of a prior distribution (prior) for

**,**

*θ*In subjective-Bayesian analysis, the prior purely reflects existing knowledge about ** θ**. In objective-Bayesian analysis, where such knowledge is disregarded or nonexistent, the prior is designed to be noninformative so that the data dominates inference about

**. Noninformative priors depend on the characteristics of the data and experiment concerned and have no probability interpretation (Bernardo and Smith 1994). The likelihood function required to apply Bayes’s theorem is a probability density for data**

*θ***y**

_{0,}the actually observed

**y**. Typically,

**y**will reflect some function of

**and a random error**

*θ***.**

*θ***is not simply a deterministic function of**

*y***,**

*θ***y**has a fixed value for any given

**and thus no likelihood function exists. Therefore, Bayes’s theorem is not required and instead one simply uses**

*θ***y**and of

**are the same, as they are in FR05, and provided that in the region where**

*θ***into a PDF for**

*θ***y**and vice versa arewhere

**y**; and, conversely,where

*f*with respect to

**(Mardia et al. 1979).**

*θ*If the dimensionality of the observables exceeds that of the parameters, a dimensionally reducing version of the transformation-of-variables PDF conversion formula may be used (Mardia et al. 1979; Lewis 2013b), provided the observables can first be whitened (made independent and of equal variance, as in optimal fingerprint methods; Hegerl et al. 1996).

The primary aims of this paper are to provide insight into the use of objective-Bayesian methods for estimating climate sensitivity by considering their relationship to transformations of variables in the context of the simple case considered in FR05, discussing also estimation using profile likelihood methods, and to dispel some misunderstandings about Bayesian inference.

Although the FR05 authors had no intention of using an objective-Bayesian approach, they stated that “unless they are warned otherwise, users will expect an answer to the question ‘what does this study tell me about *X*, given no knowledge of *X* before the study was performed?’” (p. 3), going on to assert that “this requires sampling nonobservable parameters to simulate a uniform distribution in *X*, the forecast quantity of interest, before other constraints are applied.” By contrast, the objective approach presented here, which is intended not to incorporate any prior knowledge as to the values of the parameters involved, is equivalent to advocating a uniform prior, not in the forecast quantity but in a transformation of observables that has errors with a Gaussian or other fixed distribution.

The data, model, and model parameters used in this paper’s analysis follow FR05, although several inconsistencies and misinterpretations in FR05 are pointed out. FR05 mistakenly derived distributions for its observables that, as will be seen, equated to estimated posterior PDFs for them rather than likelihood functions. Accordingly, FR05 is not fully consistent with Bayes's theorem. Moreover, the FR05 authors misinterpreted the ocean heat content change estimate they used, which pertains to a 44-yr period, as covering only the somewhat shorter period used in FR05. These errors, which have a relatively modest net effect, are addressed in a 2014 corrigendum (FR05).

The material is organized as follows: Section 2 summarizes the methods used and discusses their implications. Section 3 evidences replication of FR05’s original results. Section 4 deals with inference based on likelihood functions derived for the observables. Section 5 discusses climate sensitivity estimation and various misconceptions about it.

## 2. The observables, model, and model parameters

### a. Overview

The analysis uses a global energy balance model (EBM) with a diffusive ocean (Andrews and Allen 2008). The EBM uses given values of climate sensitivity *S* and effective ocean vertical diffusivity *K*_{υ} and a greenhouse gas (GHG) forcing time series estimated by the Met Office Hadley Centre Coupled Model, version 3 (HadCM3; Gordon et al. 2000), to simulate global surface temperature and ocean heat content changes over 1861–2000. The EBM is believed to be equivalent to and employ the same forcing series as that used in FR05 and is run at the same *S* and *K*_{υ} values.

The observables used are twentieth-century warming attributable to greenhouse gases *T*_{A} (attributable warming) and effective heat capacity *C*_{H} (the ratio of changes in ocean heat content and global surface temperature). Thus, **y** and ** θ** are both bivariate, with observables

*K*

_{υ}usually denotes effective ocean vertical diffusivity, for notational convenience here

*K*

_{υ}represents the square root of effective ocean vertical diffusivity, which controls

*C*

_{H}in an approximately linear manner (Sokolov et al. 2003) and in effect is treated as being the parameter.

Model accuracy is assumed, with there being a true setting *S* entirely reflects uncertainties in the observationally based estimates of *T*_{A} and *C*_{H}.

EBM simulations are run using all parameter value combinations lying on a grid that is uniformly spaced in terms of *S* and *K*_{υ}. The ranges used are sufficiently wide for there to be negligible probability of the true values of *S* or *K*_{υ} lying outside them. The annual model-simulation time series are used to compute

*T*

_{A},

*C*

_{H}), and the true joint values

*f*is the functional relationship between

### b. Derivation of climate sensitivity PDFs from the observables

The attraction of using *C*_{H} rather than ocean heat content is that *C*_{H} should be independent of the change in global surface temperature and hence of *T*_{A}. On the basis of independence between its observables *C*_{H} and *T*_{A}, which holds for similar variables in the HadCM3 control run, their joint density can be obtained by multiplying their individual densities. To undertake Bayesian inference for the parameters, that joint density (l_hood) would need to be the likelihood at any selected (*T*_{A}, *C*_{H}) combination [and hence at the corresponding *T*_{A} and *C*_{H}.

The EBM simulation runs and interpolation from *T*_{A}–*C*_{H} space onto the *S*–*K*_{υ} grid can be used to derive the value of l_hood at all *S*–*K*_{υ} grid combinations. By integrating the resulting l_hood values over all *K*_{υ} values, a posterior distribution for *S* based on assuming a uniform initial distribution in sensitivity can be derived, provided l_hood is a likelihood. Alternatively, the impact of different prior distributions can be simulated by weighting in different ways when interpolating between the *S*–*K*_{υ} and *T*_{A}–*C*_{H} grids. Provided l_hood is a likelihood, this procedure is equivalent to using differing prior distributions in Bayes’s theorem and thereby obtaining different marginal posterior PDFs for *S*.

However, for Bayes's theorem to be applied, l_hood must be a likelihood as defined above and not a posterior PDF. In the latter case, Bayes’s theorem is not applicable and instead the standard Jacobian determinant rule applicable to transformations of variables enables computation of a posterior PDF for *S* upon integrating out *K*_{υ}.

### c. Methods used to derive the observables from the data and their resulting status

The *T*_{G} of 0.338 K over the 1957–94 period stated in FR05, allowing for the uncertainty in both quantities (45 ZJ and 0.066 K standard errors, respectively, assumed independent and Gaussian). As discussed in section 1, the ΔOHC estimate actually related to a longer period, resulting in an overestimate of *S* and *K*_{υ}, but it is used in order to provide comparability with FR05. Section 5 gives an estimate of *S* with ΔOHC and Δ*T*_{G} determined (as 128.3 ZJ and 0.360 K, respectively) over matching 1957–96 periods. Since the ΔOHC estimate used does not represent total heat uptake, a small allowance for omitted elements should perhaps be added; consideration of that issue is beyond the scope of this paper.

*T*

_{G}, the observed value

*N*(144.7, 45) ZJ and

*N*(0.338, 0.066) K distributions involved accordingly represent not only independent likelihood functions for the observed values of ΔOHC and Δ

*T*

_{G}but also independent objective posterior PDFs, derived using uniform priors, for the true values of those variables.

I derive an estimated PDF for the true effective heat capacity *T*_{G} by calculating the quotients of many pairs of random samples taken from them and computing their histogram. This is essentially an identical method to that used in Gregory et al. (2002) directly to compute a PDF for *S* from the error distributions for the relevant observables. The sampling-based method provides a correctly calculated objective-Bayesian estimated posterior density for *C*_{H} at varying values of *C*_{H}, its likelihood function differs from the posterior density for

The other observable,

It follows that l_hood actually represents a joint posterior density for *C*_{H} and *T*_{A}, not a likelihood [the joint density for the observations

### d. Implications of using a joint posterior PDF instead of a likelihood

Since the density l_hood for the observables is a joint posterior PDF for *T*_{A}–*C*_{H} space and *S*–*K*_{υ} space and enable computation of the Jacobian determinant *K*_{υ}.

In geometrical terms, *T*_{A}–*C*_{H} space relative to that of the region *S*–*K*_{υ} space to which it corresponds. High values of

The shape of the Jacobian determinant surface is shown in Fig. 1. It is highest in the low *S*, low *K*_{υ} corner and declines with both *S* and *K*_{υ}, declining faster when the other variable is high.

The PDF for

## 3. Inference for *S* using a transformation of variables

In this section, I use a transformation-of-variables approach to infer *S* and demonstrate replication of two FR05 PDFs for *S*, those stated to correspond to uniform initial distributions in transient climate response (TCR)/attributable warming (jointly with effective heat capacity) and in climate sensitivity (jointly with *K*_{υ}), respectively. Given the linear response to transient forcing, TCR and attributable warming are approximately linearly related; hence, the PDFs for *S* in Fig. 1c of FR05 based on uniform initial distributions in each of them are indistinguishable. Although FR05 was not an objective-Bayesian study, it unintentionally used incorrect methods for the calculation of the likelihood functions: namely, the methods set out in section 2 for the derivation of objective-Bayesian posterior PDFs for the observables. As a result, the correctness of my emulation of FR05’s model simulations and other calculations can be tested by comparing its (uncorrected) PDFs for *S* based on uniform initial distributions in TCR/attributable warming and in climate sensitivity with those I compute using a transformation-of-variables approach to convert the PDFs for the observables, respectively, using and omitting the applicable standard Jacobian determinant factor. The uncorrected FR05 PDFs, which do not accurately reflect the stated intentions of its authors, are shown only to demonstrate that the computations in this study match those in FR05.

### a. Replication of PDFs for S

Figure 2 shows marginal posterior PDFs for *S* after (as in FR05) integrating out *K*_{υ}. The black line, computed by transforming the posterior PDF for *C*_{H} and presumably overlays the nonvisible line based on uniform sampling in *T*_{A}–*C*_{H} space relative to corresponding volumes in *S*–*K*_{υ} space. It thus equates to multiplication by the Jacobian determinant required upon the change of variables from

The gray line in Fig. 2 shows a PDF for *S* computed by restating, in terms of

Percentile points of the posterior PDFs give Bayesian credible intervals for *S*, which are shown by the box plots in Fig. 2 for 10%–90% and 5%–95% ranges.

### b. Discussion of the climate sensitivity PDF obtained by the transformation-of-variables approach

In a first stage, I derived, using a sampling-based objective-Bayesian approach, a joint posterior PDF for *S* (the black line in Fig. 2) follows up on applying the standard Bayesian method of integrating out *K*_{υ}.

If one accepts that the posterior PDF derived for

The sampling-derived estimated posterior PDF for *T*_{G} are valid: it corresponds with repeated sampling results. Moreover, it is implicitly assumed in FR05 that, after the additional allowance is made for forcing uncertainty, *T*_{A}–*C*_{H} space, with the posteriors derived here for

## 4. Inference based on likelihood functions

I have shown that objective posterior PDFs for *T*_{A} and *C*_{H} can readily be derived from the data used in FR05, a joint PDF for *K*_{υ}. By comparison, the usual Bayesian approach would be to derive PDFs for *T*_{A} and *C*_{H} without any transformation of variables.

In objective-Bayesian terms, it is possible to regard the posterior PDFs for *C*_{H} a more direct, non-Bayesian method of deriving an approximate likelihood function can also be used, but for *T*_{A} the data used in FR05 do not provide the necessary information.

### a. Deriving a joint likelihood function for S and K_{υ} and related noninformative priors

Normally, where (as here) estimated objective posterior PDFs for the observables are already available or can be easily and exactly calculated, there would be no reason to derive corresponding likelihood functions and to carry out Bayesian inference using them. Rather, one would just carry out a transformation of variables to parameter space. The reason for doing so here is partly to provide insight into the nature of Bayesian inference; partly to show that sampling uniformly in the observables at the likelihood function level may not provide satisfactory results; and partly in order to present comparative, purely likelihood-based inference results.

When employing the parameterized-distributions approach to deriving likelihood functions set out above, the matching process can be simplified by restricting the considered likelihood functions to simple transformations of standard location distributions. That is because a uniform prior is known to be noninformative for a location parameter. In such cases the required noninformative prior is simply the derivative of the transform involved and is the Jeffreys prior (the square root of the Fisher information matrix; Jeffreys 1946). The Jeffreys prior can be thought of as a uniform sampling of Gaussian (or other location distribution) data, subsequently transformed.

*t*distributions to the logs of upward-shifted versions of

*T*

_{A}and

*C*

_{H}provides likelihood functions for them that give rise, using the Jeffreys prior, to posterior densities that are extremely close fits to the posterior PDFs for

*C*

_{H}the likelihood is chosen asand

*t*distribution with

*υ*degrees of freedom. Here, (

*a*,

*b*,

*c*,

*υ*) are selected to minimize the sum of squared differences between the actual PDF for

*t*distribution and then exponentiating and shifting the samples. Figure 3 shows that both ways of deriving posterior PDFs for

*T*

_{A}and

*C*

_{H}give such close approximations to the PDF to be matched that the three PDFs are visually indistinguishable. The relevant likelihood functions are also shown (cyan lines). The PDFs are each shifted leftward relative to—and at the right are slightly better constrained than—the corresponding likelihood functions. That is consistent with each prior being monotonically declining with its parameter value. The variation in each prior reflects the fact that, as the parameter value varies, the likelihood function measures density at the fixed value of the observation, while the posterior measures density at the varying value of the parameter.

In the case of *C*_{H} only, the underlying data error distributions are available, providing likelihood functions for ΔOHC and Δ*T*_{G} and—since these are assumed independent—upon their multiplication for (ΔOHC, Δ*T*_{G}). A frequentist profile likelihood can then be obtained for *C*_{H} by taking, at each *C*_{H} value, the maximum across all ΔOHC and Δ*T*_{G} combinations whose quotient equals that *C*_{H} value (as is done in the FR05 corrigendum). Profile likelihood is a pragmatic approach that typically provides a reasonable but not exact likelihood (Pawitan 2001, chapter 3.4). The profile likelihood for *C*_{H} closely matches the likelihood obtained from fitting a shifted log*t* distribution: the likelihoods have indistinguishable modes and shapes apart from the profile likelihood being fractionally wider. For consistency, the likelihoods derived from the shifted log*t* distribution fits are used for both *C*_{H} and *T*_{A} in carrying out the Bayesian inference discussed next.

### b. Bayesian inference for S based on likelihood functions for T_{A} and C_{H}

Having derived likelihood functions for *T*_{A} and *C*_{H}, Bayes’s theorem can now be validly applied to derive marginal posterior PDFs for *S* based on various joint prior distributions. In particular, it is of interest to consider the following: a uniform joint prior distribution in *S* and *K*_{υ}; a joint prior equivalent to a uniform initial distribution in *T*_{A} and *C*_{H}; and a computed noninformative prior. Doing so requires conversion of the joint likelihood function and of the noninformative prior—formed by multiplication, given the assumption of conditional independence of *T*_{A} and *C*_{H}—from *T*_{A}–*C*_{H} to *S*–*K*_{υ} space. Since likelihood functions do not change upon reparameterization, their values are simply restated in terms of *S*–*K*_{υ} coordinates, by interpolating between EBM simulation runs. The joint prior for

The resulting shape of the noninformative Jeffreys prior in *S*–*K*_{υ} space is shown in Fig. 4. It is very highly peaked in the low *S*, low *K*_{υ} corner and has an extremely small value (slightly higher at very low *K*_{υ} values) when *S* is high. The idea that a prior having such a shape a priori assumes an upper bound on climate sensitivity—or discriminates against high sensitivity values—is mistaken. A valid noninformative prior asserts nothing about the value of the parameters concerned. Rather, it primarily represents how informative the data are expected to be about the parameters in different regions of parameter space. That depends both on how responsive the data are to variations in parameter values in different parts of parameter space and on how precise the data are in the corresponding parts of data space. Here, the responsiveness of the combined data variables *T*_{A}–*C*_{H} space being the product of reciprocals of the shifted data variables.

It may be asked why parameter values falling in a range in which observed quantities are fairly insensitive to parameter changes should be considered improbable relative to the data likelihood value—downweighted by the noninformative prior having a low value—simply for that reason. The answer is that, unless posterior probability is scaled down relative to likelihood in such a parameter range, more probability will be assigned to parameter values within it than corresponds to the probability of the quantities that have been observed having resulted from such parameter values, given their assumed error distributions. Viewing probabilities in terms of CDFs rather than densities makes this clear. For example, suppose what is observed varies linearly with the climate feedback parameter *λ*, the reciprocal of *S*, and thus is insensitive to *S* at high *S* levels, but that λ can be estimated sufficiently accurately to put a positive lower bound λ* _{L}* on it at an acceptable confidence level. Then,

*S*can be constrained above at

*λ*

_{L}

^{−1}with the same confidence. However, one should be aware that the magnitude of an objective-Bayesian posterior PDF reflects how responsive the observed quantities are to parameter changes in various parameter ranges as well as to how likely it was the observed values would be obtained given each possible parameter value combination. In strongly affected cases care should therefore be taken in interpreting objective-Bayesian posterior PDFs, but the uncertainty ranges derived from them are nevertheless valid.

Marginal posterior PDFs for *S* derived on the selected bases from the joint likelihood function for *T*_{A} and *C*_{H} are shown in Fig. 5. The black line uses the Jacobian determinant to convert the joint posterior PDF for *T*_{A} and *C*_{H} into a joint posterior PDF for *S* and *K*_{υ}, as in Fig. 2. The red line shows the result of applying Bayes’s theorem to the joint likelihood function for *T*_{A} and *C*_{H} using the Fig. 4 noninformative prior, which gives an objective-Bayesian posterior PDF. The green line is the published FR05 PDF stated to be based on a uniform initial distribution in TCR. These three posterior PDFs are almost identical. Indeed, the black and red line PDFs would be identical in the absence of discretization errors. The identity of these two PDFs confirms that the highly peaked Fig. 4 noninformative prior does not convey an initial assumption that *S* is very low but rather has the shape required to achieve objective inference about *S*.

The coral line in Fig. 5 assumes a uniform initial distribution in TCR/attributable warming and effective heat capacity, using the same weighting method as for the green line but weighting *S* values by the joint likelihood function for *T*_{A} and *C*_{H} rather than by their joint posterior density: that is, carrying out Bayesian inference using a uniform joint prior for *T*_{A} and *C*_{H} or sampling uniformly in the observables. The form of the prior used corresponds to that in Fig. 1 rather than in Fig. 4. The resulting PDF is significantly worse constrained above than are the black, red, and green PDFs. The reason is that, although assuming a uniform joint prior in *T*_{A} and *C*_{H} effectively incorporates the Jacobian determinant factor necessary for correctly converting a joint density in *T*_{A}–*C*_{H} space to one in *S*–*K*_{υ} space, it does not also provide a noninformative joint prior for *T*_{A} and *C*_{H} in their own space.

If errors in estimating *T*_{A} and *C*_{H} were Gaussian, *t* distributed, or otherwise independent of the values of those variables so that *T*_{A} and *C*_{H} themselves were location parameters, uniform priors in *T*_{A}–*C*_{H} space would be noninformative for those variables and their resulting objective posterior PDFs would have the same shapes as their likelihood functions. In such a case, sampling uniformly in the observables would provide an objective-Bayesian posterior PDF for *t* distribution applies to log transforms of *T*_{A} and *C*_{H}, suitably shifted, a uniform prior is not noninformative, the likelihood functions for *T*_{A} and *C*_{H} are not identical to their objective posterior PDFs, and sampling uniformly in the observables does not provide an objective posterior PDF for *T*_{A} and *C*_{H}.

The dashed gray line in Fig. 5 is the same as the gray line in Fig. 2 and is shown for the purposes of comparison. The blue line shows the posterior PDF resulting from applying Bayes’s theorem to the joint likelihood function for *T*_{A} and *C*_{H} using a uniform joint prior in *S* and *K*_{υ}. It is substantially worse constrained even than the dashed gray PDF, since the latter effectively employs noninformative priors for inferring posterior PDFs for *T*_{A} and *C*_{H} from their likelihoods whereas the blue line does not.

Although under a subjective Bayesian approach all the prior distributions considered here are in principle acceptable, it would be incorrect to argue—based on deriving a climate sensitivity PDF using a prior that samples uniformly in *S* and *K*_{υ}—that we cannot from the data available rule out high sensitivity, high heat uptake cases that are consistent with but nonlinearly related to twentieth-century observations. All high sensitivity, high heat uptake cases would have given rise to an increase in ocean heat content that is inconsistent with observations at a high confidence level and so can logically be ruled out (with the implication that use of a uniform in *S* and *K*_{υ} prior is inappropriate).

### c. Profile likelihood inference for S

Figure 5 also shows, in cyan, confidence intervals (CIs) for *S* derived by employing the frequentist signed-root-(log-)likelihood-ratio (SRLR) profile likelihood method on the joint likelihood for *T*_{A} and *C*_{H}, restated in *S*–*K*_{υ} coordinates. Only a box plot is shown, since profile likelihoods are not comparable with PDFs. The SRLR method, which depends on an asymptotic normal approximation to the probability distribution involved, provides CIs for individual parameters from their joint likelihood function. The method is exact in cases involving a normal distribution or a transformed normal distribution. Unlike objective-Bayesian approaches, it does not require computation of a noninformative prior. The SRLR method was employed in Allen et al. (2009). That study used exactly the same approach as set out here, of carrying out many EBM simulations and comparing their results with estimates of *T*_{A} and *C*_{H}. However, it used parameterized lognormal distributions for those variables rather than estimating actual distributions, correctly calculating from them the likelihood functions required for its profile likelihood inference.

Credible intervals from the objective-Bayesian PDF for *S* derived from the posterior PDFs for *T*_{A} and *C*_{H}, using the standard method for converting PDFs on a transformation of variables (shown by the black box plot in Fig. 5) may properly be used to judge profile likelihood SRLR-derived CIs, since the posterior PDFs for *T*_{A} and *C*_{H} are objectively and exactly derived from the original data used as the basis for inference. Although the SRLR method gives CIs that correspond fairly closely to credible intervals implied by the posterior PDFs for *t* distributions not being exact. But most of it likely relates to the shapes of the posterior PDFs for *T*_{A} and *C*_{H} being best matched by transforming *t* distributions with modest degrees of freedom (10 for *T*_{A} and 17 for *C*_{H}), which is consistent with neither of those PDFs closely corresponding to a transform of a normal distribution.

An illuminating comparison [in a case corresponding to that in Allen et al. (2009)] between profile likelihood and Bayesian inference based on various priors, including Jeffreys prior, is given in Rowlands (2011), which also contains other analyses of relevance. It found nearer identity between profile likelihood–derived CIs and Bayesian credible intervals derived using Jeffreys prior than here. Since the SRLR profile likelihood method is based on a normal approximation, it performs very well where the likelihood functions are normal or (as in Rowlands 2011) transformed normal distributions. Where data–parameter relationships are strongly nonlinear and/or data uncertainties have unsuitable forms, a simple SRLR method may not always provide acceptably accurate CIs. In such cases one of the various modified versions of the SRLR method may be employed in order to improve accuracy (see, e.g., Cox and Reid 1987). Modified profile likelihood–based inference has close links to objective-Bayesian inference and often gives results closely corresponding to those from objective-Bayesian marginal posterior PDFs (Bernardo and Smith 1994, their section 5.5 and appendix B.4).

## 5. Discussion

The various methods for estimating climate sensitivity from the data used in FR05 and the resulting estimates are summarized in Table 1. Using objective methods and with the ΔOHC error corrected, a median estimate of 2.2 K and 5%–95% bounds of 1.2–4.5 K (uncorrected: 2.4 K and 1.2–5.2 K) are obtained using the transformation-of-variables method, based on the initial objective-Bayesian posterior PDFs for *T*_{A} and *C*_{H}. Although this estimate is objective given the data, the data it reflects are somewhat dated and may have shortcomings. For instance, Gillett et al. (2012) found that using (as in this case) temperature data spanning just the twentieth century, where the first two decades were anomalously cool, produced a high estimate for attributable warming, which would bias the estimation of *S* upward. However, the main object of this paper is to illustrate the relative effects of various methods of inference for climate sensitivity, which are unaffected by such bias.

Summary of different methods of estimating climate sensitivity *S* from the data FR05 used (with the ΔOHC error uncorrected unless otherwise stated) and their results. For consistency, all computations using the uncorrected ΔOHC data are based on the fitted parameterized distributions. Integration out of *K*_{υ} from the joint (*S*, *K*_{υ}) posterior to obtain a marginal posterior for *S* is taken as read.

Since the best estimated PDF for a parameter value does not depend on the use to which that estimate is put (Bernardo and Smith 1994, their section 3.4; Bernardo 2009), one should use the same prior assumptions for estimating climate sensitivity, irrespective of for what purpose the resulting PDF is used. Differing loss functions associated with differing purposes may lead to the same probabilistic estimate then being used in different ways.

Most climate sensitivity studies have embodied a subjective Bayesian perspective, in which probability represents a personal degree of belief as to uncertainty and prior distributions represent subjective assumptions of the investigators. However, for scientific reporting, it is usual to assume no prior information as to the value of unknown parameters being estimated in an experiment. To achieve that using Bayesian methods, a noninformative prior distribution must be mathematically derived from the assumed statistical model. That corresponds to an objective-Bayesian approach: results from which, like frequentist results, depend only on the assumed model and the data obtained (Bernardo 2009). A proposal for adapting the objective-Bayesian approach to allow for incorporation of probabilistic prior information, represented as if derived from data, is set out in Lewis (2013a).

In cases where a prior that is noninformative for inference about the observables is easily identified and the observables are independent (or can be transformed to be independent, as by whitening in optimal fingerprint methods), an objectively correct joint posterior PDF for the observables may be derived and hence, via a transformation of variables, a joint posterior PDF for the parameters may be computed. In many cases it may be easier to adopt an objective-Bayesian approach by performing Bayesian inference in observation space and then effecting a transformation of variables than by deriving a noninformative joint prior for the parameters. The climate model simulations that are generally performed can be used to convert the joint posterior PDF from observation space to parameter space through a transformation of variables. The objectivity of the inference procedure may be more obvious with such a two-stage procedure than when the parameters are inferred directly from the observables’ likelihoods using Bayes’s theorem with a highly nonuniform noninformative prior. In some cases, as here, the available data more readily provide objective posterior PDFs for the observables than accurate likelihood functions for them, so the use of a transformation-of-variables approach is particularly advantageous.

The transformation-of-variables approach can be employed even where the model used is not deterministic, provided a location parameter relationship applies between actual and true observables, since the effects of model noise on simulated observables and measurement error plus internal climate variability on actual observables are then equivalent. Where, as in the case considered here, the dimensionality of the observables and the parameters is the same, the PDF conversion factor is the absolute Jacobian determinant. The transformation-of-variables approach should produce an equivalent result to computing a noninformative joint prior for the parameters (Lewis 2013b).

If there are more observables than parameters, a dimensionally reducing transformation-of-variables PDF conversion formula may be used, after whitening the observables. To facilitate doing so, it may be possible to transform individual observable variables having non-Gaussian (e.g., lognormal) uncertainty so that their distributions are at least approximately Gaussian, although transformations estimated from data are themselves uncertain.

Whether objective-Bayesian inference is undertaken directly or using a transformation-of-variables approach, it should be appreciated that the prior or PDF conversion factor is a function of all parameters (here *S* and *K*_{υ}) jointly. It does not vary with the parameter of interest alone.

Objective-Bayesian methods are not a universal, perfect solution to the issue of quantifying uncertainty or the only objective approach to estimating climate sensitivity. Nor, when there are multiple parameters, is Jeffreys prior always the best noninformative prior to use for marginal parameter inference (see, e.g., Bernardo and Smith 1994, their sections 5.4 and 5.6). There is always merit in reporting likelihood functions and prior distributions as well as posterior PDFs, which enables the effect of the prior to be seen, and in appropriate cases exploring the effects of different prior distributions may be helpful. Exploring data error assumptions, which affect the shape of noninformative priors used for objective-Bayesian inference as well as the likelihood functions, is also advisable.

Profile likelihood methods of varying complexity offer a viable objective alternative to Bayesian methods where likelihood information about sensitivity jointly with other climate system parameters is obtained. The basic SRLR profile likelihood method has the advantage of being simpler to apply than objective-Bayesian approaches but may be less accurate. It is worth undertaking even if Bayesian methods are used, as it provides a cross check on uncertainty bounds given by Bayesian credible intervals. That profile likelihood only provides CIs is not a real drawback, since—if one accepts the Bayesian paradigm but not its methods—a PDF may be obtained by computing one-sided CIs at all values of *S* and differentiating. However, for many purposes the combination of a best estimate (50th percentile or likelihood peak) and uncertainty ranges may provide as much useful information as a PDF does and is less susceptible to misinterpretation.

Whatever method of inference is used, there are many other subjective choices to be made, such as the observables, datasets, error distribution assumptions, and model, which will all affect the results. However, the objective-Bayesian approach does offer a solution to the issue of the relevant prior to use when estimating climate sensitivity.

## Acknowledgments

I thank Myles Allen for providing FR05 data and draft code; Dan Rowlands for helpful discussions; and Judith Curry, Steven Mosher, and four reviewers for helpful comments.

## REFERENCES

Allen, M. R., , D. J. Frame, , C. Huntingford, , C. D. Jones, , J. A. Lowe, , M. Meinshausen, , and N. Meinshausen, 2009: Warming caused by cumulative carbon emissions towards the trillionth tonne.

,*Nature***458**, 1163–1166, doi:10.1038/nature08019.Andrews, D. G., , and M. R. Allen, 2008: Diagnosis of climate models in terms of transient climate response and feedback response time.

,*Atmos. Sci. Lett.***9**, 7–12, doi:10.1002/asl.163.Bayes, T., 1763: An essay towards solving a problem in the doctrine of chances.

*Philos. Trans. Roy. Soc. London,***53,**370–418, doi:10.1098/rstl.1763.0053.Bernardo, J. M., 2009: Modern Bayesian inference: Foundations and objective methods.

*Philosophy of Statistics,*P. Bandyopadhyay and M. Forster, Eds., Elsevier, 263–306.Bernardo, J. M., , and A. F. M. Smith, 1994:

*Bayesian Theory.*Wiley, 608 pp.Bindoff, N. L., and et al. , 2014: Detection and attribution of climate change: From global to regional.

*Climate Change 2013: The Physical Science Basis,*T. F. Stocker et al., Eds., Cambridge University Press, 867–952.Cox, D. R., , and N. Reid, 1987: Parameter orthogonality and approximate conditional inference.

,*J. Roy. Stat. Soc.***49B**, 1–39.Datta, G. S., , and T. J. Sweeting, 2005: Probability matching priors.

*Bayesian Thinking, Modeling and Computing,*D. K. Dey and C. R. Rao, Eds., Vol. 25,*Handbook of Statistics,*Elsevier, 91–114.Frame, D. J., , B. B. B. Booth, , J. A. Kettleborough, , D. A. Stainforth, , J. M. Gregory, , M. Collins, , and M. R. Allen, 2005: Constraining climate forecasts: The role of prior assumptions.

,*Geophys. Res. Lett.***32**, L09702, doi:10.1029/2004GL022241; Corrigendum,**41,**3257–3258, doi:10.1002/2014GL059853.Gillett, N. P., , V. K. Arora, , G. M. Flato, , J. F. Scinocca, , and K. von Salzen, 2012: Improved constraints on 21st-century warming derived using 160 years of temperature observations.

,*Geophys. Res. Lett.***39**, L01704, doi:10.1029/2011GL050226.Gordon, C., , C. Cooper, , C. A. Senior, , H. Banks, , J. M. Gregory, , T. C. Johns, , J. F. B. Mitchell, , and R. A. Wood, 2000: The simulation of SST, sea ice extents and ocean heat transports in a version of the Hadley Centre coupled model without flux adjustments.

,*Climate Dyn.***16**, 147–168, doi:10.1007/s003820050010.Gregory, J., , R. J. Stouffer, , S. C. B. Raper, , P. A. Stott, , and N. A. Rayner, 2002: An observationally based estimate of the climate sensitivity.

,*J. Climate***15**, 3117–3121, doi:10.1175/1520-0442(2002)015<3117:AOBEOT>2.0.CO;2.Hartigan, J. A., 1965: The asymptotically unbiased prior distribution.

,*Ann. Math. Stat.***36**, 1137–1152, doi:10.1214/aoms/1177699988.Hegerl, G. C., and et al. , 1996: Detecting greenhouse-gas-induced climate change with an optimal fingerprint method.

,*J. Climate***9**, 2281–2306, doi:10.1175/1520-0442(1996)009<2281:DGGICC>2.0.CO;2.Hegerl, G. C., and et al. , 2007: Understanding and attributing climate change.

*Climate Change 2007: The Physical Science Basis,*S. Solomon et al., Eds., Cambridge University Press, 663–745.Jeffreys, H., 1946: An invariant form for the prior probability in estimation problems.

,*Proc. Roy. Soc. London***186A**, 453–461, doi:10.1098/rspa.1946.0056.Levitus, S., , J. Antonov, , and T. Boyer, 2005: Warming of the World Ocean, 1955–2003.

,*Geophys. Res. Lett.***32**, L02604, doi:10.1029/2004GL021592.Lewis, N., 2013a: Modification of Bayesian updating where continuous parameters have differing relationships with new and existing data. arXiv Rep., 25 pp. [Available online at http://arxiv.org/ftp/arxiv/papers/1308/1308.2791.pdf.]

Lewis, N., 2013b: An objective Bayesian improved approach for applying optimal fingerprint techniques to estimate climate sensitivity.

,*J. Climate***26**, 7414–7429, doi:10.1175/JCLI-D-12-00473.1.Mardia, K. V., , J. T. Kent, , and J. M. Bibby, 1979:

*Multivariate Analysis.*Academic Press, 518 pp.Pawitan, Y., 2001:

*In All Likelihood: Statistical Modeling and Inference Using Likelihood.*Oxford University Press, 514 pp.Rowlands, D. J., 2011: Quantifying uncertainty in projections of large scale climate change. Ph.D. thesis, Oxford University, 195 pp.

Sokolov, A. P., , C. E. Forest, , and P. H. Stone, 2003: Comparing oceanic heat uptake in AOGCM transient climate change experiments.

,*J. Climate***16**, 1573–1582, doi:10.1175/1520-0442-16.10.1573.Stott, P. A., , and J. A. Kettleborough, 2002: Origins and estimates of uncertainty in predictions of twenty-first century temperature rise.

,*Nature***416**, 723–726, doi:10.1038/416723a.Stott, P. A., , S. F. B. Tett, , G. S. Jones, , M. R. Allen, , W. J. Ingram, , and J. F. B. Mitchell, 2001: Attribution of twentieth century temperature change to natural and anthropogenic causes.

,*Climate Dyn.***17**, 1–21, doi:10.1007/PL00007924.