Emergent constraints underreport uncertainty and are based on strong, unrealistic statistical assumptions, but they need not be. We show how to weaken the assumptions and quantify important uncertainties while retaining the simplicity of the framework.
Emergent constraints have become a popular and controversial topic within the climate science community over the last number of years (Hall and Qu 2006; Wenzel et al. 2016; Cox et al. 2018). For some policy relevant quantity that we cannot observe now, for example equilibrium climate sensitivity (ECS), researchers seek to discover whether there are observations that we can make that would quantify or constrain our uncertainty in that quantity.
To answer this question, the community has looked to the ensembles of the Coupled Model Intercomparison Projects CMIP3 (Meehl et al. 2007) and CMIP5 (Taylor et al. 2012), and now CMIP6 (Eyring et al. 2016). The idea is to find a (typically linear) “emergent” relationship across the models between the quantity of interest (QoI; e.g., ECS) and something that can be measured. For example, Hall and Qu (2006) found that the current seasonal cycle had a linear relationship with snow albedo feedback in CMIP models. Cox et al. (2018) relate ECS to a particular metric of climate variability. Once such a relationship is found, the models are used to estimate it via regression. Observations from the real world, coupled with the regression produce a constraint on the QoI in reality.
There are a number of reasons that this practice has caused controversy. One is the way in which the constraints are found. Some use physical reasoning to show that we would expect a linear relationship between model quantities, and then look to confirm this through the ensemble (e.g., Cox et al. 2018). Others have suggested data mining be used to find them (e.g., Karpechko et al. 2013). Hall et al. (2019) highlight the importance of understanding the physical basis for emergent relationships. We discuss these ideas later. Another source of controversy is the simplicity of the treatment versus the complexity of the models and the quantities of interest. The argument is that the observed relationships are not emergent from the physics and hence predictive, but a result of the interaction of many different processes, well captured in the models or not, which must be better understood in order to say something about reality. A final concern is that emergent constraints actually underestimate uncertainty. Several authors have attempted to quantify the effect of uncertainty in the observations themselves without a formal statistical framework (e.g., Brient and Schneider 2016; Wenzel et al. 2016; Cox et al. 2018). Bowman et al. (2018) constructed a statistical framework for emergent constraints that properly accounts for uncertainty in the observations, but neglects other sources that we seek to address here.
In this paper we will explain the underpinning statistical assumptions and judgements that lead to the existing emergent constraints model. We will highlight the different sources of uncertainty that should be present when finding emergent constraints and show where they can enter the usual framework. We will argue for a simple generalization to existing methods that allows hitherto neglected uncertainties to be quantified and then compare results from this extended model to existing results in the literature. Our goal is to translate the existing underpinning statistical assumptions behind emergent constraints and then place them in a more general framework that allows all assumptions for any emergent constraint analysis to be transparently understood. Our framework highlights all sources of uncertainty and offers methodology for guided quantification of these additional uncertainty sources. To accompany the paper we present an open-source software tool capable of fitting the general emergent constraints model to user-inputted data that allows users to explore the effects of all sources of uncertainty on the analysis. Whether the statistical assumptions themselves are valid for any particular emergent constraint, or at all when using CMIP and observations in this way is a question for the climate community to resolve. This paper and its accompanying software can help to frame this discussion.
In the second section we present the strong statistical assumptions behind emergent constraints and generalize the framework by weakening them. We show where key uncertainties were being ignored and show how they can be quantified going forward. In the third section we apply the generalized framework to the emergent constraint on ECS recently presented by Cox et al. (2018) to demonstrate the effect of acknowledging additional sources of uncertainty. In the fourth section we discuss quantifying these additional sources of uncertainty and present a default guided specification which is available to use through our software tool. In the fifth section we apply the new framework to a collection of constraints on ECS from the literature and discuss the interpretation of different emergent constraints analyses for the same quantity. The final section contains a discussion. The appendix contains some of the mathematical results used to derive our more general framework. The software tool and user instructions are available at https://github.com/ps344/emergent-constraints-shiny.
EXCHANGEABILITY AND EMERGENT CONSTRAINTS.
Ordinary least squares and classical regression.
Least squares estimates of
There are two ways to treat the models so that fitting this type of regression would make sense, and we will argue that, when unpacked, neither stand up to scrutiny. The first is to assume the existence of a large population of models from which we obtain independent random samples through CMIP. Reality is then another independent random draw from the same population that the models come from. Lack of independence is well documented across climate models so that, if we did believe the existence of such a population, we are sampling a narrow part of it and the regression model is simply not true. If the model is right but the sample is biased, we cannot conclude anything about the model parameters and hence the underlying population of models without modeling the bias specifically. We know there is no “random sample,” the models that are in CMIP were specifically designed. That reality should be an independent draw from the population of models with the same error structure is indefensible and against everything we know about models and their relationship to reality. But what does the population of models argument mean anyway? What counts as a model from the population? Is there a resolution dependence, or a modeled process dependence? Does the population include future models at new resolutions we cannot currently run? These questions have yet to be addressed.
A second way to treat the models that does not require a large population sampled independently would be to assume that the models themselves are random. For this interpretation, uncertainty arises through the random nature of the climate model as it deviates from the line βTx. As the models are deterministic, this randomness can only come from initial condition uncertainty leading us to view the deviation as the result of observing a random point on each model’s attractor, and the line representing the mean of the attractor as it changes with x. Note this implies every model’s attractor has the same “variability” (σ2), a claim that is difficult to defend.
More natural is a Bayesian approach in which we acknowledge that, before we observe the models, we are uncertain as to what their xi and yi values will be, just as we are uncertain about the corresponding x* and y* values for reality. We do not need to view any of these values as random and coming from some distribution; they can be fixed and deterministic. To quantify uncertainty through probabilities, the key concept here is the prior judgement of exchangeability between the responses given the predictors. Exchangeability is a weak assumption that amounts to indifference over labels (de Finetti 1974, 1975). Here it says that, for any i, j, we think that no information about the pairs (yi, xi) and (yj, xj) is encoded in their labels i and j. Hence, if xi and xj took the same value, our distribution for yi and yj would be the same a priori.
Here the i and j are the labels for the different climate models, so applying this assumption for an emergent constraint means that if the value of the predictor turned out to be the same for any subset of models, there is nothing else that we know about those models that would lead us to change our distribution for the response before seeing the model responses. On the other hand, a view that a particular model better represented various processes might break exchangeability if, that is, it could be articulated, for a given x how the better representation of processes would change our view of the distribution for y|x. For example, we might think feedbacks were captured that raised/lowered the expectation for y compared with a model with a poorer representation. The key difference here between classical independence and Bayesian exchangeability is that the former is a property of the models and the way they are chosen, and the latter is a property of the beliefs of the analyst before he/she has observed the data from the models.
Emergent constraints and exchangeable reality.
The standard procedure in the emergent constraints literature is to assume reality, y*, follows the same regression as the other models. From the statistical view we have given, this implies that y* is assumed to be exchangeable with all of the climate models given x*. Usually x* is taken to be the observed predictor [though Wenzel et al. (2016) and Cox et al. (2018) numerically integrate out variability in x* and Bowman et al. (2018) provide a framework that includes modeling x* explicitly, as we will later], and then the regression is used to predict y* and calculate prediction intervals.
Taking the stronger classical version of this assumption first, reality is assumed to be an independent draw from the same distribution that the models were drawn from. This is the strongest possible form of assumption linking models and reality and does not seem defensible, or necessary given that it implies the weaker exchangeability assumption that we shall argue against below.
Rather than assume reality is an independent draw from the distribution of the models, we could assume conditional exchangeability of y* given x* with the yi given xi. This would amount to the view that there are no processes systematically missing from the models, but present in reality, that might cause us to view the behavior of the real world to be distinguishable from that of the models. Rougier et al. (2013) dismissed this idea out of hand, yet it is the weakest form of the key assumption driving the calculations currently performed for emergent constraints. We propose a general framework to aid our discussion of the issues.
The current exchangeability between models and reality assumed within the literature is recovered if the extra sources of uncertainty
A complete framework for emergent constraints.
In any particular problem, we specify the rest of our prior uncertainty through the quantities
The code we have provided with this paper samples from this distribution and is sufficiently flexible that any of the distributional assumptions we have made (such as the use of Normal and Half-Normal distributions) can be easily altered if required. The app we have provided allows users to add their own emergent constraint data and to experiment with the different sources of uncertainty for themselves. What follows is an illustration of these ideas through a reexamination of the Cox et al. (2018) constraint accounting for different levels of uncertainty.
ILLUSTRATION USING A RECENTLY FOUND EMERGENT CONSTRAINT.
We start with the Ψ statistic presented by Cox et al. (2018) as an emergent constraint on climate sensitivity. The Ψ is a metric of temperature variability (standard deviation of global temperature divided by the negative root one year lag autocorrelation of temperature), with a given physical justification for why it should have a linear relationship with ECS (though some dispute that justification as part of the discussion to that paper).
We begin by introducing what we view as sensible uncertainty judgements, adding the uncertainty in layers so that the effects on the constraint can be observed. Throughout, the reference model refers to the standard emergent constraints model computed by sampling from the posterior under the reference prior. Note, throughout, that the reference prior on the regression coefficients [π(β, σ2) ∝ 1/ σ2] with π(x*) ∝ 1 and with Σβ* and ξ* in Eqs. (3) and (5) collapsed to zero recovers the usual emergent constraints model.
We use the HadCRUT4 dataset tabulated in Cox et al. (2018) to give the observations, z = 0.13 K, and their uncertainty σz = 0.016 K, in Eq. (6). For our nonreference calculations we set µx = 0.15 K and σx = 1 K in Eq. (7), based on Fig. 2a of Cox et al. (2018), which shows model time series of Ψ (the data are estimated using a moving average approach) across CMIP5, that are all centered between 0.1 and 0.5 K but with an average of around 0.15 K (by eye). By setting a prior that covers all of the models with much larger uncertainty than an expert may set, we ensure our analysis is not sensitive to the prior choice (the observation variance is orders of magnitude smaller and so this will not change the posterior very much). Figure 1 shows the posterior distribution of the emergent constraint with these prior choices and reference priors elsewhere. The shading represents the 66% Bayesian prediction interval [the probability that ECS is inside the interval is 0.66, corresponding to the IPCC’s “likely” range and chosen to mirror Cox et al. (2018)], with the red curve and shading representing our model with the informed prior on x* and the black curve representing the Bayesian reference model that coincides with the usual analysis. The reference model gives the same interval as reported in Cox et al. (2018), [2.20, 3.41 K] [black shading (left plot) and black contour (right plot)]. We overlay our model results in red with the same median estimate 2.80 K and interval of [2.20, 3.41 K].
(left) Posterior density for ECS given the models and the observations under the reference prior and with all other uncertainties reduced to 0 K (black) and our model with x* ∼ N(0.15,1) (red). The shading represents the 66% Bayesian prediction intervals under the two models. (right) The Cox constraint vs ECS. Black dots are the CMIP5 models, the gray dots are samples from our posterior distribution for ECS. Blue vertical lines represent the uncertainty on the observation of the Cox constraint and the straight red lines are the median and prediction intervals for the regression relationship for reality. The red and black contours represent the uncertainty on ECS as it depends on the Cox constraint, with black belonging to the reference model and red, our model.
Citation: Bulletin of the American Meteorological Society 100, 12; 10.1175/BAMS-D-19-0131.1
Acknowledging additional uncertainty.
Instead of assuming no uncertainty for β*|β and σ*|σ, we look at the effect on the emergent constraint of adding a “reasonable” amount by specifying nonzero Σβ* and ξ* in Eqs. (3) and (5). In the “Confidence-linked default priors for physically motivated constraints” section we offer a principled approach to setting values for these quantities, which will require a number of additional arguments and results. For illustration here, we shall define reasonable in terms of the relationship of these “reality parameter” uncertainties to the regression parameter uncertainties that come from the Bayesian model.
Having fit the Bayesian regression, we have our beliefs about the relationship between the models through samples from the posterior π(β, σ|Y), which can be used to calculate posterior means and standard deviations for the parameters, shown, for the Cox et al. (2018) constraint in Table 1. The posterior correlation between β0 and β1 is
Posterior means and standard deviations for the model regression parameters.
We begin with the scenario where, given the values of β and σ, we would have the same uncertainty (in terms of standard deviations) for β* and σ* as we currently do for β and σ, using the numbers in Table 1 and a correlation of ρ* =
As in Fig. 1, but with the posterior uncertainties for the regression parameters adopted for the conditional variances of the reality parameters.
Citation: Bulletin of the American Meteorological Society 100, 12; 10.1175/BAMS-D-19-0131.1
Note that even with the additional uncertainty specification given above, we are still virtually certain that the emergent constraint exists in reality given the expected value of the models, that is, our mean for β1* would be 12.08 K and our standard deviation would be 3.75 K. For there to be no relationship (β1* crosses 0 K) in reality under this model would involve roughly a four standard deviation event, or a probability of 6.34 × 10−4! Setting the standard deviation of β1* so that no relationship in reality is a two standard deviation event (≈2.5% chance) and a one standard deviation event (≈16.6% chance), and setting the standard deviation of β0* at 1 and 2 K for these scenarios respectively (based on an argument that says if β1* = 0, then β0* should be our current best guess for ECS, which we will make more carefully in the “Confidence-linked default priors for physically motivated constraints” section), gives 66% prediction intervals of [2.10, 3.50 K] and [1.88, 3.73 K] respectively. These constraints are shown in Fig. 3 (note we added no additional uncertainty for σ* for these calculations).
(top) Emergent constraint plots given a 2.5% chance of no constraint. (bottom) Emergent constraint plots given a 16.6% chance of no constraint.
Citation: Bulletin of the American Meteorological Society 100, 12; 10.1175/BAMS-D-19-0131.1
This example shows that not-insignificant additional uncertainty can be acknowledged for an emergent constraint, without dramatically changing the conclusions of the analysis. However, there are clearly sensible levels of additional uncertainty that could matter to an emergent constraint. In any given application, what should the additional uncertainty be? This is a fair question that might often receive the answer “that depends on the beliefs of the scientist.” While it is hard to argue with this answer and, while acknowledging that any firm beliefs of the scientist that can be captured with the parameters above and openly defended should be used, we think there is a place for sensible default settings for these uncertainties that can be used and understood by any practitioner. The risk of not having such defaults is that these real additional uncertainties continue to be swept under the carpet by the community and set to zero. We present and justify our default choices below.
CONFIDENCE-LINKED DEFAULT PRIORS FOR PHYSICALLY MOTIVATED CONSTRAINTS.
The app that accompanies this paper allows the user to work with reference priors throughout and allows all of the quantities that we’ve introduced by hand to be set manually, giving the user ultimate control and the freedom to express their judgments. For the model regression parameters we go no further than this. In the first subsection, we describe useful subjective default priors for the regression, but we believe that in many instances ensemble sizes will be sufficient to enable the relatively safe use of the reference prior. For the reality relationships our app offers a third, guided specification option, based on the arguments and results from the “Priors for the real world” subsection.
Priors for the model relationships.
Though the reference prior is often deemed the “objective” prior choice for regression, it actually imparts far less information than any scientist is capable of. For example, the prior states that all intervals of the same width on the real line are equally likely to contain the true intercept and slope, which is preposterous given even a rudimentary knowledge of the scale of the predictors and responses we might see in the models. Physical knowledge of the response should at least be able to bound the prior support for β and σ2. For example, consider finding an emergent constraint for ECS. We might view it (nearly) impossible that ECS in any model were outside of the range [0, 10 K]. So if there were no constraint at all, σ2 should be such that the ensemble mean ECS ±3σ did not cross both bounds.
Priors for the real world.
Equations (2), (3), and (5) gave a model for reality y* as a regression on some predictor x*, with “reality parameters” β* and σ*, that we link to the output of the models. But the interpretation, particularly for β* could be problematic. Succinctly, how can there be a regression relationship between x* and y* in reality when there is only one reality (one x* and one y*)? The following construct offers us a way to think about this statistical model.
Suppose, for the generation of models in our ensemble, the values of β and σ could be made known to us (e.g., through many more models of the current generation being included in the sample). At some future time, an ensemble of the next generation of models will be made available to the community and we can reexamine our emergent constraint, finding β′ and σ′. We expect the next generation of models to represent physical processes better. Some models will have higher resolution, others will have used the intervening years to develop new parameterizations that overcome known structural biases in their models. If β′ and σ′, could be made known to us, we would expect them to be different from β and σ, as the new physics in the models alters the relationships, even if we may not know if the improved physics would make the slope of the constraint stronger or weaker. We might consider β* and σ* to be the model parameters at the limit of the process of improving all of the models and submitting large ensembles. This idea is similar to that introduced as “reification” by Goldstein and Rougier (2009) (where there is discussion of why this theoretical limit should not be reality itself). By considering how different the relationship could be from one generation of models to the next, we may be more easily able to consider the effect of missing processes on the relationship and more comfortably able to conceptualize how/why β* might be different from β (and similar for σ*).
If limiting relationships between model processes are not a helpful thought construct for considering beliefs about β* and σ*, a practitioner could consider the effects of missing processes in the models on the constraint. For example, suppose we knew that a systematically missing or misrepresented process led to the response (ECS, say), being 2 too high for every model, but the slope of the line was capturing the underlying physical relationships perfectly. Then we would want to increase β0* by 2 to account for this. Similarly, if a feedback process that would strengthen (or weaken) the physical constraint were missing, we would want to adjust β1* appropriately. In this way, uncertainty on β* and σ* can be considered in terms of whether the current models accurately measure the perceived constraint.
We present arguments for sensible default priors for β* and σ* that depend on the level of confidence we have in the physical reasoning leading to the existence of the emergent constraint in the models transferring to reality (or the relationship between different classes of models at the conceptual limit of improvement). Our basic argument will be that, for constraints that were effectively data-mined using the current ensemble, we should have low confidence in their holding in the real world (or the next generation of models), and for those based on purely physical reasoning we might have a greater degree of confidence. To enable us to talk about our confidence in a constraint given the ensemble and to enable other researchers to make similar arguments or debate the level of confidence that should be present, we require further probabilistic arguments.
Application to the Cox constraint.
Applying the ideas from the “Priors for the model relationships” subsection, we use the following simple arguments to set Σβ and σs. We know from previous IPCC reports that models typically have a climate sensitivity “around” 3 K and that an ECS of 10 K or a negative ECS would be hugely surprising (in a CMIP model). Under a naive assumption that each model ECS was a uniform draw from [0, 10 K] with no emergent signal at all, the regression should fit a mean of around five with no slope and the residual standard deviation, σ, should be around 2.5 K (so that two standard deviations covers the interval). This is a “worst-case” type regression where the data are far more spread than anyone familiar with ECS could possibly expect, and no signal at all. We can therefore set σs = 2.5 K as a weakly informative prior on σ in Eq. (9).
Parameter Ψ is on the order of 0.1 K, and ECS is on the order of 1 K. Hence, as Ψ changes, when multiplied by β1, we should still expect a change that is on the order of 1 K. Thus, if there is a relationship, β1 should not be more than order 10. To be cautious and only weakly informative, we set the prior standard deviation
Given changes in ECS that are order 1 K at most, we would expect the intercept, β0 to be order 1 K for ECS. To allow for the possibility of strong negative effects, we set a very cautious prior standard deviation of
For the guided real world uncertainty specification, we interpret the IPCC likely range for ECS of [1.5, 4.5 K] as implying a central estimate of μy* = 3 K and a standard deviation of σy* = 1.5 K. Table 2 shows the 66%, 90%, and 95% prediction intervals under four different confidence levels. What we refer to as “coin flip” is a 50% confidence level, though we use 50.1% to avoid numerical issues in our estimation procedure. We say more about this option in the discussion.
Bayesian prediction intervals for ECS using the Cox et al. (2018) emergent constraint with four different confidence levels in the physical arguments behind the constraint.
The posterior distributions for ECS under the three main levels of confidence are given in Fig. 4 and the updated intervals for the Cox et al. (2018) constraint are given in Table 2. We see in all cases that acknowledging the additional uncertainty inflates the posterior distribution and the intervals, but not so much as to remove the constraint. In all cases, having some physical confidence behind the constraint is enough to ensure that something is learned from the analysis. This is even true in the coin-flip scenario, which leads to a note of caution that we expand upon in the discussion: if constraints have been data-mined from an ensemble rather than physically motivated, we do not think this procedure should be used at all. Even fitting the model and specifying some level of confidence requires a strong scientific statement that one must be prepared to back up with physical reasoning, that is, one consequence of the emergent constraints framework, even our generalized one, is that the central estimate will be determined by the observations and will not be altered by the confidence level.
(top) Emergent constraint plots for ECS given Ψ under a confidence level of virtually certain in the existence of the constraint. (middle) As in (top), but under a very likely confidence level. (bottom) As in (top), but with a confidence level of likely. The black lines and shading represent the reference model.
Citation: Bulletin of the American Meteorological Society 100, 12; 10.1175/BAMS-D-19-0131.1
For the Cox et al. (2018) constraint in particular, we do not offer any judgements as to what the confidence in the constraint should be, as we are not physicists. If the physical reasoning is sound, however, we do insist that the reference model, with all legitimate reality uncertainties ignored, is not appropriate.
EMERGENT CONSTRAINTS IN THE LITERATURE.
In this section we apply our extended framework to selected emergent constraints for equilibrium climate sensitivity published within the literature. We only select constraints published with respect to CMIP5 models and we do not include CMIP3 results within the constraints, which may lead our reference intervals to differ from those published. The constraints we choose are the sum of large- and small-scale indices for lower tropospheric mixing (Sherwood et al. 2014), the temporal covariance of low cloud reflection with temperature (Brient and Schneider 2016), the double intertropical convergence zone bias (Tian 2015), and the seasonal variation of marine boundary layer cloud fraction with SST (Zhai et al. 2015). The observations and their standard deviations that we used for each constraint are given in Table 3.
Observations and standard deviations used in our analyses of four emergent constraints from the literature.
The results of applying our extended framework for emergent constraints to these data are given as 66% prediction intervals in Table 4, and shown as PDFs in Fig. 5, for different levels of confidence in the physical arguments behind the constraints. From the figure we see that in cases where we weaken the confidence in the constraint but where the 66% interval remains relatively unchanged, the effect of the additional uncertainty has been to inflate the tails so that our probability of extreme ECS has increased.
Bayesian 66% prediction intervals for ECS for different published emergent constraints using the reference model and three different confidence levels in the physical arguments behind the constraint as per our extended framework.
Posterior probability density functions for ECS found for four different emergent constraints (colors) and four different levels of confidence in the constraint. The solid line in each case is the reference analysis.
Citation: Bulletin of the American Meteorological Society 100, 12; 10.1175/BAMS-D-19-0131.1
We have compared these analyses on alternative emergent constraints on ECS for two reasons. First, to show that the effect of acknowledging reasonable doubt into the existence of each constraint, as discussed via the method of the “Priors for the real world” subsection, is to inflate the prediction intervals, but by a small amount rather than an amount that points to no result. We can say that emergent constraints have underreported uncertainty in the past, but through the given framework, in the future they need not so long as researchers are willing to state their confidence in the underlying physical argument for the linear relationship.
Our second reason is to highlight that published constraints can lead to quite different probability distributions over ECS (e.g., Sherwood predicts a higher climate sensitivity and Cox predicts a much lower climate sensitivity), and to make it clear that these distributions are not compatible in any sense. In each analysis, the authors have made (implicitly) quite different and incompatible conditional exchangeability judgements for ECS given their individual predictors, leading to different models that capture residual variability as Normal with zero mean. A meta-analysis or review of this literature for ECS that sought to give an idea of the current uncertainty in ECS itself, might stray into somehow combining these intervals or central estimates to give an objective view of the state of the science. This would be particularly troublesome if that combination put more weight on intervals that overlapped. Each interval must be thought of as the scientific judgements of the author, based on their confidence and a transparent set of statistical assumptions, as outlined in the “Exchangeability and emergent constraints” section. A form of meta-analysis might seek to take the individual judgements of a group of scientists and summarize them, but that would not lead to an objective uncertainty assessment for ECS, but rather an honest survey of the opinions of different scientists asserted with perhaps differing levels of confidence and based on transparent assumptions and beliefs.
As noted by a reviewer, each of the posterior distributions from the different emergent constraints on ECS are symmetric about a central estimate and this may not be a realistic quantification of uncertainty for ECS. More realistic may be that the posterior should be skewed with a longer tail on higher climate sensitivities. Though our folded normal representation for σ* breaks the usual symmetry in Normal models, the correct place to establish this type of scientific uncertainty judgement within the model is to change the Normal assumption for y*|x* in Eq. (2) (normality across the models need not be changed). The linear mean might still be used and our arguments for uncertainty on the intercepts and slopes would be transferable, but a lognormal or shifted gamma-type structure could be used to describe reality given the observations. A benefit of our having formally provided the statistical modeling behind emergent constraints is that practitioners can clearly see which elements of the modeling can be changed in order to capture different types of assumptions.
DISCUSSION.
In this paper we sought to unwrap the underpinning statistical assumptions behind the use of emergent constraints to quantify uncertainty for key unknowns in the climate system. We discussed the strong foundational assumptions underpinning the usual classical regression analysis and the interpretation of the real world as a random sample from the distribution of models. We argued that these ideas were too difficult to defend objectively.
We presented the Bayesian view of emergent constraints, and the far weaker and more reasonable a priori conditional exchangeability judgements that would lead to regression analyses that coincided with the classical analysis under reference priors and showed how, under this framework, standard emergent constraints analyses ignored the key uncertainties present when there are potential structural deficiencies in the current generation of models. We presented a generalized framework for emergent constraints that acknowledged these additional uncertainties, yet collapsed back to the standard model when these uncertainties were set to zero.
Our modeling looks to adopt the prior judgement that the emergent constraint is informative for reality after having observed the ensemble, to avoid incoherent models for reality beforehand and to acknowledge that these judgements should only be made sparingly. We also believe that this is how scientists think about emergent constraints. As one scientist put it to us by email, “nobody publishes an emergent constraint that doesn’t correlate.”
We presented a guided prior uncertainty specification that links confidence in the physical reasoning for a linear relationship between the response and the constraint to reasonable additional uncertainties through judgements about the response itself which are either simple to specify or generally available through literature review. We have developed a software tool that allows users to do this for themselves, and have ensured that this tool also allows scientists full freedom to specify any levels of uncertainty on any of the parameters that they wish, if they do not want to follow our guided specification. Our tool is simple to use and we will maintain it for the community through GitHub. When scientists have specific judgements relating to the models and their deficiencies, we would recommend using our tool and a structured prior elicitation (Gosling 2018) to quantify these effects.
Our modeling accounts for parameter uncertainty, observation uncertainty and uncertainty about how the emergent relationship observed in the models applies to the real world. Sansom et al. (2019) also demonstrate that emergent constraints can be sensitive to uncertainty in the values of the model predictors xi (i = 1,…, n). Our method can be readily extended to account for these errors in variables without effecting the guided uncertainty specification, since only the model posterior distribution of the parameters given the models π(β, σ|Y, X) will change.
The arguments in this paper make it very clear that strong scientific judgement is implied when linking models to reality, particularly when claiming that a linear relationship between quantities across models indicates a physical relationship. Data mining for constraints may very well lead to a multiple testing problem. A simple numerical experiment can be used to illustrate the point. Generating 430,000 (normal) random numbers and stacking these into a matrix with 43 rows, generates a pseudo ensemble with 43 members and with no physical links between the 10,000 outputs. Looking at the maximum absolute correlation between outputs across the ensemble will usually return correlations between 0.70 and 0.85, well above the threshold for relationships for an emergent constraint. To base the strong beliefs required to take this relationship into the real world (in the way we have made clear) on only the discovery of a large correlation cannot be justified. For that reason, even specifying a low confidence in the constraint through our guided framework would still be inappropriate. See also Caldwell et al. (2014) for discussion of this point.
One criticism of emergent constraints is that they are overly simple, ignoring complex nonlinearities or interactions with processes that are not yet well understood or resolved by models. We do not fully agree with this criticism. When the linear relationship can be well established through mathematical and physical arguments, the conditional exchangeability judgements we have explained in this paper, amounting to indifference over labels and, though appreciating that the relationship will not be exactly linear, having no strong judgements as to systematic deviations from it, seem plausible in many situations. While the models and reality themselves may well be more complex, that does not invalidate the statistical model which, rather than making strong statements about how reality/the models actually behave, captures our current knowledge and can be defended on those grounds. Of course, more complex forms of regression could be used within the framework we discuss, but the implied beliefs and the way these will be amended for transferring the constraint from models to reality will be far more complex and difficult to defend.
We hope that by making the required statistical assumptions clear and transparent, the validity of any given constraint, new or existing, can be discussed by the community in terms of the physical reasoning, the reasonableness of the exchangeability judgements, and the confidence in the current generation of models and linear relationship for a given quantity. By making software available to the community, we hope to help this debate move forward by allowing different researchers to look at the sensitivity of intervals to these judgements and to form their own views.
ACKNOWLEDGMENTS
This work was funded by NERC Grant NE/N018486/1. The authors thank Ben Sanderson for sharing his data on emergent constraints within the literature. We’d like to thank Peter Cox, Mark Williamson, and Femke Nijsse for useful discussions about emergent constraints and for sharing their data. The lead author would also like to thank Michel Crucifix for his encouragement to write this paper.
The software tool, user instructions, and data for the Cox et al. (2018) example are available at https://github.com/ps344/emergent-constraints-shiny.
APPENDIX: MATHEMATICAL DETAILS.
Posterior predictive sampling.
Bayesian updates.
REFERENCES
Bernardo, J. M., and A. Smith, 1994: Bayesian Theory. Wiley, 675 pp.
Bowman, K. W., N. Cressie, X. Qu, and A. Hall, 2018: A hierarchical statistical framework for emergent constraints: Application to snow-albedo feedback. Geophys. Res. Lett., 45, 13 050–13 059, https://doi.org/10.1029/2018GL080082.
Brient, F., and T. Schneider, 2016: Constraints on climate sensitivity from space-based measurements of low-cloud reflection. J. Climate, 29, 5821–5835, https://doi.org/10.1175/JCLI-D-15-0897.1.
Caldwell, P. M., C. S. Bretherton, M. D. Zelinka, S. A. Klein, B. D. Santer, and B. M. Sanderson, 2014: Statistical significance of climate sensitivity predictors obtained by data mining. Geophys. Res. Lett., 41, 1803–1808, https://doi.org/10.1002/2014GL059205.
Carpenter, B., and Coauthors, 2017: Stan: A probabilistic programming language. J. Stat. Software, 76 (1), https://doi.org/10.18637/jss.v076.i01.
Cox, P. M., C. Huntingford, and M. S. Williamson, 2018: Emergent constraint on equilibrium climate sensitivity from global temperature variability. Nature, 553, 319–322, https://doi.org/10.1038/nature25450.
de Finetti, B., 1974: Theory of Probability. Vol. I, John Wiley & Sons, 300 pp.
de Finetti, B., 1975: Theory of Probability. Vol II, John Wiley & Sons, 375 pp.
Diaconis, P., and D. Freedman, 1980: Finite exchangeable sequences. Ann. Probab ., 8, 745–764.
Draper, N. R., and H. Smith, 1998: Applied Regression Analysis. 3rd ed., John Wiley & Sons, 736 pp.
Eyring, V., S. Bony, G. A. Meehl, C. A. Senior, B. Stevens, R. J. Stouffer, and K. E. Taylor, 2016: Overview of the Coupled Model Intercomparison Project phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev., 9, 1937–1958, https://doi.org/10.5194/gmd-9-1937-2016.
Gelman, A., 2006: Prior distributions for variance parameters in hierarchical models. Bayesian Anal ., 1, 515–534, https://doi.org/10.1214/06-BA117A.
Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin, 2013: Bayesian Data Analysis. 3rd ed. Chapman and Hall/CRC, 675 pp.
Goldstein, M., and J. C. Rougier, 2009: Reified Bayesian modelling and inference for physical systems. J. Stat. Plann. Inference, 139, 1221–1239, https://doi.org/10.1016/j.jspi.2008.07.019.
Gosling, J. P., 2018: SHELF: The Sheffield Elicitation Framework. Elicitation: The Science and Art of Structuring Judgement, L. C. Dias, A. Morton, and J. Quigley, Eds., Springer, 61–93, https://doi.org/10.1007/978-3-319-65052-4_4.
Hall, A., and X. Qu, 2006: Using the current seasonal cycle to constrain snow albedo feedback in future climate change. Geophys. Res. Lett., 33, L03502, https://doi.org/10.1029/2005GL025127.
Hall, A., P. Cox, C. Huntingford, and S. Klein, 2019: Progressing emergent constraints on future climate change. Nat. Climate Change, 9, 269–278, https://doi.org/10.1038/s41558-019-0436-6.
Hewitt, E., and L. J. Savage, 1955: Symmetric measures on Cartesian products. Trans. Amer. Math. Soc., 80, 470–501, https://doi.org/10.1090/S0002-9947-1955-0076206-8.
Karpechko, A. Y., D. Maraun, and V. Eyring, 2013: Improving Antarctic total ozone projections by a process-oriented multiple diagnostic ensemble regression. J. Atmos. Sci., 70, 3959–3976, https://doi.org/10.1175/JAS-D-13-071.1.
Meehl, G. A., C. Covey, K. E. Taylor, T. Delworth, R. J. Stouffer, M. Latif, B. McAvaney, and J. F. B. Mitchell, 2007: The WCRP CMIP3 Multimodel Dataset: A new era in climate change research. Bull. Amer. Meteor. Soc., 88, 1383–1394, https://doi.org/10.1175/BAMS-88-9-1383.
Rougier, J. C., M. Goldstein, and L. House, 2013: Second-order exchangeability analysis for multimodel ensembles. J. Amer. Stat. Assoc., 108, 852–863, https://doi.org/10.1080/01621459.2013.802963.
Sansom, P. G., D. B. Stephenson, and T. J. Bracegirdle, 2019: On constraining projections of future climate using observations and simulations from multiple climate models. J. Amer. Stat. Soc., in press.
Sherwood, S. C., S. Bony, and J.-L. Dufresne, 2014: Spread in model climate sensitivity traced to atmospheric convective mixing. Nature, 505, 37–42, https://doi.org/10.1038/nature12829.
Taylor, K. E., R. J. Stouffer, and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design. Bull. Amer. Meteor. Soc., 93, 485–498, https://doi.org/10.1175/BAMS-D-11-00094.1.
Tian, B., 2015: Spread of model climate sensitivity linked to double-intertropical convergence zone bias. Geophys. Res. Lett., 42, 4133–4141, https://doi.org/10.1002/2015GL064119.
Wenzel, S., V. Eyring, E. P. Gerber, and A. Y. Karpechko, 2016: Constraining future summer austral jet stream positions in the CMIP5 ensemble by process-oriented multiple diagnostic regression. J. Climate, 29, 673–687, https://doi.org/10.1175/JCLI-D-15-0412.1.
Zhai, C., J. H. Jiang, and H. Su, 2015: Long-term cloud change imprinted in seasonal cloud variation: More evidence of high climate sensitivity. Geophys. Res. Lett., 42, 8729–8737, https://doi.org/10.1002/2015GL065911.