New details about natural and anthropogenic processes are continually added to models of the Earth system, anticipating that the increased realism will increase the accuracy of their predictions. However, perspectives differ about whether this approach will improve the value of the information the models provide to decision makers, scientists, and societies. The present bias toward increasing realism leads to a range of updated projections, but at the expense of uncertainty quantification and model tractability. This bias makes it difficult to quantify the uncertainty associated with the projections from any one model or to the distribution of projections from different models. This in turn limits the utility of climate model outputs for deriving useful information such as in the design of effective climate change mitigation and adaptation strategies or identifying and prioritizing sources of uncertainty for reduction. Here we argue that a new approach to model development is needed, focused on the delivery of information to support specific policy decisions or science questions. The central tenet of this approach is the assessment and justification of the overall balance of model detail that reflects the question posed, current knowledge, available data, and sources of uncertainty. This differs from contemporary practices by explicitly seeking to quantify both the benefits and costs of details at a systemic level, taking into account the precision and accuracy with which predictions are made when compared to existing empirical evidence. We specify changes to contemporary model development practices that would help in achieving this goal.
A new mode of development for Earth system models is needed to enable better targeted and more informative projections for both decision makers and scientists.
The Earth system is the thin layer of the Earth that contains and supports life. It ultimately governs most processes vital to human health and wellbeing: from food and water availability to disease spread and global economics. It is the canonical example of a complex system, in which its dynamics, resulting from interacting multiscale and nonlinear processes, cannot be predicted from understanding any of its isolated components. Attempts to understand the Earth system and how it will change in the future therefore depend on computational models that represent, with varying levels of abstraction, physical, chemical, and biological components of the Earth system and their interactions (Randall et al. 2007; Edwards 2010).
Decades of research using such models have resulted in advances in the understanding of many Earth system processes, including the impacts of humans on climate. Models have also produced projections, combining current knowledge of the underlying science with a set of plausible future societal change scenarios to provide information to guide climate change mitigation policy. But what confidence can be assigned to the projections? Confidence about a particular climate projection is often judged by the agreement between different climate models, with greater confidence assigned to projected changes for which there is close agreement (Pachauri and Reisinger 2007), although other considerations, such as why the models disagree, are also taken into account. Agreements between climate models have predominantly occurred for physical phenomena occurring over large spatial scales. All models agree that the world will become warmer on average if CO2 levels continue to increase (Stainforth et al. 2005; Pachauri and Reisinger 2007; Oreskes et al. 2010; Kerr 2011; Rowlands et al. 2012), and they all agree that the increases will be greater at higher latitudes (Pachauri and Reisinger 2007). However, they disagree in other aspects, such as by how much the world will warm (Oreskes et al. 2010; Bretherton et al. 2012), and these disagreements become more pronounced at finer, regional spatial scales (Pachauri and Reisinger 2007).
It is now widely recognized that differences in projections not only fail to recognize a variety of additional sources of uncertainty, but also inevitably become increasingly uncertain the further into the future they are extended. For instance, projections still rarely incorporate estimates of uncertainty in many model parameters and uncertainty arising from internal model variability, and so typically underestimate the uncertainty (Stainforth et al. 2007a,b; Brekke et al. 2008; Fischer et al. 2011; Bretherton et al. 2012). However, any model will also always imperfectly represent the dynamics of the real world by failing to account for all the factors determining the dynamics by deliberately not incorporating known processes and by obviously not accounting for unknown processes. Thus, projections will inevitably become less reliable, and thus uncertain, the further into the future they are extended (Smith 2002; Parker 2011; Smith and Stern 2011).
Despite their limitations, model projections are used by governments, businesses, and scientists to help make decisions. However, the lack of clarity about their uncertainty limits the extent to which they can be treated with any more sophistication than simply a collection of plausible outcomes (Cox and Stephenson 2007; Moss et al. 2010; Oreskes et al. 2010; Kerr 2011; Maslin and Austin 2012). For example, land use managers wishing to assess how precipitation might change in the future are typically confronted with a wide range of predictions about the direction and timing of change (Stainforth et al. 2007b; New et al. 2007; Stainforth 2010; Maslin and Austin 2012). Decision makers do not depend on consistent or confident projections in order to make decisions (Polasky et al. 2011; Kunreuther et al. 2013) but analyses of uncertainty and estimates of confidence in projections provide a much a clearer understanding of the need for, and likely consequences of, different decisions (Weaver et al. 2013; Lemos et al. 2012).
Of course, we do not suggest that Earth system modeling has not been useful in informing decision making. However, it has followed an approach that is better suited to exploring the plausible rather than identifying the probable. How can this situation be improved? How can we improve how model projections are made to provide clearer information to decision makers? We believe this can be facilitated by pursuing an alternative mode of model development; one that has the central aim of enabling the balance of models to be adjusted to allow balances of detail to be found that provide useful information for specific decisions.
DIFFERENT PERSPECTIVES ON THE FUTURE EVOLUTION OF MODELS.
There is a diverse variety of models of the Earth system (Fig. 1) because the level of detail has evolved over time to address different scientific questions. The predominant direction of model development to date has been the addition of more details, simulating an increasing number of different processes. In so doing, they have increased our understanding of the Earth system. They have also evolved to simulate processes at increasingly finer spatial resolutions (Claussen et al. 2002; Randall et al. 2007; Slingo et al. 2009), enabling phenomena to be simulated that only begin to occur at finer spatial scales (such as hurricanes). However, it is important to recognize that the same advances have also brought costs that can reduce predictive accuracy (by “accuracy” we mean the degree to which model predictions are centered on the dynamics and states of real-world phenomena rather than, for example, the number of real-world processes that the models depict). We will detail these costs below but, as an example, efforts have been biased toward adding details to individual models that are already technically unwieldy and intractable (Held 2005), rather than enabling uncertainty in the different aspects to be assessed, quantified, and incorporated into predictions and projections (“predictions”: estimates of how the Earth system will change; “projections”: estimates of how the Earth system might change under different scenarios; Weaver et al. 2013).
Perspectives differ on how much time and resources should be spent on adding yet more details. One perspective is that this is likely to be worthwhile because the model projections, incorporating more processes and at finer spatial resolutions, will become more realistic (Slingo et al. 2009; Gent et al. 2009; Slingo 2010). However, this is true only if our understanding of those new processes, as expressed in model formulations and parameter values, is sufficient to enable the projections to reliably predict the dynamics of the system under future scenarios. For example, Oppenheimer et al. (2008) show that the continual refinement of model details can actually lead to “negative learning”: where confidence improves over time to an answer that is different from the truth.
An alternative perspective is that continually adding details will unlikely deliver the desired improvements in decision-making capabilities (Dessai et al. 2009). Instead, it is proposed that the focus of making climate change decisions using projections should change from one that awaits sufficiently high confidence in what will happen before acting (“predict then act”) to one that uses the projections as a set of plausible examples of what might happen to decide on how best to act in light of that knowledge and uncertainty (Stainforth et al. 2007b; Dessai et al. 2009; Kunreuther et al. 2013). Toward this goal, studies have investigated improving the process of decision making to make more robust decisions while incorporating information with different and diverse sources of uncertainty (Lempert and Collins 2007; Stakhiv 2011; Kunreuther et al. 2013; Weaver et al. 2013). These have led to the refinement of robust decision-making (RDM) methods within climate change decision making (Weaver et al. 2013). Existing model projections can already inform decision making under such frameworks, but for a limited range of scenarios and scales of spatial and temporal resolution. But they are not necessarily best suited for this purpose. To examine a range of scenarios, it would be more convenient to use models that could easily be simulated under many different scenarios (Weaver et al. 2013; Bretherton et al. 2012). Another focus has therefore been on how to redesign methodological frameworks for producing climate change projections so that they can be better targeted toward the needs of users (Weaver et al. 2013; Bretherton et al. 2012).
However, even if more robust decision-making frameworks are adopted, contemporary climate models still do not adequately convey important information that can be used to assess the confidence that can be placed in their projections. By “confidence” we mean an estimate of the probability of how the real-world system will behave. A related concept is the credibility of projections, which describes the assessment of a mechanistic model to reliably reproduce particular real-world phenomena (e.g., Brekke et al. 2008). Our perspective is that significant improvements to how climate models are developed are needed to provide more informative climate change projections. Such projections should go beyond being just a set of plausible outcomes to also convey a much more rigorous depiction of uncertainty in those projections than has been done to date. While any estimates of uncertainty will always become decreasingly reliable the further into the future they are projected (unless some proof can be given about the extent to which they truly bound real-world dynamics), they can still be seen as information-constrained predictions of the future, based on past evidence and understanding. Our focus here is on the methodological process of making the climate model projections themselves to better convey information and uncertainty relevant to the information being sought.
THE COSTS OF MODEL COMPLEXITY.
Contemporary practices are still limited in the extent to which they incorporate different sources of uncertainty into model projections. Uncertainty arises from multiple sources: from uncertainty in the data used to initialize, parameterize, and evaluate models; from uncertainty in how adequately models represent reality; from differences in our scientific understanding of processes and how to represent them in models; from uncertainty in whether the model has been implemented correctly; and from uncertainty arising from simulated random events occurring in real-world processes (Stainforth et al. 2007a; Masson and Knutti 2011; Slingo and Palmer 2011). Yet Earth system models are currently so computationally demanding that only between 3 and 10 simulations per scenario were recommended for decadal forecasts and hindcasts to inform the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (Meehl and Bony 2011). This is a very low number of replicates to characterize a distribution in something so dynamically rich as global climate. For example, Deser et al. (2012; see Daron and Stainforth 2013 for another example) estimated how many simulations were needed to detect anthropogenic changes in air temperature, precipitation, and sea level pressure predicted by a general circulation model with its between-run variations arising only from simulated random atmospheric and oceanic processes. They found that <10 simulations were insufficient to detect changes in precipitation and sea level pressure for most regions of Earth, even over multidecadal time windows, with only changes in surface air temperature being reliably detected for most regions of Earth with such few replicates. Yet this is just for one model, not even a full Earth system model, under one scenario, using one major source of variation (internal variability), and for a few global properties. Awareness of the need to understand the degree of replication required to capture internal variability is increasing and it is notable that some modeling groups conducted many more than 10 simulations per scenario for their phase 5 of the Coupled Model Intercomparison Project (CMIP5) simulations (Stainforth et al. 2005; World Climate Research Programme 2013).
There have been significant efforts to incorporate parameter uncertainty into general circulation model projections to assess how it influences uncertainty (e.g., Murphy et al. 2004; Stainforth et al. 2005; Collins et al. 2006; Knight et al. 2007; Piani et al. 2007; Murphy et al. 2009; Fischer et al. 2011; Sanderson 2011; Rowlands et al. 2012; Sexton et al. 2012; Sexton and Murphy 2012). However, these have predominantly not been extended to other components of the Earth system (e.g., biological components), to full Earth system models, or to the functional forms assumed to represent particular Earth system processes (structural uncertainty; Slingo and Palmer 2011; Maslin and Austin 2012). Quantifying this additional uncertainty and reporting its consequences is being more commonly performed for components of Earth system models but it is yet unknown whether fully incorporating these, or the additional uncertainty of whether policy recommendations will be implemented when making future projections, will swamp any anticipated improvements in predictive precision from increasing the realism of existing components (Smith and Stern 2011; Palmer 2012).
It is also currently unknown how the partitioning of detail amongst model components influences the confidence that can be placed in predictions or projections (Smith 2002; Oreskes et al. 2010). For example, increasing the spatial resolution of a specific atmospheric physics model is justifiable in order to predict atmospheric dynamics more precisely (Shaffrey et al. 2009; Palmer 2012) but if its computational requirements restrict the inclusion of other details then the model may be less accurate than had an alternative atmospheric model formulation been adopted to allow other component processes to be represented more accurately. The adequacy of a model structure, including the level of detail, can be assessed by the degree to which predictions can recapture the known (and relevant) dynamics of interest, although such assessments are not in widespread use (Judd et al. 2008; Le Bauer et al. 2013; Smith et al. 2013).
It is also worth being aware that, fundamentally, more detailed models can make worse predictions than simpler models. This could occur, for example, if mechanisms or parameters are included as if they apply generally where they are in fact applicable to a much narrower range of circumstances. This is more likely to occur if models are tested against unchanging datasets because it raises the chances of a hypothetical mechanism being found that explains the variance in the specific dataset. This is known as overfitting, in which overly detailed models make worse predictions than simpler models through being overly tuned to the specifics of the calibration or training datasets (Bishop 2006; Crout et al. 2009; Masson and Knutti 2013).
Inadequate justification of the balance of details in models of the Earth system ultimately makes it difficult to meaningfully compare the projections of different models. Intercomparison exercises [such as the current Climate Model Intercomparison Project (Doblas-Reyes et al. 2011; Meehl and Bony 2011; Stouffer et al. 2011)] illustrate the degree of consistency between projections. However, the diverse and incompatible approaches to formulating and simulating models cause major difficulties in allowing detailed intercomparisons and the origins of differences in projections to be understood. The fact that some models used in intercomparisons are related, either by ancestry or by adoption of common formulations, means that the variation between their projections may be an overly conservative and biased estimate of the actual uncertainty (Stainforth et al. 2007a; Tebaldi and Knutti 2007; Dessai et al. 2009; Schwalm et al. 2009; Frank et al. 2010; Knutti et al. 2010; Smith and Stern 2011; Rowlands et al. 2012; Bishop and Abramowitz 2013). A rigorous analysis of the large ensemble from the general circulation model available at www.climateprediction.net reassuringly highlighted that variation due to hardware and software differences had relatively small effects on variation in model projections and that the effects arising from parameter variation were much larger (Knight et al. 2007). While extremely insightful, this analysis was performed for just one model structure and so it is largely unknown whether these findings apply more generally across a wider range of models.
A lack of detailed assessments of consistency of model components with historical observations, and the contributions of the uncertainties associated with model components to uncertainty in predictions, makes it unclear how best to proceed with future refinements. What are the most important and most reducible sources of uncertainty? Which components should be prioritized for refinement? What new data do we need to achieve this? Model development practices to date have insufficiently quantified the contributions of known sources of uncertainty to enable such questions to be addressed, although a number of model component intercomparison projects have been conducted, are planned, or are underway to help address some these issues, such as the Program for the Intercomparison of Land surface Parameterization Schemes (PILPS; Henderson-Sellers et al. 1996). Moreover, on top of the multiple necessary improvements to Earth system models, the research community will need to decide how best to make them in light of limited computational resources (Shukla et al. 2010; Palmer 2012).
AN ALTERNATIVE APPROACH.
It is increasingly recognized that the current generation of model projections often do not provide decision makers, scientists, or climate model output users in general with the specific information they need (Corell et al. 2009; Dessai et al. 2009; Oreskes et al. 2010; Kerr 2011; Lemos et al. 2012; Lemos and Rood 2012; Maslin and Austin 2012; Kunreuther et al. 2013). For example, those making decisions in relation to future land use planning might wish to understand the diversity of risks (e.g., floods) posed to potential developments (e.g., flood barriers or wind farms) as a consequence of climate change (Weaver et al. 2013) but are confronted with a wide range of projections that differ in relevance, resolution, parent model, and uncertainty (to name a few) without clear information on their credibility or uncertainty. Recent advances in decision theory have gone a long way to enabling rational decisions in light of projected climate changes, irrespective of how models are developed (Polasky et al. 2011; Liverman et al. 2010; Kunreuther et al. 2013; Weaver et al. 2013). For example, instead of decision makers awaiting confident estimates of the likelihood of particular events happening in future before acting (e.g., the chances that storm surges in a particular port will exceed 5 m), decision theory now provides robust ways of estimating the costs and benefits of acting now given the range of costs associated with different plausible events (Weaver et al. 2013). However, given the costs arising from contemporary model development practices, it is also clear that a number of changes to those practices would not only enable projections to provide more useful information for decision makers, such as providing more complete estimates of uncertainty, but also better target the needs of a much wider community of climate model users (Liverman et al. 2010; Bretherton et al. 2012). We therefore recommend here changes to model development practices to better suit the needs of those aiming to make more informative climate projections. In Table 1 we summarize the differences between the approach we recommend and contemporary practices.
Given the costs of model complexity we think there should be a greater emphasis on adopting models that are at least simpler than the current generation of extremely computationally demanding Earth system models to permit more informative uncertainty quantification. Such quantification should be conducted to reflect the sensitivity of model projections to the most important known sources of uncertainty in relation to the phenomena being targeted for prediction. Such uncertainty assessments are becoming more common in relation to parameter uncertainty and internal variability and the results of these predominantly argue for many more replicates than typically conducted for the most complex models (Stainforth et al. 2005; Sanderson 2011; Rowlands et al. 2012; Sexton and Murphy 2012; Deser et al. 2012). However, uncertainty assessments also need to be extended to enable structural uncertainties to be assessed in more informative ways than is achieved to date through model intercomparisons. Ideally, structural uncertainty assessments would be part of the uncertainty quantification conducted by any one modeling team, incorporating the effects of alternative formulations for internal processes (e.g., alternative ways of representing vegetation fires) or even for entirely different formulations (e.g., comparing simpler models to more detailed models). Thus, when projections are served to users, they can be accompanied by a more rigorous exposition of the sensitivity of relevant model predictions to these different sources of uncertainty. However, such information should always be delivered with the caveat that any uncertainty or probability projection is increasingly likely to become misleading the further into the future it is projected.
Data assimilation and parameter inference methods will play key roles in future approaches to quantifying uncertainty in how the model reflects present day and historical phenomena. Such methods will be important for propagating uncertainty into projections and enabling assessments of the value of alternative model formulations in terms of precision, accuracy, and overall confidence in how well the model captures reality (Vrught et al. 2005; Berliner and Wikle 2007; Scholze et al. 2007; Sexton and Murphy 2012; Le Bauer et al. 2013; Smith et al. 2013). New studies examining the tradeoffs between the level of model detail and the ability to quantify uncertainty would be informative in relation to this (Smith 2002; Ferro et al. 2012; Palmer 2012). Formal probabilistic methods (i.e., Bayesian inference) are particularly well suited for comparing models with data and making projections that incorporate estimates of uncertainty, so would be particularly attractive for our proposed approach (Kass and Raftery 1995; Berger et al. 1999; Kennedy and O'Hagan 2001; Oakley and O'Hagan 2002; Berliner 2003). So far, these have proven computationally unfeasible for the most detailed models (Oreskes et al. 1994; Smith and Stern 2011; van Oijen et al. 2011; Palmer 2012), but this could be addressed on the short term in a number of ways. For instance, Bayesian emulators of detailed models could be employed to make probabilistic predictions based on limited runs of the computer code (Kennedy and O'Hagan 2001; Oakley and O'Hagan 2002), or the number of details could even be restricted to a level where their suitability could be assessed using Bayesian methods. However other, non-Bayesian methods to uncertainty quantification could also be used to provide useful information, such as the adjoint method—a popular choice for investigating the parameter sensitivity of computationally intensive models (Courtier et al. 1993).
Climate models convey more confidence in projected phenomena when those phenomena arise in multiple different models and the reasons for them occurring are understood to be plausibly consistent with reality (Pachauri and Reisinger 2007; Held 2005). The classic example is the consistent prediction from all modes of abstraction—from simple physical principles to multiple complex climate models, that increasing greenhouse gas concentrations leads to a global warming response (of course neither guarantee this will actually occur in reality). New approaches to climate modeling to inform decision makers need improved ways to demonstrate the consistency of projections under the different sources of uncertainty described above, but also enable the reasons behind the occurrence of projected phenomena to be investigated, understood, and assessed for real-world relevance. Structuring the diversity of possible model details hierarchically is one way to facilitate this, which encouragingly was also the first recommendation of the National Academy of Sciences' “National Strategy for Advancing Climate Modelling” (Bretherton et al. 2012; “Evolve to a common national software infrastructure that supports a diverse hierarchy of different models for different purposes . . .”). The hierarchical organization of model details is particularly helpful for enabling the reasons for particular model predictions to be understood and then tested (Held 2005). Emergent phenomena can be studied at the simplest possible level and the reasons for their emergence investigated without having to also simulate and account for excessive detail. Thus, one of the most useful (and challenging) improvements that we recommend is to develop widely applicable hierarchical descriptions of Earth system processes to provide frameworks within which models of the Earth system can be formulated and characterized, both in terms of their structure and in terms of their predictions and projections (this was also recommended by Held 2005).
The balance of detail and sources of uncertainty considered relevant to a problem will obviously depend on the problem being addressed. For example, United Nations Framework Convention on Climate Change decision makers recommending global climate mitigation decisions may require models with a different balance of details (in terms of spatial resolution and numbers of processes) than water agencies aiming to plan new water supply and wastewater management systems for their region meet the demands of the next 25 years (Rogelj et al. 2013; Weaver et al. 2013). Thus, computational methods are also needed that facilitate the assessment and adjustment of the overall balance of model details relative to the questions posed, current knowledge, data, and uncertainty. Computational methods to enable the adjustment of details within the same modeling framework have already been developed for individual families of models to meet this requirement (e.g., the Met Office Unified Model; Pope et al. 2007), although these need to also be able to incorporate estimates of uncertainty in model components and parameters, so that the various costs and benefits of adopting different levels of abstraction discussed above can be assessed. These methods will also need to be extended to apply beyond an individual family of models, as described above, to allow the quantification and assessment of structural uncertainty. Adding to these challenges is the requirement (at least occasionally) to conduct assessments of the overall balance of details at a systemic level. This is for several reasons. First, because the different components are coupled through feedbacks, the coupling of different components might be necessary to assess the accuracy with which they can predict important emergent phenomena. Second, systemic assessments can enable the detection of logical inconsistencies between the predictions of different components. Third, systemic assessment can also help to avoid the development of details of any one area in such a way as to detrimentally affect the accuracy of the overall model or the assessment of its accuracy. Fourth, such assessments can be used to help identify the most important reducible sources of uncertainty. Such approaches are being developed for numerical weather prediction models, where the poorest performing model features can be identified (Judd et al. 2008).
Achieving the sort of “balanced complexity” modeling paradigm (Fig. 1) we advocate above will obviously be extremely challenging, for both sociological and technological reasons. Modeling groups adopt different methodological approaches and have differing incentives to adopt cross-institutional standards. Much model development to date is conducted in government-funded research institutions where there is typically an incentive structure for individuals and research teams to produce research findings within a period of months to years that is publishable in peer-reviewed journals. The high resource and technical requirements to build even one detailed Earth system model mean that individuals and groups are reluctant to undertake projects involving radical modifications to their modeling architectures because of the likely time and financial costs involved. However, just as the increasing recognition of the need to conduct model intercomparisons and benchmarking has promoted standards in compatibility data and model outputs we believe that the increasing need to utilize climate model information in decision-making context will promote methods that allow assessment of the costs and benefits of adopting different balances of detail for providing useful information.
So how could modeling systems be engineered to enable the methodological improvements described? One of the first steps will be to develop new, or evolve the existing, online repositories for model components, whole models, driver and assessment data, and model outputs (e.g., the Earth system grid; Williams et al. 2009). Such repositories should enable access to components independently from their original parent models so that other research groups can assess the implications of alternative formulations for that component. Similar assessments could be made in relation to the driver and assessment datasets as well as model structures. To facilitate the exchange of model components, future components could be developed in such a way that facilitates their use within alternative model structures. This strategy was recently employed by Smith et al. (2013) to develop a global terrestrial carbon model with the intention to facilitate investigations into the costs and benefits of alternative model components and formulations for predicting global terrestrial carbon. In that study the modeling framework included code libraries that enabled model components to obtain the data they require to make outputs from online databases, from local computers, or from other model components, depending on the structural information specified. This facilitated rapid experimentation with a wide variety of model structures.
Investigations into the effects of alternative model formulations will also benefit from adopting conventions for the description of models—their structures, components, and use histories (Dunlap et al. 2008). This should also help to minimize or eliminate reducible sources of uncertainty associated with the technical implementation of models. Uncertainty in model projections also exists because of differences in datasets, algorithms, methods, models, and simulation architectures used by different research groups. The importance of these specific details cannot practically be assessed among very different models (though see Knight et al. 2007). Thus, any new approach will benefit from enabling scientists anywhere to access pools of models and datasets and verify whether or not a change was an improvement (by various measures of performance).
The comparison of different model structures and component formulations could also be aided by the adoption of programming languages that make it easier for the intentions of the code to be understood. Functional modeling languages (Pedersen and Phillips 2009) allow for succinct and functional descriptions of models. This would not only aid in conveying the intended purpose of the code but would also aid the translation of the same underlying model to different coding languages (e.g., FORTRAN versus C++). This is one promising way of allowing interoperability between the components of models written by different institutions when it is inevitable that there would be some resistance to initiatives to adopt standards in model development. Enabling models to work with data, parameters, and predictions as probability distributions, just as naturally as they use with constants today, would also greatly facilitate modeling with uncertainty. Probabilistic programming languages are a relatively recent area of research and development aimed at facilitating the use of probability distributions and machine learning in general applications (Bishop 2013). Their application in Earth system modeling could simplify the process of computing with probability distributions. Recent developments in functional probabilistic programming languages could enable modelers to combine the benefits of both functional and probabilistic programming languages (Bhat et al. 2013). Enabling the continual quantification, storage, and retrieval of uncertainty associated with model components and projections will also require much larger computer memory requirements; this could be facilitated with online data storage and retrieval capabilities.
One of the benefits of adopting a hierarchical approach to defining the relationships amongst model components is that it should facilitate model reconstruction from simple representations to avoid becoming locked into one model or modeling approach. Ideally, model developers would be able to identify all top-level components currently known to be relevant to a particular set of phenomena (one of which, for example, might be the sea level in 100 years' time) and then, starting from the simplest possible representations of each of these, critically assess and reassess the adequacy of the level of detail used to model them. Further details, in terms of new model components (Fig. 1), would be added if justifiable. This approach to model development would not only lead to better predictions for less computing time, but also tend to check the sociological imbalances inherent to current Earth system science, helping to direct intellectual effort and scientific funding toward those components that are the least understood and most useful in relation to the phenomena being targeted for prediction.
Any new modeling approach to get informative projections to users on demand and operate within formal or informal decision-making frameworks will also require the ability for researchers to specify, combine, and compare projections from multiple models—some perhaps projecting on demand and others obtained from archives to suit their analyses (Bretherton et al. 2012; Weaver et al. 2013). Such systems should be designed to allow a much broader community of experts to contribute to model development and use, including some that have had little influence on model development to date. It should also permit the coproduction of new climate information from climate modelers, domain experts, and decision makers—to enable a balance to be struck between providing the information that decision makers want and the information that scientists think decision makers need to know (Lemos and Rood 2012).
It is now time to build from the wealth of modes of abstraction of the Earth system developed so far, on the wealth of data in existence, and on advances in computation and statistics to build climate models that deliver much more predictive information for users. A key step toward this is to enable models to be built that include much more robust estimates of uncertainty, which in turn guides where scientific and computational resources need to be directed in order to reduce uncertainties further. Combining adaptive hierarchical modeling frameworks with assessments of the uncertainty in model formulations and projections will enable much better targeted explorations of model-detail space and allow urgent questions to be answered in a much more timely and reliable way.
We thank Suraje Dessai, Julia Slingo, Tim Palmer, Beth Fulton, Peter Cox, Colin Prentice, Rachel Warren, Leonard Smith, Trevor Keenan, Yiqi Luo, Michael White, David Stainforth, Doug McNeall, and one anonymous reviewer for valuable discussions that helped us develop the ideas we present in this manuscript.