1. Introduction
Uncertainty in climate change prediction arises from three different sources: model uncertainty, scenario uncertainty, and internal variability. Model uncertainty arises because of an incomplete understanding of the physical processes and the limitations of implementation of the understanding. Scenario uncertainty arises because of incomplete information about future emissions. Internal variability is the natural unforced fluctuation of the climate system. Internal variability is aleatoric and cannot be reduced by improvement of our scientific knowledge. However, Smith et al. (2007) demonstrated that a proper initialization of climate predictions with observations can reduce the uncertainty for the next decade. An obvious and exploratory way to evaluate the total of these uncertainties is to calculate the spread of a multimodel ensemble. However, further statistical analysis is needed to quantify the contributions of particular sources of uncertainty and to describe how a particular model reacts to a particular emissions scenario.
Various methods to decompose the total uncertainty into its sources have been suggested in climate science. Cox and Stephenson (2007) propose a conceptual framework for this purpose and illustrate its use with a single energy balance model. Hawkins and Sutton (2009, hereafter HS09) and Hawkins and Sutton (2011) fit polynomial trend models over time and calculate various sources of uncertainty. These studies offer simple interpretations of uncertainty, but the drawback is that the total uncertainty cannot be easily interpreted.
We use the analysis of variance (ANOVA) to decompose sources of uncertainty [for a complete review, see Von Storch and Zwiers (2001, chapter 9)]. ANOVA is a model-based approach that partitions the total variance into components due to different sources of variation, allowing a fuller interpretation. The seminal work of Madden (1976) suggests an ANOVA approach to test for the likelihood of potentially predictable long-range variability. Formal ANOVA models are used in Zwiers (1987, 1996) for analyzing seasonal observations and an ensemble of climate simulations, respectively, and are discussed from a statistical point of view. Several previous papers use ANOVA extensively for evaluating model uncertainty from ensembles. Räisänen (2001) uses ANOVA to divide the surface air temperature (SAT), precipitation, and sea level pressure change into a common signal and variances associated with internal variability and model differences under the same forcing scenario. Hingray et al. (2007) use ANOVA to estimate the uncertainty in temperature and precipitation for a collection of atmosphere–ocean general circulation models (AOGCMs).
The remainder of this paper is organized as follows. Section 2 describes the available data and compares the methods of HS09 and ANOVA. We illustrate the results in section 3 and present our discussion and conclusions in section 4.
2. Data and methodologies
a. CMIP3 data
Following HS09, we illustrate the methodology by applying it to global, decadal mean SAT multimodel ensemble predictions for years 2001–99 from the Coupled Model Intercomparison Project phase 3 (CMIP3) archive. The multimodel ensemble data are extracted from original monthly-scale data to decadal-scale data using 10-years’ moving averaging (Table 1). The predictions are from Nm = 7 general circulation models (GCMs) under Ns = 3 different future emissions scenarios [Special Report on Emissions Scenarios (SRES) A1B, A2, and B1] with Nr = 2 initial condition ensemble members for each model and scenario. This gives a total of Nm × Ns × Nr = 42 predictions for 2001–99. These future scenarios are summarized in Solomon et al. (2007, chapter 10). The reason for using fewer GCMs than other studies, such as Boer (2009) and HS09, is because only seven GCMs have simulated all three scenarios with at least two ensemble runs. Figure 1 shows time series of all the simulation runs of all the models, scenarios, and replicates. The total uncertainty increases with time because the simulation runs diverge with time.
Climate model data used in this study are obtained from the CMIP3 archive for years 2001–99 for the A1B, B1, and A2 scenarios for GCMs with more than one ensemble member.



Global, annual mean SAT prediction from seven different GCMs under three different emission scenarios from 2001 to 2099. Two ensemble members are shown here per model per emission scenario.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1

Global, annual mean SAT prediction from seven different GCMs under three different emission scenarios from 2001 to 2099. Two ensemble members are shown here per model per emission scenario.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Global, annual mean SAT prediction from seven different GCMs under three different emission scenarios from 2001 to 2099. Two ensemble members are shown here per model per emission scenario.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
b. Methodologies used in previous studies


















c. Analysis of variance method


The notion of interaction is an important concept in ANOVA. Mathematically, interaction is said to occur if the separate effects do not combine additively (Berrington de González and Cox 2007). For climate projections, it arises from how models react differently to emission scenarios, which we call scenario-dependent model uncertainty as opposed to scenario-independent model uncertainty. To demonstrate that there is a potential interaction term, we show the mean response at different lead times to different emission scenarios for each model in Fig. 2. The lines are not parallel, indicating that there is an apparent interaction between some models and scenarios, especially for long lead times. For example, in the year 2061–99, the effect of changing emission scenario is different for the CGCM3.1 (refer to Table 1 for model name expansions) than for the MIUBECHO, as the lines joining the models cross.

Interaction plots show the means of the SAT for the years 2001–30, 2031–60, 2061–99, and 2001–99 against emission scenarios for all the models. Symbols defined as C (CGCM3.1), M [MIROC3.2(medres)], E (MIUBECHO), H (ECHAM5/MPI-OM), G (MRI CGCM2.3.2), S (CCSM3), and P (PCM). Interaction between model and scenario is present where the lines are not parallel.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1

Interaction plots show the means of the SAT for the years 2001–30, 2031–60, 2061–99, and 2001–99 against emission scenarios for all the models. Symbols defined as C (CGCM3.1), M [MIROC3.2(medres)], E (MIUBECHO), H (ECHAM5/MPI-OM), G (MRI CGCM2.3.2), S (CCSM3), and P (PCM). Interaction between model and scenario is present where the lines are not parallel.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Interaction plots show the means of the SAT for the years 2001–30, 2031–60, 2061–99, and 2001–99 against emission scenarios for all the models. Symbols defined as C (CGCM3.1), M [MIROC3.2(medres)], E (MIUBECHO), H (ECHAM5/MPI-OM), G (MRI CGCM2.3.2), S (CCSM3), and P (PCM). Interaction between model and scenario is present where the lines are not parallel.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1





















Since the multimodel ensembles used here are “ensembles of opportunity” (Tebaldi and Knutti 2007), we do not think of the available models and scenarios as a sample from a wider population of possible models and scenarios. If a wider population could be envisaged from which the ensemble members form a sample, then our ANOVA model could be adapted to include so-called random effects that would enable inferences about the population (Eisenhart 1947). The simulation runs under the same model–scenario are thought to be the samples drawn from a finite population. However, it is also appropriate to interpret internal variability as independent realizations from an infinite population. Then our estimation formula V(t) will be underestimated by 50% when n = 2. In that case a possible solution is to use a random effects model to capture such a setting (e.g., Gelman 2005), but we retain simplicity by assuming finite population.
d. Connection to the estimates in HS09


3. Results
We now use the ANOVA approach described above to quantify the uncertainty in global mean and decadal mean SAT in a subset of the CMIP3 climate projections. Figure 3a shows how the different variance components vary with lead time. The scenario uncertainty [S(t), thick dashed line] dominates the total uncertainty after year 2050, which agrees with HS09. Over the entire period, model uncertainty [M(t), thick solid line] is also greater than the internal variability [V(t), thin solid line], which itself is rather constant over time (as assumed by HS09). The model–scenario interaction variance [I(t), thin dashed line] increases from less than 1 × 10−3 to 2.5 × 10−2 K2, larger than the internal variability component, demonstrating that interaction is an important component of uncertainty (see Fig. 3b).

Uncertainty in global, decadal mean surface temperature projections is shown here by variances. (a) All four components of uncertainty: scenario-dependent model uncertainty (thick solid line), scenario uncertainty (thick dashed line), internal variability (thin solid line), and model scenario interaction uncertainty (thin dashed line). (b) Contribution of internal variability and the model–scenario interaction effect variance, i.e., internal variability (thin solid line) and model–scenario interaction uncertainty (thin dashed line). Internal variability from an ANOVA model with absence of interaction (thick solid line) is also superposed in the same diagram.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1

Uncertainty in global, decadal mean surface temperature projections is shown here by variances. (a) All four components of uncertainty: scenario-dependent model uncertainty (thick solid line), scenario uncertainty (thick dashed line), internal variability (thin solid line), and model scenario interaction uncertainty (thin dashed line). (b) Contribution of internal variability and the model–scenario interaction effect variance, i.e., internal variability (thin solid line) and model–scenario interaction uncertainty (thin dashed line). Internal variability from an ANOVA model with absence of interaction (thick solid line) is also superposed in the same diagram.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Uncertainty in global, decadal mean surface temperature projections is shown here by variances. (a) All four components of uncertainty: scenario-dependent model uncertainty (thick solid line), scenario uncertainty (thick dashed line), internal variability (thin solid line), and model scenario interaction uncertainty (thin dashed line). (b) Contribution of internal variability and the model–scenario interaction effect variance, i.e., internal variability (thin solid line) and model–scenario interaction uncertainty (thin dashed line). Internal variability from an ANOVA model with absence of interaction (thick solid line) is also superposed in the same diagram.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Figure 4 presents a comparison of the fractions of uncertainty from the methodologies of HS09 and ANOVA. With our ANOVA approach, the scenario uncertainty dominates all the uncertainty after the year 2050. In the ANOVA method, the fraction of variance due to internal variability decreases rapidly in the first few decades, which may be due to a random fluctuation in the internal variability V(t) in the first decade. Meanwhile, scenario uncertainty contributes slightly less in the ANOVA method for about the first 30 years. The model uncertainty dominates in the first few decades and has similar values in HS09 and the ANOVA approach. The model–scenario interaction variance increases from a small value and is saturated at about 5% after two decades.

Comparison of the fraction of variance for global, decadal mean SAT using (a) HS09 methodology and (b) the ANOVA-based approach.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1

Comparison of the fraction of variance for global, decadal mean SAT using (a) HS09 methodology and (b) the ANOVA-based approach.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Comparison of the fraction of variance for global, decadal mean SAT using (a) HS09 methodology and (b) the ANOVA-based approach.
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Figure 5 shows how the various fitted ANOVA parameters evolve over time. In Fig. 5a, the separation between the model effects α is seen to increase with time, and some of the models, such as ECHAM5/MPI-OM and PCM, give a larger contribution to the model uncertainty than others. Systematic changes in mean deviations can be found in some models (such as PCM) that contribute more to the model uncertainty than others. Such a pattern may be attributable to the fact that the PCM simulations in the historical period are not continuous with those in the twenty-first-century SRES simulations. In Fig. 5b, SRES A2 overtakes SRES A1B in the year 2070, and they have a 0.5-K separation in the year 2100. This separation can be understood as the response to the socioeconomic difference in the emission scenario and also how the model treats the forcings, such as aerosol, differently. A plot of the interaction term (Fig. 5c) is helpful for understanding the contribution of model–scenario interaction, which increases with lead time.

Time series plots of the fitted terms: (a)–(c) estimates
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1

Time series plots of the fitted terms: (a)–(c) estimates
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Time series plots of the fitted terms: (a)–(c) estimates
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Figure 6 explores the interaction effect in more detail and shows that the interaction effect varies widely. For example, the variation to the different scenarios is the greatest for CCSM3, and the CCSM3 is relatively cool in SRES B1 and relatively warm in A2.

Time series plots of the fitted model–scenario interaction effects
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1

Time series plots of the fitted model–scenario interaction effects
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
Time series plots of the fitted model–scenario interaction effects
Citation: Journal of Climate 24, 17; 10.1175/2011JCLI4085.1
4. Discussion and conclusions
a. Methodology
We introduce a simple, coherent approach for the modeling of uncertainty in multimodel ensembles. The sources of uncertainty are estimated from the ANOVA model and add up to give total variance, which is a natural measure of global uncertainty. In contrast to the uncertainty decomposition constructed in previous studies, this approach does not need to specify a particular type of trend and noise distribution and does not assume constant internal variability over time. The ANOVA approach is a powerful way to quantify sources of uncertainty, and the results generated are often easy to interpret. It is easy to summarize the structure of all ensemble members under different scenarios with simple exploratory techniques. Another important feature is interaction. In this framework, model–scenario interaction is defined as a form of nonconstancy of variance across scenarios in different models. The framework supports the decomposition of model uncertainty into a term that measures the uncertainty due to a variation between scenario-independent model deviations and an interaction term that measures the uncertainty due to a variation between scenario-dependent model deviations. Ignoring the significant interaction term in the analysis would lead to a dramatic impact on the interpretation of the data. The framework offers, along with some exploratory data analysis techniques, a more detailed interpretation on uncertainty and the contribution from a particular ensemble member.
b. Scientific interpretation
Our results for uncertainty in the global and decadal mean temperature change broadly agree with previous studies; however, some details are different, especially for short lead times. Here are some important findings:
Scenario uncertainty, conditional on the choice of scenarios, is of the greatest importance after year 2050.
Internal variability is constant over time but decreases rapidly as a fraction of the total.
Uncertainty from scenario-independent model deviations dominates uncertainty from scenario-dependent model deviations over the entire study period.
The model–scenario interaction effect is an important contribution to uncertainty, especially at long lead times.
The first finding is fully in agreement with previous papers, such as HS09 and Cox and Stephenson (2007). The second finding is an assumption in HS09 and is now validated using the ANOVA framework. The latter two findings are more closely tied to the presence of a significant interaction term. There are several possible reasons for this finding of certain models having large interaction terms. The most likely reason is that the same forcings are treated differently across the range of models (e.g., Kiehl 2007). However, even if two models treat a forcing in exactly the same way, there could still be a contribution to the interaction if the models respond differently to the forcing—in other words, if the models have different effective climate sensitivities.
c. Future work
The ANOVA approach, because of its simplicity, is a good starting point to cope with some other more complex problems on attributing uncertainty from multimodel ensembles. Apart from global mean temperature, it is also interesting to investigate uncertainty for different space–time scales and other meteorological fields, such as precipitation and stratospheric ozone (Hawkins and Sutton 2011 and Charlton-Perez et al. 2010). It is possible to extend this approach to a more general class of models. These extensions are not currently common in the climate science community, but they have been used extensively in areas such as biology, epidemiology, and financial modeling. For example, an obvious extension to climate science is a multivariate ANOVA (MANOVA; see details in Press 1972, chapter 8) by incorporating the relationship between different atmospheric fields. This is particularly useful because atmospheric fields are often correlated. A separate analysis of fields such as temperature and precipitation may lead to repeated use of data. For epidemiology applications, Zhang et al. (2009) developed the techniques of smoothed ANOVA (SANOVA) to smooth spatial random effects by taking advantage of the spatial variation.
Acknowledgments
S. Yip and E. Hawkins are funded by NERC National Centre for Atmospheric Science (NCAS). We thank Mat Collins and three anonymous reviewers for their thoughtful comments, which helped improve the paper. We also thank the modeling groups, the Program for Climate Model Diagnosis and Intercomparison, and the WCRP’s Working Group on Coupled Modelling for making the CMIP3 multimodel dataset freely available (https://esg.llnl.gov:8443/index.jsp). Support of this dataset is provided by the Office of Science, U.S. Department of Energy.
REFERENCES
Berrington de González, A., and D. Cox, 2007: Interpretation of interaction: A review. Ann. Appl. Stat., 1, 371–385.
Boer, G., 2009: Changes in interannual variability and decadal potential predictability under global warming. J. Climate, 22, 3098–3109.
Charlton-Perez, A. J., and Coauthors, 2010: The potential to narrow uncertainty in projections of stratospheric ozone over the 21st century. Atmos. Chem. Phys., 10, 9473–9486.
Cox, P., and D. Stephenson, 2007: A changing climate for prediction. Science, 317, 207–208.
Eisenhart, C., 1947: The assumptions underlying the analysis of variance. Biometrics, 3, 1–21.
Gelman, A., 2005: Analysis of variance: Why it is more important than ever? Ann. Stat., 33, 1–31.
Hawkins, E., and R. Sutton, 2009: The potential to narrow uncertainty in regional climate predictions. Bull. Amer. Meteor. Soc., 90, 1095–1107.
Hawkins, E., and R. Sutton, 2011: The potential to narrow uncertainty in projections of regional precipitation change. Climate Dyn., 37, 407–418, doi:10.1007/s00382-010-0810-6.
Hingray, B., A. Mezghani, and T. Buishand, 2007: Development of probability distributions for regional climate change from uncertain global mean warming and an uncertain scaling relationship. Hydrol. Earth Syst. Sci., 11, 1097–1114.
Kiehl, J., 2007: Twentieth century climate model response and climate sensitivity. Geophys. Res. Lett., 34, L22710, doi:10.1029/2007GL031383.
Madden, R., 1976: Estimates of the natural variability of time-averaged sea-level pressure. Mon. Wea. Rev., 104, 942–952.
Press, S. J., 1972: Applied Multivariate Analysis. Holt, Rinehart and Winston, 521 pp.
Räisänen, J., 2001: CO2-induced climate change in CMIP2 experiments: Quantification of agreement and role of internal variability. J. Climate, 14, 2088–2104.
Smith, D. M., S. Cusack, A. W. Colman, C. K. Folland, G. R. Harris, and J. M. Murphy, 2007: Improved surface temperature prediction for the coming decade from a global climate model. Science, 317, 796–799.
Solomon, S., D. Qin, M. Manning, M. Marquis, K. Averyt, M. M. B. Tignor, H. L. Miller Jr., and Z. Chen, Eds., 2007: Climate Change 2007: The Physical Science Basis. Cambridge University Press, 996 pp.
Tebaldi, C., and R. Knutti, 2007: The use of the multi-model ensemble in probabilistic climate projections. Philos. Trans. Roy. Soc. London, A365, 2053–2075.
Von Storch, H., and F. Zwiers, 2001: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.
Zhang, Y., J. S. Hodges, and S. Banerjee, 2009: Smoothed ANOVA with spatial effects as a competitor to MCAR in multivariate spatial smoothing. Ann. Stat., 3, 1805–1830.
Zwiers, F., 1987: A potential predictability study conducted with an atmospheric general circulation model. Mon. Wea. Rev., 115, 2957–2974.
Zwiers, F., 1996: Interannual variability and predictability in an ensemble of AMIP climate simulations conducted with the CCC GCM2. Climate Dyn., 12, 825–847.