## Abstract

In recent years, climate model experiments have been increasingly oriented toward providing information that can support local and regional adaptation to the expected impacts of anthropogenic climate change. This shift has magnified the importance of downscaling as a means to translate coarse-scale global climate model (GCM) output to a finer scale that more closely matches the scale of interest. Applying this technique, however, introduces a new source of uncertainty into any resulting climate model ensemble. Here a method is presented, on the basis of a previously established variance decomposition method, to partition and quantify the uncertainty in climate model ensembles that is attributable to downscaling. The method is applied to the southeastern United States using five downscaled datasets that represent both statistical and dynamical downscaling techniques. The combined ensemble is highly fragmented, in that only a small portion of the complete set of downscaled GCMs and emission scenarios is typically available. The results indicate that the uncertainty attributable to downscaling approaches ~20% for large areas of the Southeast for precipitation and ~30% for extreme heat days (>35°C) in the Appalachian Mountains. However, attributable quantities are significantly lower for time periods when the full ensemble is considered but only a subsample of all models is available, suggesting that overconfidence could be a serious problem in studies that employ a single set of downscaled GCMs. This article concludes with recommendations to advance the design of climate model experiments so that the uncertainty that accrues when downscaling is employed is more fully and systematically considered.

## 1. Introduction

In the past, ensembles of global climate models (GCMs) have been used to predict the response of the earth’s climate to different trajectories of anthropogenic greenhouse gas emissions to help inform climate change mitigation and adaptation strategies. The agreement made at the United Nations Conference on Climate Change in 2015 is a recent example of such policy decisions where GCMs are employed to help inform a decision-making process (Tribett et al. 2017). But both the developers of these ensembles and the decision-makers who seek to use their output have long recognized that a careful consideration of the uncertainty in the resulting climate change projections is necessary to properly frame and assess the costs, benefits, and risks associated with rising greenhouse gas emissions and any proposed abatement policies. As such, three sources of uncertainty are typically identified from these ensembles of GCMs (Hawkins and Sutton 2009; Gettelman and Rood 2016): scenario uncertainty, model uncertainty, and natural variability.

More recently, climate modeling has been drafted into service to provide guidance for local and regional adaptation decisions. This new use requires a finer resolution of information because of the myriad physical processes, feedbacks, and features that are either parameterized or nonexistent in GCMs and can affect how the global anthropogenic forcing manifests at the local scale (and therefore could affect decision-maker responses). This has led to the use of downscaling to capture the key local- and regional-scale processes that cannot be represented by a GCM (e.g., Wootten et al. 2014; Gettelman and Rood 2016). And yet, this inclusion of downscaling in the development of climate model ensembles also represents an added source of model uncertainty. This is important because impact assessment studies typically use either dynamically downscaled projections (e.g., Argüeso et al. 2015; Koutroulis et al. 2015) or statistically downscaled projections (e.g., Basso et al. 2015; Parmesan et al. 2015; Werth and Chen 2015), but rarely both. Within these two distinct approaches to downscaling, uncertainty arises in the resulting ensemble for different reasons. In dynamically downscaled ensembles, structural uncertainty arises from the choice of regional and global climate model pairing, while in statistically downscaled ensembles, the statistical technique [bias correction and spatial downscaling (BCSD), bias-corrected constructed analogs (BCCA), localized constructed analogs (LOCA), multivariate adaptive constructed analogs (MACA), etc.] or training data will vary from one dataset to another, all of which can contribute to the total downscaling uncertainty. But while significant attention has been devoted to characterizing and quantifying the uncertainty due to the choice of scenario, GCM, or parameterization, to our knowledge no study has yet assessed whether relying on only one of the two dominant downscaling techniques causes a significant truncation of the overall uncertainty estimate in climate model ensembles. Given the popularity of downscaling for use in impact assessments and adaptation planning, such a result would suggest there is a high risk of overconfident decision-making that would reduce the utility of this information. In other words, are impact assessments that only employ one type of downscaling overconfident because they are ignoring the superstructure of uncertainty that exists across the chosen and nonchosen downscaling techniques?

Studies addressing downscaling as a dominant source of uncertainty have led to mixed conclusions. Chen et al. (2011) used six downscaling techniques and 28 combinations of GCM and emissions scenarios to simulate river discharges for the Manicouagan River basin (45 800 km^{2}) in Canada. The authors estimated that the ensemble uncertainty that was due to downscaling was comparable to the uncertainty attributable to GCMs and emissions scenarios for a single river basin and an impact variable (discharge). Likewise, Mandal et al. (2016) used six statistical downscaling techniques with four GCMs and four emissions scenarios to estimate the uncertainty related to precipitation projections in the Campbell River basin (1856 km^{2}) in Canada. In this case downscaling was found to be the largest source of uncertainty in the ensemble. However, the small geographic domain, small member ensemble, single downscaling technique (statistical), and potential comingling of uncertainty due to downscaling versus that due to natural variability potentially limit the applicability of these conclusions. Pourmokhtarian et al. (2016) used three statistical downscaling techniques, four GCMs, and two emissions scenarios to generate output for a forest biogeochemical model applied in the Hubbard Brook Experimental Forest (0.13 km^{2}) in New Hampshire. The investigators concluded that the statistical downscaling technique was an important source of uncertainty for variables from the biogeochemical model but that the observations used to train the statistical downscaling are also important. Dobler et al. (2012) estimated that the uncertainties from GCMs and regional climate models (RCMs) were comparable to each other when evaluating the results from a suite of hydrological models for use in a small watershed (~1000 km^{2}) in the North Limestone Alps. Kay et al. (2009) focused on sources of uncertainty from climate projections related to flood frequency for catchments in England (point locations). This was one of the few studies to focus on multiple types of downscaling (both statistical and dynamical), and concluded that the difference between GCMs, rather than downscaling, is the dominant source of uncertainty related to flood frequency.

These studies reveal several common threads that limit their ability to answer the question of how critical downscaling is as a source of uncertainty in climate model ensembles. First there is a tendency to focus on impact variables (in hydrology or ecology) rather than on the climate variables themselves. The relationships between climate variables and impact variables are often nonlinear (e.g., Jin et al. 2005); therefore, prior results that hold for impact variables may not hold for projections of the climate variables themselves. Second, prior literature tends to focus on small regions (such as river basins) rather than larger regions (such as the southeastern United States). While some statistical downscaling techniques may work well in areas with complex topography, other techniques may be designed to more accurately model coastal interaction processes (Wootten et al. 2014). And in contrast, dynamical downscaling approaches use different numerical modeling (such as finite differencing, finite volume, or a spectral representation) to capture the same physical processes. This implies that the conclusions drawn from these small-domain studies may not be transferable to other regions because the uncertainty from downscaling is likely not the same everywhere. Finally, most of these studies employ only one of the two types of downscaling (statistical or dynamical), which implies that the uncertainty from downscaling will be underrepresented. This study characterizes the uncertainty that results from downscaling global climate models using an extension of a prior methodology developed by Hawkins and Sutton (2009) for use with GCM ensembles. This extension is applied to an ensemble of climate models representing multiple types of downscaling techniques for a domain covering the Southeast (~3 484 800 km^{2}).

To characterize the contribution of sources of uncertainty from climate model ensembles, Hawkins and Sutton (2009, hereinafter HS09) introduced a variance decomposition method to quantify the proportion of total uncertainty that is attributable to natural variability, model (structural) uncertainty, and scenario uncertainty (i.e., uncertainty about the future pathway of human greenhouse gas emissions). The HS09 approach decomposes the ensemble variance by partitioning the residuals from a polynomial model fit to the projected anomalies for each combination of GCM and emissions scenario. The original HS09 approach was developed specifically for use with GCM projections and implemented in the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC; Kirtman et al. 2013). The HS09 approach is a simple way to characterize the contribution of different sources of uncertainty from GCM-based projections, but in its original incarnation, the method did not consider downscaling as an additional source of uncertainty. The main goal of this study is to characterize the contribution of downscaling as a source of uncertainty in climate model ensembles in comparison with the sources of uncertainty from GCMs. To achieve this, this study extends the HS09 approach to include downscaling by adjusting and advancing the variance decomposition method. This extension is then applied using multiple publicly available downscaled projections (both dynamic and statistical) to characterize the contribution from four sources of uncertainty across the Southeast.

In section 2, the HS09 approach is reviewed and the changes to HS09 to incorporate downscaling are described. In section 3, the datasets, experiments, and variables used for this analysis are discussed. In section 4, the results of the extended approach are applied to two experiments for the Southeast. In sections 5 and 6 the conclusions from the study and possible improvements to the approach are discussed.

## 2. Methods

The approach of HS09 was originally developed to characterize the contribution of three sources of uncertainty from GCM projections from phase 3 of the Coupled Model Intercomparison Project (CMIP3), and was also implemented with CMIP5 GCMs. In HS09, a fourth-order polynomial is fit to the yearly time series of absolute anomalies of temperature for each combination of GCM and emissions scenario from 1950 to 2099. Hawkins and Sutton (2011) extended HS09 for use with precipitation projections by using a percent anomaly instead of the absolute anomaly. The fitted values and residuals from this polynomial fit are used to perform a variance decomposition to capture the following sources of uncertainty:

natural variability, defined as year-to-year changes not influenced directly by human activities (e.g., El Niño–Southern Oscillation or the Pacific decadal oscillation),

scenario uncertainty, defined as uncertainty associated with human actions such as decisions about policy and levels of greenhouse gas emissions, the development and adoption of cleaner technology, and changes in the global population, and

GCM model uncertainty (called model uncertainty in HS09), defined as differences between GCM construction, that is, different representation of scientific knowledge at large scales.

For a description of the equations used in the original methodology, refer to HS09. The sources are assumed to be independent, and the total uncertainty is the sum of the sources. The natural or internal unforced variability of the climate system is also assumed to be constant over time. Briefly, HS09 decompose these three sources of uncertainty as

where *V* is the proportion of total uncertainty attributable to natural variability, *W*_{m} is the weight given to each GCM by the ability to estimate the anomaly for the year 2000 from the baseline period as given by observations, and *ε* is the residuals from a fourth-order polynomial fit for each GCM *m*, scenario *s*, and time *t*. The decomposition equations for the scenario uncertainty *S* and GCM model uncertainty *M* are

where *x* is the fitted values from a fourth-order polynomial fit for each combination of GCM and emissions scenario in time, and *N*_{s} is the number of emissions scenarios. The variance (Var) is used to calculate the values of *V* and *S*, while the weighted variance (Var^{W}) is used to calculate the value of *M*. The weighted variance used by HS09 has also been used in numerous studies related to climate model evaluation, ensemble development, and uncertainty analyses. While not explicitly defined here, Haughton et al. (2015) provide a thorough definition of the weighted variance formula used in HS09 and in this study.

The sources are assumed to be independent, and the total uncertainty is the sum of the sources. The fourth-order polynomial fit to each equation is defined in HS09 as the following:

where *X* is the raw simulated value for each model *m*, scenario *s*, and time *t*, *i* is the average of the simulated variable of interest over a 30-yr period, and *ε* is the residual of the fourth-order fit.

For projections from GCMs, HS09 provide a reasonable approach to characterizing these three sources of prediction uncertainty. However, to extend this approach for use with downscaled projections there are two important considerations. First, the extension must consider multiple downscaling techniques from both types. That is, it must include a decomposition equation for a source of uncertainty unique to downscaled projections, namely, downscaling uncertainty: differences between downscaling techniques, including between dynamical techniques, between statistical downscaling techniques, and across dynamical and statistical techniques.

Second, the extension must account for the fragmented nature of downscaling as much as possible. This fragmented nature results from the fact that in practice, downscaling is typically performed in an independent and project-specific manner, without a common set of GCMs, emission scenarios, and downscaling techniques. For example, as seen in Table 1, each set of publicly available downscaled projections used in this study incorporates different GCMs and emissions scenarios. Each of these publicly released downscaled datasets is the product of a unique set of questions, requirements, and constraints that all influenced which GCMs, emission scenarios, time periods, and temporal resolutions were ultimately chosen. The resulting collection of publicly available downscaled projections is an incomplete and fragmented sample of the available combinations of downscaling techniques, GCMs, and emissions scenarios (Fig. 1). This fragmented sample causes the sum of the weights in HS09 to be less than one in situations where not all combinations exist, causing biased variance estimates. So, to extend the HS09 approach to include downscaling this must be addressed.

To address these limitations, differences between downscaling techniques are directly incorporated into the fourth-order polynomial fit:

The left-hand side of this equation reflects the anomalies of a variable in time where *X* is the value of the variable of interest in time for each combination of GCM *m*, downscaling technique *d*, and emissions scenario *s*, and *i* is the average of the simulated variable of interest over 1981–2000. The right-hand side contains the fitted values (*x*) and residuals (*ε*) from a fourth-order polynomial fit in time (*t*) for each combination of *m*, *d*, and *s*. The fourth-order polynomial fit represents the slowly varying signal of anthropogenic climate change (Benestad 2003), while the residuals of that fit represent the remaining natural variability. In this way downscaling is incorporated at the outset, which enables the variance decomposition method to include this source of uncertainty. The full period used for each fit is 1950–2099, and variance decomposition is applied to the fitted values and residuals.

A weighting scheme is included to account for the accuracy of the GCMs and downscaling techniques for the historical anomaly:

In these weighting schemes, *x*_{obs} is the anomaly from the fourth-order polynomial fit of the variable in 1999 taken from the Parameter-Elevation Regressions on Independent Slopes Model dataset (PRISM, designated as obs; Daly et al. 2008). The *x*_{m,d,1999} is the anomaly from the polynomial fit for 1999 for each combination of downscaling technique *d*, and GCM *m*. Given that the downscaled datasets have the same baseline period for each emissions scenario, the subscript *s* is omitted in Eqs. (6) and (7). The year 1999 is used as the reference year as it is a year in common between all the downscaled projections. The results are assumed not to be sensitive to the reference year chosen. The anomaly from the fourth-order polynomial in 1999 is used to assess the ability to simulate the recent change of a variable of interest to allow the best performing GCMs and downscaling techniques in recent history to retain the highest weights in the variance decomposition equations. To focus on the historical accuracy of each individual GCM or downscaling technique, the mean of the values across downscaling techniques or GCMs is used. The weights for GCMs and downscaling techniques calculated this way do not sum to 1. To adjust for this the weights are rescaled by

where *W* is the rescaled weights for the GCMs and downscaling techniques, and *N*_{m} and *N*_{d} are the number of GCMs and downscaling techniques. At this point, the weighting scheme is the same as used in HS09. And as in HS09, this weighting is used to downweigh GCMs and downscaling techniques for which the recent anomaly is too large or small. We note that the variance decomposition results are not strongly sensitive to this weighting approach. However, the usual limitations associated with model weighting apply in that the available sample of models and downscaling techniques represents an ensemble of opportunity, and cannot be considered an unbiased sample (Sanderson and Knutti 2012). Finally, additional rescaling is also applied to account for situations where the full complement of GCM and downscaling combinations is not available:

This calculation is performed iteratively for all terms. The values of *ω* depend on which GCMs or downscaling techniques exist in the sample at the particular point of the calculation. Therefore, the values of *ω* do not remain constant, while the values of *W* do remain constant regardless of the calculation. More importantly, as the sample becomes more complete the values of *ω* approach *W* as more combinations become available. In addition, the calculation of *ω* forces the weights used in future combinations sum to one, ensuring that the estimates of mean and variance in the equations are unbiased.

The calculation for each component of uncertainty uses the fitted values, residuals, and weights and is described separately here.

- GCM model uncertainty: where
*M*is the GCM model uncertainty for each time*t*. In this case, rather than taking the variance across the GCMs of the fitted values, Eq. (13) uses the weighted variance (Var^{ω}; Haughton et al. 2015) across GCMs (*m*) and uses the value of*ω*_{m}[from Eq. (10)] for the weights. Here*N*_{s}and*N*_{d}represent the total number of emissions scenarios and downscaling techniques available. - Downscaling uncertainty: where
*D*is the downscaling uncertainty for each time*t*. Like the calculation of*M*, Eq. (14) uses the weighted variance (Var^{ω}; Haughton et al. 2015) across downscaling techniques (*d*) and uses the value of*ω*_{d}[from Eq. (11)] for the weights. Here*N*_{m}represents the total number of GCMs available.

The total variance is then defined as

where *T* is the total variance. Note that using this equation to define *T* requires assuming that each source of uncertainty is independent. The mean change relative to the baseline period (1981–2000) is defined as

where *G* is the mean change relative to the baseline period for the projections. This is calculated from the fitted values of the regression equation and model weights.

## 3. Downscaled projections, experiments, and variables

The downscaled projections used in this analysis 1) are commonly used and publicly available, 2) represent multiple downscaling techniques, and 3) reflect the fragmented nature of publicly available downscaled datasets (Table 1). Four statistical techniques and one dynamical technique are used in this analysis. The Southeast Regional Assessment Project (SERAP; 16 GCMs, four emissions scenarios; Stoner et al. 2013) and Center for Climatic Research (CCR; 13 GCMs, three emissions scenarios; Lorenz 2012) represent two statistical downscaling techniques based on transfer functions. MACA (20 GCMs, two emissions scenarios; Abatzoglou and Brown 2012) and BCCA (21 GCMs, four emissions scenarios; Bureau of Reclamation 2013) use weather typing approaches (also considered a form of statistical downscaling). Finally, the Hostetler dataset (three GCMs, one emissions scenario; Hostetler et al. 2011) was created using dynamic downscaling with the RegCM3 regional climate model. The output simulations from the Hostetler dataset are available across the study region at a spatial resolution of 15 km. Little overlap exists between the GCMs used in the downscaling and the downscaling technique for these five datasets (Fig. 1a). Different generations of GCMs (i.e., CMIP3 or CMIP5 members) and emissions scenarios also clearly affect which emissions scenarios overlap between downscaled projections (Fig. 1b). With these five datasets, the available combinations of GCMs, downscaling techniques, and emissions scenarios are far less than all the possible combinations, even when considering just the available GCMs and emission scenarios from the CMIP archives. In addition, not all the downscaled projections are continuous; several only have output available for one or more GCM time slices, which do not always overlap (Table 1).

This incomplete and truncated ensemble of opportunity illustrates the challenge of providing an accurate and robust characterization of uncertainty in regional climate change projections. To assess how this incomplete sample can affect the representation of uncertainty, and the contribution of downscaling to the overall uncertainty, the extended variance decomposition method is applied to two experiments:

IDEAL, in which a subset of the five downscaled projections is used that reflects an ideal complete sample (this subset has 16 common GCMs and two common emissions scenarios from MACA and BCCA, both of which contain continuous model output for the twenty-first century), and

ALLDATA, in which all of the downscaled projections in this study (whether continuous or discontinuous), including eight emissions scenarios and 45 GCMs in addition to time slices, are used for variance decomposition.

The downscaled projections represent a range of statistical techniques and one example of dynamical downscaling [see Wootten et al. (2014) for an extended description of the differences between these downscaling techniques].

The domain in this analysis is the Southeast (Fig. 2), covering a much larger geographical area (~3 484 800 km^{2}) than do prior studies [e.g., <46 000 km^{2} in Minville et al. (2008) and Mandal et al. (2016)]. This region incorporates complex topography and coastlines, as well as climate change adaptation planning efforts where downscaled projections are actively being considered. For the variance decomposition, there will be both time series and gridded analyses. For the gridded analyses, all the downscaled datasets are aggregated to a 15-km-resolution grid, which is the coarsest resolution of the available downscaled projections. There are multiple methods available to regrid all the downscaled datasets to a common resolution, and it is assumed that using a different method would have a negligible effect on the results. For the time series analysis, the two subdomains shown in Fig. 2 are used.

The variance decomposition methodology described in the previous section uses yearly anomalies. However, for different temporal means (such as the decadal means used in this study), the residuals (*ε*_{m,d,s,t}) are smoothed with a 10-yr moving average before the variances are calculated. The gridded and time series analyses are performed for the following variables:

decadal mean of the annual average high temperature (TMAX),

decadal mean of the annual number of days with high temperatures > 95°F (35°C; TMAX95),

decadal mean of the annual average low temperature (TMIN),

decadal mean of the annual number of days with low temperatures < 32°F (0°C; TMIN32),

decadal mean of the annual total precipitation (PR), and

decadal mean of the annual number of days with precipitation > 1 in. (25.4 mm; PR25).

The variables TMAX, TMIN, and PR are similar to those variables characterized in HS09 and Hawkins and Sutton (2011). The variables TMAX95, TMIN32, and PR25 represent a subset of the exposure and adaptation-relevant indices highlighted in the Third National Climate Assessment (Carter et al. 2014). In addition to the decadal means of these variables, the annual values from each were also analyzed.

The approach used in HS09 and in this extension could be described as an ad hoc statistical approach. As an alternative, a random effects model was also considered as a means to estimate the variance contribution from each source of uncertainty. The random effects model (Diggle et al. 2002) is a more robust and formal statistical modeling approach, and it provides a point of comparison to assess the validity of the sample variances calculated by this ad hoc technique. The random effects model has the following structure:

where *μ* is the mean trend, *β*_{m} ~ *N*(0,), *β*_{d} ~ *N*(0,), *β*_{s} ~ *N*(0,), and *ε* ~ *N*(0,). As it relates to the HS09 extension developed in this study, , , , and correspond to the values of *M*, *D*, *S*, and *V*, respectively. The random effects model is fit to the annual anomalies using ordinary least squares. Results from the random effects model and the annual values analysis are only discussed as needed.

It is important to note that the typical idea of a variance estimate based on a sample drawn from a population can be quite problematic when applied to climate model ensembles. In this case, the “sample” of GCMs and downscaled GCMs that form an ensemble can either be interpreted as being drawn from a distribution that represents “truth plus error” (random samples from a distribution of plausible models centered around the true future realized climate), or it can be interpreted as being drawn from an “indistinguishable” distribution (where the truth and all models are thought to be drawn from the same distribution with equal probability; Sanderson and Knutti 2012). Weighting the variance estimates by model performance is particularly problematic if the ensemble is actually derived from an indistinguishable population. Given the weighting scheme used by HS09 and as followed in this study, the “truth plus error” interpretation is followed (implicitly in HS09, explicitly here). An alternative interpretation could yield different variance estimates.

## 4. Results

For each experiment and variable, the percent contribution from each source to the total variance is calculated. To visualize this contribution in space and time, three time periods are shown: early century (2016–25), midcentury (2051–60), and late-century (2086–95). In the IDEAL TMAX experiment for the early century period, the analysis shows that GCM model uncertainty is the dominant source of uncertainty (>50% of the total uncertainty) followed by natural variability (Fig. 3). Scenario uncertainty becomes the dominant source by the end of the century across the domain, which agrees with prior GCM-only studies (e.g., HS09). The downscaling uncertainty in Florida is greater than 30% of the total uncertainty in the IDEAL experiment for TMAX in the early and midcentury. For the ALLDATA TMAX experiment (Fig. 4), the midcentury and late-century pattern is consistent with the results from the IDEAL. However, the downscaling uncertainty is now less than 5% of the total in the early and midcentury periods in Florida. This suggests that adding more downscaling techniques increased the consistency (decreased the variability) in Florida.

For TMAX95, the downscaling uncertainty is larger than 30% for much of the domain in the early century in the IDEAL experiment (Fig. 5). Downscaling is the dominant source of uncertainty for areas of Florida and the Appalachians in the early century. In addition, downscaling uncertainty is dominant in the midcentury for areas of the Appalachian Mountains and remains significant (>30%) through the end of the century in this area in the IDEAL experiment. While the contribution from downscaling uncertainty is diminished in the ALLDATA experiment for TMAX95 (Fig. 6), it remains greater than 30% for parts of the Appalachians through the end of the century. As with TMAX, the scenario uncertainty becomes the dominant source of uncertainty across much of the domain for both experiments by the end of the century. We can also note that for TMAX and TMAX95 the scenario uncertainty appears to be much larger in the ALLDATA experiment during the early century than in the IDEAL experiment. While more scenarios are added, some projection information is not available for the early century period (2016–25). This is discussed further in section 5.

The results for TMIN and TMIN32 (not shown) were similar to the TMAX results for both experiments, with a transition to scenario uncertainty as the dominant source of uncertainty by late century. However, the patterns for downscaling uncertainty in Florida are ~10% larger for TMIN32 than TMAX for all three future periods in the IDEAL experiment. Finally, downscaling uncertainty in TMIN32 is not significant in the Appalachians for any time period for both experiments.

For precipitation (PR) in the IDEAL experiment, the natural variability and GCM model uncertainty are the dominant sources of uncertainty throughout the three time periods (Fig. 7). However, the downscaling uncertainty reaches 20% of the total uncertainty in some areas and is greater than 30% of the total uncertainty in south Florida. The ALLDATA experiment (Fig. 8) shows that at the end of the century much of the uncertainty is similarly partitioned between natural variability, GCM uncertainty, and scenario uncertainty. While the signal from downscaling is diminished, there are areas of the domain where the attributable uncertainty reaches 20% of the total.

For more extreme precipitation events represented in the PR25 ensemble projections, the pattern of uncertainty partitions is similar to the patterns for mean precipitation (PR). The primary sources of uncertainty in the IDEAL experiment for PR25 transitions from natural variability in the early century (2016–25) to GCM model uncertainty in the late century (2086–95; Fig. 9). While not the dominant source, the contribution from downscaling uncertainty is more than 30% in portions of the domain for all three periods. As with the PR projections, the three non-downscaled sources of uncertainty are equally important by the end of the century for PR25 in the ALLDATA experiment (Fig. 10). The uncertainty attributable to downscaling still reaches ~20% of the total uncertainty for areas of the domain by the end of the century.

A possible concern with the HS09 methodology, and this extended version, is that the calculated variability may not match the variance from the ensemble spread. It is expected that the HS09 approach, this extended HS09 approach, and the random effects model will overestimate the total variance from this ensemble because they rely on the assumption of independence between sources (Pennell and Reichler 2011). The upper and lower bounds in Fig. 11 reflect two standard deviations above and below the mean calculated from the ensemble and from the variance decomposition methodology (derived bounds). The extended methodology captures the mean projected change of variables in climatologically different areas of the Southeast but overestimates (underestimates) the upper (lower) bounds of the ensemble particularly at the end of the period. This suggests that the variance decomposition estimated a higher variance than the variance calculated from the ensemble. This overestimation is present for all tested variables and decadal means (not shown). The random effects model produced the same result as the extended methodology, overestimating the variance and capturing the mean change. This demonstrates the expected overestimation of the total variance connected to the assumption of independence between the sources in the extended methodology.

For the two subdomains representing the Appalachian Mountains and central Florida, there are striking differences in how uncertainty is partitioned between the IDEAL and ALLDATA experiments (Fig. 12). While the uncertainty attributable to different emission scenarios for TMAX95 becomes the dominant source by the end-of-century period in all cases (and conversely the contribution from natural variability uncertainty diminishes through time), there are substantial differences in the contributions from the GCMs versus the downscaling methodologies to the total uncertainty over both space and time. Notably, while the attributable downscaling uncertainty in central Florida for the IDEAL experiment peaks at ~20% in the beginning part of the twenty-first century model simulation, it reaches nearly 50% in the Appalachian subdomain with a concomitant reduction in proportion of uncertainty attributable to the GCMs.

The results for the ALLDATA experiment are more complex because of the incorporation of ensemble members with widely disparate simulation characteristics. In particular the effect of relying on shorter time slice experiments rather than full century-length model runs is apparent from the midcentury spikes in estimated uncertainty contributions from the GCMs and downscaling techniques. For extreme precipitation events (PR25; Fig. 13), as in HS09, the uncertainty contributions from natural variability, GCMs, and downscaling are larger than the scenario uncertainty in the IDEAL experiment. In the ALLDATA experiment, the scenario uncertainty has a larger contribution through the future period than in the IDEAL experiment. However, the contributions from downscaling, GCMs, and natural variability increase to reach a similar or larger magnitude than the contribution from scenario uncertainty in the ALLDATA experiment, particularly during the common time periods between downscaled projections. The natural variability contribution in both subdomains is even larger in the ALLDATA experiment, never falling below 25% of the total uncertainty. And as with the TMAX95 results, the variability in ensemble size and membership results in mid- and late-century spikes in the uncertainty attributable to the GCMs and downscaling technique.

## 5. Discussion

The appeal of high-resolution downscaled climate model output continues to grow apace with local climate change adaptation efforts. But the ability to carefully consider the expected magnitude of uncertainty in climate model ensembles that can be attributed to downscaling is limited by a multitude of factors including domain size, the variables considered, and the type of downscaling employed. In this study, a methodology is extended to include downscaling as one of the four major sources of uncertainty in climate model ensembles and uses publicly available projections for analysis for a larger region than prior studies. These projections were created with four different statistical downscaling techniques, but only one dynamical downscaling technique. For the temperature indices (TMAX, TMAX95, TMIN, and TMIN32), the dominant source of uncertainty transitions from natural variability and GCM model uncertainty in the early and midcentury to scenario uncertainty by late century. For both precipitation indices (PR and PR25), the dominant source in the early century is natural variability, but it becomes split between natural variability, GCM model uncertainty, and scenario uncertainty by the end of the century. The uncertainty contribution due to the use of downscaling is never dominant (i.e., >50% of the total) but nevertheless can be significant. The contribution from downscaling uncertainty is greater than 10% of the total uncertainty for projections of PR25 for 40% of the Southeast domain in the late century and 30% for the projections of TMAX95 in the Appalachians.

The IDEAL experiment estimated the respective uncertainty contributions using an ensemble consisting of two downscaling techniques, 16 common GCMs, and two common emission scenarios. In this experiment, the contribution of downscaling to the total uncertainty is most apparent for projections of TMAX95. The two statistical downscaling techniques in the IDEAL experiment are both weather typing approaches. The primary difference is that the MACA approach is a multivariate approach originally implemented in the Pacific Northwest region of the United States, which has complex topography like the Appalachians. The BCCA approach was not created for complex topography and does not use multivariate analysis. The differences between these methods are large enough to result in the relatively large contributions to the total uncertainty that are attributable solely to the choice of downscaling method for TMAX95 in the Appalachians and temperatures in Florida (with less complex topography).

In recent years, individual downscaling efforts have often originated under circumstances where the intended use was for a relatively narrow range of users and uses. However, the resulting information is still typically made available to the public regardless of the original intent. While the availability of multiple downscaled datasets presents a significant opportunity to develop ensemble climate projections that potentially are more robust and less likely to be underdispersed, it remains a significant challenge to rigorously and consistently quantify and partition the uncertainty from all relevant sources. For the future time periods where all five sets of downscaled data provide guidance, the contributions from downscaling and GCMs are amplified in the percent contribution time series for the ALLDATA experiment (as shown by the spiked patterns in Figs. 12 and 13). The contributions from these sources during these periods more accurately reflect the contribution from all four sources when compared with other periods in the ALLDATA experiment, because all five sets of projections provide guidance during these periods (reflecting the full range of information possible). But while these projections (SERAP, CCR, and Hostetler) only provide guidance for these time slices, this may not be the relevant time horizon for a decision. For instance, a decision-maker in the Appalachians may have a 30-yr time horizon for planning purposes. In this situation, results of the IDEAL experiment suggest that using multiple downscaling techniques (i.e., downscaling uncertainty) is an important consideration. The results of the ALLDATA experiment indicate otherwise for this region and that time period, but the sample in the ALLDATA experiment for that period is smaller than the hypothetical “complete” sample size based on the full set of available scenarios, GCMs, and downscaling techniques. This in turn affects the contribution of all sources to the total uncertainty. This illustrates that the non-standardization among downscaling experiments artificially limits the expected uncertainty contribution. If any particular combination of downscaled GCM output for a given emissions scenario is a reasonable reflection of possible future climatic changes, then portions of the full sample will be unavailable for time horizons of interest. This complicates the representation of uncertainty but also reinforces the fact that the full range of model simulated climatic changes, let alone the full range of possible future climatic responses to the anthropogenic forcing, is not being provided for decision-making. This raises a critical concern for stakeholders that are interested in using downscaled climate model projections for impact assessments and adaptation planning activities. Without the full complement of scenarios, models, and downscaling techniques, decision-makers may be underestimating the magnitude of uncertainty associated with projections of the local manifestations of global climatic change potentially by 20% or more.

The two experiments characterize the uncertainty in an idealized experiment (IDEAL) versus a larger sample of what is available from the publicly released downscaled projections (ALLDATA). During the time periods with the full complement of ensemble members, the uncertainty contribution from downscaling and GCMs tend to be less in the ALLDATA experiment than in the IDEAL experiment. This suggests that during these periods there is more consistency between the downscaled projections when these additional datasets are included. One possible reason is that the addition of statistical downscaling techniques causes the resulting ensemble members to become more similar to each other because they were developed with a common set of training data. This is difficult to assess further with this subset of the projections because it incorporates different generations of GCMs (CMIP3 and CMIP5). Further discussion of this topic using additional downscaled projections from the most recent generation (such as LOCA; Pierce et al. 2014, 2015) is warranted.

When all model runs are available, the contribution of downscaling to the total uncertainty never rises above 50%. The ALLDATA experiment during these periods can provide guidance on the relative contribution of downscaling to the total ensemble uncertainty, but cannot be expected to be reliable outside of those periods. The ALLDATA experiment results only considered one form of dynamical downscaling and are therefore weighted heavily toward the application of statistical downscaling. While the IDEAL experiment may provide some guidance during the other periods, the reliability of those estimates is hard to assess given that only two statistical downscaling approaches were included in the ensemble. In addition, the experiments done here and the assumption of independence in the variance decomposition methodology do not allow for this study to consider the effect of the application of downscaling upon the representation of natural variability. While the effect of downscaling upon natural variability is beyond the scope of this study, it is an important aspect that, to the authors’ knowledge, remains unexplored.

While not the dominant source of uncertainty in these “superensembles,” the added structural uncertainty that comes with applying a downscaling technique to an ensemble of GCMs should not be ignored, as it can reach 20%–30% of the total uncertainty in temperature and precipitation projections. Therefore, researchers and stakeholders who wish to reduce the risk of relying on overconfident climate model ensembles for decision-making may want to consider using more than one set of downscaled projections to better characterize and propagate this source of uncertainty through to impact assessments. The modeling community involved in downscaling has long done its work independently with little overlap between the GCMs, emissions scenarios, and time periods used to construct downscaled ensembles. The result of this is a fragmentation of information that ultimately can underrepresent the risks associated with anthropogenic climate change (Weaver et al. 2013). Time periods common to all the sets of projections produced independently can potentially be used to better characterize the uncertainty, but these common periods do not always align with the time horizons of interest to decision-makers. The consequence of this unintended fragmentation of downscaled projections is a generation of ensembles that are likely overconfident and underdispersed.

## 6. Conclusions

This analysis used an extension of the methodology presented in HS09 to characterize the contributions from downscaling to the total uncertainty in downscaled climate projections. Using the Southeast for this analysis extends the literature on this question across a region with complex physical and climatological features and a wide range of potentially climate-sensitive decisions. In addition, the analyses in this paper use projections created with both types of techniques by incorporating publicly available projections. While this facilitated a more rigorous treatment of downscaling as a source of uncertainty, it also allowed for a focus on the influence that the fragmented style of downscaling may have on adaptation planning.

While downscaling is not the dominant source of uncertainty in downscaled projections, it should not be ignored. For example, in situations where complex terrain is present and climatic extremes are of interest (e.g., days exceeding a high temperature of 95°F, days with a low temperature below freezing, or rainfall totaling an inch or more), the choice of downscaling technique alone represents between ~30% and 50% of the total uncertainty in ensemble climate change projections. For the Southeast domain, the results suggest that the uncertainty associated with downscaling varies, and while not the dominant source of uncertainty for average temperature and precipitation change, it can still account for up to 20% of the total uncertainty. We are confident that studies that rely on one downscaling technique will likely suffer from overconfidence.

Several caveats apply to the results of this study. First, this approach assumes that the sources of uncertainty are independent and that natural variability is constant in time. This assumption of independence was shown to cause an overestimate of the variance of the sample of projections used here. Both assumptions should be evaluated and tested in future work with this methodology. Second, we note that only one set of dynamically downscaled projections were used in this analysis. The inclusion of additional regional climate model experiments (e.g., NARCCAP; Mearns et al. 2009) could improve estimates of the uncertainty attributable to the downscaling technique. It is also important to acknowledge that the sample used was based upon what was provided in publicly available projections, and does not reflect all the emissions scenarios and GCMs provided in the CMIP3 or CMIP5 archives, nor do they reflect all the possible downscaling techniques in the literature. However, while this methodology can be applied to the full CMIP archive with many more downscaling techniques, the calculated variances would still represent a sample of the hypothetically possible futures. Finally, this analysis focuses on the total downscaling uncertainty rather than the individual components of downscaling. That is, this study did not attempt to estimate the proportion of uncertainty attributable to downscaling that is due to methodological and structural considerations such as the choice of training dataset or the selection of a statistical model. In particular, future work regarding the influence of training dataset on the uncertainty is warranted given the results of analysis from Pourmokhtarian et al. (2016).

The results from the analyses of climate variables agree with prior studies where downscaling was shown to not be a dominant source of uncertainty (Dobler et al. 2012; Kay et al. 2009). Much of the literature discussing downscaling as a source of uncertainty has considered this with respect to impact relevant variables in hydrology (e.g., streamflow) or ecology (e.g., species distribution) instead of climate variables. The relationships between climate and impact variables are often nonlinear (e.g., Jin et al. 2005). Therefore, while the projection uncertainty attributable to downscaling may not be the dominant source for most climate variables, it may be a larger source for impact variables and efforts to consider and characterize this uncertainty are warranted. Additional considerations for future work also include an examination of the effect of decreasing domain size, increasing spatial resolution, and any relevant interaction effects.

Our results show that the fragmented and use-specific manner in which downscaled products are currently developed complicates efforts to provide the most realistic representation of uncertainty for decision-makers, impact assessments, and climate change adaptation planning. Important is that the depiction of the range of possible future climatic change may be artificially limited over certain time horizons of interest because some downscaled projections are unavailable, leading to a truncated model ensemble. To increase the utility of downscaled climate model projections to decision-makers and society, consideration should be given to methods that could fill in these gaps so that a more realistic representation of model and epistemic uncertainty is provided (e.g., climate model emulation methods such as in Urban and Fricker 2010). Without such efforts, impact assessments that make use of downscaled climate model projections for these time periods are potentially overconfident because of the lack of downscaling techniques represented.

## Acknowledgments

We thank David Blodgett (USGS Center for Integrated Data Analytics) for assisting with accessing some of the publicly available projections via the USGS GeoData Portal. We also thank the anonymous reviewers for their feedback and suggested improvements to this article. This research was conducted under the Defense Coastal/Estuarine Research Program (DCERP), funded by the Strategic Environmental Research and Development Program (SERDP). Views, opinions, and/or findings contained in this report are those of the author(s) and should not be construed as an official U.S. Department of Defense position or decision unless designated by other official documentation. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. government.

## REFERENCES

*Climate Change Impacts in the United States: The Third National Climate Assessment*, J. M. Melillo, T. C. Richmond, and G. W. Yohe, Eds., U.S. Global Change Research Program, 396–417, doi:.

*Analysis of Longitudinal Data.*2nd ed. Oxford University Press, 169–171.

*Demystifying Climate Models—A Users Guide to Earth System Models*. Springer Nature, 274 pp.

*Climate Change 2013: The Physical Science Basis*, T. F. Stocker et al., Eds., Cambridge University Press, 953–1028.

*Paris Climate Agreement: Beacon of Hope*, Springer-Climate, 115–146.

## Footnotes

^{a}

Current affiliation: South Central Climate Science Center, University of Oklahoma, Norman, Oklahoma.

© 2017 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).