Simulations of future climate change impacts on water resources are subject to multiple and cascading uncertainties associated with different modeling and methodological choices. A key facet of this uncertainty is the coarse spatial resolution of GCM output compared to the finer-resolution information needed by water managers. To address this issue, it is now common practice to apply spatial downscaling techniques, using either higher-resolution regional climate models or statistical approaches applied to GCM output, to develop finer-resolution information. Downscaling, however, can also introduce its own uncertainties into water resources’ impact assessments. This study uses watershed simulations in five U.S. basins to quantify the sources of variability in streamflow, nitrogen, phosphorus, and sediment loads associated with the underlying GCM compared to the choice of downscaling method (both statistically and dynamically downscaled GCM output). This study also assesses the specific, incremental effects of downscaling by comparing watershed simulations based on downscaled and nondownscaled GCM model output. Results show that the underlying GCM and the downscaling method each contribute to the variability of simulated watershed responses. The relative contribution of GCM and downscaling method to the variability of simulated responses varies by watershed and season of the year. Results illustrate the potential implications of one key methodological choice in conducting climate change impact assessments for water—the selection of downscaled climate change information.
Scenario analysis using general circulation model (GCM) output to drive hydrologic models is a common approach for assessing the potential effects of climate change on water resources. These studies are complicated by two challenges: 1) the large uncertainties associated with GCM simulations of future climate change, particularly for precipitation (see, e.g., Cox and Stephenson 2007; Räisänen 2007; Stainforth et al. 2007; Hawkins and Sutton 2011), and 2) the coarse spatial resolution of GCM output that does not incorporate local topographic effects compared to the finer-resolution information needed by water managers (e.g., Fowler et al. 2007). Progress has been made with the first challenge by adopting approaches such as use of an ensemble of model runs to capture the range of variability across multiple GCMs (e.g., Tebaldi and Knutti 2007). Exploring the full range of variability in this way can reveal system vulnerabilities and guide risk management (Wilby et al. 2000; Weaver et al. 2013).
To address the second challenge, it is common practice to apply spatial downscaling methods (DSMs), using either higher-resolution regional climate models (dynamical DSMs) or statistical approaches applied to GCM output. Dynamical DSMs use GCMs to drive nested, regional-scale, numerical models at higher spatial resolution to simulate local conditions in greater detail (Elguindi and Grundstein 2013; Pryor et al. 2012; Mearns et al. 2009, 2013). Statistical DSMs are based on relationships that interpolate large-scale GCM output to observations of historical weather and climate (Abatzoglou and Brown 2012; Burger et al. 2012; Wood et al. 2004; Maurer et al. 2009). Downscaling yields information at a finer spatial resolution more appropriate for watershed analysis. However, the process can also modify and/or compound the uncertainties associated with the choice of a particular underlying GCM. While downscaling can improve local-scale representation of topographic effects, this may have little meaning if the GCM misplaces key features, such as the location of the jet stream or storm tracks relative to the site of interest (Hall 2014). The choice of which projected future climates, and thus specific spatial and temporal details, are used in an assessment has a direct influence on results. It is important for practitioners to understand the potential implications of this methodological choice—the choice of a DSM—on assessment results.
Previous studies have assessed the sources of variability in simulations of hydrologic response to climate change. Chen et al. (2011a,b) evaluated the sources of uncertainty in hydrologic projections for the Manicouagan 5 watershed in Quebec through 2100, examining the role of emission scenarios, GCM, statistical DSM, hydrologic model structure, and hydrologic model parameter sets. They found that the choice of GCM is consistently a major contributor to variability across the outputs of different simulations; however, they found that the choice of DSM, as well as the GCM initial conditions, could have a comparable or even larger contribution for some hydrologic endpoints. Extension of this analysis to include downscaling using four regional climate models (RCMs) found that results from statistical downscaling and RCMs had similar envelopes of uncertainty, although the RCM methods had a larger impact for some endpoints (Chen et al. 2013).
Conversely, variability across simulations driven by different GCMs was more pronounced in runoff projections for major French drainage basins than among DSMs, including both statistical and dynamical downscaling with quantile bias correction (Boé et al. 2009). Similarly, Habets et al. (2013) found that GCM-related variability was the largest driver of the magnitude of climate impacts on hydrogeology in northern France. Mpelasoka and Chiew (2009) found greater variability in precipitation projections among multiple GCMs than among three statistical DSMs. Finally, in hydrologic simulations of the Alpine Rhine using statistical DSMs, the choice of GCM was the dominant contributor to intersimulation variability in summer and fall, but the choice of DSM was of greater importance in winter and spring (Bosshard et al. 2013). These studies suggest the relative contributions of GCM versus DSM to variability differ among locations, but direct comparison across sites is complicated by differences in methods. The studies discussed above also focus only on water quantity (e.g., streamflow) and do not consider water quality endpoints.
In this study, we use watershed model simulations in five U.S. watersheds to illustrate the effects of DSM on the variability of simulated streamflow and water quality (nitrogen, phosphorus, and suspended solids) responses to climate change. Watershed model simulations are driven by meteorological inputs representing mid-twenty-first-century climate developed from nondownscaled and downscaled GCM output. Our analysis addresses two questions: 1) What is the relative contribution of GCM, DSM, and interannual variation on simulated watershed responses? 2) How do simulated watershed responses change when driven by downscaled versus nondownscaled output from a single underlying or “parent” GCM (hereafter referred to as the incremental effects of downscaling)?
The first question allows us to explore overall sources of variation within the ensemble of simulated future changes in climate evaluated in these watersheds, while the second allows us to address three subquestions: (i) Does the variability (i.e., range) of simulated watershed responses to climate change differ when driven by downscaled versus nondownscaled GCM information? (ii) Does using downscaled data lead to the identification of regional patterns of streamflow and water quality variability not found using nondownscaled GCM output, for example, small-scale orographic effects? (iii) Do the simulated watershed responses to climate change depend on the particular GCM and/or type of downscaling used?
Results illustrate the potential implications of one key methodological choice in conducting climate change impact assessments for water—the selection of DSM. Other known sources of variability, including watershed model structure and parameters (see, e.g., Mendoza et al. 2015), emissions scenarios, or other factors in the “uncertainty cascade” (see, e.g., Wilby and Dessai 2010), are not evaluated. Our intent is to help bridge gaps between the climate and hydrologic modeling communities and improve the integration of modeling efforts across these communities (Lofgren and Gronewald 2013).
Our analysis is based on simulations of five large watersheds: the Minnesota River watershed, the Apalachicola–Chattahoochee–Flint River (ACF) watersheds, the Willamette River watershed, the Salt River watershed, and the Susquehanna River watershed (Figure 1). All watershed simulations used in the analysis were conducted as part of a larger, previous modeling effort to assess streamflow and water quality sensitivity to climate change in 20 U.S. watersheds [U.S. Environmental Protection Agency (EPA) 2013; Johnson et al. 2012]. Except for the Salt River watershed, the study watersheds are comparable in size to the U.S. Geological Survey’s (USGS) Hydrologic Unit Code (HUC) 4-digit basins, ranging from 15 025 km2 (Salt) to 71 236 km2 (Susquehanna), and were selected to represent different hydroclimatic and watershed conditions occurring throughout the nation (Table 1).
Watershed simulations were conducted using the Soil and Water Assessment Tool (SWAT, version 2005; Neitsch et al. 2005). The SWAT watershed model incorporates data for weather, soils, topography, vegetation, and land use and cover to estimate water and sediment movement, nutrient cycling, and other watershed processes in large, complex watersheds (Neitsch et al. 2005). Potential evapotranspiration (PET) was calculated internally in SWAT using the Penman–Monteith energy balance method (Allen et al. 2005). Land use and land cover was from the 2001 National Land Cover Database (NLCD) and held constant in all simulations to focus on the effects of climate change and DSM. In each of the five study watersheds, SWAT was used to simulate changes in total streamflow, total nitrogen (TN), total phosphorus (TP), and total suspended solids (TSS) loads in response to simulated mid-twenty-first-century climate change.
SWAT models for each study watershed were calibrated and validated at the scale of USGS 8-digit HUCs. All models performed credibly for hydrology with total volume errors within 20% and Nash–Sutcliffe coefficients of model fit efficiency for monthly streamflow ranging from 0.32 to 0.83. Confidence limits (95%) on mean monthly flows at downstream gauges ranged from ±3% (Susquehanna) to ±15% (Salt). Water quality simulation focused on monthly loads and has much higher uncertainty because of the limited availability of sampling data. In most cases, however, the pollutant load simulations from SWAT models generally appear to be in the fair to good range (median absolute error of 16.5% relative to loads estimated from sparse monitoring data). All analyses in this study are based on simulation results expressed as mid-twenty-first-century changes relative to historical baseline conditions. The setup, calibration, and validation of SWAT models in each of the five study watersheds is described in detail in the appendices to U.S. EPA (2013). Simulations use consistent methods, models, and scenarios to facilitate comparison among study watersheds.
2.1. Simulated future climate change
All projected future climates are based on mid-twenty-first-century (2041–70) climate model simulations using the four GCMs from phase 3 of the Coupled Model Intercomparison Project (CMIP3) under the A2 emissions scenario (IPCC 2007) covered by the regional downscaling efforts of the North American Regional Climate Change Assessment Program (NARCCAP; http://www.narccap.ucar.edu). The specific future climate information used to drive watershed simulations differs depending on if or how the GCM output was downscaled. We consider three categories of climate change information based on these same underlying GCMs: nondownscaled GCMs, dynamically downscaled NARCCAP projects, and statistically downscaled bias-corrected and spatially disaggregated (BCSD; Maurer et al. 2009) projections (Table 2). These three categories of climate change information, while not comprehensive of all GCMs or DSMs, are representative of commonly applied “off the shelf” datasets used in climate change impacts studies.
The NARCCAP information is dynamically downscaled using RCMs nested within parent GCM models to represent detailed subgrid, regional processes and is intended to provide greater detail at finer spatial resolution than the driving GCM. NARCCAP RCM output is spatially downscaled to a 50 × 50 km2 grid over North America (Mearns et al. 2009; Mearns et al. 2014). This downscaled output is archived for two 30-yr time slices (1971–2000 and 2041–70) at a temporal frequency of 3 h. All the NARCCAP simulations assume the IPCC’s A2 greenhouse gas storyline (IPCC 2007). We evaluated six NARCCAP GCM/RCM combinations (U.S. EPA 2013).
The BCSD information is statistically downscaled as described by Wood et al. (2004) and Maurer et al. (2007). This dataset provides temperature and precipitation on a 1/8° (approximately 14 × 10 km2 at 45°N) horizontal grid. We evaluate four BCSD-derived future climates based on the same four underlying GCMs used by NARCCAP. For consistency with the NARCCAP scenarios, we use the 2050 CMIP3 BCSD scenarios for the A2 emissions storyline.
Finally, nondownscaled future climate projections are based directly on GCM output. We evaluate four projected future climates based directly on the four parent GCM output used by NARCCAP and BCSD. (Note, however, that many of the CMIP3 GCMs ran multiple versions of the A2 simulation, differing only in initial conditions, to better capture the random internal variability of the climate system and to extract a more robust signal of the anthropogenic climate forcing. These multiple versions are called “ensemble members,” and the BCSD data we used that were derived from the HadCM3 and CCSM GCMs are from different ensemble members compared to the corresponding nondownscaled or NARCCAP data derived from the same GCM. By contrast, for CGCM3 and GFDL, the ensemble member used is identical across BCSD, NARCCAP, and nondownscaled GCM samples.)
The climate change information (e.g., from NARCCAP, BCSD, and nondownscaled GCM) used to drive SWAT watershed models in each study watershed was implemented as a daily meteorological time series. In each case, daily time series were created using the “change factor” or “delta” method (see, e.g., Anandhi et al. 2011). The change factor method combines information about relative change (between a historical period and future period, generally of a number of years or decades in length) in a particular climate variable of interest, such as temperature or precipitation, with one or more observed local time series of the same variable, to create a synthetic future input dataset for (in this case) the SWAT model.
Climate model outputs were bilinearly interpolated to each of the NCDC weather stations used by the SWAT models (see Table 1 for the number of stations in each watershed). Monthly change statistics (change factors) for each of the 14 total sources of future climate information (from NARCCAP, BCSD, or the nondownscaled GCM output) at each weather station were then calculated as the difference between mid-twenty-first-century (2041–70) and simulated baseline (1970–2000) values. The monthly change factors were additive for temperature and multiplicative (in terms of percent change) for precipitation. These change factors were then used to adjust 30 years of hourly historical precipitation and surface air temperature observations. See U.S. EPA (2013) for additional detail about the approach used.
It is important to note that the different DSMs do not all provide the same meteorological variables. SWAT watershed simulations in this study estimate PET using the Penman–Monteith energy balance method, which requires inputs for solar radiation, humidity, and wind. To provide a consistent basis for comparison, simulated future climate change in this study represents changes only in air temperature and precipitation, the only two variables commonly archived for each DSM. The other climate variables needed to compute PET through the energy balance method are left unperturbed in this study as supplied by SWAT’s weather generator representation of existing climate. Accordingly, information about potential future change represents the effects of changes in air temperature on PET but does not account for changes in solar radiation, humidity, and wind. This delinking of mass inputs (precipitation) and energy inputs other than average air temperature is a simplification but reflects common practice in many climate impact studies (Milly and Dunne 2011). Note that results for SWAT simulations of these watersheds reported in U.S. EPA (2013) and Johnson et al. (2015) do make use of projected changes in these energy inputs where available.
2.2. Data aggregation and analysis
SWAT simulations in each study watershed resulted in 29–30 years of daily output for each future climate simulation evaluated. Daily output was first aggregated to time series of annual and seasonal averages. Aggregated values within each study watershed were then normalized by the mean and standard deviation of their baseline scenario (1971–2000). This converts each time series to a set of deviations from mean baseline conditions that share a common scale of projected change across watersheds. Our analysis focuses on projected changes in endpoint values. Changes were calculated by subtracting baseline deviations from future climate deviations. The endpoints we consider are the total streamflow, TN load, TP load, and TSS load. All analyses were evaluated, using mixed effects models with restricted maximum likelihood, at seasonal and annual intervals with the lme4 package in R (R Core Team 2014; Bates et al. 2014).
Mixed effects models (i.e., hierarchical or multilevel models) are useful when data are nested within groups or categories, such as the climate models in this study (Zuur et al. 2009). These models contain both fixed and random effects, where the fixed effects evaluate overall, population-level relationships, and the random effects account for and produce estimates of heterogeneity among the groups or categories for the fixed effects. Like classical analyses of variance (ANOVAs) that can incorporate random effects, for example, those used to analyze randomized block designs, the goal is not to evaluate differences between the groups but variability.
2.2.1. Analysis of sources of hydrologic variation within the ensemble
Mixed effects models were used to quantify the overall variability associated with parent GCM and DSM in the ensemble of 14 sets of simulated future climate for streamflow and water quality endpoints in each of the five study watersheds. In these models, we used the parent GCM (four groups; columns of Table 2) and DSM (three or four groups per parent GCM; cells of Table 2) as categorical random factors, with the DSM factor nested within the GCM factor. The mixed effects models produced estimates of the mean projected ensemble change and of three sources of variation: variation among parent GCMs , variation among DSMs within parent GCMs , and the unaccounted for interannual variability within DSMs . The approach is analogous to a traditional, nested ANOVA, except it produces estimates of variation for the two random factors instead of mean GCM and DSM estimates. These effects are illustrated in Figure 2. The three estimated standard deviations produced by these statistical models are useful because they allow us to visualize the distribution of projected changes across the full ensemble of simulations with respect to both the parent GCMs and DSMs.
2.2.2. Analysis of incremental effects of downscaling GCM output on hydrologic simulations
Analyses were also conducted to assess how simulated watershed responses change within the overall ensemble of simulations, when driven by downscaled versus nondownscaled GCM output (hereafter referred to as the incremental effects of downscaling). First, we used a mixed effects model with two fixed, binary, categorical variables to compare nondownscaled GCMs (controls) to downscaled means: BCSD, yes (1) or no (0), and NARCCAP, yes (1) or no (0). To measure the variability in these effects, we allowed the relationship between nondownscaled GCMs and downscaled means to vary randomly across parent GCMs (four groups; columns of Table 2). Because the NARCCAP variable in this statistical model does not distinguish between the two RCMs associated with CGCM3 and GFDL (e.g., both CGCM3-CRCM and CGCM3-RCM3 would have the same covariate values), we evaluated each mixed effects model four times: once with each unique combination of CGCM3 and GFDL GCMs and RCMs (Table 2) and then reported average parameter estimates and p values derived from those averages.
Each statistical model produced estimates of the mean projected change among the group of nondownscaled GCM data and the difference between that and the mean projected change in the group of BCSD and NARCCAP data, and . These differences are illustrated in Figure 3a. These models also produce estimates of four sources of variation (standard deviations): variation among nondownscaled GCMs , variation in the effect of downscaling with BCSD among parent GCMs , variation in the effect of downscaling with different NARCCAP projections among parent GCMs , and the unaccounted for interannual variation . This approach is analogous to a traditional analysis of covariance (ANCOVA) with two covariates and four groups (the parent GCMs), except it produces estimates of variation among the groups for the fixed parameters (, , and ), instead of four intercepts associated with individual nondownscaled GCMs and eight projected changes associated with individual BCSD and NARCCAP projections.
To visualize these random effects consider Figures 3b and 3c. Figure 3b shows differences between individual NARCCAP projections and their associated nondownscaled GCM projections (dashed red lines), compared to the overall group difference (thick black line). Figure 3c shows how those differences are represented in the model. Orange arrows show how each nondownscaled GCM projection compares to the overall group. The mixed effects model estimates the variability associated with those differences ), while a traditional ANCOVA would estimate the magnitude of each difference separately. Blue arrows show how the difference between each NARCCAP projection and their associated nondownscaled GCM projection compares to the overall group difference (thin black lines have been added to highlight this comparison).
The mixed effects model estimates the variability in the these downscaling effects ( and ), while a traditional ANCOVA would estimate the magnitude of each effect separately. Of these sources of variation, we are interested in the variability in the BCSD and NARCCAP effect ( and ). When significant, BCSD or NARCCAP fixed effects ( and ) indicate that the application of downscaling consistently found regional patterns not found in the nondownscaled GCM output, resulting in directional shifts in simulated streamflow or water quality responses. Significant BCSD or NARCCAP random effects ( or ) indicate that the magnitude or direction of the BCSD or NARCCAP effect depend on the parent GCM and downscaling model; the larger the value of or , the larger the discrepancy between the overall BCSD or NARCCAP effect and individual model combinations.
We then used three simple mixed effects models to estimate the variability of watershed simulations using BCSD and NARCCAP downscaled climate to compare against the variability among nondownscaled GCMs. This differs from the previous analysis in that it allows us to visualize the variability among the three categories of climate simulations separately. In these statistical models, either the nondownscaled GCM, BCSD, or NARCCAP (four groups per category; rows of Table 2) was used as a categorical random effect. Here, also we evaluated each mixed effects models four times, once with each unique combination of CGCM3 and GFDL GCMs and RCMs (Table 2).
These statistical models produced estimates of the mean projected change for each group of projections, the nondownscaled GCM, BCSD, or NARCCAP projections, and of two sources of variation: variation among DSMs (, , or ) and the unaccounted for interannual variation within the DSMs ). This approach is analogous to a traditional one-way ANOVA, except that it produces estimates of variation instead of mean nondownscaled GCM, BCSD, or NARCCAP responses. Figure 3d can be used to compare these random effects to the previous model. Orange arrows again show how each nondownscaled GCM projection compares to the overall group of nondownscaled GCMs. Both models produce a valid estimate of , but for consistent comparisons, we report the version estimated here. Green arrows, however, show how each NARCCAP projection compares to the overall group of NARCCAPs rather than showing how individual downscaling effects differ from the overall effect. Of these two sources of variation, we are interested in the variability among DSMs for each group (, , or ). Taking all parts of Figure 3 together, the group of NARCCAP scenarios estimates a larger hydrologic response to climate change than the nondownscaled GCMs, but the variability in the NARCCAP effect also leads to larger variability among the NARCCAP scenarios.
3. Results and discussion
Simulated changes in streamflow and water quality endpoints in response to the 14 different projected future climates in each of the five study watersheds are shown in Figure 4. The values shown are ratios of the future (2041–70) to baseline (1971–2000) annual average values at the downstream outlet of each study watershed. Symbols represent watershed responses to climate change based on nondownscaled GCM, NARCCAP, and BCSD data. Projected average changes in air temperature, precipitation, actual evapotranspiration (AET), and PET for each climate future are also shown for comparison. Figure 4 illustrates a wide range in simulated water quality endpoints when different categories of future climate change information are used to drive SWAT. The range of water quality responses is generally wider (on a percentage basis) than the range in driving climate variables and in most cases spans unity (indicating disagreement about the sign of future change). This reflects the cascading effects of variability in climate drivers when coupled with watershed modeling to assess watershed responses. In addition, Figure 4 shows the important role of water limitation in certain regions and seasons of the year, as revealed by the difference in future changes between actual and potential evapotranspiration.
3.1. Sources of variation within the ensemble
Analysis of simulation results in the five study watersheds show that parent GCM, and DSMs within each parent GCM, can each be a significant source of variability in the overall ensemble of projected streamflow and water quality responses to climate change. The relative contribution of GCM and DSM to the variability of simulated responses, however, varies by watershed, season of the year, and streamflow and water quality endpoint (Figure 5). Interannual variability in simulated streamflow and water quality that cannot be attributed to GCM and DSM also varies among watersheds, season, and endpoint. Parameter estimates for all models are presented in Table S1.
Results show differences in the hydrologic and water quality response to climate change among the five study watersheds. This is expected because of the differences in watershed physiographic and hydroclimatic conditions, land use, and other factors. Hydroclimatic conditions vary from the arid southwest (Salt) to the humid Pacific Northwest (Willamette) and southeast (ACF) and represent both continental and maritime midlatitude climates. For example, in ACF and Minnesota the estimated variability among parent GCMs was most often smaller than the variability of DSMs within the parent GCMs (Figure 5). In ACF, the variability among parent GCMs was smaller than the variability among DSMs within parent GCMs in all models considered, while in Minnesota this was true in 80% of the models considered. In contrast, variability among parent GCMs in Willamette was greater than the variability among DSMs within parent GCMs in 75% of the models, suggesting that either large or small differences exist between parent GCMs or DSMs, respectively. Much of the variability we observe across regions may depend on simulated precipitation and spring warming, as the timing and spatial distribution of precipitation has been shown to vary widely across climate models, which in topographically complex watersheds, or those that are influenced by small-scale meteorology, can result in very different flow patterns (Rasmussen et al. 2012).
Differences in the relative contribution of GCM and DSM among study watersheds can be illustrated by comparing results for the ACF and Willamette basins (Figure 6), which have different hydroclimatic and watershed attributes. For Willamette, GCMs tend to be more important than DSMs in determining variability for streamflow and water quality endpoints, while the reverse is true for ACF. Contributing to this difference, Willamette is strongly influenced by the large-scale flow (e.g., the North Pacific storm track) year-round, particularly in the cold season, over which the choice of GCM would be expected to play a larger role. By contrast, temperature and precipitation in ACF strongly depend on smaller-scale meteorology (e.g., local convection) that DSMs (particularly dynamical downscaling) would be more likely to resolve. In addition, the Willamette is closer to the inflow boundary of the RCM domains, so it is likely more strongly influenced by the driving GCM solution, whereas regional climate simulated at ACF experiences more modification as the meteorological flow traverses the RCM domains.
These results are consistent with Wang et al. (2009), who compared the performance of six RCMs over the intermountain region of the western United States to data from the North American Regional Reanalysis (NARR) dataset (Mesinger et al. 2006) and demonstrated that the different RCMs are largely consistent in the Cascade Range (Oregon, Washington) where the dominant upper-level flow first encounters land. The differences among RCMs reported by Wang et al. and the difference from NARR are greatest on the windward side of the Rocky Mountains in Colorado and remain large into Arizona (location of the Salt watershed).
Simulations within each of the five study watersheds also show differences in the relative contributions of GCM and DSM in different seasons of the year. In our ensemble of projected future climates, variability among parent GCMs was smaller than the variation among DSMs within parent GCMs most often in autumn and winter (Figure 5). The variability among parent GCMs was always smaller than among DSMs in winter, while in autumn it was smaller in 65% of the models considered. The converse was true in spring, where variability among parent GCMs was larger than among DSMs in 65% of the models. Projected changes that used downscaled results tended to deviate from nondownscaled results most in winter (discussed below; Figures 7–11). Bosshard et al. (2013) also found that DSM contribution to variance was larger during winter months. Apart from these patterns in the larger dataset, each watershed had their own unique characteristics driven by its hydroclimatic setting (Figure 5). For example, variability in projected changes tended to be highest in spring for Willamette, summer for Salt, autumn for ACF and Susquehanna, and winter for Minnesota. Specifically, Willamette has relatively high mountains where spring snowmelt is important. The Salt is affected by summer monsoons, and ACF has highly variable tropical storms in late summer and fall. Winter has the highest variability in Minnesota, likely in part because scenarios resolve winter temperatures and the difference between precipitation as rain or snow differently.
Simulations show less pronounced differences in the relative contributions of GCM and DSM for different streamflow and water quality endpoints. The variability of streamflow and water quality endpoints is most pronounced for the Minnesota and Salt Rivers (Figures 4, 5). The relationship between GCM and DSM effects across endpoints was relatively consistent, but interannual variation that could not be attributed to each source varied widely by metric. Unaccounted for interannual variation in streamflow was larger than the other two effects in only 32% of the models, but this value increases to 68%, 80%, and 96% in TSS, TN, and TP models. These results illustrate the greater variability in projected changes in water quality metrics, especially TN and TP, due to the multiple interacting factors affecting pollutant sources, fate, and transport, such as changes in precipitation intensity and seasonal timing relative to plant growth cycles.
3.2. Incremental effects of downscaling GCM output on hydrologic simulations
The incremental effects of downscaling were evaluated by comparing SWAT simulations in the five study watersheds when driven by downscaled versus nondownscaled climate change information from the same parent GCM. By “incremental effects,” we mean the quantified impacts, on simulated hydrologic endpoints, of using dynamical or statistical downscaling to modify the output from a given GCM. This is distinct from the overall variability among GCMs and/or DSMs, as presented in the previous section.
Figures 7–11 show results for each streamflow and water quality endpoint by season of the year. The significance of fixed and random BCSD and NARCCAP effects are also shown in Figures 7–11. Parameter estimates for all of the effects models are presented in Tables S2 and S3. Results show significant variability in the effects of downscaling among watersheds, seasons of the year, and to a lesser extent with the different streamflow and water quality endpoints. In some cases (e.g., watersheds/seasons), watershed simulations driven by downscaled (BCSD, NARCCAP) versus nondownscaled (GCM) climate change information deviate in a consistent direction, suggesting that downscaling is capturing some common underlying process in the watershed, for example, orographic effects or lake snow, that the GCMs are not. In other cases, however, simulations using NARCCAP versus BCSD deviate from the GCM in ways that are not consistent with each other, including the sign of the projected change [e.g., recall Figure 4 and see discussion in Johnson et al. (2012)].
Simulations within individual study watersheds tend to show greater incremental effects of downscaling when driven by climate change information from NARCCAP RCMs (i.e., significance and size of the fixed BCSD and NARCCAP effects; 16% vs 26% overall; Figures 7–11). In many cases the variability among BCSD and NARCCAP scenarios was similar, but random NARCCAP effects were significant more often (64% vs 96% overall). In other words, for the ensemble of projected future climates in this study, NARCCAP RCMs were on the whole more likely to find consistent regional patterns that differed from nondownscaled GCMs, but individually these differences were more variable. This result could occur because, unlike with statistical downscaling, RCMs are able to alter the atmospheric circulation and convective environment in the parent GCM.
Simulations across the five study watersheds show significant variability in the incremental effects of downscaling in these different hydroclimatic and physiographic locations. For example, looking across all streamflow and water quality endpoints and seasons of the year, watershed simulations driven with downscaled climate change information (i.e., NARCCAP and BCSD) differed from simulations using nondownscaled GCMs most often in the Salt watershed. In the Salt, simulations using BCSD differed significantly from the nondownscaled GCM runs in 20% of the simulations, while those based on NARCCAP differed significantly in 60% of the simulations (Figures 7–11). With the exception of the summer season, which had highly variable changes in streamflow (Figures 5, 10), the use of climate change information from BCSD in the Salt resulted in relatively higher streamflows and loads, while the use of NARCCAP data resulted in relatively lower streamflows and loads. This contrasts with the ACF and Willamette watersheds, where, in the former, simulations driven with BCSD differed significantly from the nondownscaled GCMs in 30% of the cases (and simulations with NARCCAP did not differ), while in the latter, simulations driven by NARCCAP differed significantly in 20% of the cases (and simulations with BCSD did not differ). Across all five study watersheds, however, random NARCCAP effects were significant more often than random BCSD effects (Figures 7–11).
Last, Figures 7–11 illustrate variability in the incremental effects of downscaling when considering different streamflow and water quality endpoints and seasonal differences in endpoint values throughout the year. While the effects of downscaling were relatively consistent among the different endpoints, results are more variable across seasonal endpoint values. For example, considering annual average streamflow, simulations driven by downscaled climate change information from BCSD and NARCCAP differed significantly from those using nondownscaled GCMs in 5% and 35% of models, respectively (Figure 7). A similar pattern of significant fixed effects occurs in the autumn and summer seasons, where BCSD and NARCCAP effects differed from nondownscaled GCMs in 5% and 25% of models in autumn and 0% and 35% of models in summer, respectively (Figures 10, 11). In the spring and winter seasons, however, BCSD and NARCCAP effects differed in 20% and 0% of models in spring and in 50% and 30% of models in winter, respectively. In other words, BCSD effects were most significant in spring and especially winter (20% and 50% of models, respectively), while NARCCAP effects were significant in roughly equal proportions (25%–35% of models) for all periods except spring.
3.3. Assumptions and research needs
This study describes a particular set of watershed simulations to illustrate how driving a watershed model with different approaches to downscaling climate change information can influence simulation results. All results are conditional on the methods, models, and climate change information evaluated in the underlying simulations. Several caveats should be noted. First, to provide a consistent basis for comparison, all simulations of watershed response to climate change assume future changes only in air temperature and precipitation. We intentionally do not consider the implications of representing changes in other meteorological variables such as humidity, radiation, and wind speed that are necessary to calculate PET using an energy balance approach (see, e.g., Milly and Dunne 2011). Representation of these additional meteorological variables can have a significant influence on watershed simulation results. Sensitivity studies in the five study watersheds suggest the inclusion of projected changes in dewpoint resulted in a reduction in estimated annual PET of about 11% across all the meteorological stations, implying an underestimation of soil moisture and streamflow when change in dewpoint is not used (see U.S. EPA 2013). The importance of accounting for dewpoint for properly simulating hydrology under future climate change was also noted by Pierce et al. (2013).
In addition, all meteorological inputs used to drive the watershed models were created using the change factor method applied to historical time series. Use of change factors to translate climate change information to site-specific information to drive watershed models is, in itself, a simple additional step of statistical downscaling from the gridded DSM output to point gauge locations. Chen et al. (2013) have shown that different approaches for translating regional climate projections to site-specific inputs for hydrologic models can impact watershed simulations. In this study, we do not consider the implications of using different types of change factors (e.g., scaling vs quantile mapping), nor do we compare the change factor application to approaches that use the climate model–simulated sequences of precipitation events.
Finally, it must be noted that simulations of potential climate change impacts are subject to multiple and cascading uncertainties associated with different watershed model characteristics and methodological choices. In this study we address only one source—the effects of downscaling climate change information used to drive watershed models—and do not address other uncertainties affecting watershed simulations. The analysis is nevertheless illuminating and shows promise for providing systematic, quantitative, uncertainty characterization in the study of watershed responses to climate change.
Future research addressing the above and related methodological questions would be valuable. In addition, application of this type of statistical analysis to additional study areas, increasing the sampling across diverse hydroclimatic regimes, would be helpful for eliciting clearer patterns in the relative importance of GCM and DSM by watershed characteristics. Increasing the size of the GCM ensemble, and therefore range of future climates, considered might similarly produce insights into more systematic patterns of response in watershed simulations to GCM versus DSM forcing. Finally, while the kind of “ensemble of opportunity” approach to pairing of GCMs and DSMs we have used here allows for the leveraging of a large volume of existing projection data, it makes it difficult to separate variability in hydrologic endpoints due simply to the increased resolution from that due to factors such as RCM model formulation or the particular statistical downscaling algorithm used. It would therefore be worthwhile to repeat this type of analysis in the context of a “big brother” or “perfect model” approaches (see, e.g., Denis et al. 2002), where a high-resolution climate simulation is degraded to coarser resolution to create a synthetic analog of both the GCM and downscaled data from the same underlying model run.
4. Summary and conclusions
Assessments of climate change impacts on water resources are complicated by the scale, complexity, and inherent uncertainty of the problem. This study illustrates one poorly understood but important facet of this complexity: the potential effects of DSM (including the choice to use downscaling at all) on simulations of hydrologic and water quality changes. Our results show that both the parent GCM and how downscaling is done can contribute to the variability of projected watershed responses. Moreover, sources of variability differ among watersheds, season of the year, and for different streamflow and water quality endpoints governed by different watershed and hydroclimatic processes. The differences among GCMs can be the major source of variability in some cases, while if and how the data are downscaled can be a major factor in others. Our results also provide a detailed illustration of how downscaling GCM output can alter simulations of watershed processes as compared to simulations based on nondownscaled GCMs. Water resources practitioners should be aware that while models are a useful and necessary part of management planning, there is significant uncertainty in projections associated with both GCM choice and DSM choice. Given the uncertainties, managers should seek to examine a wide range of plausible futures, identify potential vulnerabilities, and focus on solutions that are robust across a range of plausible futures rather than a single most likely future.
Statistical downscaling has power in its ability to reproduce local-scale deviations from areal average results, such as finer-scale orographic effects, and can adjust for some inherent spatial biases in GCMs, but it assumes historical spatial relationships between GCM output and local climate will remain unchanged over time. Statistical downscaling is also less computationally intensive and thus more conducive to running larger ensembles of scenarios. Dynamical downscaling with RCMs is a physics-based approach that attempts to account for changes in the relationship between global and local climate but requires a high level of effort and is not yet proven to yield more credible results. There is no consensus on a “best” downscaling approach for use in the assessment of climate change impacts on water resources. Statistical and dynamical methods each have advantages and disadvantages, and there are a wide variety of specific methods within each category. In choosing information sources for potential future climate change, one should consider the study goals and specific questions being asked, level of confidence required for information to be actionable, time and resources available, and other relevant questions that determine the decision context.
The authors thank the entire project team at Tetra Tech, Inc., Texas A&M University, AQUA TERRA, Stratus Consulting, and FTN Associates for their support contributing to the development of the watershed model simulations used in this analysis. We also thank Seth McGinnis of the National Center for Atmospheric Research (NCAR) for processing the North American Regional Climate Change Assessment Program (NARCCAP) output into change statistics for use in the watershed modeling. NCAR is supported by the National Science Foundation. We acknowledge the modeling groups, the Program for Climate Model Diagnosis and Intercomparison, and the WCRP’s Working Group on Coupled Modeling for their roles in making available the WCRP Coupled Model Intercomparison Project Phase 3 (CMIP3) multimodel dataset. Support of this dataset is provided by the Office of Science, U.S. Department of Energy. The views expressed in this paper represent those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency.
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/EI-D-15-0024.s1.