The Colorado River is the primary water source for much of the rapidly growing southwestern United States. Recent studies have projected reductions in Colorado River flows from less than 10% to almost 50% by midcentury because of climate change—a range that has clouded potential management responses. These differences in projections are attributable to variations in climate model projections but also to differing land surface model (LSM) sensitivities. This second contribution to uncertainty—specifically, variations in LSM runoff change with respect to precipitation (elasticities) and temperature (sensitivities)—are evaluated here through comparisons of multidecadal simulations from five commonly used LSMs (Catchment, Community Land Model, Noah, Sacramento Soil Moisture Accounting model, and Variable Infiltration Capacity model) all applied over the Colorado River basin at ⅛° latitude by longitude spatial resolution. The annual elasticity of modeled runoff (fractional change in annual runoff divided by fractional change in annual precipitation) at Lees Ferry ranges from two to six for the different LSMs. Elasticities generally are higher in lower precipitation and/or runoff regimes; hence, the highest values are for models biased low in runoff production, and the range of elasticities is reduced to two to three when adjusted to current runoff climatology. Annual temperature sensitivities (percent change in annual runoff per degree change in annual temperature) range from declines of 2% to as much as 9% per degree Celsius increase at Lees Ferry. For some LSMs, small areas, primarily at midelevation, have increasing runoff with increasing temperature; however, on a spatial basis, most sensitivities are negative.
The Colorado River is the major water source for much of the southwestern United States. The river’s discharge is regulated by numerous dams on tributaries and two major main stem dams—Glen Canyon and Hoover, which form impoundments (Lakes Powell and Mead, respectively) that store about four times the mean annual natural flow of the river (observed discharge adjusted for effects of upstream diversions and storage) at its mouth. In a relative sense, the storage provided by these reservoirs is large (in contrast, the storage in the Columbia River basin is only about one-third of the river’s mean naturalized—absent water management effects—flow). Therefore, the Colorado River reservoir system is operated to carry over water from wet years to dry years, notwithstanding that it also reshapes the seasonal pattern of discharge. Because of the large storage, the reliability of the Colorado River reservoir system for water supply is relatively insensitive to changes in the seasonal pattern of discharge (see, e.g., Christensen et al. 2004 and Christensen and Lettenmaier 2007). However, the system is vulnerable to long-term changes in climate, such as the general drying projected for the southwestern United States by many climate models (Christensen and Lettenmaier 2007; Solomon et al. 2007; Seager et al. 2007). The multiyear drought experienced over the last decade highlights the potential impacts of long-term changes in Colorado River discharge (Barnett and Pierce 2009; Cayan et al. 2010; Overpeck and Udall 2010).
Because of the Colorado River’s water supply importance, it has been evaluated in several studies that have projected future flows (e.g., Milly et al. 2005; Hoerling and Eischeid 2007; Christensen and Lettenmaier 2007; Seager et al. 2007; Ray et al. 2008). The methods used in these studies differ, and methodological differences explain some of the wide range of projected future changes in Colorado River discharge, which vary from reductions in mean annual discharge of less than 10% to almost 50% by the mid-2000s (Hoerling et al. 2009). Part of this range is attributable to differences in the forcing data used, both in resolution and in the variation of climate models included in the different studies. Differences also reflect biases in the runoff projections (as shown below, hydrologic sensitivities are highly nonlinear, and hence depend on the models’ estimates of current climate runoff) and reflect variations in hydrologic model sensitivities to changes in precipitation and temperature. In this paper, we evaluate the land surface model (LSM) contribution to the overall uncertainty—specifically, how runoff change differs depending on model responses to changes in precipitation and temperature.
Many LSMs have been developed with varying levels of complexity and philosophies with respect to their representations of the land surface hydrologic cycle. We focus on five LSMs that have been widely applied at regional to global scales. Included are three that were developed for use in global climate models [Catchment, Community Land Model (CLM), and Noah] and two that have been used primarily in uncoupled hydrologic applications [Sacramento Soil Moisture Accounting (SAC) and Variable Infiltration Capacity (VIC)]. LSMs used in global and regional climate models have the primary purpose of partitioning net radiation into turbulent and ground heat fluxes; hence, the lower boundary conditions for the atmosphere. Hydrologic models, in contrast, have the primary purpose of partitioning precipitation into evapotranspiration and runoff. The two LSMs with hydrologic heritage were developed primarily for hydrologic prediction purposes, albeit at large scales. They generally have more detailed representations of runoff production dynamics than do the LSMs intended for coupled applications. Nonetheless, with one exception discussed below, all models predict the full energy and water balances at the land surface, typically at time steps of one day or shorter.
Many studies have compared the performance of different LSMs in different hydroclimatic settings (e.g., Henderson-Sellers et al. 1995; Pitman et al. 1999; Boone et al. 2004; Mitchell et al. 2004; Wang et al. 2009). The Project for Intercomparison of Land Surface Parameterization Schemes (PILPS), for example, had four phases of controlled experiments that were intended to better understand the implications of parameterizations of water, energy, and momentum between the atmosphere and land surface (Henderson-Sellers et al. 1995; Pitman et al. 1999). As many as 30 LSMs participated in the various phases of PILPS. The North American Land Data Assimilation System (NLDAS) project compared the performance of four LSMs across the continental U.S. domain with respect to their capabilities in the context of land data assimilation (Mitchell et al. 2004; Lohmann et al. 2004). These and other studies have improved LSM performance by comparing models to one another and observations to highlight their strengths and weaknesses. The studies have also illustrated the implications of differences in LSM parameterizations and complexity (Koster and Milly 1997; Mitchell et al. 2004). These differences, compounded by potential issues of numerical processes (e.g., Kavetski and Clark 2011) and diverse philosophies that underlie LSM development (e.g., focusing on the largely vertical processes that control land–atmosphere fluxes, as contrasted with the largely horizontal processes that control runoff generation) make understanding differences and uncertainties in LSM formulations exceedingly difficult.
Little previous work has been done to compare LSMs with respect to their hydrologic sensitivities to changes in long-term precipitation and temperature. Many previous studies have, however, investigated how changes in climate influence streamflow, and in so doing have attempted to quantify hydrologic sensitivities (e.g., Schaake 1990; Dooge 1992; Dooge et al. 1999; Sankarasubramanian et al. 2001; Fu et al. 2007; Gardner 2009; Zheng et al. 2009; among others), mostly from observations. These studies define hydrologic sensitivities in various ways. For example, Schaake (1990) first defined precipitation elasticity as the change in precipitation that will produce a unit fractional change in runoff. He demonstrated how variable this sensitivity (or elasticity, as defined in economics) was in different regions of the United States, with a range from less than 1 in very humid areas to as high as 10 in some arid areas. Dooge (1992) calculated sensitivity factors (similar in nature to Schaake’s runoff elasticities) using empirical expressions that related (annual) actual evaporation normalized by potential evaporation to different humidity indexes, where the higher the humidity index, the lower the elasticity. He cautioned that the usefulness of the annual analysis was limited in some ways, and that inclusion of seasonality—which he later addressed (Dooge et al. 1999)—could change the interpretation. Sankarasubramanian et al. (2001) estimated streamflow elasticity across the continental United States through use of both parametric and nonparametric estimators applied to time series of annual streamflow and precipitation. They found elasticity values for tributaries and main stem locations in the Colorado River basin (CRB) that ranged from a little less than two to a little more than four.
These two bodies of research—the LSM comparison studies and hydrological sensitivities studies—provide important foundations for understanding hydrologic sensitivities to changes in precipitation and temperature. LSMs play an important role in global and regional climate models, which in turn are the basis—directly or indirectly—for most projections of hydrologic impacts of climate change. We argue that better understanding of the relative sensitivities of these models to climate forcings is essential to understanding uncertainties in hydrologic projections. In this paper, we take a straightforward approach to evaluating the hydrologic sensitivities of five commonly used LSMs to precipitation and temperature changes, with the intent of better understanding 1) to what extent does land surface hydrology modulate or exacerbate regional-scale sensitivities to global climate change?, 2) how much of the range of results of these hydrologic sensitivities is attributable to model bias?, and 3) how do these sensitivities vary spatially across the CRB?
2. Study area
The Colorado River (Fig. 1) drains parts of seven U.S. states and Mexico (642 000 km2) and ranges in elevation from over 4000 m in its northeastern headwaters to sea level at its mouth at the head of the Sea of Cortez 2250 km downstream. For purposes of water allocation, the Colorado River Compact of 1922 divided the CRB into the “upper basin” (above Lees Ferry, including Lake Powell), and the “lower basin” (below Lees Ferry, including Lake Mead). On basin average, about half of the CRB’s precipitation comes in winter (although more so in the north than the south) and summer; however, from a hydrological standpoint, runoff is dominated by winter precipitation, which occurs mostly as snow in the upper basin and is the source of the spring freshet that accounts for most of the river’s annual streamflow (Fig. 2).
The CRB has large temperature and precipitation gradients, as well as diverse vegetation and soil types, all of which combine with variability in precipitation to produce one of the most variable hydrologic regimes in the continental United States. The coefficient of variation (standard deviation/mean) of annual flow volume at Lees Ferry as calculated by McMahon (1982) is 0.37, which is higher than most (median 0.25) of the 126 rivers included in the United Nations Educational, Scientific and Cultural Organization (UNESCO) study of world rivers [for comparison, the coefficient of variation of annual discharge of other large global rivers are Columbia (0.18), Mississippi (0.28), Amazon (0.06), and Mekong (0.12)]. In addition to spatial variations in surface and subsurface properties, hydrologic variability in the basin is strongly affected by elevation. Hence, relatively high spatial resolution is required for adequate representation of the basin’s water balance. Water balance variations are strongly driven by evapotranspiration, which accounts for over 85% of precipitation on average (however, runoff ratios are much higher than the ~15% basin average in the high-elevation headwaters).
We focus on LSM water balance calculations and the sensitivity of these calculations to changes in precipitation and temperature. To do so, we forced all five models with the same surface meteorological data and implemented all models at the same (⅛° latitude–longitude) spatial resolution over the same domain. For each model, we compared both gridcell runoff (the sum of the model’s surface runoff and drainage) and streamflow (runoff routed through a simplified representation of the channel network) at selected stream gauge locations (Fig. 1). In the context of LSMs, runoff is a quantity that is generated instantaneously at each grid cell. We applied the routing model of Lohmann et al. (1996, 1998) to all five LSMs. The routing model accounts, in a simplified manner using a unit hydrograph, for the effect of the channel system in transforming spatially distributed (grid cell) runoff to streamflow and for lags both within a grid cell that affect the timing of runoff exiting a grid cell and within the channel system, which represents lagged effects from gridcell outlets to a (stream gauge) location on the channel network. A spatial average of runoff is slightly different than streamflow (primarily because of travel time considerations), but the long-term average is nearly the same.
a. Meteorological forcing dataset
We used methods described in Maurer et al. (2002) as modified by Wood and Lettenmaier (2006) to generate daily historical gridded forcings of temperature minima and maxima, precipitation, and wind speed at ⅛° latitude–longitude resolution from observed station data. The modification of the approach by Wood and Lettenmaier (2006) uses a smaller set of index stations, which reliably report over many years, and avoids the problem of ephemeral stations, which can bias the long-term variability of the dataset. To meet requirements for models that run on a subdaily time step, daily values of precipitation and temperature were disaggregated into 3-hourly time steps according to methods outlined in Nijssen et al. (2001) and Wang et al. (2009). Similar to Maurer et al. (2002), other meteorological and radiation variables are calculated from established relationships—for example, downward solar and longwave radiation and dewpoint were derived from the daily temperature and temperature range using methods described in Nijssen et al. (2001). Surface air temperature, precipitation, wind speed, specific humidity, air pressure, and surface incident shortwave and longwave radiation forcings were identical for all LSMs. Because we focus on relative differences, we distributed precipitation uniformly throughout the day in all LSMs even though the Catchment model is more sensitive to diurnal precipitation variations than are the other four models (Wang et al. 2009). We selected 1975–2005 as our period of analysis because it includes the drought years in the early 2000s.
b. Land surface models
The five LSMs we used have diverse heritages; however, all of them have been used in either or both of the past multi-LSM studies of Mitchell et al. (2004) and Wang et al. (2009) and are structured to run offline in a semidistributed manner (Table 1). We included two versions of the Noah LSM as recent changes to Noah model parameterization processes to improve warm season simulations (see Wei et al. 2012 for modification details) have resulted in much different model performance (see Fig. 2). Each LSM was initialized by running it from 1970 to 2005, then cycling with 1970 forcings 10 times before beginning the simulation in 1970. The simulation was then continued through water years 1975–2005, which was our period of analysis.
We used LSM versions—including parameters—used in previous studies, with modifications only in spatial resolution and extent and forcing data. VIC was implemented as in Christensen and Lettenmaier (2007), which was calibrated at ⅛° spatial resolution. We used Noah 2.7 and 2.8, SAC, and CLM as used within the University of Washington’s real-time national Surface Water Monitor multimodel system (see http://www.hydro.washington.edu/forecast/monitor), which required an increase in the resolution from ½° to ⅛°. SAC does not generate potential evapotranspiration (PET) and instead uses PET generated by Noah 2.7 similar to Mitchell et al. (2004) multiplied by monthly vegetation adjustment factors. In this study we focus on results of the semidistributed version of SAC as used in Wang et al. (2009), although we have found in limited comparisons that results are comparable to the operational version (which is well calibrated, using only catchment-level characteristics) used by the CRB River Forecast Center; for example, for historical values at Lees Ferry, the elasticity was 2.4 (versus 2.6 with the distributed version) and the temperature sensitivity was 4% (versus 5%). We increased the spatial resolution of Catchment as used by Wang et al. (2009) from ½° to ⅛°, and used Catchment parameters provided by S. Mahanama [National Aeronautics and Space Administration (NASA) Goddard Space Flight Center, 2010, personal communication] that were produced specifically for a ⅛° implementation of the model.
As their diverse heritage indicates (Table 1), the various LSMs were constructed for different purposes and thus also differ in the extent to which they have been calibrated and compared with observed streamflow (Mitchell et al. 2004). VIC and SAC were developed specifically for streamflow simulation purposes. VIC was calibrated to a number of stream gauges within the Colorado basin (Christensen et al. 2004; Christensen and Lettenmaier 2007); however, its implementation here uses slightly different forcing data, and the model version is slightly different than the one for which calibrations were performed. The SAC version we used has not been calibrated, nor did we attempt to transfer parameters from the operational version, which has been implemented for somewhat different subbasins than the ones we used. No previous attempts had been made to calibrate Catchment, CLM, or either version of Noah. As a result, in general we do not expect simulated streamflows to match observations closely, and this is not our goal. However, in section 4, we do evaluate the nature of variations across models in their sensitivities, and assess how the model sensitivities are affected by biases in the LSM simulations.
LSM formulation and validation is particularly challenging in the CRB because there are limited in situ observations that capture the basin’s highly variable (in space and time) snow and soil moisture, especially at high elevations. Also, because such a large fraction of the basin’s precipitation either evaporates or is transpired, the accuracy of runoff predictions is highly susceptible to evapotranspiration (ET) prediction errors. For example, a 5% error in annual ET prediction (assuming ET is 85% of the total mass balance) translates to 25% errors in annual runoff.
c. Precipitation elasticity and temperature sensitivity formulation
We perturbed both precipitation (P) and temperature (T) by uniform amounts each day of the year throughout the period of record. We created reference climates by using multiplicative perturbations in P (70%, 80%, 90%, 100%, and 110%) and additive perturbations in T (0°, 1°, 2°, and 3°C). For both precipitation elasticity (ɛ) and temperature sensitivity (S) computations, we calculated the model response to an incremental change (1% and 0.1°C change, respectively) relative to each reference climate. We selected these increments of change (1% and 0.1°C) to be as small as possible so as to approximate the tangent (versus the secant) while limiting computational artifacts. As an example, for historical ɛ we compared the historical simulation (0°C T change and 100% of historical P) to a simulation where P was multiplied by 1.01; whereas for a 110% reference climate, we compared historical climate P multiplied by 1.10 with historical climate multiplied by 1.11. We estimated ɛ as the fractional change in annual average runoff (Q) divided by the imposed fractional change in P. In this study we use Δ = 1%:
Sensitivities to T changes are not as straightforward to estimate as those to P changes. Dooge (1992) suggests formulation of an elasticity of runoff with respect to PET; however, unlike P, which is a common forcing for all models, PET is not a measurable quantity but a function of measurable quantities, which is computed differently depending on LSM, and can be difficult to extract from the various models. While net radiation, rather than T, is the primary variable that defines PET, net radiation is T dependent, especially because downward solar radiation depends on daily T range as described below. Additionally, surface air T is the best-understood and widely archived variable simulated by climate models, so it makes some sense to formulate a T sensitivity rather than a PET elasticity. Furthermore, surface air T is the variable that is most often perturbed in hydrologic simulations of climate change (e.g., Christensen and Lettenmaier 2007; Elsner et al. 2010; and others). Surface air T affects and/or is affected by downward solar and net longwave radiation, sensible and latent heat fluxes, ground heat flux, and snow processes, which change evaporative demand and thus runoff. It is also notable that net radiation, vapor pressure deficit, and wind speed (along with air T) all affect evapotranspiration rates. All of these variables have been changing in the last few decades, and as Donohue et al. (2010) found by investigating these variables in Australia, nontemperature attributes can have an influence greater than T. For this reason, we perturb T in two ways as described below.
We defined S as the percent change in annual average Q per 1°C T change [Eq. (2)]:
Because the model forcing data include daily temperature maxima (Tmax) and minima (Tmin), we used two methods to perturb T. Both increase the daily average T by the same amount, Δ—either increasing both Tmin and Tmax by Δ (referred to as STmin&max) or by fixing Tmin and increasing Tmax by 2Δ (referred to as STmin_fixed). We used Δ = 0.1°C, where in STmin&max calculations Tmin and Tmax are both increased by 0.1°C and in STmin_fixed calculations Tmin remains the same and Tmax is increased by 0.2°C. In the model forcing dataset, downward solar radiation is indexed to the daily temperature range, Tmax–Tmin, using the method of Thornton and Running (1999). In this algorithm, if Tmin and Tmax are both changed by the same increment (for STmin&max), the daily T range, and hence downward solar radiation, is unchanged (however, downward longwave radiation and humidity, both of which are forcings to the evapotranspiration algorithms used by the various LSMs, do change). On the other hand, when only Tmax is increased (for STmin_fixed), it has the effect of changing downward solar radiation, as well as downward longwave radiation and humidity, hence in general resulting in larger changes in net radiation and vapor pressure deficit.
Our approach allows us to investigate changes in ɛ and S spatially both within and across models. We also evaluate ɛ and S of streamflow at Lees Ferry, where this spatially averaged value avoids having ɛ and S dominated by areas of the basin that produce little runoff but have large ɛ or S.
d. Precipitation and temperature interactions
As the climate changes, both T and P are projected to change simultaneously; however, in our LSM simulations in sections 3b and 3c, we altered the reference climate’s P or T, but not both. To determine the extent to which changes in P and changes in T interact, we compared four simulations for each LSM: 1) historical runoff, Qhis (ΔT = 0°C, P = 100%); 2) perturbed T, QΔT (1°C, 100%); 3) perturbed P, QΔP (0°C, 101%); and 4) both perturbed T and P, QΔTΔP (1°C, 101%). We compared QΔTΔP and QΔTΔPest, where QΔTΔP is the model simulation with both T and P perturbed simultaneously, and QΔTΔPest is estimated by Eq. (3):
For these simulations, unlike the ones outlined in section 3c, we use ΔT = 1°C (instead of 0.1°C) changes in T so that differences in runoff resulting from T and P changes are more similar in magnitude. We also select opposing responses (ΔT = 1°C will decrease runoff, whereas ΔP = 1% will increase runoff) to better distinguish the effects of each perturbation. Our analysis of P and T interactions includes all 4518 grid cells, effectively representing a wide range of reference conditions.
4. Results and discussion
We examine the influence of changes in forcing datasets by evaluating LSM performance with historical forcing data (section 4a) through perturbations in P (section 4b), T (section 4c), and both P and T (section 4d). Because Lees Ferry is the key gauge in the Colorado basin for water management purposes, we report naturalized flows at this gauge. Flows at Lees Ferry are generally representative of basinwide runoff as average annual naturalized streamflow at Lees Ferry is greater than 90% of the flow at Imperial Dam (USBR 2010)—the most downstream location for which naturalized streamflows are reported. The spatial patterns within each model also reveal meaningful differences between LSMs; therefore, we provide both gauge information and values at each ⅛° grid. Although we focus on water years 1975–2005, results changed little when different multidecadal periods of analysis were used.
a. Historical water balance
We compared LSM routed streamflow to naturalized flows at Lees Ferry as estimated by the U.S. Bureau of Reclamation (USBR); we refer to USBR values as observed (USBR 2010) (Fig. 2). In the observations, flows generally peak in spring (usually June) and vary in total average annual flow from 220 to 1010 m3 s−1 in individual years between 1975 and 2005, with an average of 590 m3 s−1. The seasonality and magnitude of streamflows differed among models, although trends in wet and dry years generally coincided with observations (Fig. 2). When compared to naturalized streamflow, baseline historical simulations for each LSM run had a range of biases. VIC simulated the peak in seasonal flow in the same month as observed, but had a dry bias averaging 12% for 1975–2005. SAC simulated peak flows 1 month earlier than observations on average and had a negative bias of 9% for 1975–2005. The two versions of Noah were the two extreme LSMs. Noah 2.8 was extremely wet relative to observations (64% wet bias in annual flows) and peaked 3 months earlier, whereas Noah 2.7 was dry (49% dry bias in annual flows) and peaked 1 month earlier. CLM also consistently underestimated (28%) annual flow and had a much a narrower peak in monthly flow relative to observations and other models. Catchment peaked 2 months earlier and had a dry bias (44% in annual flows).
In LSM simulations, Q by construct should equal the residual of P minus ET in long-term mean, assuming no long-term change in storage. When averaged over the entire basin (Fig. 3, top panels; Table 2), most LSMs had long-term storage changes that were less than 1% of P over the period of simulation (Table 2). Noah 2.8 was an exception, with 8% of P per year on average not accounted for in either Q or ET. Basinwide averages of water balance in Table 2 differ from those at Lees Ferry (e.g., P = 1.0 mm day−1 at Lees Ferry versus 0.96 mm day−1 across the entire basin) primarily because the entire basin includes the drier lower basin—that is, similar values of runoff are averaged over larger areas.
Averaged over the entire basin, ET varied from 69% of P in Noah 2.8 to 94% of P in Noah 2.7 (Table 2). Basinwide P from the forcing dataset minus an estimate of observed streamflow averaged for 1975–2005 was approximately 0.87 mm day−1, or 90% of P. In this estimate we add 40 m3 s−1, the lower bound, to naturalized flow values at Imperial to include the Gila basin, as USBR does not report values for the Gila. In the LSMs, ET was influenced by both T and P—higher in the summer with a slight decline in June when P was lowest, which resulted in a bimodal peak. All LSMs simulated this phenomenon to some extent (Fig. 3). Catchment and SAC had ET peaks that occurred early (in April and May, respectively) at the same time Q peaked and again in August, while Noah (both versions) and CLM peaked in August with a smaller increase occurring in May. VIC ET had one well-defined peak in July, although when temperatures were increased, a bimodal peak began to appear.
The springtime peak in runoff coincided with the ET peak in all LSMs except VIC and Noah 2.8. In VIC, snow melted later than in the other LSMs, which might be associated with lower ET, as the growing season is shorter. In Noah 2.8, this is likely because the runoff peak occurred much earlier—before net radiation, which drives ET, was high enough to support a peak in ET.
Spatially, all LSMs had somewhat similar patterns in ET, Q, and snow water equivalents (SWEs) with most Q produced at the highest elevations (Fig. 4). Noah 2.8 had low SWE throughout the winter and ET values were much lower than for the other models, resulting in more Q. The relative fraction of Q generated in the lower basin varied considerably across models—VIC, SAC, Noah 2.8, and Catchment generated 36%–32% of total basin Q from the lower basin (below Lees Ferry), whereas CLM and Noah 2.7 generated much smaller amounts—12% for CLM and 23% for Noah 2.7 with most of the runoff coming from elevations greater than 1500 m. The lower basin is heavily managed (e.g., according to Blinn and Poff 2005, the Gila River’s virgin flows were greater than 40 m3 s−1, whereas now they are less than 6 m3 s−1). If, however, we conservatively assume naturalized flows below Imperial are 40 m3 s−1, lower basin flows are approximately 15% of the total basin. Similarities in model performance were somewhat surprising given that the partitioning of water between surface runoff and drainage differs considerably among models (e.g., about 30% of total runoff is surface water in VIC versus 73% in Catchment; Table 2).
The dataset we used (Wood and Lettenmaier 2006) may have a slight dry annual bias. When VIC streamflow generated with this forcing dataset is compared to other VIC simulations run with forcing datasets of Maurer et al. (2002) and Hamlet and Lettenmaier (2005) from 1950 to 1999, streamflow values at Lees Ferry were lower when using Wood and Lettenmaier (2006) by about 9% and 12%, respectively. Therefore, some dry bias in streamflow is not unexpected—the extent to which Noah 2.7, Catchment, and CLM streamflows were dry, however, resulted in biases that go beyond the effects of modest P differences. In particular, the wetness of Noah 2.8 relative to Noah 2.7 likely has to do with model physics parameterizations of turbulent fluxes (see Wei et al. 2012 for details on version differences) that result in low ET in the early spring and an apparent imbalance in the Noah 2.8’s water budget.
b. Precipitation changes
Precipitation P perturbations of +10%, −10%, −20%, and −30% relative to historical values (1975–2005) resulted in large but relatively consistent changes in ET and Q across the LSMs (Fig. 3, second row). The applied percentage changes in P were uniform over the year, but because P had a strong seasonality, the magnitude of change in P (e.g., in mm) was not uniform. For instance, the smallest absolute changes in P averaged over the basin were in summer. Other water balance terms, however, did not have the same seasonal fluctuations. In fact, the largest absolute change in ET occurred for all models in the summer. The largest absolute Q change occurred when Q peaked, which varied among LSMs but was typically in the spring or early summer.
Precipitation elasticities ɛ—the percent change in Q for a 1% change in P—varied depending on reference climate (Fig. 5) and location within the basin (Fig. 6, top row). Values of ɛ calculated with historical values (see reference P = 100% on Fig. 5, left panel) ranged from 2.2 to 3.3 at Lees Ferry for different LSMs. In other words, a decrease in P of 1% relative to the climatology resulted in a decrease in Q from between 2.2% and 3.3%. This compares with an observed elasticity of 2.2 calculated at Lees Ferry from 1975 to 2005 using the nonparametric median estimator described in Sankarasubramanian et al. (2001).
Values of ɛ varied within LSMs depending on the reference climate and basin location. Drier (−30% to −10%) and wetter (+10%) simulations relative to climatology had ɛ values that ranged from 2.0 to 6.0. Values of ɛ for all LSMs decreased with increasing P. Declines in ɛ with increasing P had relatively similar slopes between LSMs with a slight concave curve that is more pronounced in LSMs and at locations that have lower flows (i.e., if the current climate flow was anomalously low for a given LSM, its ɛ value tended to be high relative to the others). Elasticity values greater than one, and strong increases in elasticity with declining precipitation, denote water limitations (Dooge 1992). These limitations become increasingly severe as precipitation declines. This is consistent with the Budyko hypothesis and analysis of climate sensitivities by Dooge (1992). It is therefore not surprising that the rank of the models from most to least elastic closely aligns with the magnitude of their historical flows. Specifically, Noah 2.7 (0.11 mm day−1), Catchment (0.12 mm day−1), and CLM (0.15 mm day−1) have the lowest average current climate flows at Lees Ferry, and the highest ɛ values (3.3, 3.0, and 2.9, respectively). This trend continues across different reference P values. If ɛ is plotted as a function of total flow rather than percent change, Noah 2.7, Catchment, CLM, and SAC tend to align on a single curve (Fig. 5, right panel), while VIC is slightly lower and Noah 2.8 higher. This highlights the importance of computing ɛ using simulations that reproduce historical streamflow. This result suggests that Noah 2.7, CLM, Catchment, and SAC would have similar ɛ values if their parameters were adjusted so that the simulated (reference climatology) streamflows were similar, whereas VIC would be less elastic and Noah 2.8 more elastic (if Noah 2.8 were not biased wet, it would have considerably higher ɛ values than shown in Fig. 5, right panel).
LSMs differ considerably as to where within the basin they are most elastic (Fig. 6). For example, VIC has its highest ɛ values at high elevations that contribute the most to runoff, whereas these same areas have the lowest ɛ values in SAC (see the histogram below each map in Fig. 6). Most VIC grid cells have ɛ values that range from 0.3 to 5.0, whereas for SAC the range is from about 2 to 8, yet overall basin ɛ values are nearly identical between these two models because the area that contributes most of the runoff has similar ɛ values—even though these values are at opposite ends of their entire-basin histograms. In other words, the highest 25% of runoff comes from the parts of the basin where ɛ values are most similar between models (see dark blue areas of histograms in Fig. 6). For parts of the basin that generate less flow, the model ɛ values are much more divergent (Fig. 7). Notably, the basin’s runoff is strongly controlled by the relatively small headwaters area. Therefore, it clearly is most important for models to simulate the headwaters accurately since it is such a large contributor to the entire basin’s flows.
Negative ɛ values were rare, but occurred in all models except for SAC. These values appear to be computational artifacts, as the few values disappeared when perturbation values were increased from 1% to 10% and only continued to occur with any frequency in CLM at locations where runoff values were smaller than any other model (values of 0.001 mm day−1 and less). CLM also had some very high ɛ values in the lower basin (Fig. 6). This appears to occur because CLM generated exceptionally low runoff in the arid parts of the lower basin (Fig. 4); hence, even small increases can imply high ɛ values.
c. Temperature changes
Essentially all climate projections indicated that air T will increase in the CRB as over most of the globe (Solomon et al. 2007). To explore how the basin will respond to increases in T, we increased reference T by 1°, 2°, and 3° by increasing daily T minimum and maximum (Fig. 3, bottom row). Generally, as T increased, ET seasonal peaks became wider, and Q declined in all LSMs. Changes in water balance resulting from T increases had a stronger seasonal signal than P changes and the magnitude of the annual Q change varied considerably among LSMs. Most notably, the results of T increases were declines in Q primarily in the spring and summer, and a shift in peak Q to earlier in the year.
As noted above, the daily T range is used to infer downward solar radiation. Therefore, keeping the daily T range the same implies no change in downward solar radiation, which suppresses changes in net radiation. If, however, the T perturbation increases Tmax without changing Tmin, the increase in T range results in increased net radiation. We calculated S by perturbing T in the two ways described in section 3c [Fig. 6, middle (STmin&max) and lower row (STmin_fixed)]. We first focus on STmin&max results and then discuss the differences between the two T perturbation approaches.
Values of S were largely negative and differed only slightly between reference conditions, but varied considerably among the different LSMs (Fig. 8) and spatially (Fig. 6; Fig. 7). CLM and Noah 2.7 had the highest aggregate S, whereas Catchment tended to be the least sensitive. When Tmin and Tmax were both increased, STmin&max ranged from a −2.8% change in Q per °C for Catchment to −8.4% in Noah 2.7 (Fig. 8, left panel) at Lees Ferry. Although STmin&max remained relatively constant with reference T (range for all reference T was −2.3% to −8.9%), VIC and Noah 2.7 had S that became slightly less negative as the reference T increased, whereas CLM and SAC had slight decreasing trends (became more negative) (Fig. 8). Aggregate S for different subbasins (not shown) had generally similar trends.
The spatial distributions of STmin&max (Fig. 6, middle row) varied considerably, although most values were negative, reflecting an increase in ET and subsequent decline in runoff. The greatest difference in S among LSMs were at locations with the lowest runoff values (Fig. 7).
There were grid cells in all LSMs that had positive S, although the number and magnitudes of these values differed considerably among LSMs. Positive S appears to occur in three types of conditions. Two are essentially computational artifacts, while the third relates to physical processes. One condition has to do with outliers in S (dark blue or dark red cells in Fig. 6, middle panel) that are related to small, imposed T changes. These positive S values disappeared when larger T increments were used. A second condition occurs only in CLM (manifested as a large, light blue area in the lower basin in Fig. 6, middle panel, which constitutes about 10% of the basin area). This coincides with very small CLM total runoff (less than 0.005 mm day−1) and more specifically with the sandiest soil in the basin. These values appear to reflect internal computational issues within the model.
The third category of positive S is values that appear consistently when the T references change and is relatively insensitive to the T increment. These conditions were most noticeable in Catchment (9% of grid cells), but also occurred in Noah 2.7 (3%), Noah 2.8 (0.7%), VIC (0.6%), and CLM (0.02%). LSMs’ higher surface runoff ratios tended to have the largest fraction of positive S (especially Catchment, where surface runoff was almost always greater than 50% of total runoff). Many of these positive S values occurred around 2000 m (over 60% of positive values occurred between elevations of 2000 and 2500 m).
The magnitude and direction of S demonstrates how land surface hydrology can both exacerbate—and more rarely modulate—regional-scale sensitivities to global climate change. Generally, as T increases, ET increases and runoff decreases (resulting in negative S). There are, however, some locations (the third category of positive S noted above) were there is a plausible mechanism for T increases to result in runoff increases because of land surface processes. This mechanism for positive S is the so-called Dettinger hypothesis, the details of which are described by Jeton et al. (1996). The hypothesis is that warmer T advances spring snowmelt and provides greater availability of moisture for runoff at a time of year when the energy available for ET is small. Hence, snowmelt is more efficiently transformed to runoff than later in the year, when evaporative demand is higher. Arguably, this mechanism should be most prevalent in locations where there is transitional snow (i.e., modest T increase results in large decreases in snowpack). It also stands to reason that this phenomenon would be more prevalent when there is more surface runoff relative to drainage, meaning that moisture available for runoff leaves the system sooner. There were, however, few locations where these positive values exert much change in runoff. The San Juan subbasin (near Bluff, Utah) in Catchment appears to be one location where both positive and negative S contributed to the lower overall negative S, but in other basins and especially in other LSMs, the aggregate flow changes were always negative and influenced little by areas with positive S.
The effect of increasing reference T on S was modest; however, changing the method of perturbing daily T (STmin&max versus STmin_fixed) resulted in large changes—roughly double for most of the LSMs (Fig. 8). When Tmin was fixed and Tmax was increased by 0.2°C for an average increase of 0.1°C [STmin_fixed; see Eq. (2) discussion above], S became more pronounced in all LSMs (Fig. 8, right panel), although spatial patterns remained similar (Fig. 6, bottom row). The historical climate had STmin_fixed that ranged from −7% to −15% for Catchment and Noah 2.7, respectively. Also, STmin_fixed was about a factor of 2 larger than STmin&max averaged over all models for aggregate streamflow at Lees Ferry. The ratio was largest for Catchment (about 2.4) and least for CLM and SAC (about 1.5). Noah 2.8, VIC, and Noah 2.7 were intermediate, with factors of 1.6, 1.9, and 1.9, respectively. Notably, changes in net radiation and vapor pressure deficit from changing the range in our T formulations changed S and thus indicate an important consideration in understanding the uncertainty of future climate impacts to water resources—highlighting considerations of climate variables beyond T and P. Donohue et al. (2010) found there are other climate variables that matter to potential ET calculations (e.g., net radiation, vapor pressure deficit, and wind speed).
In the larger context, T minima and maxima have been shown to not be changing uniformly over recent decades. Easterling et al. (1997) found that T records globally indicated a decline in the diurnal T range primarily from T minima increasing more than T maxima. They attributed changes to increases in cloudiness, surface evaporative cooling from precipitation, greenhouse gases, and tropospheric aerosols that may be the result of urbanization, irrigation, desertification, and variations in local land use. Within the Colorado basin, their collection of nonurban stations showed that T minima values in the past 100 years increased while T maxima decreased, resulting in an overall decrease in the diurnal T range. Although extensive investigation into the implication of observed changes in CRB T ranges is beyond the scope of this study, our results do suggest that changes in the diurnal T range can have strong implications for S, which vary among LSMs, and thus are important to consider when constructing and evaluating climate change scenarios.
d. Precipitation and temperature
Understanding P and T impacts on runoff independently provides a foundation from which to understand future changes. In reality, however, climate change most likely will be manifested by a combination of changes (e.g., both P and T). To test the extent to which the two effects can be superimposed (i.e., the combined effect estimated as the sum of P and T effects), we compared a simulation where both P and T were changed with the predicted sum of the individual effects for each variable (Figs. 9 and 10)—see Eq. (3). For all of the LSMs, the combined effects were quite close to those estimated by superposition, indicating that interaction effects were quite small; for example, differences between QΔTΔPest and QΔTΔP, assuming no interaction, were within 1.5% of QΔTΔP for over 90% of all grid cells in all LSMs except for CLM (for which 90% of grid cells were within 2.5%, and 77% of grid cells were within 1.5%). In addition, QΔTΔP tended to be minutely smaller than QΔTΔPest, which would be expected since the combined run (QΔTΔP) has more P—and therefore runoff—for simultaneously occurring T increases to diminish.
We also examined the spatial patterns of the inferred interaction effect (Fig. 10). The percent difference of QΔTΔPest from QΔTΔP had spatial patterns that coincided with areas where runoff is more sensitive to changes in T and P. VIC had the greatest correlation (r%diff,elast = 0.72) with ɛ values and r%diff,sens = 0.68 with S values. SAC had the second highest correlations: r%diff,elast = 0.61 and r%diff,sens = 0.58. Correlations with ɛ values were greater than with S values, except for Noah 2.7 (which had small correlations for both r%diff,elast = 0.20 and r%diff,sens = 0.26).
More broadly, areas that were more sensitive to changes in P were also, generally, more sensitive to changes in T as evident in similar, yet opposite, spatial patterns and histograms of ɛ and S (Figs. 6 and 7). Correlation coefficients between ɛ and S for all individual grid locations (n = 4518) were, however, small for most models with VIC having the greatest correlation (r = 0.81), SAC the second largest (r = 0.48), and all others less than 0.3.
It is unclear why models converge in ɛ and S in the headwaters and diverge elsewhere. This divergence in lower-flow regions is of somewhat diminished importance for understanding overall Colorado River flow, but arguably is important for understanding changes in water demand (which is beyond the scope of this study). For example, SAC and Noah 2.8 simulations have hydrologic sensitivities that indicate considerably more water stress as T increases and P decreases in the lower basin (i.e., a need for more water to compensate for climate change) than in VIC and Catchment simulations.
We investigated the hydrologic sensitivities of five commonly used land surface models (LSMs) to precipitation and temperature changes. We found the magnitude of predicted runoff changes resulting from these changes differs considerably among models as evidenced by large variations in precipitation elasticities and temperature sensitivities among the LSMs. Identifying the nature of these differences helps to better understand how land surface hydrology exacerbates or, in the Colorado River basin (CRB) more rarely, modulates regional-scale sensitivities to global climate change, how the range of hydrologic sensitivities is attributable to model bias, and how hydrologic sensitivities vary spatially across the CRB. More specifically, we found the following.
The direction of runoff change among LSMs is similar, with declines in annual streamflow at Lees Ferry when either precipitation decreased or temperature increased in all models. However, in most LSMs, there are some areas where temperature sensitivities are positive—mostly in the transient snow zone. In these areas, land surface processes modulate change, although the fraction of the CRB affected is quite small (and accounts for at most 7% of total runoff).
Model biases have an overt effect on the range of precipitation elasticity values, and an equivalent effect is not apparent in temperature sensitivity values. Most models have larger elasticities with respect to precipitation than are inferred from observations. This results in part because most of the models are biased downward in their reproduction of current runoff. Differences in LSM elasticities are amplified by these dry biases, which highlights the importance of simulating historical runoff magnitudes reasonably before performing climate perturbations. However, even with runoff bias accounted for, with elasticities interpolated to observed flows at Lees Ferry, there remains a range from about 2.2 to 3.1 in the elasticity of aggregate flows. Temperature sensitivities vary by at least a factor of 2 among LSMs but are not influenced much by biases in the models’ runoff simulations (an analogous interpolation for temperature to adjust for dry biases would not be appropriate).
The elasticities and temperature sensitivities are more consistent among models in headwater regions that produce most of the CRB’s runoff relative to other locations in the basin. Convergence in the headwaters tends to mask larger differences in parts of the basin that produce less runoff.
Superposition of precipitation and temperature changes largely holds with respect to annual runoff in the CRB in all LSMs across 4518 grid locations that represent a range of reference conditions; that is, the combined effect of precipitation and temperature changes are essentially equivalent to the sum of the precipitation and temperature contributions computed separately.
These findings for the CRB highlight a way to evaluate LSM performance that relates directly to their use in climate change studies. Similar investigations in other major river basins would be advantageous to further compare LSM performance. Notably, the CRB is more extreme than most river basins, especially considering that differences in LSM responses reflect differences in the models’ evapotranspiration parameterizations, meaning the CRB is particularly sensitive because evapotranspiration in the basin constitutes more than 85% of the basin’s water balance.
The authors thank Ted Bohn and Ben Livneh for assistance with model setup; Randy Koster for use of the Catchment model and feedback on earlier versions of the paper; Brad Udall, Robin Webb, Dan Cayan, Levi Brekke, and Kevin Werner for early feedback on research direction; and two anonymous reviewers for suggestions on manuscript revisions. This publication was funded by the NOAA Regional Integrated Sciences and Assessments program and the NOAA Climate Dynamics and Experimental Prediction/Applied Research Centers program under NOAA Cooperative Agreements NA17RJ1232 and NA10OAR4320148 to the Joint Institute for the Study of the Atmosphere and Ocean (JISAO), and by U.S. Department of Energy Grant DE-FG02-08ER64589 to the University of Washington.
Joint Institute for the Study of the Atmosphere and Ocean Contribution Number 1871.
Current affiliation: CH2M HILL, San Diego, California.