Observations show snowpack has declined across much of the western United States over the period 1950–99. This reduction has important social and economic implications, as water retained in the snowpack from winter storms forms an important part of the hydrological cycle and water supply in the region. A formal model-based detection and attribution (D–A) study of these reductions is performed. The detection variable is the ratio of 1 April snow water equivalent (SWE) to water-year-to-date precipitation (P), chosen to reduce the effect of P variability on the results. Estimates of natural internal climate variability are obtained from 1600 years of two control simulations performed with fully coupled ocean–atmosphere climate models. Estimates of the SWE/P response to anthropogenic greenhouse gases, ozone, and some aerosols are taken from multiple-member ensembles of perturbation experiments run with two models. The D–A shows the observations and anthropogenically forced models have greater SWE/P reductions than can be explained by natural internal climate variability alone. Model-estimated effects of changes in solar and volcanic forcing likewise do not explain the SWE/P reductions. The mean model estimate is that about half of the SWE/P reductions observed in the west from 1950 to 1999 are the result of climate changes forced by anthropogenic greenhouse gases, ozone, and aerosols.
The western United States is an arid region with a large and growing population. Water is a precious resource, and changes in the hydrological cycle have important societal and economic effects. Retention of winter precipitation in the form of snowpack is an integral part of the hydrological cycle in the region. Precipitation from winter storms can be retained in snowpack and released gradually, often months later, in the drier parts of the year. The majority of streamflow in the western United States originates from melting snowpack (Palmer 1988), and in much of the west more water is stored in snowpack than in man-made reservoirs (Mote et al. 2005). Future changes in snowpack are therefore a subject of considerable economic and societal importance.
Many studies have documented a reduction in snow over the western United States. Groisman et al. (1994) examined snow cover extent from satellite data over the period 1972–92 and found a statistically significant reduction over North America in the spring and summer. Groisman et al. (2004) used National Weather Service Cooperative network (co-op) station data to show March snow cover extent in the west has diminished over the period 1950–2003. Mote (2003), using data from snow courses, showed 1 April snow water equivalent (SWE) in the Pacific Northwest decreased strongly from 1950 to 2000. Mote et al. (2005) extended this analysis to the rest of the western United States, showing the reduction in SWE is widespread, with the notable exception of the southern Sierra Nevada. There, high altitudes and colder temperatures combined with unusually wet conditions acted to increase SWE (Mote 2006). Knowles et al. (2006) documented a regional trend toward more winter precipitation falling as rain instead of snow in the period 1949–2004; this is one of the factors contributing to the decreasing snowpack, along with an increase in winter daily melt events in the region (Mote et al. 2005).
In this work we use a formal detection and attribution (D–A) methodology to examine if some part of observed changes in snowpack over the western United States can be confidently ascribed to anthropogenic greenhouse gases (GHGs), aerosols, and ozone. Detection examines whether the changes in snowpack are likely to have arisen from natural internal climate variability. Attribution asks whether the snowpack changes are consistent with the expected effects of these anthropogenic forcings and inconsistent with other natural external climate forcings, such as variability in solar irradiance and atmospheric burdens of volcanic dust. D–A allows an explicit estimation of the likelihood of obtaining the observed decrease in snow given the model-estimated background of natural variability and compares the observed changes to those expected to occur owing to anthropogenic effects. Only if changes are both outside the likely range expected due to natural climate variability and consistent with the changes expected due to anthropogenic forcing can it be concluded that human activity has a role in reducing winter snowpack.
We use 1600 years of control run data from fully coupled global general circulation climate models (GCMs) to provide estimates of natural internal variability. Multiple ensemble members of two GCMs run with estimated historical changes in well-mixed GHGs, aerosols, and ozone supply the expected response of snowpack to these anthropogenic forcings. We statistically downscale the GCM results to ⅛° resolution then use the downscaled fields as input to a fine-resolution hydrological model. The hydrological model calculates the SWE values as well as soil moisture, runoff, and other variables in the hydrologic water balance used in companion work (Barnett et al. 2008; Bonfils et al. 2008; Hidalgo et al. 2008, manuscript submitted to J. Climate).
Previous studies have examined if changes in western U.S. snow are consistent with anthropogenic effects, although none have used a formal D–A method. The primary focus of these works has been on determining the relative roles of temperature and precipitation in causing the decline. Mote (2003, 2006), for example, used linear regressions between SWE, temperature, and precipitation to conclude that increases in SWE over the period 1930–50 were caused by increases in precipitation, but the widespread reductions in SWE since 1950 are due to regional warming. They found high elevation snow courses (permanent sites where manual measurements of snow depth and SWE are taken) with cold winter temperatures are little influenced by the warming—with SWE that can increase where local precipitation has increased. Natural internal climate variability, such as that associated with the North Pacific index (Trenberth and Hurrell 1994), was insufficient to explain the observed post-1950 warming. Mote et al. (2005) and Hamlet et al. (2005) used runs of the variable infiltration capacity (VIC) hydrological model (Liang et al. 1994; Cherkauer and Lettenmaier 2003) with specified forcing to explore this issue; after first verifying that VIC reproduced the observed trends in SWE when forced with historical meteorological data, they showed with fixed-precipitation runs that SWE decreases are predominately driven by regional warming. Again, these works found natural internal climate variability, such as from the Pacific decadal oscillation (PDO) (Mantua et al. 1997), could not explain the magnitude of the observed warming or SWE depletions. The contribution of the present work is to use formal D–A methodology to develop a model-estimated fingerprint of the SWE changes expected due to anthropogenic effects, assess the likelihood of the fingerprint pattern arising by chance from natural internal or external climate variability, and determine if the observed pattern of SWE changes agrees with that expected to be seen from anthropogenic warming.
The remainder of this paper is organized as follows. In section 2 the various data sources are described, including the observed snow and precipitation datasets, the global model control and anthropogenic runs, and the statistically downscaled model datasets. The D–A methodology and results are presented in section 3. A summary and our conclusions are given in section 4.
a. Snow courses
We use snow course data from the National Water and Climate Center (NWCC; http://www.wcc.nrcs.usda.gov), part of the U.S. Department of Agriculture Natural Resources Conservation Service (downloaded from ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course, accessed 22 September 2006). Over 2500 locations are available across the western United States, Alaska, and parts of British Columbia and Alberta, Canada. In the southern Sierra Nevada, the NWCC data were augmented with data from the California Cooperative Snow Surveys obtained from the California Data Exchange Center (CDEC; http://cdec.water.ca.gov; accessed 22 September 2006), California Department of Water Resources, Division of Flood Management. As described on those sites, snow courses are generally around 300 m long and sampled at multiple locations along their length to reduce the effects of drifting and small depressions in the ground. Measurements are taken by driving a hollow tube into the snow, which is then weighed to measure the SWE.
We use snow course measurements taken within 10 days of 1 April, which tends to be the most sampled date and near peak SWE in the western United States (Cayan 1996; Bohr and Aguado 2001). Mote et al. (2005) note that improved travel has resulted in a snow courses being sampled closer to 1 April over time, but conclude that this effect is small compared to climate effects on the snowpack. Although a few courses have data starting before 1910, coverage increases markedly by 1940. We used stations starting by 1950 with no more than 20% missing data from 1950 to 1999. This yields a pool of 661 stations for subsequent consideration (continued below). The beginning year is chosen partly to ensure good snow course coverage and partly because measured trends in precipitation are less reliable in earlier years (Groisman and Easterling 1994). The ending year is determined by the availability of model data produced for the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4) database.
Considerable work has been done to isolate the relative contributions of temperature and precipitation (P) to variations in spring SWE (e.g., Groisman et al. 1994; Cayan 1996; Mote 2003; Mote et al. 2005; Hamlet et al. 2005; Mote 2006). In short, temperature fluctuations have a larger effect in the warmer elevations near the freezing level, while changes in precipitation become more important at higher elevations where winter temperatures are far below freezing. In this work we focus on whether there is a detectable temperature-driven effect on western U.S. snowpack. We therefore analyze SWE/P rather than simply SWE to minimize the effects of year-to-year precipitation fluctuations, which are not of primary interest here and add noise to the temperature-driven snowmelt signal (Dettinger and Cayan 1995; cf. Barnett et al. 2008). The effectiveness of this technique is examined in section 3f, where it is shown that SWE/P has no statistically significant sensitivity to P.
We divide 1 April SWE by the total P summed from October through March, so the SWE/P ratio shows what fraction of current water year precipitation remains in the snowpack by 1 April. Seventeen stations (3% of the total) with a mean SWE/P ratio less than 0.15 were eliminated, as we wanted to focus on areas where an appreciable fraction of winter precipitation was retained in the snowpack on 1 April. The ideal precipitation data for this purpose would be measured at the same locations as the snow courses. Some snow courses are collocated with automated snow telemetry (SNOTEL) precipitation stations; however, the majority of SNOTEL data begins less than 20 years ago, too short a time to detect a slowly evolving anthropogenic signal. There also tend to be disagreements between reported snowfall and SWE even for collocated SNOTEL and snow courses, although snow course SWE measurements correlate over longer distances than SNOTEL snowfall measurements (Dressler et al. 2006). This is helpful for our purposes, as it suggests the snow course measurements have less noise.
We use the gridded daily precipitation dataset of Hamlet and Lettenmaier (2005; hereafter HL05), which is available at relatively fine spatial resolution (⅛° longitude by latitude) across the western United States. As a sensitivity test we also tried monthly data from the Parameter–elevation Regressions on Independent Slopes Model (PRISM) Group at Oregon State University (http://prism.oregonstate.edu, accessed 7 March 2007; Daly et al. 1994, 2002) and the daily gridded data of Maurer et al. (2002). The D–A results were little changed (not shown), although it should be kept in mind that these are not independent data sources. All rely on co-op station data as input, and both the daily datasets use PRISM monthly average precipitation maps in their topographic correction of precipitation (HL05; Maurer et al. 2002). The co-op precipitation gauges have smaller mouths than the SNOTEL gauges (20 cm versus 30.5 cm) and tend to be unshielded (Doesken and Schaefer 1987), which both increase undercatch. SNOTEL gauges, by contrast, have Alter wind shields that reduce the undercatch (Groisman and Easterling 1994). Our choice to use HL05 therefore requires a correction for undercatch, as described in the next section.
c. Treatment of gauge undercatch
Precipitation gauges tend to undercatch both rain and snow, with the undercatch for snow being larger (Groisman and Easterling 1994), primarily due to wind effects over the gauge opening. This means more SWE can be reported as present on the snow course than was measured by precipitation gauges (Serreze et al. 1999). Additionally, when there is an upward trend in the fraction of precipitation that falls as rain instead of snow (as found by Knowles et al. 2006), the increasing gauge catch efficiency will report a decreasing SWE/P, even if the actual SWE/P remains constant. It is important to account for these effects.
Details of our treatment of gauge undercatch are given in appendix A and outlined briefly here. We analyze SWE/P divided by its time mean (which we term “fractional SWE/P”) since this quantity is less sensitive to gauge undercatch than SWE/P itself. We estimate the trend in undercatch based on the changing rain/snow ratio using data from Knowles et al. 2006. This trend is small compared to the mean undercatch, so we develop a correction term that is first order in (trend/mean) and apply it to our fractional SWE/P estimate. Stations where the mean SWE is greater than estimated P, even when undercatch is taken into account or that do not fall within one of our mountain regions, are discarded. The remaining 548 stations (83% of our original pool of 661) used in the detection and attribution analysis are shown in Fig. 1.
d. Global climate model control runs
We use two different multicentury coupled climate model control runs for our estimates of natural internal climate variability: a finite volume version of the Community Climate System Model, version 3 (CCSM3) (Bala et al. 2008), which we will refer to as CCSM3-FV, and the Parallel Climate Model (PCM) (Washington et al. 2000).
CCSM3-FV was run for over 1000 years with preindustrial atmospheric constituents and forcing conditions. The first 239 yr were treated as a spinup; we analyzed the 850 yr starting at model year 240. The atmospheric resolution is 1.25° longitude by 1° latitude, somewhat finer than the equivalent spatial resolution of CCSM3 run with T85 spectral truncation, and 26 vertical levels. The model configuration included numerical parameterizations specific to the finite volume version of CCSM3 that correct problems with representing the sea ice and ocean around Greenland. The ocean model has a nominal 1° resolution and 40 vertical levels. Since we are performing a regional D–A study, our choice of this model configuration was motivated by the desire to have a relatively finescale representation of natural climate variability.
We used 750 yr from the same PCM control run (B06.62) analyzed in Barnett et al. (2005) and Pierce et al. (2006). The model was run at T42 spectral truncation for the atmospheric component and used a stretched and rotated grid with an average of about ⅔° spatial resolution in the ocean. We used this model because our previous analyses have shown PCM captures many important features of observed climate over the North Pacific and western United States—our area of interest here.
The natural internal climate variability of both control runs is compared to observations in section 3a.
e. Global climate model anthropogenically forced runs
To increase the robustness of our study, we used more than one set of anthropogenically forced climate simulations to compare to the observations. One of the downscaling methods used (described in the next section) requires daily maximum and minimum temperature (Tmax and Tmin) and precipitation. Our choice of model runs to use for the anthropogenic fingerprint estimation was therefore limited to cases where 1) daily data were available for the period 1950–99; 2) multiple ensemble members were available; and 3) the model had a good representation of western U.S. climate, including a realistic amplitude of natural variability. The requirement for 50 yr of daily data was one of the biggest limitations; for instance, while the World Climate Research Programme (WCRP) Coupled Model Intercomparison Project phase 3 (CMIP3) climate model archive has some daily Tmax, Tmin, and precipitation data for the twentieth-century runs, it is generally only available for 40 yr, and often for only one ensemble member.
The first set of anthropogenic model runs we used are from PCM: cases B06.22, B06.23, B06.27, and B06.28 (Washington et al. 2000; available online at http://www.earthsystemgrid.org). In addition to fulfilling the requirements for daily data and multiple realizations, PCM has a reasonable simulation of ENSO, the PDO, and the mean state and variability over the western United States. The PCM runs were forced with anthropogenic GHGs, ozone, and the direct effects of sulfate aerosols.
The second set of model runs we used are from the Model for Interdisciplinary Research on Climate (MIROC) model run at T42 atmospheric resolution (Hasumi and Emori 2004; Nozawa et al. 2007). We selected this model because an extensive set of runs with daily data and various forcing combinations was available; this will support future, more detailed work in examining how the various components of anthropogenic forcing affect climate over the western United States. Additionally, MIROC has a good representation of the PDO [defined as the leading EOF of SST anomalies north of 20°N in the North Pacific, Mantua et al. (1997)], with a peak SST anomaly of 0.5°C (versus 0.6°C observed) and spatial correlation with the observed PDO SST pattern of 0.75 (not shown). These were not the highest values of the models present in the CMIP3 database—both PCM and CCSM3 show higher spatial correlations of 0.91 and 0.88, respectively—but are nearly so, and having many runs with daily data available is advantageous for reducing the noise in the D–A work. The representation of ENSO is less satisfactory, however, with a signal that extends too far west and an amplitude about half that observed. The MIROC runs were forced with GHGs, ozone, the direct effects of sulfate aerosols, some indirect effects of sulfate and carbonaceous aerosols, and land-use changes. More details on the various model runs can be found in Bonfils et al. (2008).
f. Downscaling methodology
We explored two methods for downscaling the control and anthropogenically forced global model results to our region of interest: the bias correction/spatial disaggregation (BCSD) technique (Wood et al. 2002, 2004) and constructed analogs (CA) technique (H. G. Hidalgo et al. 2008). Both are statistical methods that use the global model’s fields of precipitation and temperature to construct a physically consistent representation of weather with finescale detail (⅛°). A comparison of the two methods is given in Maurer and Hidalgo (2007), who show that the CA method tends to produce weaker trends than the BCSD approach. Our results are consistent with this finding. Details of the methods are provided in the references given above, so only a brief summary is given here. The two methods make the trade-off between data volume and signal time step differently: the BCSD method requires only monthly model data to produce a daily time step but does not preserve the daily sequence of weather simulated by the model, while the CA technique requires daily model data but in turn preserves the daily sequence of weather simulated by the model. One consequence is BCSD will always give the same submonthly temporal relationships between the variates as found in the historical period, while CA can evolve the daily relationships between Tmin, Tmax, and P if the GCM responds in that way.
The BCSD method first bias corrects GCM monthly average temperature (T) and precipitation (P) using a quantile-based mapping from model to observed climatology. The bias-corrected T and P anomalies are then interpolated onto the fine grid, using additive anomalies for T and multiplicative ones for P. A random month from the historical observations is chosen (subject to the constraint that it be the same month of year as the month being downscaled), and the observed, fine-resolution daily T min, T max and P fields are adjusted (temperatures are shifted and P is scaled) by the bias-corrected, interpolated PCM anomaly. This method was used for the PCM control run.
The CA method first bias corrects the daily GCM data to have the same mean and variance as observed. It then makes use of a “library” of observed daily T and P fields available on both the fine (⅛°) grid and the global model grid. For each day’s GCM fields, the best 30 regional matches to the large-scale T and P fields are found. The final downscaled T and P fields are a weighted sum of the finescale T and P patterns corresponding to the closest 30 matches on the coarse scale. This method was used for the CCSM3-FV control run and MIROC anthropogenic runs.
g. Hydrological model
The downscaled T min, T max, and P fields from the control and anthropogenically forced GCMs were used to drive the VIC hydrological model (version 4.0.5; Liang et al. 1994; Cherkauer and Lettenmaier 2003) run at ⅛° resolution over the western United States. The VIC model calculates soil moisture, runoff, and snow cover given the meteorological forcing and parameterizations describing the physical characteristics of the region (soil depth, vegetation types, elevation, etc.). VIC was configured with five snow elevation bands to better represent snow processes in grid cells with changing topography. Our final analysis uses fractional SWE/P calculated by VIC forced by the statistically downscaled GCM results. The VIC model has been validated against observations and used in many climate change and streamflow forecasting studies in our region of interest (cf. Mote et al. 2005 and references therein).
a. Natural internal variability in the control runs
It is important to compare the natural internal climate variability in the control model runs to observations since control data with variability significantly weaker than observed could bias the D–A results.
Two important modes of natural variability that affect climate in the western United States are the El Niño–Southern Oscillation and the Pacific Decadal Oscillation (Mantua et al. 1997). Figure 2 shows ENSO and PDO variability in the model control runs, defined as the leading empirical orthogonal functiuons of winter [December–February (DJF)] sea surface temperature anomalies over the regions plotted. The same quantity is plotted from observations over the period 1946–2006 for comparison.
The PDO is reasonably well simulated by the control models, with a realistic pattern but somewhat higher variability than is observed. CCSM3-FV tends to have more loading in the Kuroshio Extension region than observed, while PCM mirrors the observations in showing largest expression in the central North Pacific. ENSO variability is again fairly well represented in the control runs, with somewhat stronger than observed amplitudes in CCSM3-FV and weaker than observed in PCM. The meridional scale of the variability is narrower in CCSM3-FV than observed (Bala et al. 2008). In both models the variability extends too far west, as is common in GCMs. We note in particular that there is no systematic underestimation of the strength of the PDO or ENSO in these models.
Although a reasonable simulation of the relevant modes of natural variability is important, our detection variable is fractional SWE/P, which depends on a wide variety of climate influences. Figure 3 shows the standard deviation of SWE/P from the (detrended) observations and CCSM3-FV and PCM control runs. Observed SWE/P values are detrended, so any linear change in SWE/P arising from anthropogenic forcing will not inflate their standard deviations. The left column is the standard deviation of the annual values and the right column shows the standard deviation of the 5-yr-averaged data. On annual time scales, both control runs tend to modestly overestimate the standard deviation in the Sierra Nevada and in Utah, while underestimating the variability in the extreme north of the Washington Cascades. On the 5-yr and longer time scales, both models continue to overestimate the variability in the Sierra Nevada and in Utah, while CCSM3-FV tends to underestimate variability in the Washington and Oregon Cascades. PCM does a better job of capturing the variability in the Oregon Cascades, but has slightly too much variability in western Montana. Overall, despite these small-scale differences, we find no evidence that either PCM or CCSM3-FV is systematically underestimating natural internal climate variability for our region and variable of interest.
b. Constructing the fingerprint
We use a fingerprint-based detection and attribution methodology (Santer et al. 1995; Hegerl et al. 1996, 1997; Tett et al. 1999; Allen and Tett 1999; Barnett et al. 2001). The intent is to isolate the signature of anthropogenically forced changes in SWE/P from the historical model runs, determine how similar the observed changes in SWE/P are to the fingerprint, and calculate the likelihood a signal of the observed strength could have occurred by chance in the control run. In this section, the method of constructing the fingerprint will be described and applied to nine mountainous regions across the western United States. In section 3g, the same methods are used to develop and analyze a fingerprint as a function of elevation.
To compare observations to the model results, we assign each snow course to the location of the nearest VIC model grid cell (on the ⅛° grid). When multiple snow courses were assigned to the same VIC grid cell, the fractional SWE/P time series from those snow courses were averaged. We then regionally average the observed and model SWE/P values over the nine regions shown in Fig. 4; this reduces small-scale noise and some spatial redundancy from areas where many snow courses are near each other. When regionally averaging the model SWE and P, we subsampled the model data at the same locations as the gridded snow courses to avoid introducing discrepancies from different model and observed data coverage.
The regionally averaged fractional SWE/P time series from observations are shown in Fig. 5. SWE/P has increased in the southern Sierra Nevada (although the trend is not significantly different from zero at the 5% level using the test described in Wigley et al. 2006); it decreases in all the other regions (although not significantly in the northern Sierra or Great Basin). SWE changes as a function of elevation (Mote et al. 2005, their Fig. 2; see also section 3g) indicate warmer, lower elevations tend to have experienced the most relative SWE loss, while precipitation increases can drive SWE increases at the colder, higher elevations.
The model-based fingerprint is formed from the nine regionally averaged time series of the anthropogenic model runs. To reduce natural variability unrelated to the anthropogenic forcing, we first ensemble averaged each regional time series across all the ensemble members available (4 PCM, 10 MIROC). In the results shown here each ensemble member was weighted equally: however, we found that weighting the PCM realizations so that they would contribute equally to the more numerous MIROC realizations made little difference. The resulting regional time series are shown in Fig. 6. SWE/P decreases in all areas, although the magnitude varies by region; losses range from 5% to 20% over the period examined here (1950–99). The 95% confidence interval on the estimated trend (adjusted for temporal autocorrelation) excludes zero for all regions except in the Rockies. As noted in previous works (e.g., Mote et al. 2005; Mote 2006), colder, higher elevation regions tend to have less temperature sensitivity than places where winter and early spring temperatures are near freezing; precipitation tends to be smaller in such locations, and sublimation more comparable with melt in the energy budget.
The model fingerprint (Fig. 7) is the leading EOF of the nine ensemble-averaged time series. It accounts for 72% of the overall variance. The EOF was calculated with each regional time series weighted by the snow-covered (>1 cm climatological 1 April SWE) area it represents. As a sensitivity test we tried setting all the area weights to 1 and found it made little difference (not shown). The fingerprint is a monopole over the western United States, although different locations contribute differently to the final result. In general, the weighting is higher in the central Cascade region than in the surrounding areas. This pattern is set by the combination of the relative areas, the covariation of the location with other places, and the characteristic winter temperature.
c. Signal strength
Given the model fingerprint F(x), the signal strength S is defined as
where D(x, t) are the nine regional time series (from a model run, observations, or model ensemble mean when the ensemble mean S is being calculated), and “trend” indicates the slope of the least squares best-fit line. The normalizing value S0 is the trend of the fingerprint’s associated principal component, chosen so that all signal strengths are relative to the fingerprint. (Recall that Fig. 5 shows the actual SWE/P losses in the various regions, which average about 20% over the study period.)
The values of S for the various model runs and observations are shown in Fig. 8. The signal strengths reported from the CCSM3-FV and PCM control runs are calculated in 50-yr segments to match the observed record, but are overlapped (starting progressively 5 yr later than the previous segment) to better estimate the effects of sampling variability with respect to the initial start date of the control run. With 850 years of CCSM3-FV control run and 750 years of PCM, we have a total of 32 independent 50-yr segments of control run data to compare to the forced runs and observations; this was the value used as the degrees of freedom in all following statistical tests. (We use overlapped 50-yr segments to reduce the chance that a particular initial starting date will unduly affect our results, but independent 50-yr segments to calculate the degrees of freedom to avoid artificially inflating the statistical confidence of our estimates.) The shaded region is the one-tailed (since we a priori expect warming produces a reduction in SWE/P) 95% confidence interval of the signal strength in the PCM and CCSM3-FV combined control model results. The 95% confidence interval on the observed trend is the standard error of the least squares linear trend, appropriately adjusted for serial correlation (Santer et al. 2000a; Wigley et al. 2006).
Figure 8 shows that the signal strength in the anthropogenic model runs is larger than in the control runs, but not to such a degree that the distributions are completely separated. There is a range of S, roughly from 0.05 to 1.5, where a single measurement is consistent with both the response to anthropogenic forcing and natural internal climate variability (although one of the anthropogenic runs, MIROC 10, falls outside this range). There is also a range of larger values, roughly 1.5–3, where a single measurement is consistent with the anthropogenic runs but outside the range expected if only natural internal variability were acting. The observed signal strength (S = 2.08 ± 1.11) falls in this latter region, showing significantly more reduction in SWE/P than is likely to be found from natural variability alone (p < 0.05). The signal strengths in the solar/volcanic runs are opposite sign to that found in the observations, so solar and volcanic variability do not provide an explanation for the observed changes in SWE/P.
How likely is it that the anthropogenic model signal strengths might be drawn from the same underlying distribution as the control run? This can be estimated from a Kolmogorov–Smirnov test (Press et al. 1989). Using all available information, with the combined control runs compared to the combined anthropogenic runs, shows a highly significant difference between the anthropogenic runs and control distribution (p < 0.01 using a one-tailed test). We conclude the signal strengths are inconsistent with the null hypothesis of natural internal climate variability.
The observed signal strength falls outside the 95% confidence interval of the control runs. However, our estimate of the observed signal strength is uncertain. What is the likelihood that the signal actually falls within the control model distribution? To address this question we use the pooled standard errors of S from the control runs and observations (Santer et al. 2000b; Lanzante 2005) and find the observed value is unlikely to have been drawn from the control distribution (p < 0.05 using a one-tailed test) even when uncertainty in the observed signal strength is taken into account.
For attribution, we consider whether the observed signal S is consistent with the anthropogenically forced model runs—that is, whether it is likely the observed signal strength could be drawn from the population of anthropogenic signal strengths. The signal strength from all the PCM and MIROC anthropogenic runs have an ensemble mean value of 1.0 ± 0.67 (with the 95% confidence interval in the ensemble-mean value calculated from the standard error of the ensemble-mean trend, as noted above), and the standard deviation of the values that make up this distribution is 0.91. Using pooled standard errors of the observations and all the anthropogenic runs we have available (combined PCM and MIROC results), we find no significant difference between the model-estimated signal strength and that observed.
The ratio of the observed signal strength to the ensemble mean model value is about 2, which indices the best model estimate is that approximately half of the observed changes in SWE/P arise from the anthropogenic effects included in our global climate models (GHGs, ozone, and some aerosols). The rest could be due to 1) anthropogenic effects not included here, such as “graying” of snow due to dust or soot deposition (Hansen and Nazarenko 2004; Painter et al. 2007; Flanner et al. 2007) or missing aerosol physics; 2) the models’ underestimation of the effect of the included anthropogenic forcings (cf. Rahmstorf et al. 2007); or 3) natural internal climate variability. Selecting between these possibilities is beyond the scope of this work.
In summary, the observed changes in SWE/P are consistent with those expected from anthropogenic forcing. Taken together with the finding that the observations are not consistent with either natural internal climate variability or solar/volcanic forcing, we conclude that changes in the climate from human-emitted greenhouse gases, ozone, and aerosols are causing a reduction in SWE/P in the western United States. The mean model estimate is that approximately half the decrease is due to anthropogenic effects.
A point worth reiterating is that the distributions of S in the anthropogenic and control model runs overlap significantly. This is not unexpected since we are examining a smaller domain than typically used in D–A studies. Weather noise increases when averaging over smaller areas, making it challenging to identify an anthropogenic fingerprint. One implication is that we require large ensemble sizes to reliably estimate the regional-scale response to anthropogenic forcing. The signal strength in an individual PCM or MIROC realization falling within the distribution of internal natural variability is not “evidence of absence” of an anthropogenic effect on climate. As shown here, however, observed SWE/P changes in the western United States fall outside the range of natural variability. It might be asked, what fraction of the anthropogenic distribution of S falls outside the distribution of natural variability? The 95th percentile on the combined control run distribution is S = 1.43. This signal strength is in the 63rd percentile of the anthropogenic runs, so approximately 37% of the anthropogenic distribution of S is inconsistent with the explanation of natural internal variability.
e. Detectability over time
We determine the earliest year anthropogenic changes in SWE/P are detectable following Santer et al. (1995, 2007). Starting in 1950, we calculate observed SWE/P signal strength for increasingly longer intervals and do likewise for the control run. As the interval lengthens, a consistent signal tends to stabilize around some asymptotic value, while the amplitude of the control run noise decreases (Santer et al. 2007). The signal-to-noise ratio therefore changes as a function of the averaging length, as shown in Fig. 9. With two control model runs available, we can estimate the noise either from CCSM3-FV (triangles) or from PCM (dots). Either way, the signal is detectable at the 95% significance level by the early 1990s: a record length of about 40 years. A similar analysis (not shown) indicates that the mid-1950s is the latest that the analysis can be started to achieve detection by 1999 (a record length of about 45 years). Either way, the data suggest four to five decades of observations are needed for detection of an anthropogenic SWE/P signal. Performing the same analysis on the anthropogenically forced model runs gives a mean model estimate that about 41 years of data are required, consistent with the observed values.
The observations (Fig. 5) show that in several regions, such as the Washington Cascades, SWE/P increased from 1975 to 1999. Is this behavior consistent with the model-estimated characteristics of anthropogenic SWE/P changes? To examine this, we used the 14 anthropogenically forced model runs to compare SWE/P trends in the Washington Cascades over the period 1950–99 to the period 1975–99. Only one of the 14 anthropogenic runs has a positive SWE/P trend over the period 1950–99. By contrast, half of the anthropogenic runs have a positive trend over the period 1975–99, so the models indicate that this is not an unusual happenstance. Together with the finding noted above that at least four decades of observations are necessary for detection of an anthropogenic signal, this suggests that a 25-yr period (1975–99) is too short to make any robust conclusions about the existence (or lack thereof) of an anthropogenic signal in SWE/P.
f. The role of changes in precipitation
As noted in the introduction, previous work has examined the role of precipitation changes in causing the reduction in snowpack and concluded that the widespread declines across the western United States (particularly at lower elevations) are primarily driven by an increase in temperature, not by changes in precipitation (Mote et al. 2005; Hamlet et al. 2005; Mote 2006). We also find no evidence for a systematic reduction in precipitation at our snow course locations: in fact, precipitation increases at 60% of the snow courses used here, while 71% show a reduction in SWE (Fig. 10). Stewart et al. (2005) also note the tendency for changes in P and streamflow center of timing to be at odds.
We have used SWE/P as the detection variable, rather than simply SWE, on the grounds that SWE/P would be less sensitive to precipitation variations that add noise to the temperature-driven signal. We can examine the extent to which this is true by comparing the relationship of P and SWE to the relationship of P and SWE/P, as shown in Fig. 11. The SWE values are from the snow courses, and the P values from the HL05 analysis. The plotted values are divided by their mean to reduce the large regional differences in P. The left-hand panel shows there is a strong relationship between SWE and P, which is not surprising; more precipitation gives proportionally more snow. The correlation is 0.64, and the slope of the best-fit linear trend is close to 1. In the right-hand panel it can be seen that dividing SWE/P removes essentially all this relationship; the correlation now is only 0.03 (not statistically significant if the spatial autocorrelation of the snow courses is taken into account), and the slope is near zero.
As a final check of the possible influence of precipitation changes on our results, we repeated the D–A analysis using P at the snow course locations instead of SWE/P. Neither the observations nor the signal in the anthropogenically forced models fell outside the 90% confidence interval of the control runs. We conclude that the snowpack reductions we are seeing are principally driven by increases in temperature over the western United States.
g. The role of elevation
Previous work (e.g., Mote et al. 2005; Mote 2006) has shown how observed reductions in SWE diminish with elevation. This raises the question of what a D–A analysis with stations partitioned by elevation, rather than geographically, would show. To explore this, we binned the snow courses into nine bands (shown in Fig. 12) that span the stations’ elevation range while having a nearly constant increment. Fractional SWE/P shows a pattern of strong negative trends at the lower elevations, reducing to near-zero trends at the higher elevations. Only elevations below ∼2000 m have statistically significant trends, given the noise.
The corresponding SWE/P trends for the ensemble-averaged anthropogenic model runs are shown in Fig. 13. The model also shows a decrease in trends with elevation, although not to as pronounced a degree as observed. Decreasing trends are statistically significant up to ∼2800 m in the model. This is higher than for the observations, consistent with the fact that the ensemble-averaged values have less noise by construction.
The D–A fingerprint is the leading EOF of the time series in Fig. 13. Various choices could be made for areal weighting in its construction. The bands could be weighted equally, which would address the question of how the decrease in trends with elevation affects the D–A process, and assess whether the observed decrease in trends with elevation matches the model prediction. Or the bands could be weighted by the amount of area they represent in the mountainous regions of the western United States, which would be a more traditional D–A approach since agreement or disagreement in a small fraction of the area (i.e., at high elevations) is less important than agreement or disagreement over a broad area. We tried both approaches and found that it had little effect (Fig. 14): the difference in fingerprint strength in the lowest versus highest elevation class is slightly larger when weighting according to the area represented since lower elevations (where the SWE/P trends are large) receive more areal weighting than higher elevations (where the trends are small). With equal weighting, the fingerprint accounts for 78% of the variance (versus 72% for the geographic fingerprint, Fig. 7). The values drop by 85% from the lowest to highest elevation, consistent with the decrease in the effect of warming on snowpack at the colder, higher elevations.
The D–A results using equal weighting for the elevation bands are shown in Fig. 15 (the results for area-weighted bands are little different, not shown). The observed signal strength (2.56 ± 1.54) is higher than when the geographical regions are used (2.08 ± 1.11), leading to detection that is significant at the 99% level rather than the 95% level. The attribution results are unchanged from the geographical case.
The snowpack in the western United States serves as a natural reservoir of freshwater from winter storms, gradually melting and releasing that water in late spring and early summer. Changes in the amount of precipitation retained in snowpack can have an important effect on human and natural systems that anticipate this kind of water storage, so it is important to know whether part of the observed changes in snowpack over the western United States can be attributed to anthropogenic inputs of greenhouse gases (GHGs), ozone, and aerosols into the global atmosphere.
In this work we have performed a formal detection and attribution (D&A) analysis of changes in western U.S. snowpack over the period 1950–99. Our detection variable was defined as the ratio of 1 April snow water equivalent (SWE) to precipitation (P) over the period October–March, normalized by its time mean. The resulting fractional SWE/P ratio is relatively insensitive to precipitation-driven snowpack changes, and was corrected for changing undercatch driven by the evolving snow/rain mix.
We employed a total of 1600 years of statistically downscaled (to a ⅛° grid) control run data from two climate models (CCSM3-FV and PCM) to construct our estimate of natural variability in SWE/P. The model ENSO, PDO, and SWE/P variability show realistic amplitudes that provide a good test of the hypothesis that natural internal climate variability can account for all the observed reduction in snowpack.
We find that SWE/P declines in the observations and anthropogenically forced model runs are significantly greater than expected if only natural internal variability were acting on the system (p < 0.05). Solar and volcanic effects likewise cannot explain the observed changes. We conclude that there is a detectable change in snowpack over the western United States, which cannot be fully explained by natural internal climate variability or the effects of solar and volcanic forcing. This finding did not depend on whether the stations were grouped geographically (by mountain range across the west) or in equally weighted or area-weighted elevation bands.
We have used 700 years of downscaled, anthropogenically forced model runs from two climate models (PCM and MIROC) to construct the distribution of SWE/P changes expected from human effects on climate and find that observed reductions in SWE/P are consistent with the anthropogenically forced model results. The mean model estimate is that approximately half of the observed changes in snowpack over the western United States during the period 1950–99 arise from climate responses to anthropogenic GHGs, ozone, and aerosols. The remainder may arise from natural internal climate variability or neglected or improperly modeled anthropogenic effects.
It should be noted that the anthropogenic forcings examined here are not the only ones relevant to SWE/P. For example, modification of snow albedo through soot or dust deposition has been suggested as a contributing factor to earlier snowmelt in the Arctic and parts of the west (Hansen and Nazarenko 2004; Painter et al. 2007; Flanner et al. 2007) but is not included here. The global models we used also lack a comprehensive treatment of aerosol physics. It is possible that inclusion of such forcings would improve the agreement between the model-estimated and observed reductions in SWE/P. A finer breakdown of the effects of various anthropogenic forcing mechanisms on SWE/P awaits further study.
Looking ahead, what do these results portend for the western United States? Since greenhouse-gas-induced warming is already contributing significantly to the decline in snowpack and is predicted to continue over the twenty-first century, we can anticipate that the snowpack loss is likely to continue and even accelerate over the next half century. Further out, the behavior of snowpack will be influenced by what greenhouse gas reduction strategies, if any, are put into effect. Areas that have insufficient reservoir capacity to capture the earlier spring melt while still having enough margin to prevent floods from late-winter storms will end up losing water that would otherwise be retained in the natural snowpack reservoir. As water is a precious and limited resource in the western United States, this suggests that we will be faced with difficult and expensive political, social, and environmental choices for how to deal with this problem.
The authors thank Phil Mote and two anonymous reviewers for their helpful comments, and Noah Knowles for providing data on the changing fraction of rain versus snow. Support was provided by the Lawrence Livermore National Laboratory through an LDRD grant to the Scripps Institution of Oceanography (SIO) via the San Diego Super Computer Center (SDSC) for the LUCSiD project. The MIROC simulations were supported by the Research Revolution 2002 of the Ministry of Education, Culture, Sports, Science and Technology of Japan. The PCM simulation had previously been made available to SIO by the National Center for Atmospheric Research for the ACPI project. The California Energy Commission provided partial salary support for DP and HH at SIO. The Department of Energy and NOAA supported TPB as part of the International Detection and Attribution Group (IDAG). The LLNL participants were supported by DOE-W-7405-ENG-48 funds to the Program of Climate Model Diagnosis and Intercomparison (PCMDI). The USGS and SIO provided partial salary support DC and MD at SIO.
Treatment of Undercatch
Assume we know the true SWE from the snow course data but only have an estimate of precipitation P′ = gP, where P is the true precipitation and g (<1) is the gauge catch efficiency, which can vary in space. The quantity 1–g is called the “undercatch.” SWE/P normalized by its time mean (a quantity we call “fractional SWE/P”), calculated with our estimated value of precipitation and for the moment assuming g is constant in time, is
where angle brackets indicate the time mean. So, ideally, the change in fractional SWE/P is insensitive to undercatch if g is constant.
The net catch efficiency can be thought of as arising from parts due to snow and rain:
where the subscripts S and R indicate snow and rain and f are the fractions of total precipitation in each phase ( fR + fS = 1). Knowles et al. (2006; K06 hereafter) showed that the fraction of winter precipitation falling as snow is decreasing. Since precipitation gauge catch efficiency is higher for rain than snow, this gives a systematic upward trend in g, violating the assumption of constant g used in Eq. (A1).
The effect the changing rain/snow mix has on our results is greater when the difference in gauge efficiency catching rain versus snow is larger, so to be conservative we use gS = 0.5 and gR = 0.9, a difference in gauge efficiency at the upper end of estimates (e.g., Groisman and Easterling 1994). These estimates of gS and gR along with the linear trend estimates of fS and fR from K06 allow us to express g as
where t goes from −1 to +1 (covering our period 1950–99).
Table A1 shows Δg, g, and their ratio estimated from the K06 data, averaged over our regions. The ratio tends to be small, Δg/g ≡ ɛ ≈ 0.02 ≪ 1, which suggests calculating a correction term to Eq. (A1) that is first order in ɛ. In fact, 95% of the stations in K06 have ɛ < 0.04, so the higher-order terms are quite small. As a sensitivity test we also calculated g using the procedure outlined in Serreze et al. (1999), which involves directly estimating the mean undercatch from large snowfall events in January, and got similar values of g.
where α ≡ SWE/gP. Making repeated use of the expansion 1/(1 + δ) = 1 − δ + O(δ2), and dropping terms O(ɛ2) or higher, we obtain after some algebra
which is our desired correction to Eq. (A1).
The middle term on the rhs of Eq. (A5), −ɛt, becomes more negative over time when precipitation gauge efficiency increases over time (as expected with less precipitation coming as snow). It indicates that observed SWE/P will show a negative trend even if the true SWE/P ratio is constant, simply owing to the increasing amount of precipitation being caught.
The last term on the rhs of Eq. (A5) can be understood by noting that it is constant, and zero if either the catch efficiency or the true SWE/P is constant. It therefore arises from an interaction between the trends in catch efficiency and true SWE/P and tends, in the observations, to exaggerate (if >0) or diminish (if <0) the actual trend in SWE/P. In practice, however, this term is negligible as it depends on the interaction of trends that are modest [SWE/P, with a trend O(0.2)] and small [g, with a trend O(ɛ)] to begin with.
The gauge catch efficiency at each snow course location was estimated from a weighted average of the five nearest stations in K06, with the weights inversely proportional to the distance. Stations where 〈SWE/P〉 was greater than g−1S before the correction for undercatch was applied were eliminated (81 stations, 12% of the total). Such locations might suffer from a poor estimate of precipitation due to locally rough topography or unusually large snow accumulation due to drifting or small-scale meteorological effects. The station fractional SWE/P estimates were then computed, and corrected using Eq. (A5). Corrected fractional SWE/P values were area averaged over the nine mountain regions; 15 stations (2% of the total) did not fall within one of our mountain regions and were not included in the analysis over mountain regions (but were included in the analysis over elevation).
The correction for changing undercatch due to the evolving snow/rain mix is modest compared to the changes in SWE/P. The correction reduces the SWE/P trend about 5% on average, with the largest reduction in the northern Sierra (17%). This leads to an overall signal strength S (section 3c) for the observations that is 6% weaker when the changing snow/rain mix is included.
Corresponding author address: David W. Pierce, Division of Climate, Atmospheric Sciences, and Physical Oceanography, Scripps Institution of Oceanography, Mail Stop 0224, La Jolla, CA 92093-0224. Email: firstname.lastname@example.org