There is a large uncertainty of snow water equivalent (SWE) in reanalyses and the Global Land Data Assimilation System (GLDAS), but the primary reason for this uncertainty remains unclear. Here several reanalysis products and GLDAS with different land models are evaluated and the primary reason for their deficiencies are identified using two high-resolution SWE datasets, including the Snow Data Assimilation System product and a new dataset for SWE and snowfall for the conterminous United States (CONUS) that is based on PRISM precipitation and temperature data and constrained with thousands of point snow observations of snowfall and snow thickness. The reanalyses and GLDAS products substantially underestimate SWE in the CONUS compared to the high-resolution SWE data. This occurs irrespective of biases in atmospheric forcing information or differences in model resolution. Furthermore, reanalysis and GLDAS products that predict more snow ablation at near-freezing temperatures have larger underestimates of SWE. Since many of the products do not assimilate information about SWE and snow thickness, this indicates a problem with the implementation of land models and pinpoints the need to improve the treatment of snow ablation in these systems, especially at near-freezing temperatures.
Accurate representation of the amount of seasonal snow on the land surface is important because seasonal snow cover has a large influence on land–atmosphere interactions (Robinson and Kukla 1985). For example, because of its high albedo, snow influences how much solar energy is absorbed by the land surface, and further, it affects the turbulent exchange of water and energy between the land surface and the atmosphere, as well as the transfer of heat to the subsurface (due to its low thermal conductivity). Hydrologic simulation also depends critically on correct representation of snow because the hydrology of cold regions is heavily influenced by annual snowmelt (Bowling et al. 2003; Bales et al. 2006). Incorrect representation of snow water equivalent (SWE) and subsequent snowmelt may also have lasting impacts after the snow has melted. For example, the timing and magnitude of snowmelt influences streamflow, soil moisture, and vegetation health well into the summer. In addition, there can be a connection between snow and summertime rainfall (Hahn and Shukla 1976; Gutzler and Preston 1997).
There remains substantial uncertainty about the representation of snow on the ground in many reanalysis and Global Land Data Assimilation System (GLDAS; Rodell et al. 2004) products, as evidenced by the wide spread of SWE and snow depth simulations in such systems (e.g., Mudryk et al. 2015). Incorrect representation of SWE might lead to incorrect representation of climate variables, even subsequent to snowmelt. For example, an incorrect amount of snowmelt may influence simulation of runoff and soil moisture well after the snow has melted. In some cases, these variables are used for applications outside of the specific reanalysis or GLDAS model. For example, soil moisture from GLDAS has been used to initialize global models for subseasonal forecasts (Koster et al. 2004; De Goncalves et al. 2006) and to supplement observational soil moisture datasets for hydrologic land–atmosphere coupling studies (e.g., Zhang et al. 2008; Syed et al. 2008; Rodell et al. 2007). Reanalyses are routinely used to study and monitor variables that may be heavily influenced by the amount of snow on the ground (e.g., near-surface air temperature).
One of the challenges of snow evaluation in large-scale reanalyses and GLDAS products is determining the quantity of snow that is actually on the ground. Scale differences make it difficult to use in situ point data to determine how much snow there is over larger areas, as points may not be representative of larger grid boxes (Brasnett 1999; Brown et al. 2003). Point data can be upscaled to a gridded surface using a variety of methods (e.g., Molotch et al. 2005; Erxleben et al. 2002; López-Moreno and Nogués-Bravo 2006; Dixon et al. 2014; Dawson et al. 2016). Often, this involves a fusion of modeling with in situ observations (Carroll et al. 2001; Brown et al. 2003; Drusch et al. 2004) or satellite observations (Takala et al. 2011). Even though such combinations generate snow fields that are consistent with the available observations, there is still uncertainty about how well these products can be treated as “ground truth.”
An even more critical factor that complicates the evaluation of snow in large-scale products is the difficulty to disentangle various factors that might lead to simulation errors. For one thing, the forcing data can be uncertain, causing ambiguity about whether simulation errors are the result of forcing data errors, land model deficiencies, or both. There are specific instances where simulation errors can clearly be alleviated by improving either the land models themselves or the input data to those models. For example, Pan et al. (2003) found that differences between observed and model-generated SWE in a North American Land Data Assimilation System (NLDAS) model could be largely alleviated by adjusting precipitation data to match observations. Numerous other studies have also shown that there are differences of snow simulation with different models with the same forcing (e.g., Schlosser et al. 2000; Etchevers et al. 2002; Bowling et al. 2003; Essery et al. 2009; Rutter et al. 2009; Essery et al. 2013; Chen et al. 2014; Magnusson et al. 2015; Mudryk et al. 2015) and that new model formulations (e.g., for snow albedo) can lead to greater agreement between observed and modeled SWE (e.g., Barlage et al. 2010; Livneh et al. 2010; Wang et al. 2010). It is uncertain, however, whether deficiencies in snow simulation in reanalyses and GLDAS products are more commonly related to the land model components of these systems, to data assimilation, or to the input forcing data.
In this study, we seek to disentangle these various factors by evaluating the climatic dependence of snow in modeling systems compared to that from observations. Section 2 discusses the reanalysis and GLDAS products to be evaluated and the two high-resolution SWE validation datasets over the conterminous United States (CONUS) that are used for the validation. Section 3 quantifies the SWE deficiencies of the reanalysis and GLDAS products and identifies the primary reason for these deficiencies. Section 4 discusses the results presented in section 3.
2. Reanalysis and GLDAS products and validation data
a. Reanalysis and GLDAS products
The reanalysis and GLDAS data are from four GLDAS, version 1 (GLDAS1; Rodell et al. 2004), land models (VIC, Noah, Mosaic, and CLM); one GLDAS, version 2 (GLDAS2), land model (Noah); and three recent reanalyses: the Modern-Era Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011) and its second version (MERRA2), the Climate Forecast System Reanalysis (CFSR; Saha et al. 2010), and the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERA-Interim, hereafter ERA-I; Dee et al. 2011). ERA-I and MERRA also include their offline versions, ERA-I/Land (Balsamo et al. 2015) and MERRA-Land (Reichle et al. 2011), respectively. In this paper, “GLDAS products” refers to the four GLDAS1 models and GLDAS2-Noah, and “reanalysis products” refers to CFSR, ERA-I, ERA-I/Land, MERRA, MERRA-Land, and MERRA2. The temporal and spatial resolutions vary among these products, as summarized in Table 1. Data for SWE, snowfall, total precipitation, and 2-m air temperature are obtained for each product.
The forcing data and snow models employed by these products vary considerably (also summarized in Table 1). These products (except for GLDAS1-Mosaic) employ mass and energy balance snow models that have prognostic formulations for albedo and snow density, though the complexity of these models differs among products. CFSR, GLDAS1-Noah, and GLDAS2-Noah use the Noah land surface model (LSM; Chen and Dudhia 2001), which has a single-layer snow scheme. CFSR uses the Global Data Assimilation System (GDAS) forcing with precipitation adjustment from the Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP; for low latitudes) and the CPC unified global daily gauge analysis (CPCU; for midlatitudes). GLDAS1-Noah is forced with GDAS forcing data with precipitation adjustment from CMAP, and GLDAS2-Noah is forced using the Princeton meteorological forcing dataset (Sheffield et al. 2006).
ERA-I uses the relatively simple single-layer snow scheme of Douville et al. (1995), and in ERA-I/Land, this scheme was revised to that in the Hydrology Tiled ECMWF Scheme of Surface Exchanges over Land (HTESSEL; Balsamo et al. 2009) model to include treatment for liquid water storage in the snowpack and revisions to snow density, albedo, and snow cover fraction. ECMWF forcing drives the land components in both cases, though for ERA-I/Land, precipitation data are corrected using the Global Precipitation Climatology Project (GPCP) dataset. MERRA, MERRA-Land, and MERRA2 use the Catchment LSM (Koster et al. 2000), which has a more complex three-layer snow scheme (Stieglitz et al. 2001). The Catchment model is unique in that it uses a model grid that is based on hydrologic boundaries. In addition to MERRA forcing, precipitation for MERRA-Land and the land model in MERRA2 is corrected with CPCU.
GLDAS1 also has three additional LSMs besides Noah: VIC, CLM, and Mosaic. VIC (Liang et al. 1994) and CLM (Zeng et al. 2002) represent the snowpack using multilayer snow models: VIC has a thin surface layer and a thick deeper layer (Andreadis et al. 2009), and CLM has up to five snow layers. To account for subgrid variability, VIC has subgrid tiles based on elevation bands and land-cover type, whereas CLM just has subgrid tiles based on land-cover type. Mosaic (Koster and Suarez 1992) has very simple treatment for snow, but does include subgrid tiles based on land-cover type. These models share the same forcing information as GLDAS1-Noah, but each GLDAS1 model determines the partitioning between rainfall and snowfall individually (so in this study, snowfall is obtained as model output for each GLDAS1 model).
The reanalysis and GLDAS products also differ in terms of the assimilation of snow data. CFSR uses snow thickness data from the U.S. Air Force Weather Agency (Saha et al. 2010), which combines point measurements with satellite microwave estimates of snow depth. ERA-I uses Cressman interpolation to incorporate synoptic (SYNOP) reports of in situ snow depth into their analyses (Drusch et al. 2004; De Rosnay et al. 2014). MERRA, MERRA-Land, and ERA-I/Land do not assimilate any observational snow information (Rienecker et al. 2011; Reichle et al. 2011; Balsamo et al. 2015). The GLDAS models only assimilate information about snow cover from the Moderate Resolution Imaging Spectroradiometer (MODIS; Rodell and Houser 2004; Zaitchik and Rodell 2009).
b. Validation data
The above reanalysis and GLDAS products are evaluated with high-quality observationally constrained gridded precipitation and temperature data for the CONUS from Parameter-Elevation Regressions on Independent Slopes Model (PRISM; Daly et al. 1994), SWE data from the Snow Data Assimilation System (SNODAS; Carroll et al. 2001; Barrett 2003), and SWE and snowfall data that are based on our recent study (Broxton et al. 2016), which are referred to here as the UA data. The PRISM data include daily estimates of air temperature and precipitation on a 2.5-arc-min grid. The PRISM dataset incorporates data from 13 000 stations for precipitation and nearly 10 000 stations for maximum and minimum temperature across the CONUS. The basis for PRISM is a local climate–elevation regression that varies spatially, but it also accounts for a variety of other local factors affecting climate, such as rain shadow affects, temperature inversions, and the effects of nearby water bodies on temperature and precipitation. PRISM is especially well suited for interpolating climate data in physiographically complex landscapes (Daly et al. 1994, 2008).
Our new SWE and snowfall dataset is generated by combining station data of SWE, snow depth, and precipitation with the gridded PRISM precipitation and temperature data. In this study, we choose to use validation products based on ground data (for precipitation, temperature, SWE, and snow depth) as opposed to satellite data because of the uncertainty of using satellite remote sensing to measure SWE in areas of complex terrain (such as the western United States, where the deepest snowpacks in the CONUS occur). In addition, across the United States, there is a relatively high density of stations that allows for relatively high-quality precipitation and temperature data to constrain our SWE estimates. Note that a recent study (Mudryk et al. 2015) also evaluated many of the products that we are testing, but compared them with a product derived from satellite and in situ snow data. In this study, we feel that constraining our SWE estimates using precipitation and temperature data gives a better estimate for the CONUS, where the station density is relatively high and where there are substantial areas of complex terrain.
Here, in situ data come from the National Resources Conservation Service (NRCS) Snowpack Telemetry (SNOTEL; Serreze et al. 1999) network and the National Weather Service’s (NWS) Cooperative Observer Program (COOP) network in the United States. The COOP data used in this study are from the National Climatic Data Center (NCDC) COOP Summary of the Day DSI-3200 dataset, which includes observations from ~8000 active stations, including NWS “first order” or principal climatological stations (operated by trained observers) and “second order” (operated by volunteers) stations (NCDC 2006). Figure 1 shows the locations of the SNOTEL and COOP stations used in this study.
The first step in the creation of our dataset is to determine the precipitation phase at the in situ stations, as this is the basis of whether to consider PRISM precipitation as rainfall or snowfall. Following Knowles et al. (2006) and Huntington et al. (2004), who perform snowfall/precipitation studies in the western and eastern United States, respectively, precipitation is considered to be snow at COOP stations on days for which newly fallen snow was recorded. Similarly, it is considered to be snow at SNOTEL stations if there is a positive change in snow depth. As pointed out by Huntington et al. (2004), this methodology may sometimes result in positive biases in snowfall, because on days when precipitation falls as both rain and snow, it is recorded entirely as snow. In general, this method has been shown by Knowles et al. (2006) and Huntington et al. (2004) to partition between rainfall and snowfall reasonably well in both the western and eastern CONUS.
Next, a temperature threshold is used to define the phase (rainfall vs snowfall) of the PRISM precipitation data. This threshold is determined individually for each station that records precipitation. For stations where precipitation occurs as rainfall (based on the above methodology) on days when the daily average air temperature is below freezing, the temperature threshold is set as the station temperature minus 1°C. For stations where precipitation occurs as snowfall (based on the above methodology) on days where the daily average air temperature is above freezing, the temperature threshold is set as the station temperature plus 1°C. Otherwise, for stations that record snowfall at subfreezing temperatures and rainfall at temperatures warmer than 0°C (which occur a majority of the time), the threshold is set to 0°C. These threshold temperatures are then interpolated from each station location to the PRISM 2.5-arc-min grid using ordinary kriging. PRISM precipitation that occurs when the air temperature below this interpolated threshold is considered snow; otherwise, it is considered rain.
The generation of SWE estimates follows the methodology of Broxton et al. (2016). Namely, SWE at each station is divided by the difference between accumulated snowfall (from the time that no snow was on the ground to the time of the SWE estimate) measured at the station and an estimate of accumulated ablation (for the same period), which is given by a simple temperature index snow model. We then interpolate these ratios to the PRISM 2.5-arc-min grid and multiply them by the difference between accumulated snowfall within each PRISM pixel and our estimate of accumulated ablation (for each pixel). For this dataset, we are essentially using forward modeling to predict accumulated snowfall and accumulated ablation. This approach (but with a different model) has been shown to produce reliable SWE estimates in areas with dense precipitation networks (Livneh et al. 2014); however, in places with fewer gauges, it has also been shown that incorporation of satellite data (e.g., using inverse modeling to work backward from snow-free conditions; Molotch 2009) can generate comparable measurements, or even improve upon SWE measurements generated from forward modeling (Raleigh and Lundquist 2012; Livneh et al. 2014).
While the SNOTEL stations measure SWE directly, COOP stations do not. Here the COOP snow depths are converted to SWE using a relationship between the SNOTEL SWE and snow depth data at the time of maximum snow accumulation (Fig. 2a). In general, snow density increases over time because of settling and metamorphosis of grains, compaction due to the weight of overlying snow, and refreezing of meltwater (Essery et al. 2013), which occurs at different rates in different environments. Figure 2b shows that, commonly, when there is more snow on the ground, the bulk snow density is higher. Furthermore, this relationship generally holds even when half of the data are withheld during calibration and used as evaluation data instead. This also holds even when the data are stratified so that the calibration data are from a different environment than the validation data. For example, Fig. 2c shows that this relationship holds for sites that are in the Cascades and Sierra Nevada (which generally have a higher snow density through much of the winter) versus those in the Intermountain West (which generally have a lower snow density through much of the winter). This transformation, along with the snowfall calculations below, results in a nonlinear regression slope between computed maximum SWE and snowfall at COOP stations that is quite similar to the SNOTEL record, albeit with more scatter (the square of the correlation coefficient R2 in Fig. 2d is 0.74 for the COOP data vs 0.92 for the SNOTEL data).
Besides the UA dataset, we also use the SNODAS SWE data. SNODAS provides data for SWE on a 30-arc-s grid. It is the National Operational Hydrologic Remote Sensing Center’s (NOHRSC) operational snow monitoring system that provides the SWE and snow depth estimates in the CONUS (Barrett 2003). The assimilation system uses downscaled forcing data from a regional weather model to drive a multilayered, uncoupled energy and mass balance model. SNODAS can then be nudged using a large amount of snow measurements from ground, airborne, and satellite sources if it is determined by an analyst that such nudging is necessary (Barrett 2003).
SNODAS assimilates much more snow and near-surface atmospheric forcing data than our method, but its method (Carroll et al. 2001; Barrett 2003) was not published in peer-reviewed journals and the subjective adjustment by analysts during the data development is not reproducible. The SNODAS data are available beginning 30 September 2003. In contrast, our method is based directly on daily PRISM precipitation and temperature data (and hence relies on more in situ precipitation and temperature data) and relies on station reports of SWE, snow depth, and precipitation that go back further than 30 years, enabling the creation of a dataset spanning 30+ years. Broxton et al. (2016) found that the two SWE datasets are comparable, and their differences can be interpreted as a rough estimate of our SWE data uncertainty.
In this study, we use data from water year (WY) 2004 to WY 2009 (meaning that the data go from 1 October 2003 to 30 September 2009), as this covers most of the overlap period for our products. SNODAS begins in WY 2004 and CFSR is only available until March 2011 (after which it was superseded by the CFSv2 operational analysis).
3. Evaluation of SWE in the reanalysis and GLDAS products
In this section, we first present a comparison between SWE values from each product over all of North America to give a sense of which products predict the most snow and which products predict the least (section 3a). Because this analysis partly examines snow cover extent, we also compare the overall snow-covered extent to that indicated by the 24-km Ice Mapping System (IMS) product (Ramsay 1998; National Ice Center 2008) over North America. Next, we evaluate reanalysis and GLDAS SWE with the SNODAS and UA SWE data (section 3b). We evaluate both the temporal evolution of the reanalysis and GLDAS SWE for six 2° × 2° grid boxes, as well as the spatial distribution of maximum SWE (SWEmax) across the entire CONUS. Next, to understand how deficiencies in the reanalysis and GLDAS SWE relate to deficiencies of the forcing data, we evaluate the reanalysis and GLDAS products in terms of how well they represent “snow season” precipitation (both the total amount and the fraction that falls as snow) and snow season temperatures (section 3c). The snow season is defined, for grid cells of each product separately, as the period between the first day of snow on the ground and the last. Reanalysis and GLDAS precipitation amounts are also compared to PRISM precipitation data, and their snowfall fractions are compared to the UA snowfall data and PRISM precipitation data.
Last, to understand how deficiencies in reanalysis and GLDAS SWE might be related to the amount of snow ablation in each product, we evaluate the SWE differences between the reanalysis and GLDAS products versus the SNODAS and UA data in terms of the rate of snow ablation predicted in each product (section 3d). For consistency among all products, we treat the reduction of SWE on precipitation-free days as representing the amount of daily snow ablation due to evaporation/sublimation losses and snowmelt. The results in sections 3a–d focus on a single water year for easy interpretation of the results, and these results are extended to six water years (from WY 2004 to WY 2009) in section 3e.
a. Spatial patterns of GLDAS and reanalysis SWE over North America
Generally, the spatial patterns of SWE are relatively consistent among the reanalysis and GLDAS products that we tested over North America. This can be seen in Fig. 3, which shows a SWE “snapshot” for each product on 1 March 2008, which is near the time of maximum snow cover extent. The two products that look the most dissimilar from the rest are CFSR and ERA-I. For CFSR, the total snow-covered area is relatively consistent with the other products, though there is a much smaller area with higher amounts of SWE. For example, it only has 1.3 × 106 km2 of land area with more than 100 mm of SWE, compared to areas from 3.4 × 106 to 8.5 × 106 km2 for the other GLDAS or reanalysis products (Fig. 4). For ERA-I, there are many areas with large amounts of SWE that are not reflected in the other reanalysis or GLDAS products. For example, ERA-I is the only product depicting large amounts of SWE in pockets through central Canada east of the Rocky Mountains (Fig. 3). These patches are reminiscent of artifacts in the ERA-I’s snow analysis that have been identified in the past. ERA-I/Land looks more consistent with the other products than ERA-I (Fig. 3). GLDAS1-CLM and GLDAS1-VIC are the snowiest GLDAS models, while Noah (both GLDAS1 and GLDAS2) and GLDAS1-Mosaic have fewer areas of deeper snow (Fig. 4). MERRA2 also has considerably more area with >100 mm of SWE than the other products (8.5 × 106 km2). This is much more than MERRA, which has 4.6 × 106 km2, and MERRA-Land, which has 4.9 × 106 km2 with SWE > 100 mm. Note that products that have more SWE as of 1 March also have more SWE through the entire course of the winter (not shown).
Figure 4 also shows the sensitivity of the estimated snow-covered area (SCA) to the threshold SWE value used. Some products have substantial areas with <1 mm of SWE, giving unrealistically high estimates of total SCA when the threshold is very low. For example, using a threshold of 0 mm gives total SCA that is from 0.1 × 106 to 3.2 × 106 km2 greater than using a threshold of 1 mm. GLDAS2-Noah, MERRA, MERRA-Land, and MERRA2 have particularly large differences when these two thresholds are used (>1.5 × 106 km2), while ERA-I, ERA-I/Land, GLDAS1-Noah, and CFSR have moderately large differences (from 0.5 × 106 to 1.5 × 106 km2). The other GLDAS1 models have smaller differences. Compared with the IMS data, using a threshold of 1 mm seems to give a better estimate of SCA than 0 mm for the products with the largest differences between SCA computed with the 0- and 1-mm thresholds (e.g., GLDAS2-Noah, ERA-I, ERA-I/Land, MERRA, and MERRA-Land), as the 0-mm threshold results in SCA that is significantly higher than the IMS data in these products (Fig. 4). Using a threshold of 1 mm slightly underpredicts SCA compared to the IMS data across most products.
b. Comparison with high-resolution SWE data over the CONUS
In comparison with the validation data (SNODAS SWE and our UA SWE dataset), SWE is too low in the reanalysis and GLDAS products in the CONUS. For example, Fig. 5 shows that through the winter, area-averaged SWE is commonly higher in the high-resolution data than the GLDAS and reanalysis datasets in the six 2° × 2° boxes shown in Fig. 1 (centered on northern Colorado, southern Idaho, western Washington, central Wisconsin, central Indiana, and central New York). This is particularly true of the Colorado and Idaho boxes. Furthermore, this underestimation of SWE persists for much of the winter, and particularly in the Colorado, Idaho, and Washington grid boxes, snow persists much longer in the gridded observational data than in the reanalysis and GLDAS products.
This general underestimation of SWE by the reanalysis and GLDAS products compared to the higher-resolution SWE data can also be seen across the CONUS at maximum snow accumulation (Fig. 6). In many areas where there are deeper snowpacks, the reanalysis and GLDAS products predict less snow than the UA (and SNODAS) data. Many of the products (including SNODAS), however, predict more snow than the UA data along the southern margin of the extent of seasonal snow, as well as in some portions of the western United States and northern Great Plains, primarily in areas where the snow is shallow (Fig. 6). However, as can be seen in Fig. 6a, the magnitudes of SWEmax in these regions are often quite low, so even though there are substantial relative differences in these areas (shown in Figs. 6b–m), the absolute differences are quite small. Overall, averaged across the entire CONUS, both the UA and SNODAS SWE data have considerably higher SWE than the reanalysis and GLDAS products.
This conclusion holds for all of the reanalysis and GLDAS products, regardless of their resolution. For example, GLDAS2-Noah has a spatial resolution of 0.25° × 0.25°, but it underestimates SWEmax more than many GLDAS1 models, which have a spatial resolution of 1° × 1°. Furthermore, there is more spread in the CONUS average SWEmax, predicted by the GLDAS1 models (31.5–57.6 mm, or 28%–51% of UA SWEmax), which all share the same spatial resolution of 1° × 1°, than that predicted by the other products (41.3–55.3 mm, or 37%–49% of UA SWEmax), which have spatial resolutions ranging from 0.25° × 0.25° to 0.50° × 0.67° (column 1 of Table 2).
The same conclusions can also be drawn if the reanalysis and GLDAS SWEmax are compared to SNODAS (rather than UA) SWEmax. Not only is the seasonal evolution of SNODAS SWE similar to that in the UA dataset for the 2° × 2° boxes (Fig. 5), but averaged across the entire country, maximum SWE predicted by the UA dataset is nearly the same as that predicted by SNODAS (column 1 of Table 2). This gives us confidence that the underestimation of SWE by the reanalysis and GLDAS products are real and substantial.
c. Evaluating the role of atmospheric forcing
One of the first things that comes to mind when trying to explain this widespread underestimation of SWE are deficiencies in the model precipitation data, as the amount of SWE is particularly sensitive to the amount of snow that has fallen (Broxton et al. 2016). However, this widespread underestimation of SWE (and SWEmax) occurs in the reanalysis and GLDAS products despite the fact that some products have more precipitation than is suggested by the PRISM data, and some have less. CFSR generally has more precipitation than the PRISM data do while the GLDAS models have less. ERA-I, ERA-I/Land, MERRA, MERRA2, and MERRA-Land have smaller precipitation biases compared to PRISM. This is generally true for the 2° × 2° boxes (Fig. 5) as well as across the entire CONUS (Fig. 7 and column 2 of Table 2). For example, for WY 2008, the reanalysis products have 97%–155% of the PRISM snow season precipitation, averaged across the entire CONUS (column 2 of Table 2), while the GLDAS products have 44%–80% of the PRISM snow season precipitation (column 2 of Table 2).
Furthermore, this widespread underestimation of SWE probably cannot be explained by incorrect partitioning of precipitation between rainfall and snowfall. Snow season snowfall S divided by snow season precipitation P is, in general, lower in CFSR, ERA-I, and ERA-I/Land than is suggested by the high-resolution data. At the same time, this ratio is higher in MERRA, MERRA-Land, MERRA2, and GLDAS products than is suggested by the high-resolution data (column 3 of Table 2; Fig. 8). For example, averaged across the CONUS during WY 2008, for CFSR, ERA-I, and ERA-I/Land, this ratio ranges from 63% to 90% of S/P determined from the PRISM precipitation and UA snowfall datasets. For the remainder of the reanalysis and GLDAS products, this ratio ranges from 109% to 158% of S/P determined from the PRISM precipitation and UA snowfall datasets (column 3 of Table 2). This means that for these products, snowfall makes up a greater proportion of total precipitation when snow is on the ground than is suggested by the high-resolution datasets.
It is also unlikely that temperature TAir biases can account for the widespread underestimation of SWE. For example, some products have higher temperatures than are suggested by the PRISM dataset, and some have lower temperatures. In general, the reanalysis products (especially CFSR, ERA-I, and ERA-I/Land) and GLDAS2-Noah have higher temperatures than are suggested by the PRISM data, and the GLDAS1 products have lower temperatures (column 4 of Table 2; Fig. 9). These results are complimentary with the above results because, in general, the same products that have lower S/P than is suggested by the high-resolution data have higher temperatures, and vice versa (cf. columns 3 and 4 of Table 2).
d. Evaluating the role of snow ablation
Section 3c shows that the significant underestimation of reanalysis and GLDAS SWE occurs irrespective of biases in atmospheric forcing. This points to a deficiency in the land components of these reanalyses and GLDAS products. In particular, SWEmax is too low in the reanalysis and GLDAS products for the amount of snow that falls. This can be seen by comparing the ratio of SWEmax to accumulated snowfall (SWEmax/S) for the reanalysis and GLDAS products with that for the UA data. Figure 10 clearly shows that this ratio is much lower when computed for the reanalysis and GLDAS products than for the high-resolution data. Averaged across the CONUS, SWEmax/S for the reanalysis products ranges from 36% to 76% of that computed using the UA dataset, and SWEmax/S for the GLDAS products ranges from 47% to 88% of that computed using the UA dataset (column 5 of Table 2). Higher-resolution products are not necessarily better at representing this ratio, as the two highest-resolution products (CFSR and GLDAS2-Noah) perform the worst and GLDAS1-VIC represents this ratio better than the other reanalysis and GLDAS products (column 5 of Table 2).
At the same time, many of the reanalysis and GLDAS products have more ablation than the UA or SNOTEL SWE datasets. Figure 11a plots, in 2° daily temperature bins, the SWE change (ΔSWE) on precipitation-free days for all products. This quantity can be thought of as approximately equivalent to daily snow ablation. The reanalysis and GLDAS products have different ΔSWEs at particular temperatures. MERRA, MERRA-Land, MERRA2 (green lines in Fig. 11a), and CFSR (red line in Fig. 11a) have the most ΔSWE at subfreezing temperatures. Under these conditions, the UA and SNODAS SWE data show very little ΔSWE, as even the 25th percentile of ΔSWE is close to zero. Even at temperatures above 0°C, most of the reanalysis and GLDAS products have ΔSWEs that are larger in magnitude (more negative) than the UA SWE and SNODAS SWE data. Only ERA-I and ERA-I/Land have median values of ΔSWE that are consistently within the 25th–75th percentile range of ΔSWE for the UA SWE and SNODAS SWE data. At temperatures between 0° and +5°C, GLDAS1-VIC also has less ΔSWE than the remainder of the reanalysis and GLDAS products. Many of the differences between the GLDAS and reanalysis products can be captured by comparing the median ΔSWE for the temperature bin from −1° to +1°C for different products (Table 2, column 6).
There is a good relationship between products that have too much ablation at near-freezing temperatures and those that have ratios of SWEmax/S that are too low (cf. columns 5 and 6 of Table 2). Figure 11b shows this graphically by plotting the differences between SWEmax/S for each reanalysis and GLDAS product and that computed from the UA SWE and snowfall data (these values are determined by subtracting rows 3–13 from row 1 in column 5 of Table 2) versus the median ΔSWE for the temperature bin from −1° to +1°C for each product (column 6 of Table 2). Products that have more ablation at near-freezing temperatures clearly have smaller values of SWEmax/S.
e. Consistency of results for additional water years
For the remainder of the water years in the study period (from WY 2004 to WY 2009), the results were similar, that is, SWEmax was underpredicted by all of the reanalysis and GLDAS products (Fig. 12a), and the underestimation was a little bit worse in comparison to SNODAS because SNODAS predicted slightly more SWE than the UA dataset for many of the years. Despite the interannual climatic variability, CFSR consistently overestimated precipitation, and GLDAS1 products consistently underestimated precipitation (Fig. 12b). Often, the same products that predicted higher or lower snow season snowfall fraction (S/P) and average snow season temperature than the PRISM and UA data were consistent in doing so (Figs. 12c,d). Furthermore, during the entire period, the only metric that was consistently lower than the UA data was SWEmax/S (Fig. 12e). This interannual consistency indicates that the results in sections 3a–d are robust.
4. Discussion and conclusions
Similar to another recent study (Mudryk et al. 2015), we find that there are considerable differences of the climatological values of SWE among the GLDAS and reanalysis products that we tested, even when the forcing for those models is consistent (e.g., for GLDAS1). However, we find that all of the reanalysis and GLDAS products that we tested generally underpredict SWE in the CONUS. They have considerably less snow than higher-resolution, observationally based gridded SWE data, especially in the western CONUS. This is true when these products are compared to either the UA or the SNODAS data, indicating that these results are probably fairly robust. Even though there is a wide spread in the model precipitation, with GLDAS1 products (which all share the same precipitation amount) having the least amount of precipitation (well below what is suggested by PRISM) and CFSR having the most (more than what is suggested by PRISM), SWEmax, normalized by precipitation, is significantly lower in all of the reanalysis and GLDAS products than is suggested by the high-resolution data.
The primary reason for this underestimation is that most reanalysis and GLDAS products have too much snow ablation at near-freezing temperatures. In fact, the UA and SNODAS data suggest that, on average, significant snow ablation does not occur until the daily temperature is above freezing, though some of the GLDAS and reanalysis products that we tested have significant snow ablation when the daily average temperatures are well below freezing. In particular, CFSR, MERRA, MERRA-Land, and MERRA2 have the most snow ablation at subfreezing temperatures. In general, the more ablation that the reanalysis or GLDAS products have at near-freezing temperatures, the larger the difference between SWEmax/S for the UA dataset versus that computed from reanalysis or GLDAS data (Fig. 11b).
This does not mean that other factors do not contribute to the widespread underestimation of SWE, especially on a case-by-case basis. For example, as mentioned in section 2, CFSR and ERA-I include initialization that affects snow thickness. This certainly affects SWE for these products. CFSR uses a similar initialization procedure as the operational CFS, which was shown in Dawson et al. (2016) to lead to significant underestimations of SWE. The MERRA and GLDAS products do not initialize snow thickness, so it is not expected that snow initialization could dramatically alter the amount of SWE in these products. In addition, forcing biases (e.g., in precipitation and temperature) certainly contribute to either worsening or alleviating the underestimation of SWE in the reanalysis and GLDAS products. It is well known that forcing biases can lead to large uncertainties in SWE simulation, especially if those biases are large (Raleigh et al. 2015). For example, Brun et al. 2013 find that the Crocus snow model yields much better snow simulations with relatively small biases over Eurasia if the model is forced with ECMWF forcing versus the Princeton meteorological forcing dataset. In this study, we find that GLDAS products, especially GLDAS1, underestimate precipitation, which undeniably contributes to these products having low SWE. However, at the same time, some of the GLDAS1 products, particularly GLDAS1-CLM, have higher snowfall ratios (S/P) than is suggested by the UA data, which would likely increase SWE. Note that even though GLDAS1 models share nearly identical precipitation forcing, each model decides how to partition between rainfall and snowfall individually. Furthermore, the GLDAS1 products tend to underestimate snow season temperature, especially in the western CONUS (Fig. 9, Table 2).
In addition, the above conclusions are not much affected by the different spatial resolutions of these products. For example, GLDAS1-Noah (which has a grid size of 1°) predicts SWEmax/S that is closer to the UA data than GLDAS2-Noah (which has a grid size of 0.25°). In addition, even though GLDAS1-VIC has a grid size of 1° (among the coarsest resolution), it was most similar to the UA data in terms of SWEmax/S and to the UA and SNODAS data in terms SWEmax. This does not mean that resolution is not an issue because all products that we evaluated have resolutions that are too coarse to resolve complex topography (mountains) in some regions. However, what we have shown is that even after removing this effect (i.e., evaluating how each product predicts SWE given a set of climatological conditions within each product), the GLDAS and reanalysis products still underestimate SWE.
We have shown that the degree to which each product underestimates SWE is related to the propensity of each product to predict more or less ablation at near-freezing temperatures. Although full model sensitivity studies are required to fully understand, on a case by case basis, why this may be so (a task that is beyond the scope of the current manuscript), it is reasonable to assume that it can potentially be understood in terms of the differences in model physics or the way in which the models are implemented. Consistent with previous studies (e.g., Essery et al. 2013), more complex models do not necessarily predict SWE better than simpler models. For example, MERRA, which employs a multilayer snow model, predicts SWEmax/S that is lower than GLDAS1-Mosaic, which employs a single-layer snow model (and with much simpler snow physics). However, deficiencies in the representation of certain aspects of the model physics (e.g., for albedo) can have a big influence on snow simulation (Wang et al. 2010; Feng et al. 2008). Furthermore, other factors, such as how individual models are influenced by forest cover or whether they include subgrid tiling of topography and vegetation, may also have a large influence on the snow simulation (Chen et al. 2014). It is striking that GLDAS1-VIC is the best at representing SWEmax/S and yet is the only model to include subgrid elevation bands. Representing this sort of subgrid variability may be particularly appropriate for snow simulation because of SWE’s strong dependence on elevation. It is possible that better representation of subgrid topography (e.g., elevation and aspect) may lead to more realistic snow simulation in large-scale reanalysis and GLDAS.
While the UA dataset is likely much more accurate at predicting SWE than the GLDAS and reanalysis products, it still contains a number of uncertainties. One potential uncertainty is the statistical model to convert COOP snow depths into SWE (Fig. 2a) prior to their inclusion into the dataset. In general, this relationship potentially underestimates SWE at COOP stations because it is derived from SNOTEL stations, which are often located in cold environments with continuous snowpacks, and applied to COOP stations, which are located in warmer environments with ephemeral snowpacks. Ephemeral snowpacks often have a higher density, at a given SWE, than cold snowpacks (Brown et al. 2003). Furthermore, there is a substantial difference between the scale of in situ measurements and the grid scale of the UA data (4 km). However, the effect of this difference on the SWE assimilation is mediated somewhat in the UA data by the fact that a normalized quantity (SWE/S), instead of SWE itself, is interpolated between the point observations. The normalized quantity was shown in Broxton et al. (2016) to be spatially more consistent than the unnormalized quantity. Furthermore, the dataset is strongly constrained by the gridded temperature and precipitation data, which makes it relatively consistent, regardless of what SWE data are included (and their uncertainty). For example, Figure S1 in the supplemental material shows the area-averaged SWE through time for WY 2008 for the Colorado and Wisconsin grid boxes where 0%, 50%, 75%, and 90% of the SNOTEL and COOP SWE and snow depth data have been withheld (i.e., are not used in the interpolation). These area-averaged quantities rarely differ by more than 5%. It is clear that withholding large amounts of SWE and snow depth data have relatively little impact on the overall averaged SWE for the grid boxes, demonstrating the robustness of the UA product.
Since the UA SWE dataset is based on daily PRISM precipitation and temperature data and station SWE and snow depth data, it covers a period that goes back much further than the SNODAS data. Because of the potential use of such a dataset, we are in the process of documenting and releasing the entire dataset (from WY 1981 to WY 2010) in a separate study. Furthermore, recognizing the uncertainty in the simple relationship between SWE and snow depth in Fig. 2, we are in the process of developing a new observational data-based snow density model in a separate study.
This study was funded by NASA (NNX14AM02G; received by Xubin Zeng), NOAA (NA13NES4400003; received by Xubin Zeng), and NSF (AGS-0944101; received by Xubin Zeng). Patrick Broxton and Nicholas Dawson were supported by the above grants. Three anonymous reviewers are thanked for their helpful and insightful comments. SNOTEL data are obtained from the NRCS SNOTEL and snow course data and products web page (http://www.wcc.nrcs.usda.gov/snow/). COOP data are obtained from the University Corporation for Atmospheric Research’s (UCAR) Computational and Information Systems Lab (CISL) Research Data Archive (RDA; http://rda.ucar.edu/index.html?hash=cgi-bin/dssearch?words=td3200), PRISM data are obtained from the Oregon State University’s PRISM web page (http://www.prism.oregonstate.edu/), and SNODAS data are obtained from the National Snow and Ice Data Center (NSIDC; http://nsidc.org/data/g02158). IMS data are obtained from the NSIDC (http://nsidc.org/data/docs/noaa/g02156_ims_snow_ice_analysis/). Data for four GLDAS, version 1 (GLDAS1), models (VIC, Noah, Mosaic, and CLM), as well as one GLDAS, version 2 (GLDAS2), model (Noah) are obtained from the Goddard Earth Sciences Data and Information Services Center (GES DISC; http://disc.sci.gsfc.nasa.gov/services/grads-gds/gldas), CFSR data are obtained from the NOAA National Operational Model Archive and Distribution System (NOMADS; http://nomads.ncdc.noaa.gov/data.php), MERRA and MERRA-Land data are obtained from the GES DISC MERRA Data Subset (http://disc.sci.gsfc.nasa.gov/daac-bin/FTPSubset.pl), MERRA2 data are obtained from the GES DISC MERRA2 Data Subset (http://disc.sci.gsfc.nasa.gov/daac-bin/FTPSubset2.pl), and ERA-I and ERA-I/Land data are obtained from the ECMWF (apps.ecmwf.int/datasets).
Supplemental information related to this paper is available at the Journals Online website: http://dx.doi.org/10.1175/JHM-D-16-0056.s1.