## Abstract

In many practical applications snow depth is known, but snow water equivalent (SWE) is needed as well. Measuring SWE takes ∼20 times as long as measuring depth, which in part is why depth measurements outnumber SWE measurements worldwide. Here a method of estimating snow bulk density is presented and then used to convert snow depth to SWE. The method is grounded in the fact that depth varies over a range that is many times greater than that of bulk density. Consequently, estimates derived from measured depths and modeled densities generally fall close to measured values of SWE. Knowledge of snow climate classes is used to improve the accuracy of the estimation procedure. A statistical model based on a Bayesian analysis of a set of 25 688 depth–density–SWE data collected in the United States, Canada, and Switzerland takes snow depth, day of the year, and the climate class of snow at a selected location from which it produces a local bulk density estimate. When converted to SWE and tested against two continental-scale datasets, 90% of the computed SWE values fell within ±8 cm of the measured values, with most estimates falling much closer.

## 1. Introduction

As global temperatures rise, the world’s snow resources are predicted to change in significant ways (Hosaka et al. 2005; Christensen et al. 2007; Räisänen 2008; Deser et al. 2010). Long-term changes in global, regional, and local snow depth (*h _{s}*), snow water equivalent (SWE), and extent will ultimately have major ramifications for ecosystem function, human utilization of snow resources, and the climate itself through feedback mechanisms like snow albedo (Barry 1996). Unfortunately, of the three snow metrics listed above, only extent [i.e., snow cover area (SCA)] is easily monitored using satellites. This monitoring, under way for several decades (Robinson 1993; Frei and Gong 2005), has shown that global SCA has been decreasing for the past 30 years (Lemke et al. 2007). SCA, however, is only an indirect measure of the world’s snow water resources (Brown 2000; Brown et al. 2003). To fully understand global snow water trends, the most fundamental metric to monitor is SWE, with depth a close second. For monitoring these metrics, two potential methods are available: passive microwave remote sensing and estimations based on direct measurements. Many issues make estimating snow mass from remote sensing problematic (König et al. 2001), leaving field and station measurements currently the primary means of inferring critical trends in snow resources.

Of the two fundamental measurements, depth is quicker and easier to measure in the field than SWE. No detailed estimates of the total number of depth and SWE measurements made worldwide is available, but what is available suggests that considerably more depths are collected than SWE measurements. For example, Environment Canada operates 1556 snow depth sites, but only 27 sites where SWE is measured. In the United States, over 700 snow pillows (Beaumont 1965; Johnson and Schaefer 2002), chiefly operated by the National Resource Conservation Service (NRCS), are used to make continuous SWE measurements, but conservative estimates (based on sales of sonic sounders) suggest that thousands of depth-monitoring stations are in operation. The U.S. Weather Service as well as the Swiss Service measure more depths than water equivalents.

The two measurements are related but not interchangeable. The decision to make one type of measurement versus the other (or a mix of both) is often determined as much by tradition as an objective assessment of accuracy, need, and available resources. To our knowledge, no formal cost–benefit analyses of measuring depth versus SWE have been conducted, but a rough estimate can be made of the relative time effort involved in making each type of measurement. During field campaigns we try to split our available time equally between the two. A census of records for the past decade indicates an approximate ratio of depth to SWE measurements of about 30 to 1. In a more objective comparison, our standard field protocol calls for 201 depth measurements and 10 SWE measurements at each station. These are done in the same amount of time giving a 20:1 ratio. Part of the reason for this disparity is that we use an efficient self-recording depth probe (U.S. Patent No. 5864059). In the U.S. alone approximately 6000 man hours are expended monthly during winter and spring in the collection of ∼20 000 SWE data, but based on the 20:1 ratio, the same work effort could produce nearly half a million depth measurements.

What if there was a reliable method of converting depth measurements into SWE? Such a method would allow for a significant expansion in the number of locations where SWE could be estimated without incurring much additional expense. This would improve our ability to assess worldwide snow water resources. Here we describe a method that uses historical station data and knowledge of snow climate classes (Sturm et al. 1995) to estimate the bulk density of a snowpack. The bulk density is then used to convert depth into SWE. Specifically, using a large (*n* = 25 688) training set of depth, density, and SWE measurements, we fit the data, including the climate class of each datum (Sturm et al. 1995), with a nonlinear analysis of covariance model (ANCOVA) using Bayesian methods. From the analysis an equation is derived that allows estimation of bulk density based on input of *location*, *snow depth*, and the *day of the year*. We examine the residuals to assess model fit and test the model against a large independent dataset to provide the reader with an understanding of model accuracy.

The method opens the possibility of converting thousands of depth observations that have been, or currently are being, collected in to SWE values. For example, it could be used to convert Environmental Technical Applications Center monthly observed global snow depth climatologies (Foster and Davy 1988) into SWE climatologies. It is also likely to become even more useful in the future if one promising new method of measuring snow depth becomes operational: airborne LiDAR (Hopkinson et al. 2004; Deems and Painter 2006; Deems et al. 2006; Minoru and Hiroshi 2006; Schaffhauser et al. 2008). Airborne LiDAR snow surveys are conducted much like gamma ray surveys (Peck et al. 1980) with a set of measurements made prior to the start of the snow season subtracted from a second set made after the snow has accumulated. Unlike gamma ray surveys, however, LiDAR surveys produce snow depth swath maps along the flight path that can contain literally millions of depth data. Accuracy depends on flight altitude, terrain, and GPS navigation, but is typically about ±15 cm in absolute terms, with considerably higher accuracy possible when frequent bench marks are included in the swath. Coupled with the depth-to-SWE conversion method described here, airborne LiDAR could greatly improve our ability to estimate local-to-hemispheric snow resources.

## 2. Background

At a given point, snow depth (*h _{s}*) is related to SWE by the local bulk density (

*ρ*):

_{b}where depth is measured in centimeters, density in grams per centimeters cubed, *ρ _{w}* is the density of water (1 g cm

^{−3}), and SWE is measured in centimeters of water. It has long been observed (e.g., Dickinson and Whitely 1972; Steppuhn 1976) that the natural range of

*h*is many times greater than the range of

_{s}*ρ*. For a northern Alaskan dataset (

_{b}*n*= 5323), 95% of all bulk density values fell between 0.12 and 0.42 g cm

^{−3}while 95% of all depths fell between 8 and 100 cm, a dynamic range 4 times greater than that of density. For the training dataset used in this paper the ratio is about 10:1. Based on these ratios, estimating the more conservative parameter (

*ρ*) while directly measuring the more dynamic (and easier to measure) parameter (

_{b}*h*) is the most practical and potentially accurate method of estimating SWE.

_{s}The form of Eq. (1), along with the large dynamic range of *h _{s}*, ensures that there will be a strong correlation between SWE and depth, but it is mute about the relationship between depth and bulk density. These two parameters are functionally related, albeit in a complex way. Steppuhn (1976) noted that Eq. (1) requires a covariance term between depth and density (

*C*) when used to compute mean areal values (indicted by overbars):

The covariance between depth and bulk density, however, tends to be negligible for a snow cover less than 80 cm deep (Pomeroy and Gray 1995), and at greater depths is only about 2.5% of SWE (data from this study). The error in neglecting *C* is therefore typically smaller than other potential error sources in Eqs. (1) or (2), so for simplicity, C is ignored.

Despite the fact that estimating *ρ _{b}* using Eq. (1) makes it possible to determine SWE from depth, there have been relatively few formal attempts to do so. A handbook for the NRCS field surveyors (Davis et al. 1970) provides qualitative rules for adjusting the local mean snow density upward or downward depending on wind, exposure, thaws, and the day of the year. Wilks and McKay (1996) developed a power-law version of Eq. (1) with the bulk density term (their “pseudodensity”) varying with climate. More recently, estimates of

*ρ*have been produced using snow models that solve the surface energy balance and evolve a snowpack layer by layer (Brun et al. 1989; Liston and Elder 2006; Lehning et al. 2006), but these models require extensive meteorological input as well as the calculation of settlement of individual snow layers. They tend to be difficult to apply to large basins and regions because of their heavy data and computational demands.

_{b}## 3. Data and modeling

### a. Data

The bulk density model was developed using a training set of 25 668 records of snow depth, density, and SWE from three countries and two continents (Table 1). These data span the climate classes of seasonal snow defined by Sturm et al. (1995): alpine, maritime, prairie, tundra, taiga, and ephemeral. The classes are defined by the general physical attributes of a snow cover (depth, density, type of snow layers, etc.), attributes that tend to remain relatively consistent within climate zones. The set combines research surveys done in Canada, Switzerland, and Alaska by the authors with data from monthly agency surveys done by the NRCS on snow courses in the western United States (Fig. 1a). Because few systematic measurements have been made in ephemeral snow it had to be excluded from the analysis. The data in the training set tend to be of high quality because a high percentage were taken during research campaigns.

At NRCS stations, data are collected on the first of the month starting in November and continuing through May or June each year. The Canadian and Alaskan data were taken by the authors during over-snow traverses conducted in March and April when conditions were still well below freezing but the snowpack was at near-maximum depth (Table 1). The Swiss data were taken at Alptal, Switzerland (Staehli and Gustafsson 2006) on a weekly and biweekly schedule between November and April. All data used in the analysis are available online (see http://cdp.ucar.edu/cadis).

After developing the model, we located a large depth–density–SWE dataset from Canada (i.e., the test dataset; Fig. 1b; *n* = 226 009) against which the model was tested. This set, compiled in 2000 (Meteorological Service of Canada 2000) and updated through 2004 by R. Brown, comes from snow courses measured between 1935 and 2004 and is wholly different than the Canadian data used in model development. The practices and equipment used in collection of the large set varied widely and are poorly documented. Considering this, and the age of the data, they are probably not as accurate as the training set. Additionally, the set is heavily biased toward sites clustered along the U.S.–Canada border, many of which are classified as maritime sites despite having markedly lower depths than in the training set.

Both the training and test sets undoubtedly include erroneous depth and SWE data (cf. Carroll 1995). SWE measurements, in particular, are difficult to make accurately. Using a corer that is either plunged or twisted into the snow until it hits the ground, a snow core is removed from the snowpack and weighed in the corer using a spring balance calibrated to read out directly in SWE units, or bagged and weighed later using a digital balance. In either case, the measurement is gravimetric and returns the bulk density of the snow; though in one case the measurement is reported in SWE units. While the corer is touching the ground, the depth is read off graduated marks on the side of the corer with a nominal precision of ±0.2 cm, but an accuracy that is closer to ±1.0 cm.

At least four models of snow corer are in use in North America [the Federal or Mt. Rose, the Adirondack, the Eastern Snow Conference (ESC), and the Snow-Hydro] and all suffer from similar problems, the most common being a loss of snow. If the corer fails to retain a plug of soil, vegetation, and/or ice, snow will fall out the bottom of the corer when it is extracted from the snowpack (Turcan and Loijens 1975). The corer can also hit an ice layer, which will push snow out of the corer’s path, again resulting in a light sample. We have tried to quantify the magnitude of this undersampling by routinely comparing core-based density values to those determined from a density profile done in nearby snow pit. For measuring the latter, a 3-cm high steel box cutter with a volume of 100 cm^{3} is used with a digital balance. By integrating the individual layer densities measured in the pit, a bulk density can be computed and compared to the adjacent values obtained by coring. Our experience suggests that the pit measurements are more reliable and accurate, albeit more labor-intensive, than the core measurements. The results (spanning 8 yr, 5 field programs, and 3 types of corers: Table 2) suggest core-based bulk densities average 7.1% lower than layer-integrated values.

Our results contradict prior studies (Work et al. 1965; Peterson and Brown 1975; Goodison 1978; Farnes et al. 1982) that found that coring results in a high SWE bias. This happens when excess snow is forced into a core tube, a situation that is said to happen more frequently with smaller diameter cutters (like the Federal Sampler with a cross-sectional area of 11 vs 30 cm^{2} for the ESC and Snow-Hydro samplers), deeper, denser snow (Peterson and Brown 1975), and poor cutter design (bad taper; fewer and/or dull teeth). The type of snow (nature and amount of depth hoar, number and thickness of ice layers) also matters. SWE errors also arise when a spring balance is used for weighing the cores (Bray 1973). Given our findings, and those of the previous studies, we conclude that within the training set (i) the SWE errors are small and (ii) they are a function of the equipment used, the care taken in collecting the measurement, and the nature of the snowpack (none of which is routinely recorded). They are effectively random in nature, and therefore cannot be corrected easily, so no correction is applied.

### b. Modeling

While it would be possible to develop a model relating SWE directly to depth, it is more appropriate to model bulk density (as detailed above) and use that relationship to convert measured or assumed depths to SWE. Bulk density is a complex function of snow depth (*h _{s}*), snow temperature (

*θ*), snow deposition history (

*t*), and the initial density of the individual snow layers (

*ρ*

_{0}):

thus Eq. (1) can be written as

Without knowing explicitly the form of Eq. (3), it is clear that SWE is a complex and nonlinear function of *h _{s}.*

To model the bulk density we used Bayesian statistical methods to develop a nonlinear ANCOVA model. Bayesian methods offer versatility when modeling complex systems (Congdon 2003; Gelman et al. 2004; see also http://www.bayesian.org/ and the journal *Bayesian Analysis*). When *ρ _{b}* is plotted as a function of depth, it exhibits nonlinearity in the mean response, a skewed error structure, and a nonconstant error variance (heteroscedasticity). These complexities can be handled using Bayesian methods, but are more problematic when using nonlinear least squares. The Bayesian methods also produce useful posterior probability distributions that define the uncertainties associated with the model predictions, as well as accounting for temporal and spatial dependencies. Three easily obtained predictor variables: snow depth, DOY (day of the year), and snow class, were used as input to the model. The snow class, as discussed below, was intended to capture environmental variables like temperature and initial density, which are well known for impacting the bulk density.

Many different functional forms of Eq. (3) (i.e., linear models with and without quadratic and cubic functions of depth and DOY; nonlinear functions) and all combinations of predictor variables were tested. To choose which form was best, as well as whether all three predictor variables were needed, we used an objective method of evaluation called the deviance information criterion (DIC; Burnham and Anderson 2002; Spiegelhalter et al. 2002; Wheeler et al. 2010). DIC scores become lower (in our case increasingly negative) as model accuracy goes up, but if, and only if, the improvement in accuracy exceeds a penalty assessed for increasing model complexity. A decrease in DIC score of 5–10 points implies a significant improvement in model performance. The final model that was selected had a DIC score ∼1630 lower than its closest competitor, suggesting that it was significantly better.

The variance in the bulk density data (when plotted against snow depth, DOY, and subdivided into climate classes) was modeled using both beta and normal distributions, but in the end, a beta distribution was selected because it could accommodate the skew in the data. Incorporating a term to address the nonconstant variability (heteroscedasticity) present in the training dataset improved model performance (decreasing DIC by almost 2000 points) and was also incorporated into the final model.

The software package WinBUGS (Lunn et al. 2000) was used to run the various models and determine the optimal result. Multiple Markov chains (using a range of initial values) were run for more than 100 000 iterations to simulate posterior distributions, with the first 50 000 iterations eliminated to allow for burn-in. We verified convergence by examining run histories, autocorrelation functions for simulated values, the Gelman–Rubin convergence statistic (Gelman and Rubin 1992) as modified by Brooks and Gelman (1998), and by performing additional diagnostic tests using the CODA (Plummer et al. 2006) libraries in Program R (R Development Core Team 2008).

Two potential types of errors affect the modeling. The first relates to the use of snow climate classes as a proxy for the physical processes that lead to the densification of a snow cover. Snow bulk density is known to increase with time and depth as the weight of the overlying snow compacts underlying layers (Kojima 1966). It is also moderated by temperature, solar radiation, the nature of the initial snow deposit, infiltration of meltwater, and the rate, sequence, and rapidity with which new snow layers are added to the snow on the ground. Bilello (1969), McKay and Findlay (1971), and McKay and Gray (1981) first demonstrated that these differences in bulk snow densification could be captured empirically using a climate classification; Sturm and Holmgren (1998) confirmed this finding. Nonetheless, the approach is general. The alternative would be to explicitly model compaction processes, as has been done in several physically based snow models (cf. Anderson 1976; Koren et al. 1999; Liston et al. 2007; see Rutter et al. 2009 for an extensive list). The problem is that these physical models require high-quality daily or even hourly weather and snowfall data because they must track individual snow layer settlement through time. Simpler quasi-physical models have been proposed based on time since snow deposition (Martinec 1966; Elder et al. 1991; Sturm and Holmgren 1998), but even these models are too complex and too computationally intensive to be used in global and regional applications (cf. Anderton et al. 2004). So while our model is general and potentially less accurate than explicit locally applied physical models, it is far easier to apply globally or regionally.

The second potential error is misclassification of the snow cover. We used a 1-km by 1-km global snow class map (G. E. Liston and M. Sturm 2010, unpublished manuscript) for classifying both the training and test datasets. The map is a refinement of the 0.5° by 0.5° grid map that appears in Sturm et al. (1995). For the training set the classification was checked using aerial photos (vegetation can indicate class) and satellite images, and in many cases there was local knowledge of the snow. Classification of the test set was based strictly on location (latitude–longitude) and likely includes greater classification errors.

## 4. Results

The training dataset (Table 3 and Fig. 2) has a mean density (*ρ*_{b}) of 0.312 g cm^{−3} and exhibits a nearly symmetric distribution, while depth (mean = 108.4 cm) and SWE (mean = 38.8 cm) have asymmetric distributions. Consistent with the findings of Skaugen (1999) and Bocchiola and Rosso (2007), these distributions can be fit using normal (density) and lognormal (depth and SWE) functions, respectively. The similarity between the SWE and depth distribution curves is indicative of the fact that SWE is more closely linked to depth than it is to bulk density, as previously discussed. A regression line of SWE versus depth [forced through (0, 0), see Fig. 3a] has a slope of 0.394 and an *r* ^{2} value of 0.95 (Table 3). The correlation value is high, but it should be given the nature of Eq. (1). The same equation suggests that the slope of the regression should be equal to the mean bulk density (0.312 g cm^{−3}), but it is not. Instead the slope is 26% higher because the functional relationship between depth and SWE is nonlinear. The regression line is skewed upward by the greater snow densities associated with deeper snow. This problem illustrates one of the motivations for modeling *ρ _{b}* rather than estimating SWE directly from depth.

A plot of bulk density (*ρ _{b}*) versus depth (

*h*) [Fig. 3b; see also Fig. 3 in Pomeroy and Gray (1995)] shows considerably more scatter. There is a distinct increase in the minimum value of

_{s}*ρ*(the floor of the data cloud in Fig. 3b) with increasing depth, a result that could have been anticipated because with increasing depth there is greater compaction. There is almost no increase in the maximum bulk density (

_{b}*ρ*

_{max}) with depth; this too is consistent with previous work that has found that the maximum density of seasonal snow (in the absence of water infiltration) approaches 0.6 g cm

^{−3}(Paterson 1981). A linear function could be used to relate bulk density to depth, but a better fit is achieved using a nonlinear function asymptotic to the maximum seasonal density:

where *ρ*_{0} is the initial density and *k* is a fitting parameter. Unlike the SWE versus depth regression, the bulk density fit should not be forced through (0, 0) because even a vanishingly thin snow layer will have a nonzero density.

Within the training dataset there are distinctly different density patterns for each class of snow (Fig. 4). Maritime snow, for example, has the largest range (in both depth and density) of all the classes, forming a data cloud that has an axis oriented primarily in the horizontal direction, reflecting the large range of snow depths this class exhibits (Sturm et al. 1995). Deep maritime snow, always relatively close to its melting point, compacts rapidly to produce high bulk densities by late winter. In contrast, the tundra snow data cloud has a vertical axis, indicating a large density range, but little range in depth. Tundra snow never gets very deep, but its two major components, depth hoar and wind slab, bracket the full range of density found in seasonal snow (Benson and Sturm 1993). Variations in the relative percentages of these components can alter the bulk density dramatically without much variation in depth.

Within each snow class shown in Fig. 4 there are further variations in depth and density controlled by landscape, vegetation, microtopography, and time (e.g., Adams and Roulet 1982; Elder et al. 1991; Lapen and Martz 1996). These within-class variations are well illustrated by the tundra snow class. In the tundra regions, locations exposed to higher winds (in Alaska the arctic coastal region) tend to exhibit higher bulk densities and lower snow depths than more protected places like the foothills just north of the Brooks Range (Fig. 5: see also Bilello 1957, 1984). Here, we ignore these local, landscape-controlled density differences, though they could be introduced into the methodology in manner similar to that of Anderton et al. (2004), Watson et al. (2006), and Jonas et al. (2009).

The scatter in Figs. 3a,b suggest that in estimating the bulk density of snow using a statistical model there is likely to be a fairly large uncertainty. To place the model results in context, it is useful to look at the range of bulk density values observers might encounter if they were to take several accurate measurements at a single site or within a very small basin. This natural range is the base against which model-based estimates need to be compared. Unfortunately, there has been relatively little systematic effort to quantify it. A sense of its magnitude, however, can be obtained by analyzing data collected at 124 sites (10 measurements per site) during 2 of the Alaskan traverses that are included in the training set. Each of these sites was flat and nearly featureless (cf. Liston and Sturm 2002; Sturm and Liston 2003) limiting local variations because of topography or vegetation. The bulk density varied between 0.25 to 0.33 g cm^{−3} in 2000, and between 0.23 to 0.28 g cm^{−3} in 2002 (Fig. 6), a range that when converted to SWE implies a 27% and 20% variation, respectively, about the mean. In a more comprehensive survey that included five different locations and several snow types, Jonas et al. (2009) concluded that the local natural variation in bulk density ranged from 7% to 23% of the mean bulk density value, corresponding with a variation in SWE of 13%–80%, an amount that translates into 2.5–6 cm of SWE.

Finally, in preparation for testing the model results, training and test datasets were compared using standard statistical measures (Table 3). The mean values of depth, density, and SWE were higher for the training set by 35%, 12%, and 42%, respectively. We note that these differences are of the same order as the local variances shown in Fig. 6. The test set was also more scattered than the training set, probably because it is considerably larger, older, and contains more problematic data points.

## 5. Modeling the bulk density

The simplest method of converting snow depth to SWE is to replace *ρ _{b}* in Eq. (1) with the mean density of the training set (Table 3; 0.312 g cm

^{−3}). A refinement of this approach would be to first classify the sites being modeled and then use the mean densities of the individual classes. The mean density of the whole training set differs from the whole test set by only 12%, a small error considering other uncertainties in calculations of global snow reserves. The differences in snow class densities range from 1% to 19% (in all cases the training set values are higher than the test set), still reasonably close for regional and global modeling.

A more accurate approach, however, is to directly include the effect of snow depth [*h _{s}* in Eq. (5)], the effect of snow aging (older snow is generally deeper and denser than younger snow) through the use of DOY, and to indirectly include the effects of climate (temperature and wind) through the use of snow classes. Equation (5) becomes

where *k*_{1} and *k*_{2} are densification parameters for depth and DOY, respectively; *ρ*_{max}, *ρ*_{0}, *k*_{1}, and *k*_{2} vary with snow class (Table 4); and *i* indicates the *i*th observation. Because the winter season in the Northern Hemisphere spans two calendar years, DOY runs from −92 (1 October) to +181 (30 June), with no 0 value. In this system, 1 February is DOY = 32, while 15 November is DOY = −47.

Before we settled on Eq. (6) other functional forms relating bulk density to depth and DOY were evaluated using the DIC criteria. Several linear models with and without quadratic and cubic functions of depth and DOY produced nonphysical results and were discarded. Incorporating nonconstant variability (i.e., the heteroscedasticity exhibited in Fig. 3b) improved model performance, decreasing DIC by almost 2000 points, but had virtually no effect on the fitted curve and resulting estimates of *k*_{1} and *k*_{2}. The final model [Eq. (6)] included all three predictor variables (snow depth, DOY, and snow class) and had a DIC score that was 1630 lower than the closest nonlinear competitor (which included only snow class and depth as input; Fig. 7). The chosen model explains about 45% of the variability in the data. By way of comparison, the adjusted *r* ^{2} for a linear model with depth as the only input explains less than 24% of the variability (Fig. 7).

To quantify the model fit, bulk density residuals (*ρ*_{obs} − *ρ*_{model}) for the training set were used to calculate the root mean squared error (RMSE) as a function of snow depth:

where *n* is the number of observations falling in depth class *h*_{1} to *h*_{2}, and obs and mod indicate measured and modeled values, respectively. As expected, the predictive power of the model for snow depths less than 30 cm is relatively poor (Fig. 8), but the error diminishes quite rapidly as snow depth increases, quickly dropping to levels well below those of natural local variability (Fig. 6). For example for snow deeper than 50 cm, modeled bulk density errors are less than 0.066 g cm^{−3}.

Application of the model over an area, region, or the Northern Hemisphere is simple: each snow depth station is classified using the snow class map of Sturm et al. (1995) or the newer 1-by-1 km map developed by G. E. Liston and M. Sturm (2010, unpublished data). Equation (6) is applied to the station snow depth using the date of observation and the appropriate parameters from Table 4. The resulting value of *ρ _{b}* is multiplied by the depth to produce the SWE for the station.

## 6. Testing the model

For each site and date of observation in the test dataset Eq. (6) was applied using the appropriate values from Table 4 based on the site classification. The estimated bulk density was compared to the observed value and two errors were defined:

and

Probability distribution functions for these errors are shown in Figs. 9a,b. Ninety percent of all model-estimated bulk densities fell within ±0.13 g cm^{−3} of the measured density. For comparison, site-specific field surveys based on multiple SWE cores commonly find local variations in density of ±0.07 g cm^{−3} (Fig. 6; see also Goodison 1978 and Table 4 in Jonas et al. 2009). SWE values produced from the model densities fell between −7 and +9 cm of the observed SWE: locally observed variations in SWE are on the order of half this range.

The biases between the training and test datasets listed in Table 3 and shown in Figs. 2a–c reappear in Figs. 9a,b. The test data are on average 0.04 g cm^{−3} lower in density than the training data and 2.2 cm lower in SWE (shown by vertical dashed lines in Figs. 8a,b). There is no definitive way to tell which dataset is better. However, given the uncertain heritage of many of the datum in the test set, we favor the training set as being more universal, in part because it is more regional in scope, and in part because its heritage is better known and it is known to include data of higher quality.

The probability of error as a function of snow depth has been contoured in Figs. 10a–c, removing the previously noted bias in order to focus on how the error varies with depth. The bulk density error is nearly constant across the entire range of depth (Fig. 10a), while the probable error in SWE increases as depth increases (Fig. 10b). The relative error in SWE, on the other hand, decreases with increasing depth (Fig. 10c). This is because the depth increases faster than the error. We examined whether the errors showed any systematic variation with other factors besides depth (and SWE) but found none. The errors are also of approximately the same magnitude across all latitudes, longitudes, and dates in the test dataset.

In a second test, model results were compared against data collected from three midlatitude basins in the United States where considerable snow research has been done. These are: Tokopah basin in the Sierra’s of California (Maritime or Alpine snow: see Sickman et al. 2001), Green Lakes basin in the Front Range of Colorado (Alpine snow; data courtesy of M. Williams), and the Fraser and North Park basins measured during the Cold Lands Processes Experiment (CLPX) in Colorado in 2001 and 2002 (Alpine and Prairie snow). The comparison (Table 5) shows reasonable agreement for estimated SWE values for all areas with the exception of the very shallow prairie snow of North Park. With the exception of the latter, the SWE error as a percentage of the observed value ranged from −1.5% to 26%, and was in all cases less than the standard deviation of the observed values.

## 7. Discussion

The strength of the model developed here is its simplicity and ease of use. It seems ideally suited for use in sparsely populated snow-covered areas like the Arctic and mountainous regions where access can be difficult. It is also useful where the computational cost of physically based modeling to estimate bulk density may not be warranted, or where meteorological station density may be too low to support such modeling. Given its simplicity, the performance of the model is surprisingly good. The reason the model works is because bulk density is a conservative value. Fully 50% of all bulk density measurements for seasonal snow lie in a narrow range between 0.244 to 0.375 g cm^{−3} (Fig. 2c); 80% lie between 0.194 and 0.435. Reasonable estimates of SWE can be derived even using a fixed bulk density value (either 0.312 g cm^{−3} for all snow, or the values in Table 3 for each snow class).

This approach of deriving SWE from snow depth is not new: NRCS instructions from the 1970s for converting aerial marker snow depths into SWE (Davis et al. 1970) incorporate several of the basic ideas presented here, though in a rule-based form. What is new is our use of snow climate classes to refine bulk density estimates, though the work of McKay and Findlay (1971), Wilks and McKay (1996), and Sturm and Holmgren (1998) presaged this part of the approach. The addition of the climate classes broadens the area over which estimates can be derived and improves the estimates.

The methodology applied here (using historical data to develop statistically based methods for estimating bulk density) has already been applied to a single snow class (Alpine snow: see Jonas et al. 2009) where snow depth, altitude, and DOY were used as input to a within-class statistical model. The same principle, that bulk density is conservative, served as the basis for this local study and ensured that the approach worked.

The ultimate value of the method is that it could potentially improve local, regional, and global estimates of snow resources at a time when budgets for operating traditional measurement monitoring networks have become more difficult to obtain and sustain. Using the model increases efficiency and cost effectiveness in snow monitoring by leveraging depth measurements (which can be taken 20 times as fast as SWE) into SWE. Dickinson and Whitely (1972), Steppuhn (1976), and Rovansek et al. (1993) have all proposed schemes that in some way do this, though here we have made the application of the method easier. Steppuhn (1976), writing more than 30 years ago, stated that:

“…statistically valid water equivalents [could] be derived from sampling where depth measurements outnumber[ed] those for density four-fold or more.”

We could not agree more, and put our model forward as a potential way to do this.

## 8. Conclusions

Here we present a method of converting snow depths to snow water equivalents (SWE). Using as input data snow depth, day of year, and station location (from which can be inferred the climate class of snow following Sturm et al. 1995), estimates of the bulk density of the snow are derived using a statistical model. The model was trained on historical depth–density–SWE data from the United States, Canada, and Switzerland, then tested on an extensive set of independent data from Canada and a second set from the western United States. The method works well because there is a satisfactory statistical relationship between bulk density and snow depth, time of year, and climate class of snow. It also works because density is a conservative variable that in nature is constrained between fairly narrow limits, while snow depth varies over a much greater range. Consequently, estimates of bulk density derived from historical data generally fall close to measured values. In a test against extensive Canadian data, 90% of the model-derived SWE estimates fell within ±8 cm of measured values, with relative errors in SWE not much higher than the range of SWE that would be encountered at a single site due to local real variability. The model is easy to apply and requires simple input data, making it useful in local to global snow water resource applications.

Recent efforts using remote sensing to monitor snow water resources have met with limited success. For example, direct satellite remote sensing methods for determining SWE are still not reliable enough (Cline et al. 1998; Mote et al. 2003) for tracking local, regional, or global snow mass or volume. As an alternative, our ability to measure snow extent has led to hybrid procedures where SCA has been used with a melt model to compute water volume (Cline et al. 1998; Molotch and Margulis 2008; Molotch 2009). In light of these limitations in our ability to measure and monitor local to global snow water resources, there remains a critical need for more station measurements of SWE. The method described here allows reasonable estimates of SWE to be made from simple observations of snow depth.

## Acknowledgments

We express our thanks to several generations of NRCS, Canadian, and Swiss snow researchers, unknown to us by name, who painstaking collected the data used here. In our research programs, we thank Jon Holmgren, Eric Pyne, April Cheuvront, Wylie Bogren, Chris Polashenski, Paul Heflinger, Arvid Silis, Peter Toose, Andrew Rees, and others for the depth and SWE measurements they made in the field. James Sickman and John Melack provided snow data from Tokopah Basin. Mark Williams provided data from Niwot Ridge in Colorado. CLPX data were provided by Don Cline and Kelly Elder. Noah Molotch suggested the inclusion of Table 5 as a second test of model results. Ron Barry helped develop the Bayesian model. Ross Brown generously provided the test dataset. This work was supported by the U.S. Army Cold Regions Research and Engineering Laboratory and by the National Science Foundation, Office of Polar Program, Grants 0629279, 0632398, and 0632131.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Dr. Matthew Sturm, USA-CRREL-Alaska, P.O. Box 35170, Ft. Wainwright, AK 99712. Email: msturm@crrel.usace.army.mil