## 1. Introduction

With the continued growth in world population and industrial and commercial productivity, demands on global water resources have increased greatly. For effective water resources management there is a need to accurately quantify the various components of the hydrological cycle at different space and time scales. Snow is a renewable water resource of vital importance in large portions of the world and is one of the major hydrological cycle components. It is also a major source of water storage and runoff for many parts of the world. For example, in the western United States snow contributes over 70% of total water resources. To better predict snow storage and detect trends in the variations of water resources, accurate snowpack information with known error characteristics is necessary.

Traditionally, rulers, fixed snow stakes, and snow boards are used to measure the snow depth (SD) at a point. In general, point measurements of SD produce high quality data representative of a small location (<10 m scale length). To monitor SD in a temporally and spatially comprehensive manner, optimum interpolation of the points must be undertaken (Brasnett 1999; Brown et al. 2003). However, the spatial representativity of point measurements in a basin or at larger scale is uncertain (Atkinson and Kelly 1997). Furthermore, the spatial density of SD measurements in most parts of the world is rather low. Thus, the accuracy of spatially integrated point measurements of SD needs to be assessed carefully.

Spaceborne scanning microwave sensors, which cover a wide swath and can provide rapid repeat global coverage, are ideally suited to augmenting global snow measurements. For example, passive microwave radiometers such as the Scanning Multifrequency Microwave Radiometer (SMMR) on *Nimbus-7* and Seasat-A and the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave Imager (SSM/I) have been utilized to retrieve global SD. To assess the representativity of satellite-derived SD, it is necessary to determine how and whether the point SD measurements can be compared with the spaceborne-derived SD that typically represents about 25 × 25 km^{2} in area.

The uncertainties in point and areal SD measurements of natural snowpacks need to be understood if comparisons are to be made between point SD measurements and satellite-derived SD. The statistical variability of the snow depth, as represented by the variogram, has a direct effect on the accuracy of SD derived from satellite data. Consequently, it is essential that the magnitude and cause of any variability is clearly defined for robust global validation of satellite-derived SD. In this paper we use sparsely distributed SD data from the National Weather Service (NWS) cooperative station network and SSM/I-derived SD data to study large-scale snow distribution.

To understand the snow-distribution characteristics from ground-measured SD, it is necessary to know the density of point SD measurements and the defined SD areal accuracy. Geostatistical analysis can be used to gain a better understanding of the spatial variability of snow depth in large areas, such as the northern Great Plains. Although there are large portions of the world where the spatial density of point-measured SD is less then 1 per 10 000 km^{2} (approximately about the area of 1° latitude by 1° longitude), the aims of this study are to understand and quantify statistically the uncertainties associated with sparse sampling of SD over a regional scale, and to determine how these uncertainties affect the validation of global SD derivations from satellite observations at a local to regional scale. The aim of this study is to find out how well remote sensing–derived SD can be validated by current ground-measured point SD data, specifically

How well does ground-measured SD compare with satellite-derived SD?

What are the characteristics of snow spatial distribution?

What are the sampling criteria for making ground snow measurements so they can be used to validate satellite-derived SD, given a predefined accuracy requirement?

Throughout the paper we refer to ground SD, which refers to the measurement of snow depth at a point made by a temporary ruler or permanent ruled staff. The snow depth is the accumulated vertical thickness of snow from the ground to the snow–air interface at any given moment.

## 2. Background

Any remote sensing technique that can estimate accurately snow storage is of great benefit for global water cycle research and water resources applications. With spaceborne satellite sensor data, global snow measurements can be achieved. Spaceborne sensors can image the earth with spatial resolutions varying from tens of meters (e.g., visible and infrared spectrometer; synthetic aperture radar) to tens of kilometers (e.g., passive microwave radiometers). Visible and infrared sensor applications to snow are limited to clear-sky occurrences and are sensitive only to snow surface properties, while passive microwave sensors are solar illumination independent and are sensitive to snow volume properties. Both remote sensing approaches have been used to monitor snow cover areas. With the improvement in satellite instrumentation, regional and local scales can now be mapped effectively. Passive microwave sensors have been used to monitor continental-scale snow cover area extent in the Northern Hemisphere for several years (Chang et al. 1987). However, passive microwave retrieval methods of snow water equivalent (SWE) and/or SD are less mature than visible (VIS)/IR sensor mapping approaches and often result in large uncertainties from retrievals at the global scale (e.g., Armstrong and Brodzik 2001, 2002).

Microwave brightness temperature measured by spaceborne sensors originates from radiation from 1) the underlying surface, 2) the snowpack, and 3) the atmosphere. The atmospheric contribution is usually small at microwave frequencies and can be neglected over most snow-covered areas, especially at higher latitudes. In this paper, therefore, we neglect the atmospheric effects when extracting snowpack parameters from satellite data. Snow crystals within snowpacks are effective at scattering upwelling microwave radiation, and the microwave signature of a snowpack depends on both the number of scatterers and their scattering efficiency. The degree of scattering is frequency dependent, with higher frequency (shorter wavelength) radiation scattered more than lower frequency (longer wavelength) radiation. The deeper the snowpack, the more snow crystals there are available to scatter microwave energy away from the sensor. Hence, microwave brightness temperatures are generally lower for deep snowpacks, with a larger number of scatterers, than they are for shallow snowpacks, with fewer scatterers (Matzler 1987; Foster et al. 1997). The scattering effect increases rapidly with effective snowpack grain size; for example, when a depth hoar layer is present. Such large snow crystals often develop in thin snowpacks that are subject to cold air temperatures at the snow surface and a large thermal and vapor gradient through the pack. This can result in very strong signals from thin snowpacks, as observed by Josberger et al. (1996) in a comparison of microwave observations and snowpack observations from the upper Colorado River basin. Based on radiative transfer theory, Chang et al. (1987) successfully developed a method to derive SWE using SMMR observations. SSM/I data have been used routinely to infer the SWE in prairie and boreal forest regions of Canada (Goodison and Walker 1995; Goita et al. 2003). Derksen et al. (2002) found that the time series of SSM/I SWE remains within 10 to 20 mm of surface measurements in the Canadian prairie. Walker and Silis (2001) reported derived snow cover variations over the Mackenzie River basin. Their algorithm was tested using “ground truth” in situ data and shows that the inferred SWE estimates generally underestimate the measured SWE by between 10 to 30 mm. The derivation of an accurate algorithm is complicated by the snow crystal metamorphism that occurs through the winter. To model this effect, Josberger and Mognard (2002) and Mognard and Josberger (2002) developed an algorithm for the U.S. northern Great Plains that includes a proxy for crystal growth based on air temperature. Kelly et al. (2003) coupled a spatially and temporally varying empirical grain growth expression with a radiative transfer model to derive SD in the Northern Hemisphere. All of these results encourage us to study further the interaction of microwaves with snow parameters to derive a validated algorithm with known errors.

Ideally, it is recognized that SWE is closely related to the volume of snowpack stored in a basin. However, global SWE datasets are not available; rather SD is the quantity that is recorded at many weather station locations. Snow depth is measured at a point, usually from a ruler or snow board. In the high latitudes the distributions of liquid precipitation (rain) and solid precipitation (snow) are very similar. Precipitation gauges also have been used to record accumulated snowfall, although such devices are often subject to large uncertainties (Yang et al. 1998, 1999, 2000). Precipitation gauges are also sparsely distributed around the globe with large regional variations in spatial density. For example, in Germany there may be three–five stations in a 25 × 25 km^{2} area (Rudolf et al. 1994). In the United States there are some areas with one station in 25 × 25 km^{2} while in some areas of Russia, for example, there is typically only one station in area 100 × 100 km^{2}. Snow courses provide more detailed measurements of snow parameters located at discrete sites along a defined transect; however, they are even sparser in occurrence. Thus, with ground SD data more readily available for comparison with satellite derivations, ground SD measurements are the prime validation source used in this study. Also, since SD is the most widely measured variable, we use the SD form of the microwave retrieval algorithm from Chang et al. (1987) in this study.

## 3. Snow field descriptions and data used in the study

The northern Great Plains study region covers a geographical area from 42° to 49°N and 91° to 104°W. The test area is about 800 000 km^{2}. This encompasses the states of North Dakota, South Dakota, and Minnesota. The geomorphology of this area is rather homogeneous. For example, the Roseau River in Minnesota and Manitoba, Canada, is a typically small basin that flows into the Red River. The Roseau basin has relatively low relief (<500 m) with a mixture of cropland and forests (hardwoods and conifers). Recently Josberger et al. (1998) reported a comparison of the satellite and aircraft remote sensing snow water equivalent estimates in this region. They found that in this prairie ecosystem passive microwave observations could be used to derive SWE. This area, therefore, is ideal for studying the characteristics of ground SD measurements and microwave-derived snow depths.

Snow depth retrievals were performed using observations from SSM/I instruments aboard DMSP *F-8*, *F-11*, and *F-13* platforms. Ground snow depth measurements, archived by the NWS and obtained from the cooperative network of observers, were collected for the study region. These ground measurements consist of daily weather observations of temperature, precipitation, snowfall, and snowpack thickness at more than 351 stations in the area, although this number varies from year to year. Typically, the snow depth information is collected daily but with a long time lag before the data become available. Cooperative station data for 12 February each year from 1988 to 1997 were used in the analysis. Figure 1 shows the location of the ground SD data within the northern Great Plains (NGP) study region. All data were georeferenced to the equal area scaleable earth grid (EASE-grid). The SSM/I data were obtained from the National Snow and Ice Data Center (NSIDC) in 25 × 25 km^{2} EASE-grid projection (Armstrong and Brodzik 1995). For the SSM/I, with the exception of 1994, our analysis focused on the mean of 3 days (10–12 February) for each year of 10 yr (from 1988 to 1997). This averaging process ensured complete coverage of the study region for the selected date. In 1994, because of incomplete SSM/I coverage, the 3-day mean was shifted to 27–29 January. Only morning SSM/I passes were used in the analysis (approximately 0400–0600 local time).

The NWS cooperative station data can take some time to become available. This time frame represents a 10-yr period when paired observations of SSM/I data (launched in 1987) and NWS cooperative station data were available. These 3 days were chosen since they represent a time in winter when the snowpack is potentially at its most stable and extensive, with minimal liquid water content. Figure 2 shows five time series plots of mean daily minimum air temperature and mean daily snow depth from 1 January to 28 February at the five selected cooperative stations identified in Fig. 1. The data demonstrate that not only were the mean daily minimum temperatures well below 0°C, at this time of year the standard deviation of snow depth was generally small, as shown by the error bars in the snow depth time series. By undertaking the analysis for the same time in consecutive years, potentially consistent biases in the data (either satellite or ground) could be identified. It was also decided that the analysis would focus only on midwinter snow packs. It is recognized that SSM/I-derived SDs in the early season underestimate field-measured conditions because snow is often highly discontinuous in space and time (Armstrong and Brodzik 2001). Also, in late winter and early spring, the presence of melt–refreezing events and snow that contains free liquid water is known to be a very challenging environment for SD estimates from passive microwave instruments (Stiles and Ulaby 1980). To investigate instances when confidence was high that only relatively stable and dry snowpacks were present, therefore, this research is concerned with SD estimates derived from passive microwave instruments during midwinter conditions only (January and February).

## 4. Data analysis

### a. Comparisons of paired ground snow depth measurements and passive microwave snow depth estimates

Statistical analysis of paired ground SD and SSM/I SD for each year (from 1988 through 1997) and the 10-yr composite show that the SSM/I estimates generally compare well with the ground SD measurements. Table 1 gives summary statistics for each year plus composite means. The mean ground SD is highly variable from year to year (1.5 to 45.4 cm). In the 10-yr period, there are 3 yr when SD is less than 10 cm, 5 yr when SD is between 10 and 30 cm, and 2 yr when SD is greater than 30 cm. The corresponding range of SSM/I-derived mean SD (1.7 to 43.4 cm) is similar to the ground SD measurements, with the total composite mean SD from SSM/I estimates almost identical to that of the ground data. The correlation between yearly mean ground SD and SSM/I SD values for the period is 0.82.

*t*test, was used to determine whether or not there were significant differences between the ground and SSM/I-derived SD. The paired

*t*statistic (

*t*) is defined as

*μ*and

*σ*are the mean and standard deviation of the paired differences of the two variables (ground SD and SSM/I SD), and

*N*is the number of data pairs (McClave and Dietrich 1979). For

*N*> 30, in this case

*N*> 250,

*t*follows approximately a normal distribution. The hypothesis of difference is rejected if |

*t*| < 1.96 at the 95% confidence level. The paired

*t*statistics are included in Table 1. Inspection of the

*t*values shows no systemic pattern from year to year. There are 8 yr with |

*t*| > 1.96 and 2 yr (1990 and 1997) with |

*t*| < 1.96. For those years where the value of

*t*is larger than 1.96, there is a significant difference between the ground SD and SSM/I-derived SD at 95% level confidence. The paired

*t*-test value is –1.13 for the composite 10-yr dataset. The mean difference between the ground SD and satellite SD is 0.4 cm. However, the standard deviation of the difference is 18.7 cm, which is slightly larger than both the accumulated ground and SSM/I snow depth means (17.9 and 18.3 cm, respectively). Overall, for half of the years the SSM/I-derived SDs are less than the ground SDs, and for the remaining 5 yr they are more than the ground SDs. No significant explanation could be found to explain this feature for these bulk statistics. It is possible, however, that snowpack stratigraphy variations at the local scale (and within the 25 × 25 km

^{2}SSM/I pixels) have an important effect on microwave responses from the snow. While we assume in the retrieval algorithm that grain size and density are constant, spatial and temporal variations in snow stratigraphy can play a significant role in attenuating the upwelling radiation (Matzler 1987). Changes in the vertical properties of the snowpack are generally caused by thermal and vapor gradients through the snowpack along with the surface melt and refreezing of water (Colbeck 1982). Unfortunately, no information in the data is available on these properties for this research. It is probable, however, that variations in density and grain size contributed to variations in SSM/I-derived SD in addition to variations in SD accumulation.

Measurement error is an important factor explaining why ground and satellite data have different statistical characteristics. Snow depth measured at a point reflects snow accumulation subject to local microscale processes, while SSM/I-derived SD reflects mean snow conditions, subject to controls at the local to regional scale. For example, wind speed is the most important environmental factor contributing to the undermeasurement of snow at a point (Goodison et al. 1989). In the NGP region, high wind speeds are common and cause a system bias error. At cooperative stations, typically snow rulers or snow boards are used to measure SD. To obtain a representative SD measurement under snow drifting or snow scouring conditions, careful judgment by the observer is required. Assuming that such processes occurred with equal frequency over the study domain, a varying positive (drifting) or negative (scouring) bias from year to year might also help to explain why for half of the years the bias is positive and the other half negative.

### b. An assessment of the spatial variation of ground-measured and passive-microwave-derived snow depth

Large spatial and temporal variations exist in global and local snow cover extent and volume (Frei and Robinson 1999). Errors of these variations are not very well understood, although it is important for better climate observation. It is necessary to better understand the spatial characteristics of different scales of SD. Jacobson (1999) defined five spatial-scale lengths of weather parameters: planetary scale (>10 000 km), synoptic scale (500 to 10 000 km), mesoscale or regional variation (2 to 2000 km), microscale (2 mm to 2 km), and molecular scale (<2 mm). In the NGP study region, we are concerned with the characterization of snow distribution at the microscale and mesoscale.

From the *t*-test values of the previous section, mesoscale (SSM/I) and microscale (ground) comparisons of snow depth revealed that for 8 out of the 10 yr, significant differences existed between these two datasets. However, when the data were aggregated over a longer time period (10 yr), the |*t*| value was less than 1.96, suggesting that when averaged over successive years the two datasets are not significantly different. To further understand these characteristics, analysis of the spatial variability of the two datasets was undertaken.

*γ*) may be defined as half the expected squared difference between the random functions

*Z*(

*x*) and

*Z*(

*x*+

*h*) at a particular lag

*h*. The semivariogram (hereafter referred as variogram), defined as a parameter of the random function model, is then the function that relates semivariance to lag:

*γ*(

*h*) can be estimated for p(

*h*) pairs of observation or realizations, [

*Z*(

*x*+

_{l}*h*), l = 1, 2, . . . .

*p*(

*h*)], by

A mathematical function or model is usually fitted to the experimental values, which are discrete, to represent the true variogram of the region, which is continuous. The experimental values are often erratic because they are subject to error. In general, the variogram model is either unbounded (increases indefinitely with lag) or bounded (increases to a maximum value of semivariance, known as the sill, at a finite positive lag, known as the range *a*). The sill is equal to the a priori variance (that defined for an infinite region) of the random function (RF), while the range indicates the limit to spatial dependence, beyond which data are statistically uncorrelated. Often the model approaches and intercepts the ordinate at some positive value of semivariance known as the nugget variance *c*_{0}. The nugget variance results from measurement error (Atkinson 1993), the uncertainty in estimating the variogram from a sample, the uncertainty in model fitting, and spatially dependent variation acting at scales finer than the sampling interval. The structured component of variation *c*_{1} is then the sill minus the nugget variance, so that *c*_{0} + *c*_{1} = sill.

For each dataset, variograms were computed for the ground and SSM/I data using GSTAT software (Pebesma and Wesseling 1998). As an example, Fig. 3 shows the variograms with spherical models of SD fitted for the ground data and the SSM/I data for 1988. For the variograms of ground measurements, the variograms were calculated using the sparsely distributed ground data for each year. The experimental variograms were, therefore, not always smooth in definition. In the case of the SSM/I data, all pixels in the study region were used for each year to produce the variograms. Therefore, the variograms are “smooth” in character. Authorized models were fitted to all experimental variograms using a least squares criterion (Pebesma and Wesseling 1998) with the exception of the ground data for 1990 and 1991 when the experimental variograms were unbounded. The reason for the lack of structure for these 2 yr is probably because there was so little snow accumulated at the stations (means of 1.8 and 1.5 cm). These means are substantially composed of 0-cm measurements such that very little spatial variation was present. For two datasets (1996 ground and 1997 SSM/I), data were detrended using first-order polynomials; this was because a trend in the data produced unbounded variograms. Unlike the 1990 and 1991 data for the ground SD measurements, appreciable snow accumulation was present in both 1996 and 1997, and clear-direction snow accumulation gradients were present (NE to SW in the case of the 1996 ground data and NW to SE for the SSM/I-derived SD). These directional trends were caused by the presence of synoptic-scale variations of snow depth, which can be present at the large regional scale. As stated earlier, this research aims to test passive-microwave-derived SDs, which are calculated at the local to regional scale, so larger synoptic-scale variations are considered an unwanted trend in the data. Figure 4 shows the SSM/I-derived SD for 1997 and kriging interpolation of ground-measured snow depth for 1996, based on the variogram derived using the method below. The SSM/I-derived SD data reveal a northwest to southeast trend, while the ground-measured data reveal a southwest to northeast trend of snow depth.

The main parameters of interest for the comparative analysis were the nugget variance and the range. Variograms were computed and estimated to a maximum lag of 1000 km at 25-km lag separations (the support of the SSM/I data). Spherical variogram models were fitted to the variograms using the weighted least squares criterion. Table 2 shows the model ranges and nugget variances for the ground and the satellite snow depth dataset. The mean values are also shown. Note that the SSM/I means are area integrated and are different from the means of the ground-measured SD, which are the means of sparsely distributed snow depth measurements. The variograms for the SSM/I and ground SD data for each of the 10 days show some broad similarity with respect the mean snow depths for each year. For example, the nugget variance increases with increasing mean snow depth for both datasets, suggesting that representation of microscale effects of snow distribution is not possible for thicker snowpacks. The range decreases with increased mean snow depth in both ground and SSM/I datasets, also suggesting that snow depth variability is smaller over short distances only when the snow is thick. For shallower snowpacks, the spatial variability is small over comparatively larger distances.

With respect to differences in variogram structure between ground- and satellite- derived snow depth data, four variogram pairs (SSM/I-derived and ground-measured) have range differences less than 200 km, one pair has a range difference between 200 and 300 km, two have differences between 300 and 400 km, and one pair has a difference between 400 and 500 km. Furthermore, applying the paired *t* test to the SSM/I and ground SD variogram range data in Table 2 gives a |*t*| value of 0.32, and the critical *t* value for a sample of 8 is 2.37. These results suggest that there is an overall general agreement between the spatial correlation length of SD derived from the SSM/I retrievals and ground measurements for the 10-yr period. This agreement reflects the fact that in both datasets the range tends to be larger for shallower snowpacks while it is smaller for thicker snowpacks. However, it should be noted that this is a generalized pattern and that there are differences in spatial correlation lengths between ground SD measurements and SSM/I SD estimates at the interannual scale. The range differences are probably caused by the differences in snow accumulation processes acting at the local scale (e.g., snow redistribution) that affect point measurements made by a ruler and those acting at a local to regional scale (e.g., development of complex stratigraphy) that affect the snowpack microwave responses.

### c. Error analysis and determination of the required sampling procedures of ground snow depth measurements

Before determining how many point snow depth measurements are needed to achieve a specified SD areal accuracy, it is necessary to understand the error characteristics of satellite-derived SD and ground SD measurements. Data from both SD sources are subject to systematic and nonsystematic or random errors. For ground SD measurements, there might be both systematic and random errors associated with each measurement. Systematic errors from ground-measurement data are attributed to the situation of the ruler and its representativity of the local conditions (Goodison et al. 1981). Ideally, several measurements are needed to produce a representative sample, but such information is usually not available in the cooperative data archive so that inferences about ground systematic errors cannot be made. For satellite SD data, both systematic and random errors are associated with the retrieval algorithm and are referred to as retrieval errors (Bell et al. 1990). Systematic biases are known to exist in relation to vegetation cover and snowpack parameterization of the algorithms. While the quantification of these errors is the focus of ongoing studies (e.g., see Derksen et al. 2003), in general, the error biases are consistent as confirmed by the results in Table 1. For the satellite SD derivations, therefore, we assume a constant systematic error because the study location and the study date each year are constant. In this study, therefore, the error term in both datasets that we use to determine the accuracy achievable from a predefined number of ground SD points is the random error term of the total error.

*g*and

*s,*respectively, and it is assumed that there are random errors associated with these estimates. Thus, we can write

*e*and

_{g}*e*are the random errors associated with independent SD estimates of the ground and satellite variables. Assuming that the estimates are unbiased with uncorrelated errors,

_{s}*e*and

_{g}*e*contain errors due to ground point sampling and satellite retrievals from the ensemble averaging. We can express the mean square difference of ground measurements and satellite estimates as

_{s}For the 10-yr dataset (1988–97), 1° × 1° grid cells of latitude and longitude (approximately 10^{4} km^{2}) were used as the study framework to investigate the random errors. The center of each grid cell was located at every half degree of latitude and longitude with the origin of each grid cell located at the southwest corner. All grid cells were classified according to the number of point ground SD measurements within the cell, a measure of the spatial density of SD sites. To ensure that no forest cover was included in the analysis, using a map of forest cover identified in Foster et al. (1997), any 1° × 1° grid cell containing a forest fraction greater than 1% was discarded. The frequency distribution of sites per cell varied from 1 SD site to 10 SD sites per cell and is plotted in Fig. 5. The most frequent category was three SD sites per cell, which had 159 cells. Typically 22 SSM/I SD retrieval points were located within each 1° × 1° grid cell and linearly averaged to produce a cell mean.

*N*ground SD sites, the mean and standard deviation of ground SD and satellite SD, the mean difference between the ground SD and satellite SD (mean difference = 〈

*g*〉 − 〈

*s*〉) and the standard deviation of the differences, the root-mean-square of the difference (rmsd) between ground and satellite SD [rmsd = 〈(

*g*−

*s*)

^{2}〉

^{1/2}]. By rearranging Eq. (6) above, the total error (〈(

*e*−

_{g}*e*)〉) was calculated and is also shown in Table 3. The total error can also be written as

_{s}*t*statistic of the means was also computed.

Overall, in Table 3, the mean difference and rmsd between ground and satellite SD was 1.1 and 16.0 cm, respectively. The mean difference between ground SD and SSM/I SD for each category is relatively small. From the paired *t* test, none of the |*t*| values were greater than 1.96, suggesting there was no significant difference between the mean ground and satellite-derived SD values. Additionally, the standard deviation of the mean difference (16.1 cm) is about the same as the mean of satellite and ground SD for each category (18.2 cm). An important characteristic of these data is that the rmsd decreases as the number of ground SD sites per cell increases, suggesting that the number of sites within a box might influence estimated random error. The total error decreases from 20.4 cm for one ground SD site to 10.0 cm at nine sites, supporting the possibility that the number of SD sites might influence the estimated error.

Figure 6 shows a series of graphs of SSM/I-derived snow depth, averaged for each 1° × 1° grid cell plotted against the average snow depth from ground measurements. Each graph represents data for each year. The size of circles represents the number of ground samples per cell for each year, and cells with more than five ground measurements are shaded gray. The diagonal line represents 100% agreement between SSM/I and ground data. For 1988, 1989, and 1996, there appears to be good agreement between datasets especially for cells with more than five ground measurements per grid cell. The unshaded circles, which have less than six sites per 1° × 1° grid cell, tend to be located farther away from the line of agreement. For 1992, 1994, and 1995, the biases indicated by the mean snow depth data in Table 1 are demonstrated; for 1992 and 1995, there is an overestimation of snow depth by the SSM/I, and in 1994 there is an underestimation. For the underestimation bias in 1994, the smaller (open) circles tend to be located within the complement of larger circles, while for the overestimation years (1992 and 1995) the smaller circles are more randomly distributed in the graphs. For 1993, the SSM/I tends to overestimate the ground-measured snow depth, although the grid cells with a greater density of ground measurements tend to be clustered. Grid cells with fewer ground sites are more widely dispersed in both graph dimensions. For 1997, the SSM/I overestimates and underestimates the snow depth (which explains why the difference between means in Table 1 is small). In general, Fig. 6 suggests that better understanding of the relationship between SSM/I estimates and ground-measured snow depth data can be obtained if a greater spatial density of ground sites is available to undertake the comparison.

*ɛ*) and the number of samples is of the form

*ɛ*

^{2}∼ 1/n for precipitation. Rudolf et al. (1994) reported a similar error estimation relationship of the form

*ɛ*

^{2}∼ 1/

*n*

^{1.11}in a 2.5° × 2.5° grid domain of gauge precipitation data. In this snow study we attempt to determine the number of samples required for SD estimates at a 1° × 1° grid domain within a limit of sampling error

*ɛ*. To be 95% confident that the true mean is within ±

*ɛ*of the observed mean, the number of samples (

*N*) required is (Snedecor and Cochran 1967)

*σ*is the standard deviation of the variable.

From Eq. (7), the total error 〈*e*^{2}〉^{1/2} can be calculated by the square root of the sum of ground SD error 〈*e*^{2}_{g}〉 and satellite error 〈*e*^{2}_{s}〉. The ground SD mean error is dominated by ground-site sample configuration, especially the site spatial density. The satellite error is caused by algorithm error and is not directly related to the ground-site spatial density. Since the snowfield of the NGP area is rather uniform (relatively homogeneous low-height vegetation and snowpack properties), it is possible to make the assumption that the satellite algorithm error is probably about the same for all grid cells. By calculating *e _{s}* using Eq. (8) for each grid cell category (number of ground sites per cell in Table 3), the calculated mean

*e*for all categories is 8.8 cm (with a standard deviation of less than 1.0 cm). By incorporating this value with the total error from Eq. (6) (also shown in Table 3),

_{s}*e*can be individually estimated for each category using Eq. (8). It is then possible to optimize a model fit using the least squares criterion of

_{g}*e*

^{2}

*in proportion to 1/*

_{g}*N*. The result is

*e*

^{2}

*= 466.7/*

_{g}*N*. Figure 7 shows the estimated sampling error for different numbers of ground sites per grid cell and the fitted model. The ground SD error varies from about 20 cm for 1 site per grid cell and decreases to 7 cm for 10 sites per cell. In other words, in order to achieve 5-cm accuracy, more than 10 ground SD measurement sites within a grid cell are required. This is an important outcome since it defines a limitation to the error characteristic as a function of the measurement-site spatial density at this grid cell scale (i.e., 1° × 1° of latitude and longitude).

## 5. Summary and discussion

Ten years of ground SD data were used to evaluate the single SSM/I-footprint-derived SD for northern Great Plains snowfields during midwinter. From year to year comparisons, 8 out of 20 yr had significant differences between ground and SSM/I derived SD data. The mean ground SD for the 10-yr composite was 17.9 cm with a standard deviation 21.9 cm, while the SSM/I-derived SD was 18.3 cm with a standard deviation of 17.9 cm. The 10-yr mean difference between ground SD and SSM/I-derived SD was 0.4 cm, which is not statistically significant.

The variograms of ground SD and SSM/I derived SD were comparable. The absolute geostatistical range differences between ground SD and SSM/I SD is less than 500 km. In general, the ranges decreased as the snow depth increased, suggesting that for thinner snowpacks, the correlation lengths increase, while for thicker snowpacks they decrease. Also, the nugget variances were larger for thicker snowpacks, suggesting that there is more unresolved variation at each sample point when greater snow accumulations are present.

Comparisons of the 1° × 1° latitude–longitude gridded data showed that the yearly differences of ground SD and SMM/I SD were not significant. The 10-yr composite mean and standard deviation of the ground SD was 17.7 and 19.7 cm, respectively, and the SSM/I-derived SD was 18.8 and 16.9 cm, respectively. The mean difference between ground and satellite-derived SD was 1.1 cm and was not significant. The standard deviation of the difference between ground and SSM/I SD was slightly smaller (16.1 cm) than the comparison for point data (18.7 cm).

For the composite mean of the 10-yr period, this research suggests that at the 1° × 1° grid cell scale, SSM/I data can be used effectively to map snow depth in the NGP area. At the interannual time scale, however, there was both agreement and disagreement between ground and SSM/I-derived SD data. When the spatial density of ground SD measurements is increased, we suggest that snow depth spatial variability can be captured by the SSM/I retrieval algorithm and has a calculated error of 8.8 cm. In comparing the SSM/I data with ground measurements, the advantage of increasing the number of measurements sites within a grid cell is reported. The modeled sampling error curve of ground SD measurements is about 22 cm for 1 site, 7 cm for 10 sites, and for more than 10 sites less than 7 cm on a 1° × 1° grid cell domain. This curve relating estimation error with number of measurements sites per cell shows that for the northern Great Plains area, the sampling error does not reduce quickly, even as the number of ground SD sites approaches 10. In the context of global snow depth estimates, this research demonstrates that it is rather difficult to quantify the global SD accuracy by using only the limited ground SD data where measurement-site density is often less than one per 1° × 1° of latitude and longitude. Perhaps, therefore, the only way to use these spatially limited datasets is to scale up (average) both the passive microwave data and the ground measurements to a grid cell size that is in excess of 1° × 1° of latitude and longitude. Even then, in certain parts of the world where ground data are very sparse, comprehensive validation of passive microwave estimates of snow depth may not be possible without a dedicated ground or aircraft field campaign.

The Advanced Microwave Scanning Radiometer (AMSR) was launched on board the Japanese *Advanced Earth Observing Satellite-II (ADEOS-II)* and the U.S. Earth Observation System (EOS) *Aqua* satellite in 2002. AMSR can provide the best ever spatial resolution multifrequency passive microwave radiometer observations from space [18-GHz channel instantaneous field of view (IFOV) is 27 × 16 km^{2} and 36-GHz channel IFOV is 14 × 8 km^{2}]. This capability provides us with an opportunity to estimate surface snow mass quantities at finer spatial resolutions than have been possible with previous microwave instruments and so is an opportunity to improve snow depth observations both with respect to spatial resolution and accuracy of retrieval. However, for field experiments designed to test satellite observations, the ground-sampling network requires careful planning to ensure snow cover parameters such as SD are accurately measured.

## Acknowledgments

We thank three anonymous referees for their most helpful and insightful comments. We are also grateful to the editor of the *Journal of Hydrometeorology* for his valuable assistance.

## REFERENCES

Armstrong, R. L., and Brodzik M. , 1995: An earth-gridded SSM/I data set for cryospheric studies and global change monitoring.

,*Adv. Space Res.***10****,**155–163.Armstrong, R., and Brodzik M. , 2001: Recent Northern Hemisphere snow extent: A comparison of data derived from visible and microwave satellite sensors.

,*Geophys. Res. Lett.***28****,**3673–3676.Armstrong, R., and Brodzik M. , 2002: Hemispheric-scale comparison and evaluation of passive-microwave snow algorithms.

,*Ann. Glaciol.***34****,**38–44.Atkinson, P. M., 1993: The effect of spatial resolution on the experimental variogram of airborne MSS imagery.

,*Int. J. Remote Sens.***14****,**1005–1011.Atkinson, P. M., and Kelly R. E. J. , 1997: Scaling-up point snow depth data in the U.K. for comparison with SSM/I imagery.

,*Int. J. Remote Sens.***18****,**437–443.Bell, T. L., Abdullah A. , Martin R. L. , and North G. R. , 1990: Sampling errors for satellite-derived tropical rainfall: Monte Carlo study using a space–time stochastic model.

,*J. Geophys. Res.***95****,**2195–2205.Brasnett, B., 1999: A global analysis of snow depth for numerical weather prediction.

,*J. Appl. Meteor.***38****,**726–740.Brown, R., Brasnett B. , and Robinson D. , 2003: Gridded North American monthly snow depth and snow water equivalent for GCM evaluation.

,*Atmos.–Ocean***41****,**1–14.Chang, A. T. C., and Chiu L. S. , 1999: Nonsystematic errors of monthly oceanic rainfall derived from SSM/I.

,*Mon. Wea. Rev.***127****,**1630–1638.Chang, A. T. C., Foster J. L. , and Hall D. K. , 1987: Nimbus-7 derived global snow cover parameters.

,*Ann. Glaciol.***9****,**39–44.Chang, A. T. C., Chiu L. S. , and Wilheit T. T. , 1993: Random errors of oceanic monthly rainfall derived from SSM/I using probability distribution functions.

,*Mon. Wea. Rev.***121****,**2351–2354.Colbeck, S. C., 1982: An overview of seasonal snow metamorphism.

,*Rev. Geophys. Space Phys.***20****,**45–61.Derksen, C., LeDrew E. , Walker A. , and Goodison B. , 2002: Time-series analysis of passive microwave derived central North American snow water equivalent (SWE) imagery.

,*Ann. Glaciol.***34****,**1–7.Derksen, C., Walker A. , and Goodison B. , 2003: A comparison of 18 winter seasons of in situ and passive microwave-derived snow water equivalent estimates in western Canada.

,*Remote Sens. Environ.***88****,**271–282.Foster, J. L., Chang A. T. C. , and Hall D. K. , 1997: Comparison of snow mass estimates from a prototype passive microwave snow algorithm, a revised algorithm and snow depth climatology.

,*Remote Sens. Environ.***62****,**132–142.Frei, A., and Robinson D. A. , 1999: Northern Hemisphere snow extent: Regional variability 1972–1994.

,*Int. J. Climatol.***19****,**1535–1560.Goita, K., Walker A. , and Goodison B. , 2003: Algorithm development for the estimation of snow water equivalent in the boreal forest using passive microwave data.

,*Int. J. Remote Sens.***24****,**1097–1102.Goodison, B. E., and Walker A. E. , 1995: Canadian development and use of snow cover information from passive microwave satellite data.

*Passive Microwave Remote Sensing of Land–Atmosphere Interactions,*B. Choudhury et al., Eds., VSP Press, 245–262.Goodison, B. E., Ferguson H. L. , and McKay G. A. , 1981: Measurement and data analysis.

*Handbook of Snow: Principles, Processes, Management and Us*e, D. M. Gray and D. H. Male, Eds., Pergamon Press, 191–274.Goodison, B. E., Sevruk B. , and Klemm S. , 1989: WMO solid precipitation measurement intercomparison: Objectives, methodology, analysis.

*Atmospheric Deposition,*J. W. Delleur, Ed., IAHS Publication 179, 57–64.Huffman, G. J., 1997: Estimates of root-mean-square random error for finite samples of estimated precipitation.

,*J. Appl. Meteor.***36****,**1191–1201.Isaaks, E. H., and Srivastava R. M. , 1989:

*Applied Geostatistics*. Oxford University Press, 580 pp.Jacobson, M. Z., 1999:

*Fundamentals of Atmospheric Modeling*. Cambridge University Press, 656 pp.Josberger, E. G., and Mognard N. M. , 2002: A passive microwave snow depth algorithm with a proxy for snow metamorphism.

,*Hydrol. Processes***16****,**1557–1568.Josberger, E. G., Gloersen P. , Chang A. , and Rango A. , 1996: The effects of snowpack grain size on satellite passive microwave observations from the upper Colorado River basin.

,*J. Geophys. Res.***101****,**C3,. 6679–6688.Josberger, E. G., Mognard N. M. , Lind B. , Matthews R. , and Carroll T. , 1998: Snowpack water-equivalent estimates from satellite and aircraft remote-sensing measurements of the Red River basin, north-central U.S.A.

,*Ann. Glaciol.***26****,**119–124.Kelly, R. E. J., Chang A. T. C. , Tsang L. , and Foster J. L. , 2003: Development of a prototype AMSR-E global snow area and snow volume algorithm.

,*IEEE Trans. Geosci. Remote Sens.***41****,**230–242.Matzler, C., 1987: Applications of the interaction of microwave with the natural snow cover.

,*Remote Sens. Rev.***2****,**259–387.McClave, J. T., and Dietrich F. H. II, 1979:

*Statistics*. Dellen, 681 pp.Mognard, N. M., and Josberger E. G. , 2002: Northern Great Plains 1996/97 seasonal evolution of snowpack parameters from passive microwave measurements.

,*Ann. Glaciol.***34****,**15–23.Pebesma, E. J., and Wesseling C. G. , 1998: GSTAT, a program for geostatistical modeling, prediction and simulation.

,*Comput. Geosci.***24****,**17–31.Rudolf, B., Hauschild H. , Rueth W. , and Schneider U. , 1994: Terrestrial precipitation analysis: Operational method and required density of point measurements.

*Global Precipitation and Climate Change,*M. Desbois and F. Desalmand, Eds., NATO ASI Series, Vol. 126, Springer-Verlag, 173–186.Snedecor, G. W., and Cochran W. G. , 1967:

*Statistical Methods*. 6th ed. Iowa State University Press, 593 pp.Stiles, W. H., and Ulaby F. T. , 1980: The active and passive microwave response to snow parameters. 1. Wetness.

,*J. Geophys. Res.***85****,**1037–1044.Walker, A. E., and Silis A. , 2001: Snow cover variations over the Mackenzie River basin derived from SSM/I passive microwave satellite data.

,*Ann. Glaciol.***34****,**8–14.Yang, D., Goodison B. , Ishida S. , and Benson C. , 1998: Adjustment of daily precipitation data at 10 climate stations in Alaska: Application of World Meteorological Organization intercomparison results.

,*Water Resour. Res.***34****,**241–256.Yang, D., and Coauthors, 1999: Quantification of precipitation measurement discontinuity induced by wind shields on national gauges.

,*Water Resour. Res.***35****,**491–508.Yang, D., and Coauthors, 2000: An evaluation of the Wyoming gauge system for snowfall measurement.

,*Water Resour. Res.***36****,**2665–2677.

Ten years of mean (*μ*) and standard deviation (σ) of paired ground and SSM/I snow depth estimates and the paired *t*-test values

Variogram characteristics of ground-measured SD and SSM/I-retrieved snow depth data. Note SSM/I *μ* are area averages rather than the paired averages shown in Table 1

The number of cells, mean, and standard deviation of ground SD and SSM/I SD estimates; mean difference between ground and SSM/I SD estimates, and the standard deviation of these differences; rmsd between the ground and SSM/I SD estimates; the nonsystematic errors of ground and SSM/I SD; and paired *t*-test value between ground and SSM/I SD for each ground SD measurement-site category