• Barnett, T. P., and Coauthors, 2008: Human-induced changes in the hydrology of the western United States. Science, 319, 10801083, doi:10.1126/science.1152538.

    • Search Google Scholar
    • Export Citation
  • Christy, J. R., 2002: When was the hottest summer? A state climatologist struggles for an answer. Bull. Amer. Meteor. Soc., 83, 723734.

    • Search Google Scholar
    • Export Citation
  • Christy, J. R., , and Hnilo J. J. , 2010: Changes in snowfall in the southern Sierra Nevada of California since 1916. Energy Environ., 21, 223234.

    • Search Google Scholar
    • Export Citation
  • Doesken, N. J., , and Judson A. , 1996: The snow booklet: A guide to the science, climatology, and measurement of snow in the United States. Colorado Climate Center Rep., Department of Atmospheric Science, Colorado State University, 92 pp.

  • Howat, I. M., , and Tulaczyk S. , 2005: Climate sensitivity of spring snowpack in the Sierra Nevada. J. Geophys. Res., 110, F04021, doi:10.1029/2005JF000356.

    • Search Google Scholar
    • Export Citation
  • Kapnick, S., , and Hall A. , 2010: Observed climate–snowpack relationships in California and their implications for the future. J. Climate, 23, 34463456.

    • Search Google Scholar
    • Export Citation
  • Knowles, N., , Dettinger M. D. , , and Cayan D. R. , 2006: Trends in snowfall versus rainfall in the western United States. J. Climate, 19, 45454559.

    • Search Google Scholar
    • Export Citation
  • Kunkel, K. E., , Palecki M. A. , , Hubbard K. G. , , Robinson D. A. , , Redmond K. T. , , and Easterling D. R. , 2007: Trend identification in twentieth-century U.S. snowfall: The challenges. J. Atmos. Oceanic. Technol., 24, 6473.

    • Search Google Scholar
    • Export Citation
  • Mote, P. W., , Hamlet A. F. , , Clark M. P. , , and Lettenmaier D. P. , 2005: Declining mountain snowpack in western North America. Bull. Amer. Meteor. Soc., 86, 3949.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Table of monthly snowfall totals (in.) for Summit (COOP ID 048646), Placer County, CA, 1878–1911, as appeared in Climatological Data for March, District 11, California (March 1911 edition) (data available online at http://www7.ncdc.noaa.gov/IPS/cd/cd.html).

  • View in gallery

    The state of California with dashed polygons representing the 18 separate regions (identified by top number) into which stations were divided. The lower number indicates the seasonal (December–March) total snowfall for the selected reference stations in each region (cm). The solid lines indicate Climate Division boundaries, which largely follow ridge lines of the major basins. Tick marks are 0.5° lat/lon.

  • View in gallery

    Fraction of months recorded as zero per 5 years during December–March of five stations from regions 6 and 7.

  • View in gallery

    As in Fig. 3, but for five stations from region 8.

  • View in gallery

    Number of stations whose median exceeds the fractional total (f) of the reference station snowfall median in each region.

  • View in gallery

    Square root of the individual stations’ average monthly squared departures (December–March only) from the region’s calculated value (Y) of the normalized metric (median of 1.0). Three values per region were calculated by setting f equal to 0.1, 0.3, and 0.5 (see text). When f = 0.5, the standard errors of regions 6, 8, 15, and 17 are ≤0.2.

  • View in gallery

    Total number of “annual” totals (December–March) from the 1878/79 season to 2010/11 season able to be generated by region based on the threshold fraction f. Note: considering Fig. 6, we note that when f = 0.5, only regions 6, 8, and 17 have time series longer than 100 years with an error metric ≤0.2.

  • View in gallery

    The time series average of station elevation (m) per region for three thresholds: f = 0.1, 0.3, and 0.5 for December–March. For each year, the elevation of stations that reported data were averaged, then the average of all years was calculated and shown above.

  • View in gallery

    (a) For region 3, the time series of individual station values (y, + symbols) and consensus values (Y, ○ symbols) of the normalized snowfall metric for months (from top) December, January, February, and March through 2011. (bottom) Seasonal total of normalized monthly consensus values. The proportional factor threshold is f = 0.5. The median December–March snowfall for the reference station is 14.0 cm with the COOP IDs of the 12 stations utilized in the assembling of the time series listed at the top. The standard error of individual monthly values from the consensus is 0.76 (recall that median is 1.0 and minimum is zero). (b) As in (a), but for region 6. The standard error of the normalized individual station values from the consensus is 0.20. (c) As in (a), but for region 8, which has the longest time series. Standard error of individual monthly, normalized values is 0.19.

  • View in gallery

    (a) Reconstructed seasonal snowfall totals (December–March) through 2011 for region 3. (b) As in Fig. 10a, but for region 6. (c) As in Fig. 10a, but for region 8.

  • View in gallery

    (a) Station locations used to calculate December–March seasonal snowfall using factor f = 0.5. Stations above 1200-m elevation are designated with +, and below 1200 m with ○. (b) As in Fig. 10a, but for November–April seasonal snowfall totals and factor f = 0.3.

  • View in gallery

    Simple average of the December–March snowfall for regions 6–9, representing the western slope of the Sierra Nevada mountain range (gray) and 9-point running average (black). The metric is the average of each region’s ratio relative to its mean. Region 8 supplies all of the pre-1895 data. None of the trends of the time series of individual regions or of the average were significantly different from zero. The linear trend of the combined time series is −0.7% decade−1.

  • View in gallery

    As in Fig. 12, but for March–May only.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 103 103 29
PDF Downloads 77 77 22

Searching for Information in 133 Years of California Snowfall Observations

View More View Less
  • 1 Earth System Science Center, University of Alabama in Huntsville, Huntsville, Alabama
© Get Permissions
Full access

Abstract

Monthly snowfall totals from over 500 stations in California, some of which date back to 1878, are examined. Most data were accessed through the NOAA archive, but several thousand station months of data were separately keyed in from image files of original documents. Over 26 000 of these entries were new relative to the NOAA archive, generally providing data prior to 1920. The stations were then subdivided into 18 regions for the construction of representative time series of each area. There were problems with the basic data—the most difficult with which to deal was the increasing presence of “zero” totals that should have been recorded as “missing.” This and other issues reduce the confidence that the regional time series are representative of true variations and trends, especially for regions with few systematically reporting stations. Interpreting linear trends on time series with infrequent large anomalies of one sign (i.e., heavy snowfall years) and unresolved data issues should be done with caution. For those regions characterized by consistent monitoring and with the most robust statistical reproducibility, no statistically significant trends in their periods of record (up to 133 years) nor in the most recent 50 years are found. This result encompasses the main snowfall region of the western slope of the Sierra Nevada Mountains.

Corresponding author address: John R. Christy, Earth System Science Center, Cramer Hall 4040, University of Alabama in Huntsville, Huntsville, AL 35899. E-mail: christy@nsstc.uah.edu

A comment/reply has been published regarding this article and can be found at http://journals.ametsoc.org/doi/abs/10.1175/JHM-D-12-070.1 and http://journals.ametsoc.org/doi/abs/10.1175/JHM-D-12-089.1

Abstract

Monthly snowfall totals from over 500 stations in California, some of which date back to 1878, are examined. Most data were accessed through the NOAA archive, but several thousand station months of data were separately keyed in from image files of original documents. Over 26 000 of these entries were new relative to the NOAA archive, generally providing data prior to 1920. The stations were then subdivided into 18 regions for the construction of representative time series of each area. There were problems with the basic data—the most difficult with which to deal was the increasing presence of “zero” totals that should have been recorded as “missing.” This and other issues reduce the confidence that the regional time series are representative of true variations and trends, especially for regions with few systematically reporting stations. Interpreting linear trends on time series with infrequent large anomalies of one sign (i.e., heavy snowfall years) and unresolved data issues should be done with caution. For those regions characterized by consistent monitoring and with the most robust statistical reproducibility, no statistically significant trends in their periods of record (up to 133 years) nor in the most recent 50 years are found. This result encompasses the main snowfall region of the western slope of the Sierra Nevada Mountains.

Corresponding author address: John R. Christy, Earth System Science Center, Cramer Hall 4040, University of Alabama in Huntsville, Huntsville, AL 35899. E-mail: christy@nsstc.uah.edu

A comment/reply has been published regarding this article and can be found at http://journals.ametsoc.org/doi/abs/10.1175/JHM-D-12-070.1 and http://journals.ametsoc.org/doi/abs/10.1175/JHM-D-12-089.1

1. Introduction

With the advent of regular measurements of snowfall in several California (CA) locations as early as 1878, the opportunity seems possible to examine a record exceeding 130 consecutive snow seasons. Considering the vital importance of snowfall for the natural and human-engineered systems of CA, understanding the long-term variations of this quantity has application to a wide variety of stakeholders. Since 1878, many stations have come (and gone)—operated by volunteer observers, government employees, and private industry—which ostensibly recorded daily snowfall as part of a suite of measurements. Many of these reports, especially since 1920, have been keyed into the climatic database of the National Climatic Data Center (NCDC) and conveniently made available online.

Weather observers report three quantities related to snow: 1) liquid equivalent (i.e., of melted snow), 2) total snowfall in the past 24 h, and 3) depth of snow on ground at time of observation (Doesken and Judson 1996). This study addresses the second quantity: total snowfall in the past 24 h. Note that snowfall is not simply the difference in the “depth-of-snow” measurement on consecutive 24-h periods, but the total accumulation during the previous 24 h. This type of measurement, taken in as open an area as possible, minimizes the impacts of melting and compaction. National Weather Service offices (and many other official stations) report snow accumulation (i.e., snowfall) every 6 h to be summed for the daily total, while volunteer observers generally measure the snowfall at cessation for the “greatest accumulation.” Thus, it is very often the case that a 24-h “snowfall” report exceeds the simple difference in depth of snow measured on consecutive days at the official time of observation.

There is some uncertainty about the frequency of measurement for a given 24-h period during which volunteer and/or early observers documented the accumulation through the years. At low-elevation stations, the observer should have measured the snow at cessation, but we do not have clear documentation that this was always the case. These issues are largely made moot by the method of analysis described below, as this study eventually describes results for regions containing numerous stations with relatively high snowfall and good record keeping. Other regions that were subject to problems are identified and low confidence will be assigned to their results.

What appears to be an opportunity to perform a straightforward assessment of snowfall variations over the past 13 decades, however, runs into additional difficulties—two of which are introduced here. First, many of the early records are available only as documents since they have yet to be keyed as digital, machine-readable values. As a result, most of the readings from 1878 to 1920 were keyed in manually from the imaged documents—primarily the coldest months, November–April.

Second, and a significant challenge for climate studies such as this, was devising a method to deal with the inconsistent record keeping and official keying of the snowfall variable. In particular, the single most pervasive problem was identifying values recorded in the database as “zero” when the value should have been “missing.” Had this been a problem random and sparse in time, there would be minimal concern. However, this is clearly a problem that increases in time; the 1970s to 1990s are particularly troublesome. Indeed, near the end of the 133-yr period, several station records are dominated by “zero” values when this is clearly inconceivable for stations that receive snow every year. What makes this such a difficult problem is that for many of these stations a “zero” total is plausible for any particular month, so deciding whether to convert the “zero” to a “missing” contains much uncertainty.

In this paper, new snowfall data generated by keying in of several thousand new monthly snowfall totals for CA stations not available in the current NCDC machine-readable archive will be studied. A description of how problems such as erroneous zero totals will be given. For this climate analysis, the snowy areas of the state are divided into 18 geographic regions to generate time series based on the clustering of stations into regions that have very similar snowfall characteristics, considering the storm-scale size of events in this part of the country. Time series are generated for each of the 18 regions by merging individual station time series by a technique that is designed to minimize the problems encountered and to adjust for interstation biases. There is a wide range of confidence in the results depending on the region being considered.

Calculating linear trends of a non-Gaussian distribution (skewness is usually 1 or greater, whereas Gaussian distributions are characterized by a skewness between −0.5 and +0.5) may not be wholly satisfying in terms of useful information, but in doing so the results indicate small, statistically insignificant trends—mostly negative—for the longest and most robust time series in these regions. It is not certain, however, that many erroneous zeros remain near the end of the individual records and thus the negative sign of such regional trends may indeed be an artifact of this problem. December–March trends of the 18 regions beginning in more recent decades become mixed (and still insignificant) between positive and negative trends.

2. Data

The first and very convenient source of snowfall values was the data archive of the NCDC in Asheville, North Carolina (NC). All snowfall records for all CA stations were initially accessed. Many of these stations, being near sea level, recorded snowfall only rarely, if at all, and were not utilized further.

Our initial database was built from two categories of data files from the NCDC archive. First, the data already digitized explicitly as monthly totals was gathered. Secondly, all daily records were obtained from which monthly totals were calculated from the daily totals separately. As would be expected, there was virtually no difference between monthly totals of the daily values and the monthly values, recorded as such from the NCDC archive, when both were present. Of interest is that the daily data were often digitized further back in time—sometimes decades earlier—than the explicit monthly data and thus provided much extra information. In several cases, one dataset would give a total, while the other recorded a missing, so we would select the nonmissing value.

A major effort of this project was the keying in of monthly totals from the digital document images also available from NCDC, allowing data not available to this point to be examined as machine-readable numbers. Though the documents began in 1891, occasionally a table would be included with monthly snowfall totals for a few stations back to 1878. In Fig. 1, we show such a valuable table from the March 1911 edition of Climatological Data for March, District 11, California (available online at http://www7.ncdc.noaa.gov/IPS/cd/cd.html). Here is the complete list of monthly totals for Summit, CA, from 1878 to early 1911 as recorded by the Southern Pacific Railroad (SPRR) Company.

Fig. 1.
Fig. 1.

Table of monthly snowfall totals (in.) for Summit (COOP ID 048646), Placer County, CA, 1878–1911, as appeared in Climatological Data for March, District 11, California (March 1911 edition) (data available online at http://www7.ncdc.noaa.gov/IPS/cd/cd.html).

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

The data were keyed in for all stations for which snowfall was a common occurrence, but also for a few low-elevation stations, where snowfall was a rarity, for future studies of unusual events. Though many of the keyed-in values were found to be duplicates of those already obtained from the NCDC archive, especially after 1920, eventually over 26 000 station months not available through NCDC were entered. The last year to key in data for all listed stations was 1930, as by 1931 nearly all of the entries were already represented in the NCDC archive.

Most of the focus here is on seasonal (i.e. quasi-annual) totals, which will generally cover the cold months December–March. This is the case because for these 4 months, the climatological reports would often include a page of supplementary information titled “Snowfall in the Mountains” with totals for many stations not in the regular listing. However, this supplementary information did not appear for months other than December–March. “Seasonal totals” for November–April and March–May will also be calculated; however, there tended to be a drop in the number of years able to be completed for some regions because the “Snowfall in the Mountains” supplementary information was not included in November, April, and May reports.

3. Geographic regions

Because the goal is to examine variations and changes in snowfall in relatively small areas on the time scale of a century, the state was subdivided into 18 regions for which a physical scale of high homogeneity would be assumed for all stations in each region. This was demonstrated in Christy and Hnilo (2010, hereafter CH10) for the southern Sierra Nevada where annual (i.e., cold season) snowfall totals were highly correlated for separations of up to 100 km.

All of the station records in each region were examined and from these a reference station was selected as a representative of the region. The reference station was selected based on two characteristics: 1) having a long period of record, and 2) having a relatively high, nonzero snowfall amount in most cold months. This latter condition is satisfied by stations located at relatively high altitudes, which for some regions was not possible because all or nearly all of the stations were less than 1500 m above sea level. In the analysis section, a quasi-normalized snowfall index will be computed for each station individually, and then for each region, which then will be converted back to a snowfall depth representative of each region’s reference station. In one sense, this may be viewed as an exercise in which the full record of each reference station is reconstructed by using all nearest-neighbor information (CH10).

In Fig. 2, the 18 regions are shown by number along with the December–March median snowfall total (cm) of each region’s reference station. Table 1 provides information on the selected reference stations. California extends across 10° of latitude and includes elevations from below sea level to over 4000 m. The diversity of this topography is somewhat evident by noting the median totals for the selected reference stations in each region (Fig. 2). Where snowfall has a significant impact on infrastructure or operations, there have been relatively long-term observations—that is, SPRR east–west transportation corridors (regions 8 and 17), hydroelectric utilities in the southern Sierra Nevada (region 6), and water resource concerns in the relatively dry southern and eastern ranges (regions 1, 2, and 15). For these varied operations, it was important for daily snowfall to be systematically measured and reported, and today these early measurements provide climate scientists a valuable resource.

Fig. 2.
Fig. 2.

The state of California with dashed polygons representing the 18 separate regions (identified by top number) into which stations were divided. The lower number indicates the seasonal (December–March) total snowfall for the selected reference stations in each region (cm). The solid lines indicate Climate Division boundaries, which largely follow ridge lines of the major basins. Tick marks are 0.5° lat/lon.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

Table 1.

List of reference stations with associated information for each region. Note: AP = airport, HD = hydrologic dam, and RS = ranger station.

Table 1.

Through the years, unfortunately, most of these early stations were closed or passed to volunteers who in general were not as diligent in recording snowfall. With the establishment of national and state parks, on-site employees began monitoring snowfall in mountainous areas, but many of these efforts were not consistent, as there was no immediate, urgent, operational requirement for the measurement. One would think, for example, that a location such as the park headquarters of Yosemite Valley would have maintained a meticulous record of snowfall given the high priority and immense public interest this park has in the national system. However, we find only spotty records in recent decades, to the point that the observations after 1975 must be neglected (see below). Further south at Grant Grove and Lodgepole in the Kings and Sequoia National Parks, respectively, the Park Service rangers have maintained decades of excellent snowfall records even when depths exceeded 3 m.

There were about 10 of the keyed-in stations whose locations did not coincide with any station already named and numbered in the NCDC cooperative network database. These stations were given unique station identification (ID) numbers and incorporated into our database.

4. Data processing

The raw data values came from the three sources: 1) the NCDC monthly digital values, 2) the NCDC monthly values derived from daily digital values, and 3) the keyed-in values, and were combined for direct comparisons. In virtually every case where more than one source was represented, the values were identical. A single, monthly time series for each station was generated by taking the nonmissing value if one existed. In rare cases where differences occurred the value with the highest magnitude was selected. In most of these cases, one of the sources would list zero while the other would have a plausible total.

As mentioned above, the single most confounding issue that was faced was determining whether a value of “zero” was real or an erroneous entry that should have been “missing.” This is demonstrated in Fig. 3 with five stations from regions 6 and 7. The metric in the figure is the proportion of months over a running 5-yr period in which the value of zero was recorded for December–March. In other words, of the 20 months in each 5-yr period of December–March, Fig. 3 displays the proportion that is recorded as zero.

Fig. 3.
Fig. 3.

Fraction of months recorded as zero per 5 years during December–March of five stations from regions 6 and 7.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

In Fig. 3 it is clear that measurement or recording practices changed dramatically for stations 5496 and 4176 to the point that no snow was recorded for several 5-yr periods after a certain date—a situation that is inconceivable for these stations. For comparison we show 3351 and 5026 (Grant Grove and Lodgepole mentioned earlier), which indicate a very high level of consistency, with snow falling in almost every one of the 20 months in each 5-yr period.

The examples of the “zero” problem of 5496 and 4176 are trivial to locate and correct (i.e., setting the obvious zeros to missing for years so identified). However, we also include 9855 (triangles in Fig. 3; Yosemite Park Headquarters), which in terms of snowfall is highly correlated with the other stations in the figure. What we see is a station that, from 1904 to about 1975, was reporting zero monthly totals less than 35% of the time (usually less than 20%) for any given 5-yr period. After 1975, the zero reports rise to 45%–72% before dropping back to about 30% at the end. An examination of the record indicates several zeros in months where snow occurred in all other nearby stations. These zeros appear to be randomly interspersed among other reasonable monthly values. We additionally checked the south entrance of Yosemite (8380) and found no 5-yr period that exceeded 25% zeros between 1940 and 2000.

It is clear that many of the zeros at 9855 are erroneous. Rather than attempting to determine which individual months may be correct (which for a true zero month would be very difficult), we assumed there was a pathology in the observing and/or recording process that is unpredictably manifested in the years after 1975. This would compromise any climate dataset construction process if one tried to repair the problem month by month. In other words, it could not be established with high confidence that a particular month was accurate or not with the time available for this project. As a result, all monthly values for 9855 were set to missing after 1975.

Figure 4 displays another example from region 8 where we show five relatively low-elevation stations for which a zero monthly total is entirely reasonable at any given time (region 8 has a total of 78 stations, though many are not used in the calculations because of minimal snowfall averages). Station 1912 displays a gradual rise in zeros from the 1920s to around 1960. Is this an indication that the snow level rose during those 40 years, or is this an indication of the “zero” problem? A similar feature is seen in stations 4288, 6136, and 6597. We decided not to alter any of these stations as they seem to corroborate a relatively sparse period of low-elevation snow around 1960 with erratic, but not remarkable variations afterward. We suspect station 1912 has at least a few erroneous zeros around 1995 because the other stations do not show this dry spell, but our decisions to convert “zeros” to “missings” required more obvious evidence than this. We also include station 3491 in Fig. 4 from region 8 to demonstrate, again, that there are stations for which the zeros-to-missing conversion is easy to spot. Thus, while we believe we have eliminated a great majority of erroneous zero reports, we suspect there are still many remaining in several stations, particularly near the end.

Fig. 4.
Fig. 4.

As in Fig. 3, but for five stations from region 8.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

We examined all snowfall stations in CA with this graphical process and in our judgment determined that some portion of 126 of the 500+ stations was compromised with erroneous zeros. In many cases, as in Fig. 4, there were more gradual rises in the reports of zeros over several decades, usually for low-elevation stations. In general, we are fairly confident this was an indication of erroneous zeros creeping into the record, but without a significant effort to examine each entry for possible error, we left them as is.

This problem was also noted in CH10 regarding some stations eventually not used in that study. For example, as one new observer began reporting for a station, he simply wrote the snowfall totals in the “comment” column rather than the snowfall column for years, so that the true snowfall was never recorded in the digital files so that the official record contains zeros. In other cases it was noted that a change to a new observer often meant a new person who was simply less diligent in recording snowfall. In any case, without the presence of significant shifts, we did not apply the zeros-to-missing conversion but we recognize any long-term trends of these time series will likely, and unfortunately, contain a spurious negative component. One means to reduce the effect of the zero problem is to utilize stations with relatively high snowfall totals for which records tended to be much more consistent, as these were in locations of operational importance (e.g., SPRR stations and their follow ons).

Kunkel et al. (2007) investigated several issues with regard to snowfall measurements that might impact the homogeneity of the time series, concentrating on stations east of the Rocky Mountains. One issue they examined was a rather systematic increase in the ratio of snowfall depth to liquid water equivalent, rising roughly from 10 to 13 from 1930 to 2000. The implication, borne out in further analysis, was an apparent tendency for early observers to measure the snowfall accurately, then divide by 10 for the liquid equivalent (which was an option provided in the instructions at the time). As time went on, the snowfall and liquid equivalent were separately measured and the ratio rose. This change in observing practice was evidently not an important issue for the stations they studied in CA (see their Fig. 8, and Knowles et al. 2006). It appears from their study that the snowfall total was generally the more accurate measurement, as it was directly observed and not a derived quantity. This was also noted in CH10, where the observer at one station (4176) reported having difficulty melting the snow with the device consisting of a large rain gauge into which three 100-W light bulbs were inserted for heat. The problem was that the bulbs would often burn out, and the snow could not be melted and measured properly. Again, this suggests that in some very snowy regions, the measurement of snowfall was the more reliable precipitation metric.

5. Data analysis

The task of generating a relatively homogenous time series from stations that 1) have inconsistent time periods, 2) whose metric is non-Gaussian (skewness 0.7–2.3 for seasonal totals), and 3) whose characteristics vary considerably according to local elevation is rather difficult. The reconstruction process was relatively easy in CH10 since the individual time series of the selected stations (roughly region 6) were highly correlated and were selected because snowfall had been generally diligently monitored for decades at each station. In that study, lower-elevation stations (<1200 m) were not used because of problems such as those noted earlier.

The processing for a homogeneous time series begins by converting the observed snowfall amounts to a quasi-normalized (or simply “normalized”) time series y that is near Gaussian by this formula:
eq1
where x is the observed monthly snowfall for month j in year i and station k, and μ is the median snowfall amount calculated from all available years for month j at station k. This transformation preserves the zero values (if there are any) and the maximum value while having a central value of 1.00. If μ is zero, then μ is redefined as the average of the snowfall values. When accumulated in seasonal totals, the skewness of this transformed metric is between −0.2 and +0.9 for the regions, or more nearly Gaussian.

Stations were screened to meet minimum criteria, which included a requirement that their median be greater than a prescribed fraction of μ of the region’s reference station (μs). For example, the January μs for a particular region may be 40 cm and will be noted as μsj,m for month j in region m. A minimum criterion or fraction f of 0.30 might be set to restrict the subsequent analysis to those stations in the region whose own median μj,k for January is at least 12 cm, or 0.30 of μs. As the threshold f is increased, this has the effect of eliminating more low-elevation, less snowy (with more erratic statistical properties) station values. We shall test the impact of varying this parameter later.

Regions with many stations in higher elevations tended to provide robust results (i.e., small error; see below) for values of f = 0.5 and more. Other regions, with few snowy stations and limited station record lengths, required values of f to be less than 0.5 to achieve a times series of several-decades length. We note that when stations with low snowfall amounts (i.e., setting a small f) were included, their impact on the normalized regional anomalies would be to increase the range of seasonal totals, sometimes dramatically. As f is increased, the range of regional totals would decrease. This is due to the fact low-elevation stations may have a very small μ so that a relatively heavy snow event would result in a value of y of several multiples of its median—a situation that generally would not occur in higher elevations.

The last step before calculating a seasonal total of normalized values from the monthly snowfall time series is to debias the individual station records relative to each other. At this point, all individual station time series are calculated relative to the median determined over their variable respective periods of record. Since the median can change considerably from one decade to the next, a short station record could have a decidedly different reference base than other time series to which it will be merged. If the records were combined at this point, the low-frequency component of the time series would not be preserved.

To account for the difference in station periods of record, a merging technique described in Christy (2002) is applied to debias the individual station time series relative to all others. The process utilizes all periods for which overlapping data are available to calculate a best-estimate bias for each station (see Christy 2002 for details). Once this is done, the single regional time series is computed as follows.

The regional value Y of month i, year j, and region m is the weighted average of the y values in which nonmissing values whose stations k lie in region m are multiplied by a weight w, which is proportional to the length of each station’s time series:
eq2
The transformation back to regional snowfall totals X for month i, year j, and region m is straight-forward:
eq3
where μsj,m is the value of the median of the reference station in month i as noted above. So, in summary, this method 1) transforms the individual station time series of snowfall to time series of quasi-normalized, positive-definite, near-Gaussian distribution of values with median of 1.0; 2) debiases the individual station time series relative to each other; 3) combines the available monthly totals to produce a best guess for a regional, monthly quasi-normalized snowfall time series; 4) calculates from the available regional monthly values a seasonal quasi-normalized time series; and 5) transforms the regional time series back to a snowfall amount proportional to the selected reference station’s median, μs.

6. Results

Some general comments are in order regarding the merging of snowfall totals from multiple stations of varying elevations and differing periods of record. Firstly, the determination of a region’s representative anomaly for a given month may be confounded by the variability of the weather systems that cause snow. A particularly cold storm may have much above-median snowfall at low-elevation stations but below-median snowfall at high elevations (and vice versa for warm storms). By using monthly totals, some of this peculiarity is limited as, generally, several systems are summed together. Seasonal totals will further minimize this feature. Also, keeping f at 0.5 reduces this noise factor as it reduces the number of low-elevation stations employed.

Secondly, the uncertainty of varying the parameter f on the results is demonstrated by displaying its consequences on 1) the number of stations utilized and 2) the standard error of monthly reconstructions. The results from values for f—which are 0.1, 0.3, and 0.5—are shown in Fig. 5.

Fig. 5.
Fig. 5.

Number of stations whose median exceeds the fractional total (f) of the reference station snowfall median in each region.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

In Fig. 5 are displayed the number of stations for each region that provided at least 20 months of December–March snowfall data per value of f. In some regions, mainly southern and/or low elevation, the reference station may be the only station with snowfall reports over several years. In Fig. 6, we show the root of the mean of the squares of the monthly differences between individual station values versus the consensus value for the normalized time series. In general, the statistical results indicate, as would be expected, that as the snowfall totals of the selected stations increases, the smaller the errors become. In particular, regions 6, 8, 15, and 17 demonstrate high reproducibility among the stations that qualify.

Fig. 6.
Fig. 6.

Square root of the individual stations’ average monthly squared departures (December–March only) from the region’s calculated value (Y) of the normalized metric (median of 1.0). Three values per region were calculated by setting f equal to 0.1, 0.3, and 0.5 (see text). When f = 0.5, the standard errors of regions 6, 8, 15, and 17 are ≤0.2.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

In Fig. 7, we show the total number of years for which a regional seasonal total may be generated, which requires having all four cold months present (December–March). One notices that only regions 6, 8, and 17 have error metrics ≤0.2 and a length >100 years for f = 0.5. Regions whose snowfall is minimal, as indicated by the median total of the representative station (Fig. 2), were generally less reproducible.

Fig. 7.
Fig. 7.

Total number of “annual” totals (December–March) from the 1878/79 season to 2010/11 season able to be generated by region based on the threshold fraction f. Note: considering Fig. 6, we note that when f = 0.5, only regions 6, 8, and 17 have time series longer than 100 years with an error metric ≤0.2.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

Finally, in Fig. 8, we show the impact of average station elevation on the threshold factor f. To compute this, the elevations of the stations that reported in each year were first averaged, then the entire time series was averaged. As expected, average elevation increases as f increases. To keep the following evaluation concise, examples of the extremes in reproducibility will be presented.

Fig. 8.
Fig. 8.

The time series average of station elevation (m) per region for three thresholds: f = 0.1, 0.3, and 0.5 for December–March. For each year, the elevation of stations that reported data were averaged, then the average of all years was calculated and shown above.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

In Fig. 9a are the monthly values of the normalized, individual y’s (+ symbols) and regional Y consensus value (○ symbols) for region 3, which is a region in the southern portion of the state (Tehachapi Mountains) with only low-elevation stations (all <1500 m) and with few consistent long-term records. It is apparent that the spread of the individual y’s is large (the standard error of the normalized metric is 0.76)—that is, indicating low reproducibility of Y among the reporting stations. This will be the case when the median snowfall is quite low (14 cm for the reference station) and for which almost no long-term stations exist.

Fig. 9.
Fig. 9.

(a) For region 3, the time series of individual station values (y, + symbols) and consensus values (Y, ○ symbols) of the normalized snowfall metric for months (from top) December, January, February, and March through 2011. (bottom) Seasonal total of normalized monthly consensus values. The proportional factor threshold is f = 0.5. The median December–March snowfall for the reference station is 14.0 cm with the COOP IDs of the 12 stations utilized in the assembling of the time series listed at the top. The standard error of individual monthly values from the consensus is 0.76 (recall that median is 1.0 and minimum is zero). (b) As in (a), but for region 6. The standard error of the normalized individual station values from the consensus is 0.20. (c) As in (a), but for region 8, which has the longest time series. Standard error of individual monthly, normalized values is 0.19.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

On the other hand, in Fig. 9b (region 6; southern Sierra) and Fig. 9c (region 8; central Sierra), we see that the individual monthly station values of y are tightly grouped around the regional consensus values Y, with standard errors of 0.20 and 0.19, respectively. This indicates high reproducibility of Y by the individual stations. In other words, with high median snowfall, the signal is large relative to the associated noise related to individual station variations. This is evidence that the methodology has, at least to some extent, minimized the concerns indicated in the previous section.

Seasonal regional snowfall totals, tied to the reference station median, are reconstructed and displayed for the three regions discussed above in Figs. 10a–c. Results for all 18 regions for f = 0.5 are listed in Table 2. For period-of-record trends we see 12 of the 18 realizations are negative, but since 1962 (50 years) we see 11 of 17 as positive (one station does not provide data past 1987 for December–March). The stations used for each region are located on the map in Fig. 10a with different symbols for stations above 1200 m and those below.

Fig. 10.
Fig. 10.

(a) Reconstructed seasonal snowfall totals (December–March) through 2011 for region 3. (b) As in Fig. 10a, but for region 6. (c) As in Fig. 10a, but for region 8.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

Table 2.

Statistics for December–March snowfall by region for parameter f = 0.5. Start and ending seasons are represented as the year each season closed (i.e., 1889 is season 1888/89). The total trend (cm decade−1) includes all available years with data. “50-yr” trend begins in the 1961/62 season to 2010/11. The median (μ) is the median of the full time series and will differ from the median of the reference station (μs), which was determined only over its years of observations. The value of “s.e. norm.” is the standard error of the normalized statistic, which itself has a minimum of zero and median of 1.0.

Table 2.

Under simple statistical significance testing of the trend, in which the individual seasonal values are assumed to be accurate, three regions (10, 11, and 13) produce marginally significantly negative trends. These regions have reference stations with relatively minimal snowfall (65, 154, and 29 cm) and are based on more erratic station reports. Additionally, there is likely present a negative bias in the trends, prominent in low-snow regions, because of the “zero versus missing” problem noted earlier. (When significance testing is performed on the normalized time series, only region 13 is marginally significant.) However, as noted in Figs. 6 and 9a, the individual regional values of Y are not known with high precision for several regions.

To account for this additional error due to imprecision of the individual measurements, 200 random time series were generated with a standard deviation equal to that of the particular region’s error and then the distribution of those trends was determined. When this measurement-error range is included in the significance test, the 95% significance range for these regions expands to include the zero value. None of the 50-yr trends are significantly different from zero. Thus, the picture that emerges is one of weakly varying snowfall trends with no long-term direction one way or another. However, a more reasonable conclusion regarding linear trends of the 18 regions is that for those whose standard error (s.e.) > 0.5 (regions 1, 3, 4, 5, 10, 13, and 14), it would be more accurate to say the linear trend and its significance are indeterminable because of the poor reproducibility.

In Table 3 the results are displayed for the longer cold season of November–April. We have decreased the threshold factor f to 0.3 to gather more stations for time series construction because of the fact that November and April were not included in the “Snowfall in the Mountains” list in the early climatological reports. Note that the seasonal totals are larger, of course, and that the errors are larger as well as the fact that lower-elevation stations are mixed in with f = 0.3. Even with f = 0.3, the number of years allowing seasonal totals to be constructed for some regions is lower than December–March because of missing Novembers or Aprils. Two of the times series indicate marginally significant negative trends (10 and 13), but neither retains significance when measurement error is included. The locations for these stations that contributed to the November–April time series are displayed in Fig. 11b.

Table 3.

As in Table 1, but for November–April seasonal totals and f = 0.3.

Table 3.
Fig. 11.
Fig. 11.

(a) Station locations used to calculate December–March seasonal snowfall using factor f = 0.5. Stations above 1200-m elevation are designated with +, and below 1200 m with ○. (b) As in Fig. 10a, but for November–April seasonal snowfall totals and factor f = 0.3.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

It is interesting to note that though region 8’s 50-yr trends in Tables 2 and 3 are nonsignificant, we see a decidedly more negative 50-yr trend for November–April relative to December–March. One reason for this discrepancy is the unusually heavy snowfall for both November and April during 1962–76. April 1967 was especially heavy (see below). Over this period, the 15-yr average for the sum of only November and April was over 50% greater than their long-term average, and was the highest 15-yr total during the entire 133-yr record. This tilts the regression line downward when starting in 1962 and using both November and April.

Region 13 also displays strongly negative trends. This is a region populated by low-elevation stations (Fig. 8), which often record no snowfall in any given month (therefore subject to the zero problem) and whose stations operating in the recent years are all below 720 m. Earlier decades were populated by stations at 850–1000-m elevation. Thus, the time series of region 13 is composed of a type of heterogeneous station data that the merging methodology utilized here cannot overcome—that is, if snow is not recorded in the reporting stations, the method cannot generate an amount that may have fallen at now-closed, higher-elevation stations. Hence, this region’s trend is clearly in the “indeterminate” category.

Finally, the December–March snowfall time series for regions 6–9 are combined by calculating a simple average of their normalized December–March snowfall (i.e., dividing each region’s seasonal total by the mean value so that 1.0 represents the mean). This is displayed in Fig. 12. These regions comprise the western slope of the Sierra Nevada on which much of the state’s water supply depends. The 133-yr time series reveals a spiky, non-Gaussian character, with a linear trend of −0.7% decade−1, and indicates no significant long-term trends in the four-region average or in each region individually. Similarly, the most recent 50-yr period produces an insignificant trend of +0.5% decade−1.

Fig. 12.
Fig. 12.

Simple average of the December–March snowfall for regions 6–9, representing the western slope of the Sierra Nevada mountain range (gray) and 9-point running average (black). The metric is the average of each region’s ratio relative to its mean. Region 8 supplies all of the pre-1895 data. None of the trends of the time series of individual regions or of the average were significantly different from zero. The linear trend of the combined time series is −0.7% decade−1.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

7. Discussion

There have been several studies of other snow-related metrics in CA and the western United States. These, in general, were performed over shorter periods and examined different metrics such as snow water equivalent (SWE) on a particular date, snowpack depth on a particular date, or river runoff during the snowmelt season. It is important to note that the current paper is examining what might be called the beginning of the hydrological process—the amount of snow that falls, when it falls. This is a metric in its own right and has application to, for example, snow ecology, snow removal, general operations, and recreation, besides more fundamental physical quantities.

While SWE and other metrics are quite useful for predicting late-spring and summer runoff volume, they do not correspond well at times to snowfall. CH10 compared snowfall, SWE, and runoff, noting, for example, that SWE on 1 April does not capture snowfalls in late spring, which are sometimes massive (e.g., 1967; see below), nor does it capture heavy snows that have already melted by that date because of warm rains. The utility of integrated measurements, such as SWE, are valuable without question, but are somewhat apart from the present study. With that in mind we shall comment on some recent studies with the present study’s results in mind, but recognizing the basic difference the metrics studied.

Howat and Tulaczyk (2005), in studying the possible sensitivity of spring snowpack in the Sierra to temperature variations, found a differential change in SWE dependent on elevation over the 53-yr period 1950–2002. They documented an increase in higher-elevation SWE (from snow-course measurements) and a decrease in lower-elevation SWE. CH10 examined SWE for the southern Sierra and found a similar, though slight, relationship for 1930–2009, but no elevation dependence for 1950–2009. We note that the period 1950–2002, using the data in Fig. 11, generates a snowfall trend for the western slope of −3.0% decade−1, while the most recent 50-yr trend is +0.5% decade−1 and for the entire time series a trend of only −0.7% decade−1. (Note, as stated earlier, many of the 18 regions are too uncertain in their precision for a discussion of trends, but for regions 6–9, it has been demonstrated that there is a high degree of reproducibility, especially if the four regions are averaged to reduce noise further.)

Barnett et al. (2008) looked at the western United States, concentrating on river flow and SWE for 1950–99, and found a trend toward an earlier peak in spring runoff for Sierra-fed rivers. Again, from Fig. 11, this can be considered consistent with the results here in that the trend in total snowfall is also negative (−3.0% decade−1) for that particular 50-yr segment (i.e., lower snowfall totals generally melt away faster, reaching their peak earlier). However, the impact on the linear trend of a few singular heavy snowfall years is confounding for studies such as this, as the most recent 50-yr December–March snowfall trend for the western slope is +0.5% decade−1. (Calculating all 50-yr trends for 1878/79 to 2010/11 produces a range of −5.6% to +4.2% decade−1.) Mote et al. (2005), utilizing 1950–97 data only, generally found a slight upward trend in water resource availability on the western slope, related to an increase in SWE (snow courses are generally higher in elevation versus the snowfall stations). Kapnick and Hall (2010) also examined SWE for the period 1930–2008 (similar to CH10) and found, as did Barnett et al. (2008), a tendency for SWE to peak earlier in the season, whether 1 April SWE was increasing or not.

Because most of these studies document a decline in some snow-amount metric in spring (i.e., lower-elevation snow-course SWE that more closely corresponds to station elevations), we reconstructed the March–May snowfall totals and generated a western slope value for regions 6–9 (Fig. 13) to see if this result was also detectable in snowfall. For 1878–2011 (134 values), none of the individual regional trends (+0.8%, −1.0%, +0.0%, and −0.3% decade−1, respectively) is significantly different from zero, nor is the four-region average (−0.4% decade−1). The most recent 50-yr period (1962–2011) produces a decidedly downward slope of −7.6% decade−1, yet this is not significantly different from zero because of its dependence on the singular heavy snowfall spring of 1967. Removing this single year from the time series produces a trend of only −0.8% decade−1. Thus, studies of Sierra snow metrics over the past few decades should be viewed with caution as single years have significant impact on the linear trend.

Fig. 13.
Fig. 13.

As in Fig. 12, but for March–May only.

Citation: Journal of Hydrometeorology 13, 3; 10.1175/JHM-D-11-040.1

For the Sierra Nevada, then, the results regarding particular snow-amount metrics (not timing) as depicted in these shorter-period studies are generally consistent with the present results in noting that the trends are not profoundly of one sign or the other.

8. Conclusions

Combining over 26 000 additional monthly snowfall totals for several stations in California with data available from the NCDC archive, we are able to develop seasonal snowfall totals for 18 regions in the state—some extending back to the season of 1878/79. The robustness of the regional time series increases with the number of stations in a region that have long records of significant snowfall (i.e., >200 cm per season). Regions of low snowfall and few (or no) stations with long time series give incomplete results with low confidence.

A major pitfall in the construction of the regional time series is the fact many archived station values of snowfall have been erroneously recorded as “zero” when they should have been recorded as “missing.” In an attempt to overcome this significant problem, we examined each station’s frequency of snow reports to discover unrealistic changes. Over 100 of the 500 stations appeared to contain segments in which erroneous increases in zero monthly totals occurred—mostly after 1970. For these, we set the segments to missing. A more thorough examination requires an investigation of each original document with comparisons of adjacent stations to determine when a zero value is appropriate. Indeed, it is recommended that a dedicated project to systematically solve this problem be instituted. As it is, this study likely provides long-term trends that are slightly more negative than in reality.

To create homogeneous and long time series of monthly and seasonal (December–March, November–April, or March–May) regional totals, individual monthly station totals were “normalized” because the distributions are highly skewed and the snowfall magnitudes of individual stations varied greatly. The homogenization methodology was performed on this quasi-normalized station time series to remove station biases before the stations were merged into a single, consensus monthly time series for each region. A “seasonal” total was calculated by linearly combining the monthly normalized values using a reference station’s median quantity as a weight for each month. A snowfall total was then computed by reversing the normalization procedure.

Regional trends ending in the 2010/11 season for the longest and most robust time series were not significantly different from zero, whether starting from their initial year (as early as the season of 1878/79) or over the most recent 50 years, starting in 1961/62. This is partly due to the high interannual variability of this metric, which creates wide significance bands. The four regions that comprise the western slope of the Sierra Nevada provide relatively robust values, so their result of nonsignificance in trends has highest confidence. For other regions, though, the poor reproducibility of the regional value by the individual station values is a major factor. Indeed, for about half of the regions, it is more reasonable to conclude that trends are indeterminate rather than insignificant.

Snowfall is an important climate metric. Utilizing snowfall, especially in regions where its presence is critical for natural and human circumstances, has some advantages. As a phenomenon that is readily visible to the human eye, it is rather easy to detect and measure with proper guidelines (Doesken and Judson 1996). The measuring device—a passive snow stake—has virtually no way to fail or malfunction on its own. This measurement has been performed with little change for over 100 years, and offers the potential for being a systematic metric for the study of climate variability and change because of its simple measuring method, and it is important since it relates to precipitation, temperature, and storminess.

Unfortunately, snowfall measurements have not been made with the diligence seen in the late nineteenth and early twentieth centuries, and many (and possibly most) long-term stations are either now defunct or not faithfully reporting the daily values. Efforts such as the Community Collaborative Rain, Hail and Snow Network (CoCoRaHS) are starting to fill the gap with citizen observers. However, long-term stations that form the backbone of our historical network are now being replaced by automated systems without robust snowfall reporting mechanisms so that the number and quality of snowfall measurements will likely continue to diminish significantly. In other cases, the official observers simply do not report daily snowfall consistently. It is recommended, therefore, that those who have influence on the establishment of weather- and climate-observing system requirements insist that the network retain the capability of monitoring the daily snowfall amount as a key, long-term weather and climate variable.

Acknowledgments

Funding for this study was provided through the U.S. Department of Energy DE-SC0005330 and the Alabama Office of the State Climatologist.

REFERENCES

  • Barnett, T. P., and Coauthors, 2008: Human-induced changes in the hydrology of the western United States. Science, 319, 10801083, doi:10.1126/science.1152538.

    • Search Google Scholar
    • Export Citation
  • Christy, J. R., 2002: When was the hottest summer? A state climatologist struggles for an answer. Bull. Amer. Meteor. Soc., 83, 723734.

    • Search Google Scholar
    • Export Citation
  • Christy, J. R., , and Hnilo J. J. , 2010: Changes in snowfall in the southern Sierra Nevada of California since 1916. Energy Environ., 21, 223234.

    • Search Google Scholar
    • Export Citation
  • Doesken, N. J., , and Judson A. , 1996: The snow booklet: A guide to the science, climatology, and measurement of snow in the United States. Colorado Climate Center Rep., Department of Atmospheric Science, Colorado State University, 92 pp.

  • Howat, I. M., , and Tulaczyk S. , 2005: Climate sensitivity of spring snowpack in the Sierra Nevada. J. Geophys. Res., 110, F04021, doi:10.1029/2005JF000356.

    • Search Google Scholar
    • Export Citation
  • Kapnick, S., , and Hall A. , 2010: Observed climate–snowpack relationships in California and their implications for the future. J. Climate, 23, 34463456.

    • Search Google Scholar
    • Export Citation
  • Knowles, N., , Dettinger M. D. , , and Cayan D. R. , 2006: Trends in snowfall versus rainfall in the western United States. J. Climate, 19, 45454559.

    • Search Google Scholar
    • Export Citation
  • Kunkel, K. E., , Palecki M. A. , , Hubbard K. G. , , Robinson D. A. , , Redmond K. T. , , and Easterling D. R. , 2007: Trend identification in twentieth-century U.S. snowfall: The challenges. J. Atmos. Oceanic. Technol., 24, 6473.

    • Search Google Scholar
    • Export Citation
  • Mote, P. W., , Hamlet A. F. , , Clark M. P. , , and Lettenmaier D. P. , 2005: Declining mountain snowpack in western North America. Bull. Amer. Meteor. Soc., 86, 3949.

    • Search Google Scholar
    • Export Citation
Save