• Abatzoglou, J. T., 2013: Development of gridded surface meteorological data for ecological applications and modelling. Int. J. Climatol., 33, 121131, https://doi.org/10.1002/joc.3413.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, S. L., 1964: A technique for maximizing details in numerical weather map analysis. J. Appl. Meteor., 3, 396409, https://doi.org/10.1175/1520-0450(1964)003<0396:ATFMDI>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., 2006: Guidelines for assessing the suitability of spatial climate data sets. Int. J. Climatol., 26, 707721, https://doi.org/10.1002/joc.1322.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., R. Neilson, and D. Phillips, 1994: A statistical–topographic model for mapping climatological precipitation over mountainous terrain. J. Appl. Meteor., 33, 140158, https://doi.org/10.1175/1520-0450(1994)033<0140:ASTMFM>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., W. P. Gibson, G. H. Taylor, G. L. Johnson, and P. Pasteris, 2002: A knowledge-based approach to the statistical mapping of climate. Climate Res., 22, 99113, https://doi.org/10.3354/cr022099.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., W. P. Gibson, G. H. Taylor, M. K. Doggett, and J. I. Smith, 2007: Observer bias in daily precipitation measurements at United States cooperative network stations. Bull. Amer. Meteor. Soc., 88, 899912, https://doi.org/10.1175/BAMS-88-6-899.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 20312064, https://doi.org/10.1002/joc.1688.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., D. R. Conklin, and M. H. Unsworth, 2010: Local atmospheric decoupling in complex topography alters climate change impacts. Int. J. Climatol., 30, 18571864, https://doi.org/10.1002/joc.2007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., M. P. Widrlechner, M. D. Halbleib, J. I. Smith, and W. P. Gibson, 2012: Development of a new USDA plant hardiness zone map for the United States. J. Appl. Meteor. Climatol., 51, 242264, https://doi.org/10.1175/2010JAMC2536.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Slater, J. A. Roberti, S. Laseter, and L. Swift, 2017: High-resolution precipitation mapping in a mountainous watershed: Ground truth for evaluating uncertainty in a national precipitation dataset. Int. J. Climatol., 37, 124137, https://doi.org/10.1002/joc.4986.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Di Luzio, M., G. L. Johnson, C. Daly, J. Eischeid, and J. G. Arnold, 2008: Constructing retrospective gridded daily precipitation and temperature datasets for the conterminous United States. J. Appl. Meteor. Climatol., 47, 475497, https://doi.org/10.1175/2007JAMC1356.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ensor, L. A., and S. M. Robeson, 2008: Statistical characteristics of daily precipitation: Comparisons of gridded and point datasets. J. Appl. Meteor. Climatol., 47, 24682476, https://doi.org/10.1175/2008JAMC1757.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fassnacht, S. R., 2004: Estimating Alter-shielded gauge snowfall undercatch, snowpack sublimation, and blowing snow transport at six sites in the coterminous USA. Hydrol. Processes, 18, 34813492, https://doi.org/10.1002/hyp.5806.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fiebrich, C. A., C. R. Morgan, A. G. McCombs, P. K. Hall, and R. A. McPherson, 2010: Quality assurance procedures for mesoscale meteorological data. J. Atmos. Oceanic Technol., 27, 15651582, https://doi.org/10.1175/2010JTECHA1433.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Funk, C., and Coauthors, 2000: The Collaborative Historical African Rainfall Model: Description and evaluation. Int. J. Climatol., 1, 4766, https://doi.org/10.1002/joc.866.

    • Search Google Scholar
    • Export Citation
  • Gervais, M., L. B. Tremblay, J. R. Gyakum, and E. Atallah, 2014: Representing extremes in a daily gridded precipitation analysis over the United States: Impacts of station density, resolution, and gridding methods. J. Climate, 27, 52015218, https://doi.org/10.1175/JCLI-D-13-00319.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goodison, B. E., P. Y. T. Louie, and D. Yang, 1998: WMO solid precipitation intercomparison. WMO Rep. WMO/TD-872, 212 pp., https://library.wmo.int/index.php?lvl=notice_display&id=6441#.YW7EBhrMKUk.

  • Hamlet, A. F., and D. Lettenmaier, 2005: Production of temporally consistent gridded precipitation and temperature fields for the continental United States. J. Hydrometeor., 6, 330336, https://doi.org/10.1175/JHM420.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamlet, A. F., and Coauthors, 2010: Final report for the Columbia Basin Climate Change Scenarios Project. University of Washington Climate Impacts Group Rep., 221 pp., http://warm.atmos.washington.edu/2860/.

  • Henn, B., M. P. Clark, D. Kavetski, A. J. Newman, M. Hughes, B. McGurk, and J. D. Lundquist, 2018a: Spatiotemporal patterns of precipitation inferred from streamflow observations across the Sierra Nevada mountain range. J. Hydrol., 556, 9931012, https://doi.org/10.1016/j.jhydrol.2016.08.009.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Henn, B., A. J. Newman, B. Livneh, C. Daly, and J. D. Lundquist, 2018b: An assessment of differences in gridded precipitation datasets in complex terrain. J. Hydrol., 556, 12051219, https://doi.org/10.1016/j.jhydrol.2017.03.008.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Herrera, S., S. Kotlarski, P. M. M. Soares, R. M. Cardoso, A. Jaczewski, J. M. Gutiérrez, and D. Maraun, 2019: Uncertainty in gridded precipitation products: Influence of station density, interpolation method and grid resolution. Int. J. Climatol., 39, 37173729, https://doi.org/10.1002/joc.5878.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hofstra, N., M. New, and C. McSweeney, 2009: The influence of interpolation and station network density on the distributions and trends of climate variables in gridded daily data. Climate Dyn., 35, 841858, https://doi.org/10.1007/s00382-009-0698-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Horel, J. D., and Coauthors, 2002: MesoWest: Cooperative mesonets in the western United States. Bull. Amer. Meteor. Soc., 83, 211225, https://doi.org/10.1175/1520-0477(2002)083<0211:MCMITW>2.3.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kochendorfer, J., and Coauthors, 2020: Undercatch adjustments for tipping-bucket gauge measurements of solid precipitation. J. Hydrometeor., 21, 11931205, https://doi.org/10.1175/JHM-D-19-0256.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lin, Y., and K. E. Mitchell, 2005: The NCEP stage II/IV hourly precipitation analyses: Development and applications. 19th Conf. on Hydrology, San Diego, CA, Amer. Meteor. Soc., 1.2, https://ams.confex.com/ams/Annual2005/techprogram/paper_83847.htm.

  • Livneh, B., T. J. Bohn, D. W. Pierce, F. Munoz-Arriola, B. Nijssen, R. Vose, D. R. Cayan, and L. Brekke, 2015: A spatially comprehensive, hydrometeorological data set for Mexico, the U.S., and southern Canada 1950–2013. Sci. Data, 2, 150042, https://doi.org/10.1038/sdata.2015.42.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lundquist, J. D., M. Hughes, B. Henn, E. D. Gutmann, B. Livneh, J. Dozier, and P. Neiman, 2015: High-elevation precipitation patterns: Using snow measurements to assess daily gridded datasets across the Sierra Nevada, California. J. Hydrometeor., 16, 17731792, https://doi.org/10.1175/JHM-D-15-0019.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lundquist, J. D., M. Hughes, E. D. Gutmann, and S. Kapnik, 2019: Our skill in modeling mountain rain and snow is bypassing the skill of our observational networks. Bull. Amer. Meteor. Soc., 100, 24732490, https://doi.org/10.1175/BAMS-D-19-0001.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ma, Y., Y. Zhang, D. Yang, and S. B. Farhan, 2015: Precipitation bias variability versus various gauges under different climatic conditions over the Third Pole Environment (TPE) region. Int. J. Climatol., 35, 12011211, https://doi.org/10.1002/joc.4045.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maurer, E. P., A. W. Wood, J. C. Adam, D. P. Lettenmaier, and B. Nijssen, 2002: A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States. J. Climate, 15, 32373251, https://doi.org/10.1175/1520-0442(2002)015<3237:ALTHBD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily database. J. Atmos. Oceanic Technol., 29, 897910, https://doi.org/10.1175/JTECH-D-11-00103.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Merino, A., E. Garcia-Ortega, and A. Navarro Martinez, 2021: Evaluation of gridded rain-gauge-based precipitation datasets: Impact of station density, spatial resolution, altitude gradient and climate. Int. J. Climatol., 41, 30273043, https://doi.org/10.1002/joc.7003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Reges, H., N. Doesken, J. Turner, N. Newman, A. Bergantino, and Z. Schwalbe, 2016: CoCoRaHS: The evolution and accomplishments of a volunteer rain gauge network. Bull. Amer. Meteor. Soc., 97, 18311846, https://doi.org/10.1175/BAMS-D-14-00213.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sevruk, B., and W. R. Hamon, 1984: International comparison of national precipitation gauges with a reference pit gauge. WMO. Rep. WMO/TD-38, 20 pp., https://library.wmo.int/pmb_ged/wmo-td_38.pdf.

  • Tang, L., J. Zhang, M. Simpson, A. Arthur, H. Grams, Y. Wang, and C. Langston, 2020: Updates on the radar data quality control in the MRMS quantitative precipitation estimation system. J. Atmos. Oceanic Technol., 37, 15211537, https://doi.org/10.1175/JTECH-D-19-0165.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Thornton, P. E., S. W. Running, and M. A. White, 1997: Generating surfaces of daily meteorological variables over large regions of complex terrain. J. Hydrol., 190, 214251, https://doi.org/10.1016/S0022-1694(96)03128-9.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Timmermans, B., M. Wehner, D. Cooley, T. O’Brien, and H. Krishnan, 2019: An evaluation of the consistency of extremes in gridded precipitation data sets. Climate Dyn., 52, 66516670, https://doi.org/10.1007/s00382-018-4537-0.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Willmott, C. J., and S. M. Robeson, 1995: Climatologically aided interpolation (CAI) of terrestrial air temperature. Int. J. Climatol., 15, 221229, https://doi.org/10.1002/joc.3370150207.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Xia, Y., and Coauthors, 2012a: Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products. J. Geophys. Res., 117, D03109, https://doi.org/10.1029/2011JD016048.

    • Search Google Scholar
    • Export Citation
  • Xia, Y., and Coauthors, 2012b: Continental-scale water and energy flux analysis and validation for North American Land Data Assimilation System project phase 2 (NLDAS-2): 2. Validation of model-simulated streamflow. J. Geophys. Res., 117, D03110, https://doi.org/10.1029/2011JD016051.

    • Search Google Scholar
    • Export Citation
  • Yang, D., D. Kane, Z. Zhang, D. Legates, and B. Goodison, 2005: Bias corrections of long-term (1973–2004) daily precipitation data over the northern regions. Geophys. Res. Lett., 32, L19501, https://doi.org/10.1029/2005GL024057.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • View in gallery

    Distribution of precipitation station locations included in the daily grids for the month of July 2020.

  • View in gallery

    Schematic diagram of the quality control process flow. Observations failing single-station QC do not proceed to the next QC steps.

  • View in gallery

    Example reapportioning of daily precipitation for an off-time station. Initial grids of daily precipitation are interpolated over a period of days with PRISM using on-time stations only. At an off-time station location, a precipitation event of 1 or more days is identified as beginning and ending on days with zero daily precipitation according to both (a) the initial grid, and (b) the station, in this case on days 0 and 5. (c) For this event, daily precipitation values at the off-time station are reapportioned based on the relative daily precipitation from the on-time grid, while maintaining the event total from the off-time station.

  • View in gallery

    Precipitation analysis for 20 May 2010 in the east-central United States using (a) all observations regardless of observation time and (b) all observations after off-time stations have been adjusted. Off-time stations not adhering to the 1200–1200 UTC day definition recorded significant precipitation from a system approaching from the west a day early as compared with the on-time observations.

  • View in gallery

    PRISM final monthly total precipitation analysis for February 2020 in the southeastern United States overlain with (a) stations before QC screening, and (b) stations remaining after all QC screening steps were performed. Color fill of station symbols matches that of the map. Many of the stations screened out had relatively low monthly totals.

  • View in gallery

    Spatial distribution of the standard deviation of gridded daily precipitation over southern Missouri mapped with climatologically aided interpolation (CAI). Standard deviation is expressed as a percent of the average monthly total precipitation over all months in 2020. The interpolation process dampens the day-to-day variability of the gridded precipitation values, resulting in local maxima at station locations and minima between stations. Stations plotted were used in the July 2020 analysis; stations used in other months may vary.

  • View in gallery

    Western Oregon (a) PRISM daily precipitation analysis for 3 Jan 2021 using CAI, and (b) PRISM January 1981–2010 average precipitation climatology. Orographic enhancement and rain-shadowing patterns produced by the Coast and Cascade Ranges on 3 Jan 2021 are similar to those of the long-term climatology for January and are typical of most storms during January.

  • View in gallery

    Conterminous U.S. maps showing (a) the green-shaded region east of the Rockies where RAI is applied, (b) daily average RAI weighting factor for July 2010, and (c) daily average RAI weighting factor for December 2010. RAI is most effective in improving the analysis in summer, when precipitation is spotty and convective, and when precipitation is in the form of rainfall rather than snowfall.

  • View in gallery

    Daily accumulated precipitation on 9 Jul 2013 over Alabama and Georgia modeled with PRISM using (a) CAI and (b) RAI. Station locations are denoted by black dots. The RAI analysis more clearly delineates convective cells that are not resolved by the station data.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 155 155 155
PDF Downloads 113 113 113

Challenges in Observation-Based Mapping of Daily Precipitation across the Conterminous United States

View More View Less
  • 1 a PRISM Climate Group, Northwest Alliance for Computational Science and Engineering, College of Engineering, Oregon State University, Corvallis, Oregon
© Get Permissions
Open access

Abstract

There is a great need for gridded daily precipitation datasets to support a wide variety of disciplines in science and industry. Production of such datasets faces many challenges, from station data ingest to gridded dataset distribution. The quality of the dataset is directly related to its information content, and each step in the production process provides an opportunity to maximize that content. The first opportunity is maximizing station density from a variety of sources and assuring high quality through intensive screening, including manual review. To accommodate varying data latency times, the Parameter-Elevation Regressions on Independent Slopes Model (PRISM) Climate Group releases eight versions of a day’s precipitation grid, from 24 h after day’s end to 6 months of elapsed time. The second opportunity is to distribute the station data to a grid using methods that add information and minimize the smoothing effect of interpolation. We use two competing methods, one that utilizes the information in long-term precipitation climatologies, and the other using weather radar return patterns. Last, maintaining consistency among different time scales (monthly vs daily) affords the opportunity to exploit information available at each scale. Maintaining temporal consistency over longer time scales is at cross purposes with maximizing information content. We therefore produce two datasets, one that maximizes data sources and a second that includes only networks with long-term stations and no radar (a short-term data source). Further work is under way to improve station metadata, refine interpolation methods by producing climatologies targeted to specific storm conditions, and employ higher-resolution radar products.

Daly’s ORCID: 0000-0002-6566-3556.

© 2021 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Christopher Daly, chris.daly@oregonstate.edu

Abstract

There is a great need for gridded daily precipitation datasets to support a wide variety of disciplines in science and industry. Production of such datasets faces many challenges, from station data ingest to gridded dataset distribution. The quality of the dataset is directly related to its information content, and each step in the production process provides an opportunity to maximize that content. The first opportunity is maximizing station density from a variety of sources and assuring high quality through intensive screening, including manual review. To accommodate varying data latency times, the Parameter-Elevation Regressions on Independent Slopes Model (PRISM) Climate Group releases eight versions of a day’s precipitation grid, from 24 h after day’s end to 6 months of elapsed time. The second opportunity is to distribute the station data to a grid using methods that add information and minimize the smoothing effect of interpolation. We use two competing methods, one that utilizes the information in long-term precipitation climatologies, and the other using weather radar return patterns. Last, maintaining consistency among different time scales (monthly vs daily) affords the opportunity to exploit information available at each scale. Maintaining temporal consistency over longer time scales is at cross purposes with maximizing information content. We therefore produce two datasets, one that maximizes data sources and a second that includes only networks with long-term stations and no radar (a short-term data source). Further work is under way to improve station metadata, refine interpolation methods by producing climatologies targeted to specific storm conditions, and employ higher-resolution radar products.

Daly’s ORCID: 0000-0002-6566-3556.

© 2021 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Christopher Daly, chris.daly@oregonstate.edu

1. Introduction

There is a great need for spatially interpolated gridded precipitation datasets for a variety of uses in hydrology, agriculture, natural resources modeling, water supply forecasting, and many other domains. Estimates of total precipitation on a daily time step are especially valued, but high spatial variability makes it challenging to develop grids that faithfully represent actual conditions on the ground. A number of gridded daily precipitation datasets have been developed for the conterminous United States (Abatzoglou 2013; Di Luzio et al. 2008; Hamlet and Lettenmaier 2005; Hamlet et al. 2010; Lin and Mitchell 2005; Livneh et al. 2015; Maurer et al. 2002; Thornton et al. 1997; Xia et al. 2012a,b). Many incorporate Parameter-Elevation Regressions on Independent Slopes Model (PRISM; Daly et al. 2008) datasets in one form or another to provide spatial detail in physiographically complex areas (Lundquist et al. 2015). All have strengths and weaknesses, depending on the approach taken and intended applications.

The primary mission of the PRISM Climate Group at Oregon State University is to advance the field of geospatial climatology, which is the study of the spatial and temporal patterns of climate and how Earth’s physiographic features, such as terrain and coastlines, affect those patterns. The practical output of this research is the development of spatial climate products on a regular grid to support modeling and analysis. We have been publishing gridded datasets for precipitation, as well as other climate elements, since the early 1990s using the PRISM climate mapping system (Daly et al. 1994, 2002, 2008). These datasets were produced at a monthly time step until 2011, when daily time step datasets were added to respond to needs in the crop insurance sector. Moving to the daily time step presented new challenges to an already complex operation, from station data ingest to product generation and short latency delivery. While most papers addressing the development and comparison of spatial precipitation datasets have focused primarily on interpolation methods (which is indeed an essential component), generating high-quality daily precipitation datasets requires careful attention to many other steps in what is a lengthy and detailed process. This paper does not focus on comparing one method with another, but rather attempts to highlight some of the challenges our team has faced along the entire product development chain, and briefly describes how we have addressed those issues. Challenges presented here include station density (section 2), data quality control (section 3), spatial interpolation (section 4), temporal consistency (section 5), and operational considerations (section 6). We follow with a discussion of further work on data quality control, interpolation methods, and uncertainty estimation (section 7), and conclude the paper with a summary of the main points (section 8).

2. Challenge: Station density

Station density refers to the number of weather stations with relevant data per unit area on the ground. Increasing station data density from reliable sources is arguably the single most effective way to increase the accuracy of spatial precipitation datasets (e.g., Hofstra et al. 2009). To that end, we have put substantial effort into identifying and ingesting precipitation data from nearly 40 different sources, which include local, state, regional, and national networks that cover the United States and border areas of Canada and Mexico (Fig. 1; Table S1 in the online supplemental material). The largest of these is Community Collaborative Rain, Hail and Snow Network (CoCoRaHS; Reges et al. 2016), a volunteer network that began in 1998 and has since grown to supply over 10 000 reports each day. The second largest and longest-running network, also volunteer, is the National Weather Service Cooperative Observer Program (COOP). COOP currently provides about 4000 daily observations and is accessed via the National Centers for Environmental Information (NCEI) Global Historical Climatology Network-Daily (GHCN-D) database (Menne et al. 2012). State volunteer networks include those administered by the North Dakota State Water Commission and the Minnesota State Climatology Office (MNGage). All in all, volunteer observations typically account for over 75% of our precipitation data, which underlines the importance of citizen science in helping to quantify daily precipitation patterns across the country.

Fig. 1.
Fig. 1.

Distribution of precipitation station locations included in the daily grids for the month of July 2020.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

Automated national and regional networks used by PRISM include USDA Forest Service and Bureau of Land Management Remote Automatic Weather Stations (RAWS), the Bureau of Reclamation AgriMet, and the U.S. Climate Reference Network (USCRN). A number of state-level mesonets are ingested, including those in Alabama, Arizona, California, Colorado, Delaware, Florida, Georgia, Kansas, Kentucky, Michigan, Nebraska, New Jersey, North Carolina, North Dakota, Oklahoma, Texas, and Washington. Valuable precipitation data for the mountains of the western United States are acquired from the USDA Natural Resources Conservation Service (NRCS) Snow Telemetry (SNOTEL) network.

We prefer to access station data directly from the providers, so that we can receive complete quality flags, accurate metadata, and edited versions of the data, which are not always transmitted to the user via third-party sources. This also affords the opportunity to communicate directly with the providers and, in some cases, alert them when data quality issues arise. That said, we access the Advanced Hydrologic Prediction Service (AHPS) River Forecast Centers data through the National Weather Service Daily Hydrometeorological Products (HYD) feed to ingest networks for which we have not established direct access such as the Hydrometeorological Automated Data System (HADS) and Automatic Local Evaluation in Real-Time (ALERT) networks. We also access Weather Bureau Army–Navy (WBAN), Environment and Climate Change Canada (ECCC), and Mexican station data from NCEI’s GHCN-D, and we tap selected networks from MesoWest (Horel et al. 2002) and the California Data Exchange Center (CDEC). At present, the total number of stations used in our precipitation mapping each day varies from about 19 000–20 000 in summer to about 15 000–18 000 in winter after screening for appropriate measuring equipment (see section 3 for details).

We follow four steps when adding a new network or data source: 1) set up data delivery methods with the provider, 2) obtain historical data, 3) write ingest scripts to process data and insert into a database, and 4) operationalize data delivery and ingest systems. Developing and maintaining station ingest systems for so many disparate data sources requires significant ongoing resources. Each data source has its own data access protocols, data formats, units, and metadata, requiring that ingestion software be tailored to each source individually. However, the data ingest process does have a set of components in common, so our approach has been to store frequently used routines in a central library to be used as needed. Hourly or subhourly data are converted to daily values in common SI units for storage in a PostgreSQL database (https://www.postgresql.org). Data source formats and access protocols can change unexpectedly, requiring us to monitor the retrieval status on a daily basis to ensure that data are being ingested properly, and to make adjustments as needed. Data interruptions, such as transmission issues, power outages, government shutdowns, and server upgrades and maintenance invariably occur, but most can be handled subsequently and captured in later releases of our precipitation grids (see section 6 for details).

3. Challenge: Data quality control

Precipitation is difficult to observe accurately, and is subject to a variety of errors, including wind-induced gauge undercatch, freeze-up, evaporative losses, misreporting, and underreporting (Daly et al. 2007; Fassnacht 2004; Fiebrich et al. 2010; Goodison et al. 1998; Kochendorfer et al. 2020; Ma et al. 2015). Not all of these errors are identified and resolved by the quality control (QC) procedures of the data providers themselves. As developers of spatial precipitation analyses using numerous independent data sources, we are often better positioned than individual providers to assess data for spatial consistency. Spatial inconsistencies among stations can also be caused by temporal shifts in the reported data, which can arise when station observations are taken or summarized under different day definitions or reported on the wrong day. Such errors compromise the statistical integrity of the spatial precipitation field by damping extremes and day-to-day variance, altering wet day frequency, etc. Therefore, we attempt to adjust for temporally misaligned data as part of our QC process. The QC steps can be categorized as follows (in the sequence they are executed): network-level screening, single-station QC, temporal adjustments, radar QC, spatial QC, and manual QC review (Fig. 2). Each step is summarized briefly below.

Fig. 2.
Fig. 2.

Schematic diagram of the quality control process flow. Observations failing single-station QC do not proceed to the next QC steps.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

a. Network-level screening

Not all networks measure precipitation effectively in all seasons. Many networks use equipment that is not designed for winter use, such as unheated tipping-bucket gauges or equipment that is not maintained in winter (e.g., Fiebrich et al. 2010). We do not include such stations in our analysis during the months of October–April (Table S1 in the online supplemental material). Using networks with improper equipment or infrequent maintenance during the winter months typically results in low precipitation readings, creating biases that accumulate over time. These networks can also present timing issues, such as when frozen precipitation melts and falls into the gauge, days after a precipitation event has ended. Currently, the decision is made on a network-wide basis, rather than station by station (which is currently under development). This may result in the omission of stations that do observe valid data in winter or are in warmer locations where frozen precipitation is not a factor during the October–April time period. However, experience has shown that this protocol results in a significant reduction in the number of suspicious observations that must be examined during manual QC review (see section 3f).

While network-level screening is effective at flagging inappropriate equipment for measuring frozen precipitation, we do not attempt to adjust the remaining stations for gauge undercatch that often occurs even during nonfrozen events. It is not straightforward to do so in an accurate and comprehensive manner, and requires data that are not normally available at gauge sites (Goodison et al. 1998; Sevruk and Hamon 1984; Yang et al. 2005). Adjustments also complicate evaluation exercises that compare grid estimates with independent station data, which are typically unadjusted. However, gauge undercatch can create discrepancies in hydrologic and water balance calculations (Lundquist et al. 2015).

b. Single-station QC

As precipitation observations are ingested into the PRISM station database, checks are made to ensure basic validity and completeness. For example, the value fails if it exceeds 115% of the published record 24-h record precipitation value for the state (https://www.ncdc.noaa.gov/extremes/scec/records). (Given the marked increases in precipitation intensity that have recently occurred across much of the United States, we will be monitoring whether the 115% threshold needs to be increased to ensure that valid extreme values are maintained.) Similarly, if a station providing data at a subdaily time step is missing more than six hours on a given day, that day’s data fail the check because of the possibility that precipitation may have gone unreported.

Checks are also made at the monthly time step. The value fails if it exceeds 115% of the world record monthly precipitation value (9300 mm; https://www.weather.gov/owp/hdsc_world_record). If more than two days are missing or invalid in a month, all days within that month fail the check. Experience has shown that the existence of more than two missing days in a month is often a sign that some storms may have been missed, leading to low monthly totals, and that there is a risk that the values reported as daily may actually be multiday accumulations of unknown duration or represent the wrong day. An exception to this logic is when a major precipitation event such as a hurricane occurs, causing a station to go offline after otherwise reporting consistently; such exceptions must be handled during the manual QC review (see section 3f). Observations that fail a single-station QC check normally do not proceed to the subsequent QC steps.

c. Temporal adjustments

Our approach to improving temporal consistency is to screen once-per-day station observations for adherence to a fixed day definition. Our “day” is defined as 1200–1200 UTC (e.g., 0700–0700 EST), which is near the observation time of a large majority of the CoCoRaHS and COOP stations and is also consistent with the AHPS “hydrologic day” definition (https://water.weather.gov/precip/about.php). Defining a day in the morning, rather than at midnight local time, minimizes the need to try to modify the precipitation values to correspond to a time when few precipitation observations are actually made. In addition, a universal rather than a local time provides a baseline that is consistent across time zones and allows a more straightforward downscaling to subdaily time steps. Once-per-day observation times must fall within ±4 h of 1200 UTC to be considered “on time.”

Stations that fall outside this day definition (termed “off time,” currently 25%–30% of COOP observations) or are multiday accumulations, are processed with an algorithm that estimates on-time station values from off-time values and also disaggregates multiday accumulations. This is done by creating initial daily precipitation grids with PRISM using on-time stations only. For daily values from off-time stations, discrete precipitation events of one day or more are identified (Fig. 3). An event is defined as beginning and ending with zero daily precipitation at both the station and the on-time grid cell containing the station. The total observed station precipitation for that event is then disaggregated to match the relative apportionment of the daily precipitation values for that event on the on-time grid cell, while retaining the event total from the off-time station (Fig. 3). A similar method is used to reapportion multiday accumulations into daily values. Reapportioned daily station values are then appended to the daily precipitation station dataset and used in a second PRISM interpolation run that uses all stations. The second run exhibits improved spatial consistency among stations, although some artifacts can remain because of inaccurate observation times or reporting issues (Fig. 4); these issues are the subject of further work (see section 7a).

Fig. 3.
Fig. 3.

Example reapportioning of daily precipitation for an off-time station. Initial grids of daily precipitation are interpolated over a period of days with PRISM using on-time stations only. At an off-time station location, a precipitation event of 1 or more days is identified as beginning and ending on days with zero daily precipitation according to both (a) the initial grid, and (b) the station, in this case on days 0 and 5. (c) For this event, daily precipitation values at the off-time station are reapportioned based on the relative daily precipitation from the on-time grid, while maintaining the event total from the off-time station.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

Fig. 4.
Fig. 4.

Precipitation analysis for 20 May 2010 in the east-central United States using (a) all observations regardless of observation time and (b) all observations after off-time stations have been adjusted. Off-time stations not adhering to the 1200–1200 UTC day definition recorded significant precipitation from a system approaching from the west a day early as compared with the on-time observations.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

d. Radar QC

Radar QC is applied at the daily time step and is designed to screen for false zero values and false “spikes” in the record that cannot be corroborated. It compares the observation with the AHPS stage-4 (ST4) 4-km gridded radar–rain gauge product (Lin and Mitchell 2005). The ST4 analysis is a national radar mosaic that has been locally adjusted with station data (see section 4 for details). Only areas east of 105°W longitude are subject to radar QC; this corresponds to the area of greatest radar coverage with minimal terrain interference. The daily observation fails if the ST4 analysis at or within about 8 km around the station location indicates substantial precipitation (≥2.5 mm) but the station observation shows zero, or vice versa.

e. Spatial QC

Spatial QC takes advantage of the PRISM interpolation system to identify observations that are spatial outliers. In a process called jackknife cross validation with replacement, the procedure removes a station observation, uses PRISM to predict a precipitation value at that station’s location, then replaces the station’s observation in the dataset (Daly et al. 2008). The observed and predicted values are then compared. If the difference between the two is larger than can reasonably be expected (see next paragraph), the observation may fail spatial QC. Tests are made to determine confidence in the model estimate; if it is low, the station value may be retained. Confidence is determined by evaluating station density in the area, and how much scatter exists in the regression function between the predictor grid and the station observations (see section 4 for details on PRISM operation). Spatial QC is repeated iteratively over 3–5 cycles, with failed stations removed at each step, until no further stations fail the test. To maximize the ability of PRISM to provide a robust estimate for comparison in what is typically a spatially noisy field, spatial QC is performed at the monthly time step.

Since the inherent variability of precipitation increases as the average value increases, we must account for this when evaluating differences between predictions and observations. Using precipitation data over a 30-yr period, linear relationships between mean monthly precipitation and its standard deviation were developed using data from COOP stations. An estimated standard deviation for a given monthly precipitation value can then be derived from these equations. Our approach is to determine if the difference between the prediction and observation exceeds a threshold number of standard deviations. The threshold, currently set to 1.59 standard deviations, was determined through a manual process of finding the approximate dividing line between differences that indicated truly erroneous data, and data that appear to be accurate.

f. Manual QC review

Each month, PRISM analysts visually review the daily and monthly precipitation maps for the prior six months. Using an in-house web application, these maps are examined for outliers and inconsistencies that were not flagged during the automated QC steps described above. Analysts review the precipitation maps in their native form, as well as derived statistics such as anomalies from long-term normals and number of wet days per month, to detect values that may look reasonable on the surface but are actually outside expected norms. Suspicious stations are subjected to a visual time series analysis, where values over the surrounding weeks or months are plotted alongside those of nearby stations to determine if the value in question is a temporal or spatial outlier. The station value fails manual QC if the analyst determines that the observation should not be included in the gridded product, given all available evidence. The observation can be flagged at either the daily or monthly time step; if monthly, all days in the month are automatically flagged and will not be used in that month’s daily mapping process. The analyst can also reinstate a value that had failed a spatial QC step. At the completion of manual QC, each daily and monthly grid is remodeled with PRISM. The new maps are again screened to ensure that all suspicious values have been investigated and rectified. The end result of the QC procedure is a set of what are believed to be spatially and temporally consistent data (Fig. 5).

Fig. 5.
Fig. 5.

PRISM final monthly total precipitation analysis for February 2020 in the southeastern United States overlain with (a) stations before QC screening, and (b) stations remaining after all QC screening steps were performed. Color fill of station symbols matches that of the map. Many of the stations screened out had relatively low monthly totals.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

4. Challenge: Spatial interpolation

Daily precipitation patterns are often highly variable and may contain both zero and large nonzero values in close proximity. By drawing data from stations that are outside the grid cells being estimated, spatial interpolation to a regular grid invariably has a smoothing effect on the field being analyzed (Ensor and Robeson 2008; Gervais et al. 2014). This often leads to overestimating areas of locally low or zero precipitation (i.e., “smearing” of nonzero precipitation into dry areas) and underestimating areas of locally high precipitation (Herrera et al. 2019; Hofstra et al. 2009). These effects are most noticeable at the lowest and highest percentiles, which can compromise extreme value statistics (Merino et al. 2021), but also dampen the overall day-to-day variability of gridded daily precipitation values between stations (Fig. 6). Other than increasing station density, spatial smoothing can be lessened (although not completely eliminated) and the accuracy of the interpolated precipitation field improved, by employing grids of predictor variables that are most highly correlated with local precipitation patterns. Ideal predictor grids are those that add information over and above that provided by the station precipitation data alone.

Fig. 6.
Fig. 6.

Spatial distribution of the standard deviation of gridded daily precipitation over southern Missouri mapped with climatologically aided interpolation (CAI). Standard deviation is expressed as a percent of the average monthly total precipitation over all months in 2020. The interpolation process dampens the day-to-day variability of the gridded precipitation values, resulting in local maxima at station locations and minima between stations. Stations plotted were used in the July 2020 analysis; stations used in other months may vary.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

We have developed two complementary methods to minimize spatial smoothing and improve accuracy. The first uses climatologically aided interpolation (CAI; Daly 2006; Daly et al. 2012; Funk et al. 2000; Hamlet and Lettenmaier 2005; Willmott and Robeson 1995). CAI relies on the assumption that local spatial patterns of the element of interest being interpolated (e.g., daily or monthly precipitation) bear some resemblance to those of an existing grid of long-term climatology of the same, or related, element of interest (called the predictor grid). In our application of CAI, we employ PRISM to create a local linear regression at each grid cell between previously developed PRISM monthly normals (as the x values) and the station data (as the y values). PRISM accounts for variations in the relationship between the predictor grid and the station data caused by physiographic features, such as proximity to moisture sources and rain shadowing by terrain features. This is done by weighting the stations in the regression function by the similarity of the station’s physiography to that of the grid cell’s. CAI that uses mean precipitation climatology as the predictor grid is most useful in the mountainous western United States, where long-term precipitation patterns are often the result of an accumulation of repeatable patterns of orographic amplification and rain shadowing of precipitation as storms interact with terrain features (Fig. 7). Our use of CAI is most effective at the monthly time step, which is then used to inform the daily time series (see section 5a for details).

Fig. 7.
Fig. 7.

Western Oregon (a) PRISM daily precipitation analysis for 3 Jan 2021 using CAI, and (b) PRISM January 1981–2010 average precipitation climatology. Orographic enhancement and rain-shadowing patterns produced by the Coast and Cascade Ranges on 3 Jan 2021 are similar to those of the long-term climatology for January and are typical of most storms during January.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

The second interpolation method uses radar-aided interpolation (RAI). In this case, the 24-h ST4 4-km gridded radar–rain gauge product from AHPS serves as the predictor grid. The ST4 analysis does not incorporate radar data over the Rocky Mountains and westward, instead using a form of CAI that employs long-term PRISM precipitation averages (https://water.weather.gov/precip/about.php). Since we are already applying CAI with PRISM long-term averages as described above, we use the ST4 product east of the Rockies only, with a narrow graduated zone along the periphery (Fig. 8a). The ST4 analysis is provided at 4-km grid resolution, which is coarser than the native PRISM 800-m operating resolution. Therefore, a modified Barnes (1964) Gaussian filter is used to downscale the ST4 grids to 800 m. ST4 can also be termed an RAI product, because it assimilates station observations into the analysis.

Fig. 8.
Fig. 8.

Conterminous U.S. maps showing (a) the green-shaded region east of the Rockies where RAI is applied, (b) daily average RAI weighting factor for July 2010, and (c) daily average RAI weighting factor for December 2010. RAI is most effective in improving the analysis in summer, when precipitation is spotty and convective, and when precipitation is in the form of rainfall rather than snowfall.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

Once the two grid analyses (CAI and RAI) have been applied on a given day, a “besting” process is conducted on a pixel-by-pixel basis to determine how to weight each analysis in the final product. This involves assessing the predictive skill of each analysis by comparing PRISM’s local regression correlation coefficients between station precipitation data with those from each predictor grid (CAI and RAI). However, we cannot use the correlation coefficients directly from the ST4 analysis for besting purposes because that analysis incorporates station data, which would produce artificially high correlations with our station data. Instead, we run a second RAI analysis using the AHPS stage-2 unbiased (ST2un) analysis as the predictor grid (ST2un was recently discontinued; see section 7b for details). ST2un does not have individual station observations incorporated, which makes for a fairer comparison with CAI. Based on the comparison of correlation coefficients between the CAI and ST2un analyses, an RAI weighting factor (0–1) grid is calculated (Fig. 8). The weighting factor is then applied to the original ST4 RAI grid when averaging it with the CAI grid, to form a hybrid estimate for the day. In flat terrain where radar beam blockage is minimal, use of RAI often improves the mapping of daily precipitation relative to CAI. This is especially true in summer, when rainfall patterns are dominated by isolated, convective cells that often fall between weather stations, and are not explained well by long-term normal precipitation patterns (Figs. 8b and 9). CAI and RAI are more evenly effective in winter, when precipitation patterns are broader in scale, and weather radar returns are less effective at identifying frozen precipitation (Fig. 8c).

Fig. 9.
Fig. 9.

Daily accumulated precipitation on 9 Jul 2013 over Alabama and Georgia modeled with PRISM using (a) CAI and (b) RAI. Station locations are denoted by black dots. The RAI analysis more clearly delineates convective cells that are not resolved by the station data.

Citation: Journal of Atmospheric and Oceanic Technology 38, 11; 10.1175/JTECH-D-21-0054.1

5. Challenge: Temporal consistency

a. Daily/monthly consistency

In addition to daily runs, PRISM is also run with station data summarized at a monthly time step using CAI and RAI in the same way as described above. Internal consistency requires that the daily grids add up to the monthly grids. This should be a simple matter of summing the daily grids to produce a monthly product, but there is an advantage to first determining which temporal product (daily or monthly) is more robust, and where, and adjusting the other product to match it. In the western United States, precipitation interpolated at the monthly time step better incorporates the repeatability of orographic signatures that can be lost at the daily time step because of timing and data-precision issues and thus has a stronger relationship with the long-term climatology. This is especially useful where SNOTEL stations are involved; the precision of the precipitation data is 2.54 mm (rather than the standard 0.254 mm), which can cause timing issues when precipitation amounts are low. The usefulness of the monthly time step in guiding daily precipitation interpolation has been recognized by other developers (e.g., Di Luzio et al. 2008). Therefore, in the western United States, the PRISM dailies are adjusted to match the monthly accumulations. To match the monthly values at the end of a month, all daily values in a grid cell are increased or decreased by the same ratio. In the case in which a monthly gridcell precipitation value is measurable (≥0.254 mm) but the monthly sum of the daily precipitation values for that grid cell is below measurable but nonzero, precipitation is added to the days for which there was nonzero precipitation until the monthly value is reached. The result is that some of these daily values may now exceed the measurable threshold, thus creating one or more “wet” days with measurable precipitation in that month. However, if the daily gridcell values sum to exactly zero (i.e., all days in the month are exactly zero), the daily values remain unchanged and the monthly value is set to zero; this improves the field in cases in which the monthly analysis “smears” precipitation into dry areas.

East of the Rockies, in flat terrain where there are few orographic signatures, the smearing of nonzero monthly precipitation totals into areas that should be dry is sometimes seen. This is most noticeable before January 2002, when national radar-based analyses are not available and CAI alone is the sole interpolation method. Therefore, the monthly total precipitation grids east of the Rockies are adjusted to match the sum of the dailies.

In an attempt to treat the resulting daily and monthly gridded precipitation values in a way that is consistent with how precipitation is observed and reported, a final postprocessing step sets any monthly or daily gridcell value that falls below 0.254 mm to zero. Since trace amounts are not measurable quantitatively, they are not included in the grid values.

b. Long-term consistency

Another challenge in mapping precipitation is maintaining temporal consistency over long periods of time. Some users of gridded precipitation datasets want to be able to analyze trends and variations over periods spanning from decades to as long as a century. This goal is, in many ways, at cross purposes with the goal of developing datasets that are as accurate as possible. For example, as discussed above, the use of RAI increases the accuracy of our precipitation datasets significantly, but national radar-based analyses do not begin until 2002. Adding RAI to the CAI-only approach in 2002 may produce nonclimatic shifts in precipitation statistics east of the Rockies, such as wet day frequency and frequency of intense precipitation. Further, station networks are continually changing, and there is no source of highly dense station data that persists over a long time period. Many important station networks we ingest today were established within the past 20–30 years. For example, CoCoRaHS, currently our largest network, was established in 1998 and has grown steadily since then, creating a relatively short period of record and a continually changing data density. It may be possible to extend station periods of record by infilling observations with estimates, but these estimates would not be independent and therefore would be subject to assumptions inherent in the estimation algorithm.

Our approach to partly address the issue of long-term consistency is to develop a second precipitation dataset that incorporates only those networks that have stations with periods of record of at least 20–30 years. These include AgriMet, COOP, WBAN, Mexico, Environment and Climate Change Canada, the H. J. Andrews Long-Term Ecological Research (LTER) site, MNGage, North Dakota State Water Commission, RAWS, SNOTEL, and stations operated by the Western Regional Climate Center. This dataset spans 1895–present and is modeled at a monthly time step only. Detailed descriptions of the two datasets, termed all-network (AN) and long term (LT), are available in our PRISM Datasets document (https://prism.oregonstate.edu/documents/PRISM_datasets.pdf). Note that our public web portal (https://prism.oregonstate.edu) disseminates the AN daily and monthly datasets only. Neither dataset has been subjected to homogenization routines, which attempt to identify and adjust breaks in station time series. For our purposes it is most useful to improve station metadata first before attempting such adjustments (see section 7).

6. Challenge: Operational considerations

The schedule on which daily precipitation data becomes available and QCed, mapping is performed, and products are disseminated is a function of data availability, computing resources, and QC scheduling. Our goal is to produce the first release of a daily precipitation grid within 24 h following the end of that day. For example, gridded precipitation for the day ending on 1200 UTC 15 July is normally available by 1200 UTC 16 July. By this time, PRISM modeling and the besting process are done, graphics are produced, and the PRISM public portal (https://prism.oregonstate.edu) and FTP site and web services (https://prism.oregonstate.edu/downloads) are populated. However, this is far earlier than the latency time of many important station data sources and their QC procedures. Therefore, we produce multiple releases of a precipitation grid for a given day until most data have arrived and have been QCed and edited by both the originator’s and our in-house processes; only then we can safely call the grid “final.” On our schedule, a second release of 15 July is done on 20 July. The 5-day window accommodates the arrival times of many COOP observations. Successive releases are then prepared during the middle of each of the following six months, for a total of eight releases. Each release overwrites the previous one. In our example, the precipitation grid for 15 July would not become “final” until the middle of January. Once a grid is final, it is not revisited until a new version of the time series is developed, which typically occurs every few years as methods improve, or as errors are identified. A calendar of release dates is available at https://prism.oregonstate.edu/calendar.

The native resolution of the PRISM grids is 30 arc s (~800 m). These high-resolution time series datasets amount to several terabytes in size, which often exceeds the capacity of users to download and manipulate what can be thousands of daily grids. The public website receives about a million grid downloads each month, which also puts pressure on our ability to provide the necessary bandwidth for such large downloads. To resolve those issues, we filter the 800-m version of the AN daily and monthly grids to 2.5 arc min (~4 km) resolution for distribution on the public web portal. The native 800-m grids are available to users but are typically delivered on portable hard drives, and a fee is charged to cover processing and delivery costs. The LT dataset is available only at 800-m resolution for a fee. The monthly long-term normals (currently 1981–2010, moving to 1991–2020 in late 2021) are available on the public portal at both 800-m and 4-km resolutions.

7. Further work

a. Station data quality

In this paper, we advocate for ingesting station data from a wide range of providers and sources. However, there is a trade-off between station data density and quality that must be considered when deciding whether to include a new data source. PRISM datasets are used in many applications, including some that involve litigation. Their quality must therefore be very high and our methods well documented. To include a data source, it is useful to have knowledge of the measuring equipment and its siting and maintenance, as well as observational protocols. This has, for example, kept us from using precipitation data from the thousands of personal weather stations across the country, for which we do not have such information. However, early results from an in-house study indicate that our QC process may be sufficient to screen out the poor quality data from these networks. We also continue to learn lessons about precipitation gauge and shielding suitability at our field sites within the H. J. Andrews Experimental Forest in the Oregon Cascades, which spans a range of conditions from mainly rainfall at lower elevations to heavy winter snowfalls at higher elevations (Daly et al. 2010).

We subject the station data to a lengthy series of QC steps that include temporal adjustments, equipment screening, range checks, radar-based checks for zeros and spikes, spatial consistency screening using the PRISM interpolation algorithm, and a manual QC review by expert analysts. However, there still is much room for improving decision-making and adding still more sources of information. The use of machine learning presents a promising opportunity to improve QC decision-making. We have a large set of observations that have been QCed by expert analysts that could serve as a training dataset for these algorithms. Remotely sensed data, such as cloud cover, could at a minimum be used to screen out nonzero precipitation observations in areas with no cloud cover in the past 24 h.

One underappreciated QC issue is how to improve station metadata, such as location coordinates, and observation times of once-per-day observers. Attention to station metadata can be tedious but may be more important than the quality of the data themselves. Incorrect observation times or reporting on the wrong day can result in temporal discrepancies in the data that can dampen precipitation extremes by decreasing the spatial autocorrelation among adjacent station measurements (Timmermans et al. 2019) and cause long-term temporal inhomogeneities in the station data. Our efforts to minimize temporal inconsistencies have relied on published observation times, but results suggest that the observation times reported by observers are not always accurate. We are currently conducting a multiyear effort to improve observation time information for COOP stations dating back to the 1800s. Imprecise or inaccurate station locations also lowers the quality of the interpolated field. In a study in the southern Appalachians, we found that station mislocation was the largest source of uncertainty in the precipitation interpolation process (Daly et al. 2017). Location metadata are not always consistent among different sources of the same data. The precision to which they are given also varies and is often poor for older stations before accurate location methods were available. For some networks we have used Google Earth satellite imagery to locate stations, but this method is only useful for observations recorded in more recent years.

b. Interpolation methods

Work is also under way to improve our CAI and RAI methods. Our current application of CAI uses monthly climatologies that reflect precipitation totals summed over all days in each month, averaged over a number of years. Long-term means are inherently powerful predictors, because storm configurations (e.g., upper-level flow directions) that produce the largest contribution to the average monthly total precipitation, either because they are wetter per storm or occur more frequently (or both), are weighted more highly than those producing little precipitation or those that occur infrequently. However, storm patterns that deviate from the mean condition do occur and can have important water balance implications (Lundquist et al. 2015). To better represent the range of potential storm conditions, we are using PRISM to develop a set of conditional CAI (CCAI) climatological grids that represent mean total precipitation for a subset of days in each month that have a variety of 700-hPa flow directions. In operation, the CCAI grid that best matches the day’s local flow direction would be used as the predictor grid, rather than the climatology that includes all days in the month. Mesoscale numerical models may provide an important source of information for differentiating these orographic signatures where station data are lacking, such as in remote mountainous regions (Lundquist et al. 2019). Such models, if applied operationally at appropriate grid resolutions and extents, could provide an additional predictor grid in the PRISM interpolation process, and ultimately be added to our CAI/RAI besting process as a third alternative. Over longer time scales, measurements of streamflow also provide valuable information on average areal precipitation in watersheds where observations are sparse (Henn et al. 2018a,b).

The ST2un analysis used to weight the RAI analyses in the besting process was discontinued in July 2020. At that time, we replaced ST2un with the 24-h radar-only Multi-Radar Multi-Sensor (MRMS; Zhang et al. 2016) product from the National Centers for Environmental Prediction for the period 1200–1200 UTC. We have been archiving MRMS products since late 2014 in anticipation of eventually also replacing the AHPS ST4 product with MRMS. MRMS grids are provided at a higher resolution than the AHPS grids (~1 vs ~4 km), and thus reveal more detailed precipitation patterns. However, the higher resolution also reveals visible artifacts in the radar fields that need to be resolved if we are to use MRMS as predictor grids in our RAI mapping. Much work has been done to identify and remove many of these artifacts (Tang et al. 2020), but some important artifacts remain. One type is “washboarding” in the fields (Zhang et al. 2016), which appears to be caused by oversampling fast-moving systems at 2-min intervals; that is, more frequently than the typical radar refresh rate. This causes some radar field snapshots to be repeated and others skipped in an irregular sequence. Another type of artifact is linear ray-shaped gaps in the field, created by radar beam blockage by obstructions, that have been partly, but not completely, rectified (Zhang et al. 2016). We have been developing methods to minimize washboarding that will be reported in a future paper.

c. Uncertainty estimation

Our ability to provide accurate and defensible estimates of the uncertainty in the daily precipitation products remains limited. We have attempted to ascribe regional uncertainties to our long-term monthly average datasets using cross-validation exercises (as described in section 3e) and model-generated regression prediction intervals (as described in Daly et al. 2008). These two methods are not strictly comparable; cross validation provides information only at station locations, while the model prediction interval is given at each grid cell. However, these methods produced similar results when aggregated to regional scales (Daly et al. 2008). Both have significant weaknesses: cross-validation errors can appear misleadingly small when low-density station datasets are used in the evaluation (e.g., Daly 2006), and the accuracy of the prediction interval depends on the veracity of the model-based assumptions (which is partly what we are trying to evaluate in the first place). Another approach we have taken more recently is to use a local network of rain gauges that is sufficiently dense so as to closely approximate “ground truth,” and perform evaluation exercises to estimate the sources and magnitudes of uncertainties in the PRISM 800-m national datasets (Daly et al. 2017). We are in the process of identifying other such networks to help gain further insights into the true uncertainties of our datasets, and in the future be able to generalize the results across larger regions. Evaluating differences between gauge-based estimates and other independent sources, such as numerical model-based estimates, also represents a promising path to assessing and quantifying uncertainties.

8. Summary

This paper has attempted to highlight some of the challenges faced by groups developing daily gridded precipitation datasets from surface observations on an operational basis, using examples of some of the issues we have been confronting throughout the product development chain. We have found that generating spatial precipitation datasets of high quality requires careful attention to many steps in a lengthy and detailed process. For the purposes of discussion, we grouped the challenges into five categories that roughly follow the production sequence: station density, data quality control, spatial interpolation, temporal consistency, and operational considerations.

The main takeaway from this discussion is that the quality of the product is directly related to its information content, and that each step in the process provides an opportunity to maximize that content. The most basic opportunity is to maximize station observation density and quality, which includes finding and accessing data from as many different sources as possible and conducting extensive quality screening. The latency time of many important station data sources may be weeks to months, however, so in the interest of both timeliness and completeness, we release eight versions of a day’s gridded precipitation data, the first 24 h after the day’s end, and the last after 6 months have elapsed. The second opportunity to maximize content is to distribute the station data to a grid in a way that adds information to each grid cell and minimizes the smoothing effect of the interpolation process. We currently use two competing precipitation interpolation methods, each based on a different source of information, and each best suited to different situations. CAI uses the spatial precipitation patterns imbedded in a long-term climatology to guide the interpolation, while RAI uses radar-based products to provide the spatial precipitation patterns for that day. Also, maintaining consistency among different time scales (i.e., monthly vs daily) affords the opportunity to exploit information available at each scale. We have found that monthly time step interpolation provides more information than does daily in the western United States where persistent orographic patterns predominate, while the opposite is true in the east. Maintaining temporal consistency over longer (i.e., from decadal to century) time scales is unfortunately at cross purposes with maximizing the information content of a dataset, because only long-term data sources can be considered. Our approach has been to produce two datasets, one that maximizes information content (AN) and a second that includes only networks that maintain stations with long periods of record (LT). Neither dataset has been subjected to statistical homogenization routines, which attempt to identify and adjust breaks in station time series; our priority is to improve station metadata first before attempting such adjustments.

Precipitation is just one of several climate elements for which we and others produce gridded datasets, such as temperature, dewpoint, vapor pressure deficit, and variables derived from these elements. While each element presents its own unique problems, challenges faced in the development process—station density, data quality control, spatial interpolation, temporal consistency, and operational considerations—apply to all, and each step in the process also provides an opportunity to maximize information content in the gridded product.

Acknowledgments

We thank David Rupp and three anonymous reviewers for their reviews of the paper. The authors acknowledge the thousands of volunteers who collect precipitation observations every day and the operators of mesonets and other national, regional, and state networks, who are dedicated to keeping their systems running regardless of the weather. We also acknowledge those who develop and maintain radar-based precipitation analyses, which are essential to improving the detail and accuracy of spatial precipitation datasets. All authors were supported by the U.S. Department of Agriculture Risk Management Agency, under Cooperative Agreement 2019-2363. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the U.S. Department of Agriculture.

Data availability statement

Daily and monthly time series precipitation grids from the AN dataset at 4-km resolution, and long-term normals at 4-km and 800-m resolutions, are available online (https://prism.oregonstate.edu). These datasets may be reused and disseminated freely, with visible attribution requested (https://prism.oregonstate.edu/terms). Daily and monthly time series precipitation grids from the AN dataset at 800-m resolution and the 800-m-resolution-only monthly LT dataset are available for a fee, with reuse and dissemination restrictions. Details are available online (https://prism.oregonstate.edu/orders). Station data are available from the sources listed in Table S1 in the online supplemental material.

REFERENCES

  • Abatzoglou, J. T., 2013: Development of gridded surface meteorological data for ecological applications and modelling. Int. J. Climatol., 33, 121131, https://doi.org/10.1002/joc.3413.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, S. L., 1964: A technique for maximizing details in numerical weather map analysis. J. Appl. Meteor., 3, 396409, https://doi.org/10.1175/1520-0450(1964)003<0396:ATFMDI>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., 2006: Guidelines for assessing the suitability of spatial climate data sets. Int. J. Climatol., 26, 707721, https://doi.org/10.1002/joc.1322.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., R. Neilson, and D. Phillips, 1994: A statistical–topographic model for mapping climatological precipitation over mountainous terrain. J. Appl. Meteor., 33, 140158, https://doi.org/10.1175/1520-0450(1994)033<0140:ASTMFM>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., W. P. Gibson, G. H. Taylor, G. L. Johnson, and P. Pasteris, 2002: A knowledge-based approach to the statistical mapping of climate. Climate Res., 22, 99113, https://doi.org/10.3354/cr022099.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., W. P. Gibson, G. H. Taylor, M. K. Doggett, and J. I. Smith, 2007: Observer bias in daily precipitation measurements at United States cooperative network stations. Bull. Amer. Meteor. Soc., 88, 899912, https://doi.org/10.1175/BAMS-88-6-899.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 20312064, https://doi.org/10.1002/joc.1688.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., D. R. Conklin, and M. H. Unsworth, 2010: Local atmospheric decoupling in complex topography alters climate change impacts. Int. J. Climatol., 30, 18571864, https://doi.org/10.1002/joc.2007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., M. P. Widrlechner, M. D. Halbleib, J. I. Smith, and W. P. Gibson, 2012: Development of a new USDA plant hardiness zone map for the United States. J. Appl. Meteor. Climatol., 51, 242264, https://doi.org/10.1175/2010JAMC2536.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Slater, J. A. Roberti, S. Laseter, and L. Swift, 2017: High-resolution precipitation mapping in a mountainous watershed: Ground truth for evaluating uncertainty in a national precipitation dataset. Int. J. Climatol., 37, 124137, https://doi.org/10.1002/joc.4986.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Di Luzio, M., G. L. Johnson, C. Daly, J. Eischeid, and J. G. Arnold, 2008: Constructing retrospective gridded daily precipitation and temperature datasets for the conterminous United States. J. Appl. Meteor. Climatol., 47, 475497, https://doi.org/10.1175/2007JAMC1356.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ensor, L. A., and S. M. Robeson, 2008: Statistical characteristics of daily precipitation: Comparisons of gridded and point datasets. J. Appl. Meteor. Climatol., 47, 24682476, https://doi.org/10.1175/2008JAMC1757.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fassnacht, S. R., 2004: Estimating Alter-shielded gauge snowfall undercatch, snowpack sublimation, and blowing snow transport at six sites in the coterminous USA. Hydrol. Processes, 18, 34813492, https://doi.org/10.1002/hyp.5806.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fiebrich, C. A., C. R. Morgan, A. G. McCombs, P. K. Hall, and R. A. McPherson, 2010: Quality assurance procedures for mesoscale meteorological data. J. Atmos. Oceanic Technol., 27, 15651582, https://doi.org/10.1175/2010JTECHA1433.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Funk, C., and Coauthors, 2000: The Collaborative Historical African Rainfall Model: Description and evaluation. Int. J. Climatol., 1, 4766, https://doi.org/10.1002/joc.866.

    • Search Google Scholar
    • Export Citation
  • Gervais, M., L. B. Tremblay, J. R. Gyakum, and E. Atallah, 2014: Representing extremes in a daily gridded precipitation analysis over the United States: Impacts of station density, resolution, and gridding methods. J. Climate, 27, 52015218, https://doi.org/10.1175/JCLI-D-13-00319.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Goodison, B. E., P. Y. T. Louie, and D. Yang, 1998: WMO solid precipitation intercomparison. WMO Rep. WMO/TD-872, 212 pp., https://library.wmo.int/index.php?lvl=notice_display&id=6441#.YW7EBhrMKUk.

  • Hamlet, A. F., and D. Lettenmaier, 2005: Production of temporally consistent gridded precipitation and temperature fields for the continental United States. J. Hydrometeor., 6, 330336, https://doi.org/10.1175/JHM420.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamlet, A. F., and Coauthors, 2010: Final report for the Columbia Basin Climate Change Scenarios Project. University of Washington Climate Impacts Group Rep., 221 pp., http://warm.atmos.washington.edu/2860/.

  • Henn, B., M. P. Clark, D. Kavetski, A. J. Newman, M. Hughes, B. McGurk, and J. D. Lundquist, 2018a: Spatiotemporal patterns of precipitation inferred from streamflow observations across the Sierra Nevada mountain range. J. Hydrol., 556, 9931012, https://doi.org/10.1016/j.jhydrol.2016.08.009.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Henn, B., A. J. Newman, B. Livneh, C. Daly, and J. D. Lundquist, 2018b: An assessment of differences in gridded precipitation datasets in complex terrain. J. Hydrol., 556, 12051219, https://doi.org/10.1016/j.jhydrol.2017.03.008.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Herrera, S., S. Kotlarski, P. M. M. Soares, R. M. Cardoso, A. Jaczewski, J. M. Gutiérrez, and D. Maraun, 2019: Uncertainty in gridded precipitation products: Influence of station density, interpolation method and grid resolution. Int. J. Climatol., 39, 37173729, https://doi.org/10.1002/joc.5878.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hofstra, N., M. New, and C. McSweeney, 2009: The influence of interpolation and station network density on the distributions and trends of climate variables in gridded daily data. Climate Dyn., 35, 841858, https://doi.org/10.1007/s00382-009-0698-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Horel, J. D., and Coauthors, 2002: MesoWest: Cooperative mesonets in the western United States. Bull. Amer. Meteor. Soc., 83, 211225, https://doi.org/10.1175/1520-0477(2002)083<0211:MCMITW>2.3.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kochendorfer, J., and Coauthors, 2020: Undercatch adjustments for tipping-bucket gauge measurements of solid precipitation. J. Hydrometeor., 21, 11931205, https://doi.org/10.1175/JHM-D-19-0256.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lin, Y., and K. E. Mitchell, 2005: The NCEP stage II/IV hourly precipitation analyses: Development and applications. 19th Conf. on Hydrology, San Diego, CA, Amer. Meteor. Soc., 1.2, https://ams.confex.com/ams/Annual2005/techprogram/paper_83847.htm.

  • Livneh, B., T. J. Bohn, D. W. Pierce, F. Munoz-Arriola, B. Nijssen, R. Vose, D. R. Cayan, and L. Brekke, 2015: A spatially comprehensive, hydrometeorological data set for Mexico, the U.S., and southern Canada 1950–2013. Sci. Data, 2, 150042, https://doi.org/10.1038/sdata.2015.42.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lundquist, J. D., M. Hughes, B. Henn, E. D. Gutmann, B. Livneh, J. Dozier, and P. Neiman, 2015: High-elevation precipitation patterns: Using snow measurements to assess daily gridded datasets across the Sierra Nevada, California. J. Hydrometeor., 16, 17731792, https://doi.org/10.1175/JHM-D-15-0019.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lundquist, J. D., M. Hughes, E. D. Gutmann, and S. Kapnik, 2019: Our skill in modeling mountain rain and snow is bypassing the skill of our observational networks. Bull. Amer. Meteor. Soc., 100, 24732490, https://doi.org/10.1175/BAMS-D-19-0001.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ma, Y., Y. Zhang, D. Yang, and S. B. Farhan, 2015: Precipitation bias variability versus various gauges under different climatic conditions over the Third Pole Environment (TPE) region. Int. J. Climatol., 35, 12011211, https://doi.org/10.1002/joc.4045.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maurer, E. P., A. W. Wood, J. C. Adam, D. P. Lettenmaier, and B. Nijssen, 2002: A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States. J. Climate, 15, 32373251, https://doi.org/10.1175/1520-0442(2002)015<3237:ALTHBD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily database. J. Atmos. Oceanic Technol., 29, 897910, https://doi.org/10.1175/JTECH-D-11-00103.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Merino, A., E. Garcia-Ortega, and A. Navarro Martinez, 2021: Evaluation of gridded rain-gauge-based precipitation datasets: Impact of station density, spatial resolution, altitude gradient and climate. Int. J. Climatol., 41, 30273043, https://doi.org/10.1002/joc.7003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Reges, H., N. Doesken, J. Turner, N. Newman, A. Bergantino, and Z. Schwalbe, 2016: CoCoRaHS: The evolution and accomplishments of a volunteer rain gauge network. Bull. Amer. Meteor. Soc., 97, 18311846, https://doi.org/10.1175/BAMS-D-14-00213.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sevruk, B., and W. R. Hamon, 1984: International comparison of national precipitation gauges with a reference pit gauge. WMO. Rep. WMO/TD-38, 20 pp., https://library.wmo.int/pmb_ged/wmo-td_38.pdf.

  • Tang, L., J. Zhang, M. Simpson, A. Arthur, H. Grams, Y. Wang, and C. Langston, 2020: Updates on the radar data quality control in the MRMS quantitative precipitation estimation system. J. Atmos. Oceanic Technol., 37, 15211537, https://doi.org/10.1175/JTECH-D-19-0165.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Thornton, P. E., S. W. Running, and M. A. White, 1997: Generating surfaces of daily meteorological variables over large regions of complex terrain. J. Hydrol., 190, 214251, https://doi.org/10.1016/S0022-1694(96)03128-9.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Timmermans, B., M. Wehner, D. Cooley, T. O’Brien, and H. Krishnan, 2019: An evaluation of the consistency of extremes in gridded precipitation data sets. Climate Dyn., 52, 66516670, https://doi.org/10.1007/s00382-018-4537-0.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Willmott, C. J., and S. M. Robeson, 1995: Climatologically aided interpolation (CAI) of terrestrial air temperature. Int. J. Climatol., 15, 221229, https://doi.org/10.1002/joc.3370150207.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Xia, Y., and Coauthors, 2012a: Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products. J. Geophys. Res., 117, D03109, https://doi.org/10.1029/2011JD016048.

    • Search Google Scholar
    • Export Citation
  • Xia, Y., and Coauthors, 2012b: Continental-scale water and energy flux analysis and validation for North American Land Data Assimilation System project phase 2 (NLDAS-2): 2. Validation of model-simulated streamflow. J. Geophys. Res., 117, D03110, https://doi.org/10.1029/2011JD016051.

    • Search Google Scholar
    • Export Citation
  • Yang, D., D. Kane, Z. Zhang, D. Legates, and B. Goodison, 2005: Bias corrections of long-term (1973–2004) daily precipitation data over the northern regions. Geophys. Res. Lett., 32, L19501, https://doi.org/10.1029/2005GL024057.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Crossref
    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save