1. Introduction and background
Digitized records of daily maximum and minimum temperature (TMAX and TMIN) and daily total precipitation (PCP) have been available for some time in electronic form from the National Climatic Data Service’s Cooperative Observer (Co-op) network for the period from 1948 to a few months prior to the present time. These data have been gridded to produce long-term daily time step forcings for hydrologic models (e.g., Maurer et al. 2002) that in turn have been used in land surface water and energy balance studies (e.g., Maurer et al. 2001; Roads et al. 2003). These data have recently been extended in electronic form to the beginning of the archival record [National Climatic Data Center (NCDC) DSI-3206 product], resulting in the potential to produce relatively high resolution gridded datasets for most of the last century. Unfortunately, use of gridding methods like those described by Maurer et al. (2002), or variations used by Cosgrove et al. (2003), are not generally appropriate for long-term trend analysis because of changes in stations and station locations over time. In the western United States, for instance, changes in station locations imply changes in the representation of topographic effects on precipitation and temperature and can result in spurious trends in precipitation and temperature, and any variables derived from them.
The motivation for this study came from problems initially encountered in attempting to use the data-processing methods developed by Maurer et al. (2002) to produce longer precipitation and temperature records. Despite good results for recent decades, our first attempts to create longer datasets back to 1915 using the DSI-3206 datasets revealed serious problems with temporal inhomogeneities. Simulated streamflows prior to 1950, for example, produced by essentially the same hydrologic models used by Maurer et al. (2002) were often strongly biased in comparison with those after 1950 (see, e.g., Fig. 2), and it was clear that long-term trends in simulated hydrologic variables would be strongly influenced (if not dominated) by the temporal inhomogeneities in the driving data.
Several issues must be considered in producing gridded meteorological datasets for hydrologic modeling studies. These issues include the following:
quality control of the raw station data (including filling of voids, flagging of implausible values, corrections for measurement error, or adjustments for temporal inconsistencies in the data),
criteria for accepting or rejecting a particular set of station records for use as input to the gridding process,
choice of gridding techniques,
techniques for adjusting precipitation and temperature for topographic variations not present in the raw data,
and verification and evaluation of the final gridded datasets.
We address some of these specific issues in this paper; however for the most part we take as our starting point the data-processing techniques developed by Maurer et al. (2002). These specific techniques, although not necessarily optimal in every regard, have been found to produce high quality macroscale hydrologic simulations from 1950 to 2000 over the continental United States (Maurer et al. 2002). On the basis of these consistently good results, we argue that these specific data-processing methods are an appropriate and well-tested foundation for the research described here. There is no loss of generality in this choice, however, because the methods for making temporal adjustments to gridded datasets that we develop here can be applied equally well to other gridded datasets produced by other procedures.
This technical note describes data-processing methods to produce temporally and topographically adjusted meteorological driving data for hydrologic models that can be used to produce derived variables (e.g., soil moisture, snow water equivalent, runoff, and streamflow, among others) that are amenable to long-term trend analysis. The methods we have developed are efficient in the sense that they do not require detailed station metadata (which are frequently unavailable) or labor-intensive, station-by-station corrections. The temporal adjustment procedures are intended to maintain as much spatial information from the relatively high density Co-op station network as possible, while adjusting the time series characteristics of the gridded data so that they have long-term trends consistent with gridded datasets derived from a smaller number of carefully quality controlled stations from the U.S. Historical Climatology Network (HCN; Karl et al. 1990) and the Historical Canadian Climate Database (HCCD; Mekis and Hogg 1999; Vincent and Gullett 1999). The HCN and HCCD data archives are composed of long, continuous station records that have been corrected for changes in station location, instrumentation, time of observation, land use, etc.
2. Data-processing methods
a. Sources of data
Three primary sources of meteorological data are used in the data-processing sequence. The first is the daily time step NCDC Co-op data, supplemented by station data from Environment Canada (ECAN) for southern British Columbia in the Pacific Northwest (PNW). The second is monthly time step HCN and HCCD data, which are used as a “standard” in making temporal adjustments to the gridded Co-op data. The third is the monthly precipitation maps produced by the Precipitation Regression on Independent Slopes Method (PRISM) method of Daly et al. (1994), which are used for topographic adjustments to the precipitation data.
b. Preprocessing, quality control, and gridding
Raw Co-op station data for TMAX, TMIN, and PCP from 1915 to 2003 were first extracted from NCDC archives. For quality control purposes, upper limits were set on daily precipitation based on approximate climatological limits (from regional observations). Values less than zero or above the upper limit were removed and replaced with a missing data flag (−99). The same was done for TMAX and TMIN for values less than −50° or greater than 55°C. These simple quality control checks were primarily intended to screen out implausible values due to, for instance, data entry errors or misinterpretation of data fields in written records. Stations that did not have at least 365 days of data with no gaps, or did not have at least 1825 total days (about 5 yr) of data, were also removed. This procedure typically retains more stations in the gridding process than were used by Maurer et al. (2002), who limited the stations to those with more than 20 yr of data from 1950 to 2000.
The data were then gridded to 1/8° latitude–longitude resolution using the Symap algorithm (Shepard 1984, as applied by Maurer et al. 2002) using four nearest neighbors. The target grid points were the center of each 1/8° grid cell. During the gridding process TMAX and TMIN were both lapsed by the pseudoadiabatic lapse rate (6.1°C per kilometer) to account for differences between the target grid point and the elevations of the nearest stations used in the interpolation. These steps are identical to those described by Maurer et al. (2002). Exactly the same gridding process was applied to the monthly time step HCN and HCCD data, except the number of nearest neighbors for PCP gridding was increased to 15 in the Symap algorithm to prevent sharp discontinuities in the gridded data as a result of the relatively low station density.
c. Temporal adjustments
It should be noted that the objective of the procedures described above is to remove gross temporal inconsistencies in the daily Co-op data at the time scale of the temporally filtered data. Spatial information at the daily time step is retained from the Co-op data, and even the monthly Co-op data are not required to exactly match the equivalent HCN/HCCD monthly values. Rather only the two smoothed time series are forced to match each other. In this way trends in the final data closely match those in the more temporally homogeneous HCN and HCCD data, without forcing a perfect match between the gridded HCN HCCD data and the gridded Co-op data that would effectively remove the increased spatial information from the Co-op data (Fig. 1).
d. Topographic adjustments
Once the temporal adjustments were applied to the daily time series, the topographic adjustments to precipitation described in Maurer et al. (2002) were carried out. These adjustments, which were indexed to the period 1961–90, forced the long-term mean of the temporally adjusted Co-op data for each grid cell to match those in the PRISM precipitation maps (Daly et al. 1994). Although here we followed the simple procedure outlined by Maurer et al. (2002) for adjusting temperature data for topography (discussed above), similar methods to those used for topographic adjustment of precipitation could also be used to adjust temperature data.
e. Wind data
In addition to precipitation and temperature and other variables derived from these data (e.g., downward solar and longwave radiation, and specific humidity) hydrologic models (and in particular the Variable Infiltration Capacity model described below) also use estimates of daily 10-m wind speed, which Maurer et al. (2002) took from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis (Kalnay et al. 1996). The reanalysis products are available from 1949 on; prior to 1949, a daily wind climatology derived from the post-1949 period was used.
3. Results
a. Macroscale evaluation of results using naturalized streamflow data
We simulated daily time step streamflow (subsequently aggregated to monthly) using the Variable Infiltration Capacity (VIC) model (Liang et al. 1994; Cherkauer and Lettenmaier 2003) driven by the adjusted gridded forcing data. The VIC model was implemented at 1/8° spatial resolution over the western United States and was run in daily water balance mode with a 1-h snow model time step [see Hamlet et al. (2004), manuscript submitted to J. Climate) for additional VIC implementation details]. The models had been calibrated using data from 1950 to 2000, and the model was not recalibrated for the period prior to 1950.
To evaluate the effect of the adjustments to the driving data described above we compared VIC simulations for “adjusted” and “unadjusted” driving datasets for the Pacific Northwest. Both driving datasets were derived from exactly the same station records and were processed in an identical manner, except the unadjusted dataset did not include the temporal corrections. Simulated streamflow records from VIC were compared with naturalized streamflow observations (water management effects removed) for 16 locations in the Columbia River basin. All of these sites are snowmelt dominant and most of the annual runoff occurs from April to September. After about 1950, characteristic differences between simulated streamflows derived from the adjusted and unadjusted datasets are relatively small in comparison with the pre-1950 period. As an example, Fig. 2 (upper panel) shows a time series of April–September average streamflows for the South Fork Flathead River at Hungry Horse Dam from 1931 to 1989 for unadjusted VIC simulations, adjusted VIC simulations, and naturalized observations. Figure 2 (lower panel) shows a time series of the absolute errors for the two VIC simulations relative to the observed data. Prior to about 1950, the simulations based on the unadjusted meteorological data are very strongly biased in comparison with the post-1950 period, and the time series of absolute errors (Fig. 2b) displays an obvious downward trend with time due to spurious trends in precipitation in the unadjusted gridded dataset. By contrast the simulations associated with the adjusted meteorological data show relatively stationary error characteristics over time. Note that this particular basin is fairly small (∼4280 km2), and the loss of spatial information associated with the temporal adjustments appears to reduce the accuracy of the simulations in a few years in the post-1950 period. In larger basins, loss of spatial information is less important, and the simulations derived from adjusted and unadjusted datasets are typically in very close agreement after about 1950.
Figure 3 shows a scatterplot of the root-mean-square error (rmse) for the time series of April–September average streamflow derived from the adjusted (x axis) and unadjusted (y axis) VIC simulations from 1931 to 1950 for 16 sites in the Columbia basin. Note that the adjustment procedure tends to produce robust improvements prior to 1950: either reducing the rmse or leaving it essentially unchanged.
Although a long unadjusted driving dataset for testing was only produced for the Pacific Northwest, VIC streamflow simulations in the Colorado and Sacramento/San Joaquin basins were also compared with naturalized observations and were found to have stationary error characteristics throughout the time series using the temporally adjusted meteorological data. In larger subbasins, the agreement with observations over the entire time series was quite good for the most part. These evaluations demonstrate that the temporal corrections to the driving datasets result in temporally homogeneous monthly water balance simulations for a wide range of topographic and climatic conditions throughout the West without any recalibration of the hydrologic model for the earlier parts of the record.
b. Macroscale evaluation of results using observed snow water equivalent
Comparison with relative trends in 1 April snow water equivalent (SWE) from snow course observations also show excellent broadscale agreement between the VIC simulations and observations over the West for the last 50 yr [see Mote et al. (2005) and Hamlet et al. (2004), manuscript submitted to J. Climate) for details]. For the model to be able to reproduce these spatial and temporal features of snow accumulation and melt, the meteorological forcing data must accurately represent both the time history and topographic characteristics of both temperature and precipitation over a wide range of conditions.
c. Data availability and archiving
At the time of this writing the forcing datasets and associated hydrologic testing were complete for river basins west of the continental divide, and are available at the Web site www.hydro.washington.edu/Lettenmaier/gridded_data/index.html. The forcing dataset for the continental United States will be completed over the next six months to a year and will be made publicly available at the same URL. Archival details have not yet been decided, but the data will be available in a comparable format to the previous forcing datasets described by Maurer et al. (2002).
4. Summary and conclusions
Without temporal adjustments, long-term gridded meteorological datasets frequently contain significant temporal inhomogeneities that prevent meaningful trend analysis of simulated hydrologic variables. The data-processing procedures described in this paper are shown to produce meteorological driving datasets that are temporally consistent over long periods of record. Based on VIC hydrologic simulations from 1916 to 2003 and comparison to observations of streamflow and snow water equivalent, we conclude that the meteorological forcing data produced by these methods (and the associated hydrologic simulations) are broadly suitable for trend analysis of simulated hydrologic variables at the macroscale.
Although these data-processing methods were implemented to produce meteorological forcing datasets in the continental United States and southern British Columbia, they are equally suitable for other areas of the globe for which there are long, temporally consistent station records and appropriate data upon which to base topographic adjustments to precipitation. Use of the data from the Global Historical Climatology Network, for example, would permit the temporal corrections described here to be applied to many developed areas outside the United States (e.g., in Europe and Asia).
Acknowledgments
This publication was funded by the Joint Institute for the Study of the Atmosphere and Ocean (JISAO) under NOAA Cooperative Agreement NA17RJ1232. We appreciate the assistance of Dr. Ned Guttman and colleagues at NCDC who produced the DSI-3206 dataset and facilitated timely access to the data for this research. Hyo Seok Park, Marketa McGuire, Andy Wood, Nathalie Voisin, and Kostas Andreadis (all of the Department of Civil and Environmental Engineering, University of Washington) assisted with various aspects of this study.
REFERENCES
Cherkauer, K. A., and Lettenmaier D. P. , 2003: Simulation of spatial variability in snow and frozen soil. J. Geophys. Res., 108 .8858, doi:10.1029/2003JD003575.
Cosgrove, B. A., and Coauthors, 2003: Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res., 108 .8842, doi:10.1029/2002JD003118.
Daly, C., Neilson R. , and Phillips D. , 1994: A statistical-topographic model for mapping climatological precipitation over mountainous terrain. J. Appl. Meteor., 33 , 140–158.
Hamming, R. W., 1989: Digital Filters. Prentice Hall, 284 pp.
Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77 , 437–471.
Karl, T. R., Williams C. N. Jr., Quinlan F. T. , and Boden T. A. , 1990: United States Historical Climatology Network (HCN) serial temperature and precipitation data. Environmental Science Division, Publication 3404, Carbon Dioxide Information and Analysis Center, Oak Ridge National Laboratory, Oak Ridge, TN, 389 pp.
Liang, X., Lettenmaier D. P. , Wood E. F. , and Burges S. J. , 1994: A simple hydrologically based model of land surface water and energy fluxes for general circulation models. J. Geophys. Res., 99 , 14415–14428.
Maurer, E. P., O’Donnell G. M. , Lettenmaier D. P. , and Roads J. O. , 2001: Evaluation of the land surface water budget in NCEP/NCAR and NCEP/DOE reanalyses using an off-line hydrologic model. J. Geophys. Res., 106 , D16,. 17841–17862.
Maurer, E. P., Wood A. W. , Adam J. C. , Lettenmaier D. P. , and Nijssen B. , 2002: A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States. J. Climate, 15 , 3237–3251.
Mekis, É, and Hogg W. D. , 1999: Rehabilitation and analysis of Canadian daily precipitation time series. Atmos.–Ocean, 37 , 53–85.
Mote, P. W., Hamlet A. F. , Clark M. P. , and Lettenmaier D. P. , 2005: Declining mountain snowpack in western North America. Bull. Amer. Meteor. Soc., 86 , 39–49.
Roads, J., and Coauthors, 2003: GCIP water and energy budget synthesis. J. Geophys. Res., 108 .8609, doi:10.1029/2002JD002583.
Shepard, D. S., 1984: Computer mapping: The SYMAP interpolation algorithm. Spatial Statistics and Models, G. L. Gaille and C. J. Willmott, Eds., Reidel, 133–145.
Vincent, L. A., and Gullett D. W. , 1999: Canadian historical and homogeneous temperature datasets for climate change analyses. Int. J. Climatol., 19 , 1375–1388.
Example of temporal corrections for Jan precipitation at a single grid location. (top) Temporally smoothed monthly time step Co-op and HCN HCCD data. (bottom) HCN HCCD data, unadjusted monthly Co-op data, and the adjusted monthly Co-op data.
Citation: Journal of Hydrometeorology 6, 3; 10.1175/JHM420.1
Comparison of “adjusted” and “unadjusted” VIC simulations and naturalized observations for the South Fork Flathead River at Hungry Horse Dam, MT. (top) A time series of Apr–Sep average streamflow from 1931 to 1989. (bottom) A time series of absolute error for the adjusted and unadjusted streamflow simulations from 1931 to 1989.
Citation: Journal of Hydrometeorology 6, 3; 10.1175/JHM420.1
Root-mean-square error for simulated Apr–Sep average streamflow from 1931 to 1950 relative to naturalized flow observations for 16 locations in the Columbia River basin shown as a scatterplot with the rmse associated with “adjusted” data on the x axis and rmse associated with “unadjusted” data on the y axis.
Citation: Journal of Hydrometeorology 6, 3; 10.1175/JHM420.1