1. Introduction
Observed meteorological forcing data used in land surface and hydrologic modeling contain large uncertainties (Beven 2002; Carroll et al. 2006; Clark and Slater 2006; DeChant and Moradkhani 2012; Liu and Gupta 2007; Montanari and Koutsoyiannis 2012). Estimates of forcing data uncertainty can help to strengthen a myriad of land surface and hydrologic modeling and application practices. For instance, in hydrologic model calibration, uncertainty estimates in meteorological data are used to represent the forcing data uncertainty in hierarchical Bayesian model calibration methods (Kavetski et al. 2003; Renard et al. 2011). In ensemble data assimilation, uncertainty estimates of meteorological data are used to generate ensemble state estimates (Clark et al. 2008; Moradkhani et al. 2012; Thiboult et al. 2016).
Despite the importance of quantifying uncertainty in forcing data, few ensemble historical meteorological datasets are available; most existing meteorological datasets are deterministic. Table 1 lists some example retrospective gridded meteorological datasets. To avoid an exhaustive list, Table 1 restricts attention to the datasets in which vast amounts of historical observations are used to derive at least one meteorological field, especially precipitation, and avoids datasets that are generated purely from climate models or remote sensing. Output fields vary among datasets, but typically include precipitation and temperature, and may also include radiation, wind, humidity, and surface pressure.
Example retrospective gridded meteorological datasets.
Gridded deterministic datasets are generated by reanalysis methods, interpolation methods, and multisource merged products. Reanalyses are often produced by numerical weather prediction (NWP) models. NWP models assimilate observations to provide the best possible estimate of the state of the atmosphere (the analysis) at the start of the forecast period. Time series of model analyses are highly valuable for climate studies, but are limited by frequent changes in the NWP model. Reanalyses use a fixed NWP model and redo the “analysis” of the state of the atmosphere (i.e., produce a “reanalysis”) for an extended time period. Reanalyses are now routinely available from many different modeling groups (Ailliot et al. 2015; Gelaro et al. 2017; Kobayashi et al. 2015). Alternatively, interpolation methods are used to interpolate station data to a regular grid, forming a different type of surface analysis. Since interpolation methods may not rely on an NWP model, they typically produce finer spatial resolution datasets than reanalyses (Livneh et al. 2013; Maurer et al. 2002; Xia et al. 2012). Multisource merged products, e.g., the satellite-based forcing products IMERG (Huffman et al. 2019) or CMORPH (Joyce et al. 2004) precipitation, or ground radar-based systems (e.g., Kirstetter et al. 2015) are deterministic, but offer error estimates from retrieval uncertainties related to the inversions required for inferring precipitation from remotely sensed platforms. The fields produced by reanalysis, interpolation, and multisource merged products are uncertain, arising from incomplete and inaccurate observations, imperfect NWP models, and errors in data assimilation and interpolation methods.
It is uncommon for historical meteorological datasets to include uncertainty estimates, but recent years have seen progress in this area. Among the datasets listed above, N15, EMDNA, EM-Earth, E-OBS, ERA5, and CaPA provide uncertainty estimates for output fields. N15, EMDNA, and EM-Earth use spatially correlated random fields to resample residuals from the spatial regression used in its terrain-assisted interpolation to create multiple ensemble members (Newman et al. 2015; Tang et al. 2021, 2022). E-OBS also uses the Gaussian random field stochastic simulation and produces multiple ensemble members of daily temperature and precipitation (Cornes et al. 2018). ERA5 uses an ensemble data assimilation (EDA) system and produces ten members (one control and nine perturbed members) to represent the reanalysis data uncertainty. Since the EDA does not account for the systematic errors in reanalysis, the ensemble only provides a relative estimate of uncertainty in terms of spatiotemporal distribution (Hersbach et al. 2020). CaPA quantifies the uncertainty of precipitation estimate by a confidence index varying between zero and one, but this index is not straightforward and has not been used in applications (Fortin et al. 2018).
From the users’ perspective, it is not easy to understand and assess uncertainties in a meteorological dataset if they are not provided. Most users do not have access to information such as the spatiotemporal uncertainties of observations, the degree of restriction of observations on the target fields, and the accuracy of the NWP model or the interpolation method in terms of representing the target fields (e.g., Dee et al. 2016). However, users may need uncertainties for a particular forcing dataset that they already use in an application (e.g., a land model that has been previously calibrated to the forcings). The most common empirical solution is to apply an additive or multiplicative error term to the forcings. The error term is assumed to follow a certain probability distribution (e.g., normal distribution, gamma distribution) or stochastic model (e.g., hierarchical model) and can be encoded in parameter estimation or assimilation algorithms (Blazkova and Beven 2009; DeChant and Moradkhani 2012; Del Giudice et al. 2016; Fuentes-Andino et al. 2017; Huard and Mailhot 2008; Kavetski et al. 2006; Leisenring and Moradkhani 2011; McMillan et al. 2011; Montanari 2005; Rasmussen et al. 2015; Renard et al. 2011; Salamon and Feyen 2009). The drawbacks of these empirical approaches are that the assumed error models can be unrealistic or static. A more rigorous estimation of such error models can be a challenging task that requires expert statistical knowledge and intensive data analysis (Del Giudice et al. 2016).
There is a need for better approaches to support users in augmenting deterministic gridded meteorological datasets to include realistic uncertainty estimates that can be easily employed in applications such as parameter estimation, data assimilation, and forecasting. The purpose of this paper is to demonstrate an adaptation of the spatial regression method used in Newman et al. (2015) to quantify the uncertainty in precipitation and temperature fields in existing deterministic gridded meteorological datasets. We investigate the viability of the spatial regression method by comparing the resulting uncertainty estimates to those obtained from the technique when it is applied directly to meteorological station observations, such as in the N15 dataset in Table 1. The work also demonstrates a new use of the Gridded Meteorological Ensemble Tool (GMET; https://github.com/NCAR/GMET), a publicly available software that implements the spatial regression method.
The remainder of this paper is organized as follows. Section 2 describes the North American Land Data Assimilation System 2 (NLDAS-2) precipitation and temperature dataset, the spatial regression method, and the evaluation method. Section 3 shows the experimental results. Section 4 discusses the advantages and limitations of the ensemble dressing method and its application to other meteorological datasets. Section 5 concludes this paper with suggestions for future research.
2. Data and methods
a. NLDAS-2 precipitation and temperature dataset
The NLDAS-2 forcing dataset is available over the domain of the conterminous United States (CONUS), the southern part of Canada, and the northern portion of Mexico (125°–67°W, 25°–53°N) (Xia et al. 2012). It has records from 1 January 1979 to present and features 1/8° spatial and hourly temporal resolution. The dataset has nine primary meteorological fields and six secondary meteorological fields. Among the 15 fields, precipitation and temperature are derived using different methods.
The temperature field is derived from the North American Regional Reanalysis (NARR). The NARR has 32-km spatial resolution, 3-hourly temporal resolution, and six forcing fields including 2-m air temperature. To get the NLDAS-2 temperature field, the NARR 2-m air temperature is first downscaled to the NLDAS-2 grid using standard interpolation methods, and then it is adjusted to the NLDAS-2 grid elevation using the lapse rate of −6.5 K km−1 (Cosgrove et al. 2003).
The precipitation field is derived from daily gauge-based precipitation analyses with adjustment for orographic precipitation impacts using monthly PRISM data. The precipitation analyses data have the same spatial resolution as the NLDAS-2 forcing dataset. They are temporally disaggregated to hourly timesteps using the hourly Stage-II Doppler radar precipitation analyses (Cosgrove et al. 2003). In parts of Canada and Mexico, gauge-based precipitation is sparse, so the NARR precipitation is used.
From the NLDAS-2 dataset, we focus on daily precipitation, mean air temperature and temperature range, as in Newman et al. (2015). Throughout this paper, the fields daily precipitation, mean temperature, and the diurnal temperature range are referred to as Precip, Tmean, and DTR, respectively.
b. Methodological rationale
The spatial regression method for ensemble geophysical generation was first proposed by Clark and Slater (2006) and has been improved since by Newman et al. (2015, 2019), Tang et al. (2021, 2022), and Bunn et al. (2022). For every grid cell, the method fits a locally weighted regression between the observations at a surrounding sample of stations and the terrain features or attributes at their locations, and then predicts the grid cell’s precipitation and temperature using the same types of terrain attributes at the grid cell location. The regression uncertainty serves as a basis for generating an ensemble of meteorological fields. The regression varies both in space and time, providing a dynamic and weather-dependent uncertainty estimate.
Using the spatial regression method to an existing gridded dataset is necessarily different from the original station-based application. Instead of training the regression using station observations and their terrain features, it must be trained on “pseudo-observations”—that is, a sample of gridded values from the neighboring area of each grid cell. The resulting regression uncertainty consequently differs in nature from that of the original application because the gridded values arise from whatever statistical or physical model was used to generate the dataset.
The employed spatial regression method is effectively “dressing” a deterministic dataset with ensembles; hereafter referred as “ensemble dressing” (Roulston and Smith 2003; Wang and Bishop 2005). The ensemble dressing posits a spatial structure to gridded observations based on terrain and location factors that are known to influence the spatial distribution of surface climate. When this influence relationship is weak, the uncertainty in the gridded observations is estimated to be higher than when it is strong. High uncertainty implies that the inputs (e.g., gauge observations and NWP analyses) used to generate the gridded observations do not have an organized structure that can be linked to the explanatory variables of terrain and location. The ensemble dressing method would offer a computationally tractable way to generate usable time–space varying uncertainty estimates to support applications such as data assimilation and ensemble forecasting.
c. Theory
In summary, to generate probabilistic estimates of Precip, Tmean, and DTR, seven space- and time-varying parameters (μ and σ for three meteorological fields and PoP) need to be estimated. The ensemble dressing method for developing meteorological ensembles is summarized in three steps: pseudo-observation stations sampling, locally weighted regression, and ensemble generation. The workflow of our method is summarized in Fig. 1. The three steps are described in detail below. The latter two steps largely follow Newman et al. (2015).
1) Pseudo-observation stations sampling
The first step samples grid cells from the meteorological dataset (Fig. 1, step 1). The sampled grid cells will serve as pseudo-observation stations in the locally weighted regression. Any user-preferred sampling method can be applied here. For instance, random sampling of a certain percentage of the dataset grids, or sampling at a specified grid interval.
In this study, we used a specified grid interval to sample pseudo-observation stations from the 1/8° NLDAS-2 grids, which enables a uniform distribution of samples over the domain. To explore the impacts of the sampling interval on the uncertainty estimates, we tried nine different sampling intervals that range from 10 to 2 (Fig. 2). For instance, an interval of 2 means that we sample every second grid in both latitude and longitude dimensions. Therefore, the smaller the interval, the denser the sampling. After determining the sampled grids, we perturbed each sampled grid cell location to avoid nearly singular matrices in the least squares regression, which would result from linearly related predictors. Each sampled point was shifted by up to one cell in eight possible directions (i.e., to an adjacent cell), or left unshifted. The implementation is simple. Randomly choose an integer from zero to eight. Each value indicates the direction in which the sample point is shifted: 0, unshifted; 1, north; 2, northeast; 3, east; 4, southeast; 5, south; 6, southwest; 7, west; 8, northwest.
2) Locally weighted regression
Locally weighted regression is iteratively performed for all dataset grids and time steps requested (Fig. 1, step 2). Note that for precipitation, the locally weighted regression is applied to the transformed precipitation in normal space.
(i) Locally weighted linear regression
In this study, the spatial attributes of the NLDAS-2 dataset are available at https://ldas.gsfc.nasa.gov/nldas/elevation. Note that this attribute dataset has a slightly different spatial coverage from the NLDAS-2 forcing dataset. Our experiments are implemented on the commonly available grids of the NLDAS topography and forcing datasets.
(ii) Locally weighted logistic regression
3) Ensemble generation
Ensemble generation is iteratively performed for each grid and time step over the user specified number of ensemble members (Fig. 1, step 3).
(i) Spatiotemporally correlated random fields
Ensemble generation is realized by sampling from the probabilistic distributions of Precip, Tmean, and DTR at each grid and time step. The sampling is done using the spatiotemporally correlated random fields (SCRFs). The SCRFs are generated using two steps. The first step generates the spatially correlated random fields (R) using the conditional distribution method described by Johnson (1987). The second step generates the SCRF by including temporal correlation into R.
The implementation of Eqs. (12)–(15) is cumbersome for high-dimensional grids. To increase efficiency, we applied a nested simulation strategy similar to that proposed by Fang and Tacher (2003).
(ii) Probabilistic estimation
At the end, the generated precipitation amount needs to be transformed back to the observation space using the inverse function of Eq. (3).
d. Evaluation and selection of probabilistic estimates
To explore the impacts of sampling density on the probabilistic estimates, we experimented with nine different sampling strategies from coarse to dense (Fig. 2). For each strategy, we performed the locally weighted regression steps for a 5-yr period from 2012 to 2016 and estimated μ and σ for the three meteorological fields and PoP. The regression results are evaluated to help identify the most appropriate sampling density from two perspectives.
The second perspective is to investigate the magnitude of the standard deviation estimates (
Through the above comparisons, we will identify a sampling strategy that gives a relatively accurate mean field estimate
3. Results
a. Evaluation of sampling strategies
1) Evaluation of mean estimates ( )
This section compares the field mean estimates with the original NLDAS-2 field values for all sampling strategies. The comparison is performed over all NLDAS-2 grids, using the MAE of the climatological 2012–16 comparison between NLDAS-2 and the pseudo-observation stations-based regression estimate. Overall, the mean estimates of all three meteorological fields are consistent with NLDAS-2 in all sampling strategies (Fig. 3–5). For Precip and DTR, the denser the sampling strategy provides more accurate field mean estimates, which is also reflected in the MAE evaluation results (Figs. 3 and 5). For Tmean, the mean estimates are highly consistent with NLDAS-2, and the
In Fig. 3, low precipitations are overestimated. This comes from the locally weighted linear regression (see section 2), particularly when the observed precipitation is very small or zero. Recall that the derivation of the linear regression coefficients relies on the 30 nearest pseudo-observation stations, when nonzero precipitation is observed in some of the nearest pseudo-observation stations (which occurs often in reality), the precipitation estimate of the target grid may be nonzero even using spatial predictors.
However, this overestimation issue is overcome in ensemble generation through the use of the PoP estimate. In the logistic regression (see section 2), PoP is quantified based on zero-to-one probability, not zero-or-one. In this way, there is a high chance that precipitation does not occur at the target grid if PoP is very small, even with nonzero mean estimates of precipitation from the linear regression. In the remaining cases where the ensemble still overestimates light precipitation, a certain amount of precipitation in the ensemble product accounts for data uncertainty in occurrence estimates and is a positive product characteristic (e.g., Carpenter et al. 2001; Liu et al. 2019; Slater and Clark 2006).
2) Evaluation of estimated standard deviation ( )
This section compares the standard deviation estimates between the nine sampling strategies and the N15 dataset. The comparison is performed over all NLDAS-2 grids using the average standard deviation estimate of the period 2012–16. Overall, the standard deviations of all three meteorological fields are reduced as the sampling becomes denser (Fig. 6). This is expected because as we increase high-quality (nearby) information, regression estimates become more accurate and uncertainty estimates decrease. For precipitation, strategy 8 has the most similar standard deviation estimates as the N15 dataset. For temperature, all the strategies have lower uncertainty estimates than the N15 dataset.
In Fig. 7, we dig down to the spatial distribution of the uncertainty estimates for one of the sampling strategies (strategy 8 in Fig. 2). We find that for precipitation, the NLDAS-2-based regression has similar uncertainty estimates to the station-based regression in the United States and lower uncertainty estimates than the station-based regression in Canada and northwest Mexico. The difference stems from the use of different precipitation generation methods in the United States and Canada and Mexico in NLDAS-2. In the United States, the NLDAS-2 precipitation is from a daily gauge product and disaggregated into hourly data based on hourly radar data, whereas in Canada and Mexico, the NLDAS-2 precipitation is from the NARR precipitation (Xia et al. 2012). The NARR assimilates precipitation gauge data into the atmospheric analysis using the atmospheric numerical model and the Regional Data Assimilation System. Precipitation representation in atmospheric numerical models is usually smoother and has less spatial variability than in reality (e.g., gauge data), especially in coarse resolution atmospheric numerical models. When the samples used in regression have less variability, the regression results tend to be more accurate and less uncertain.
For temperature, the NLDAS-2-based uncertainty is less than the station-based uncertainty except along the water area (Fig. 7). This is also due to the way the NLDAS-2 temperature is generated. The NLDAS-2 temperature is adjusted using the lapse rate −6.5 K km−1 to account for the elevation impact on temperature (Cosgrove et al. 2003). Since the NLDAS-2 product already uses elevation in temperature generation, the elevation induced uncertainty is marginal relative to locations (i.e., latitude and longitude) in regression. In contrast, the station-based regression uses station observations that are not adjusted with elevation, so it gets higher uncertainty than the NLDAS-2-based regression. In addition, relatively high temperature uncertainty exists along the West Coast and lake areas due to large temperature gradients near the coast and biased/one-side sampling of primarily inland points (Fig. 7b).
b. Evaluation of probabilistic estimates of one sampling strategy
This section evaluates the probabilistic estimates of sampling strategy 8 in Fig. 2. We picked strategy 8 because it has the most similar precipitation deviations to the N15 dataset, especially in the United States, and it has good MAEs for mean precipitation and temperature. Based on strategy 8, we generated 100 ensemble members at the same spatiotemporal resolution as the NLDAS-2 dataset for years 1979–2019, termed as the Ensemble Dressing of NLDAS-2 (EDN2). As noted in section 2d, the ensemble is scaled to center on the original NLDAS-2 fields, since the objective of the ensemble dressing method is to provide uncertainty estimates around the deterministic dataset values rather than alter them.
1) Spatial patterns
This section investigates the spatial distributions of ensemble members and the standard deviation estimates of Precip, Tmean, and DTR (Fig. 8). Overall, the distributions of two example members are very consistent with NLDAS-2 for June 2016. For precipitation, areas with higher precipitation (e.g., the southeastern United States) have higher estimated uncertainty. For temperature, the West Coast of the United States, the Great Lakes, and the Hudson Bay (in Canada) have larger ensemble uncertainty.
The PoP from the NLDAS-2 dataset and the regression estimates are compared for the same period (June 2016) in Fig. 9. The regression shows very consistent PoP estimates with NLDAS-2, and the average PoP difference between the regression and NLDAS-2 is only −0.01. Moreover, the PoP estimates are smoother than the NLDAS-2 dataset because the employed logistic regression involves the PoPs of surrounding pseudo-observation stations and uses zero-to-one probability to quantify PoP, whereas the NLDAS-2 dataset uses zero-or-one binary idea to quantify PoP of each grid and time.
2) Temporal patterns
Here we examine the seasonal patterns of the standard deviation estimates of Precip, Tmean, and DTR (Fig. 10). The average standard deviations of Precip, Tmean, and DTR are 0.38 mm day−1, 0.34°C, and 0.59°C, respectively, for all grids over the years 1979–2019. For precipitation, when precipitation does not occur in NLDAS-2 (Fig. 10a), the regression provides a small amount of precipitation uncertainty, in effect providing a precipitation occurrence measurement error, which is a positive feature of our method. When precipitation occurs (Fig. 10b), the standard deviation has a positive correlation with the NLDAS-2 precipitation amount (Pearson correlation coefficient = 0.77).
Tmean standard deviation estimates are larger during the cold season than the warm season (Fig. 10c). This is expected because warm season temperatures have higher spatial correlation which will result in lower estimated uncertainty (better regression predictions). In contrast, in winter there are sharp gradients along fronts that are difficult to resolve, and complex inversions related to terrain and snow cover, all leading to more spatially heterogeneous temperatures. DTR standard deviation estimates are relatively stable through time, though there is a very small portion of high DTR uncertainty in summer (light blue in Fig. 10d).
4. Discussion
The primary assumption in the ensemble dressing methodology applied here is that the spatial variability of the gridded precipitation and temperature fields is a good proxy for the uncertainty. In making this assumption, we do not quantify the individual sources of uncertainty directly, e.g., the uncertainties associated with measurement errors at individual stations (Humphrey et al. 1997), the uncertainties associated with the spatial representativeness of the station network (Habib et al. 2001), the uncertainties in remotely sensed radar and satellite precipitation estimates (Nijssen 2004; Schiemann et al. 2011), and the uncertainties in numerical weather prediction model output (Ailliot et al. 2015; Semenov and Stratonovitch 2010). The justification for connecting uncertainty to spatial variability in our ensemble dressing method is because uncertainty is connected to the spatial correlation length scales of meteorological fields (e.g., uncertainty may be higher for convective precipitation where the correlation length scale is much shorter). While the use of spatial variability as a proxy for uncertainty is clearly a limitation of this work, our ensemble dressing method provides more meaningful uncertainty methods than current approaches of imposing error distributions on meteorological fields where the variance is constant or constantly proportional in both space and time (Beven and Freer 2001; Blazkova and Beven 2009; Kavetski et al. 2003; Moradkhani et al. 2005).
The innovation of the ensemble dressing method lies in that it uses the regression model and a few selected spatial attributes to depict meteorological fields and their uncertainty. The regression model is hence limited when the employed explanatory variables (i.e., spatial attributes) provide a poor representation of the uncertainty in meteorological fields. It is possible to develop improved methods to represent the uncertainty. For example, one possibility is to use ensemble postprocessing methods to adjust the variance of the dressed ensembles and improve the statistical reliability of the probabilistic estimates. Candidate methods include quantile mapping and Bayesian methods (Gneiting et al. 2005; Hamill 2001; Krzysztofowicz and Evans 2008; Li et al. 2017; Piani et al. 2010; Scheuerer and Hamill 2015; Van Schaeybroeck and Vannitsem 2015; Wilks and Hamill 2007; Wu et al. 2011). Further research is necessary to understand the advantages and limitations of alternative ensemble dressing methods.
The ensemble dressing method introduced here can be applied to other deterministic gridded meteorological datasets, such as the deterministic datasets in Table 1. When applying the ensemble dressing method, only two types of inputs are required: the deterministic gridded meteorological data and the topography data over the targeted domain. The topography data include elevation, slope, and aspect, in which slope and aspect can be computed based on elevation. The topography data need to be scaled to the same spatial resolution as the meteorological input for use in regression. The output will be the gridded ensemble meteorological data with the same coverage and spatial–temporal resolution as the input deterministic meteorological data. The ensemble dressing method can also be applied to estimate uncertainties of meteorological fields other than precipitation and temperature, for example, radiation, wind, humidity, and surface pressure. In this case, the locally weighted regression needs to incorporate the explanatory variables that can represent the source of meteorological variability. It is crucial for the users of this methodology to understand their deterministic gridded meteorological dataset before applying the presented ensemble dressing method. The derived ensemble will inherit the general characteristics of the input meteorological dataset and is explicitly designed to match its climatology.
5. Conclusions
Given the prevalence of deterministic meteorological datasets and the need in many applications to quantify uncertainty in meteorological fields, in this paper we evaluate and demonstrate the use of a locally weighted spatial regression method to “dress” deterministic datasets with ensembles, and hence estimate the uncertainty in precipitation and temperature fields in existing deterministic gridded meteorological datasets. The ensemble dressing method introduced in this paper enables users to generate a user-specified number of ensemble members relatively quickly, to customize the uncertainty assessments to dataset already in use.
Taking the widely used NLDAS-2 precipitation and temperature dataset as an example, we generated a large-domain ensemble dataset of daily precipitation (Precip), daily mean temperature (Tmean), and the diurnal temperature range (DTR). Nine sampling strategies were tested to reveal the impacts of sampling density on meteorological mean and standard deviation estimates. Though it is not the primary objective of the study to reproduce the NLDAS-2 values with the ensemble means (versus the uncertainty around the values), the method also performed relatively well in this regard, with increasing agreement as more grid samples are used in the regression (as expected). The intended use of the ensemble dressing method is to center the uncertainty on the original gridded values (in this case NLDAS-2) to provide uncertainty information for applications of gridded meteorological forcing data. The GMET software has been applied in real time and is suitable for operational application, thus current community applications that rely on NLDAS-2 and need uncertainty estimates (such as for data assimilation) could make use of this technique and software.
The final ensemble dataset, termed as the Ensemble Dressing of NLDAS-2 (EDN2), was produced based on sampling strategy 8 of Fig. 2. The EDN2 dataset is freely available online at https://doi.org/10.5065/karj-0e19 (Liu et al. 2021a). It includes 100 members of Precip, Tmean and DTR data in netCDF format. The ensemble dataset is at 1/8° and daily resolution for the period 1979–2019 and covers the same domain of the NLDAS-2 forcing dataset. Compared with the Newman et al. (2015) station-based ensemble dataset (N15), the uncertainty patterns of EDN2 are reasonable for precipitation from both spatial and temporal perspectives. For precipitation, our method shows increasing uncertainty with increasing daily accumulation, and introduces some nonzero ensemble members for grid cells with zero precipitation in the deterministic dataset. The magnitudes of the temperature field uncertainties are considerably lower than those found in N15. To some extent, the relatively lower temperature uncertainty can be expected because of the higher spatial correlation lengths found in temperature anomalies (compared to precipitation) and the inclusion of terrain related predictors (mainly elevation) in the creation of the gridded NLDAS-2 field. Estimated temperature uncertainties are smaller in the warm season and larger in the cold season due to less or more complex spatial patterns in each season respectively.
The uncertainty information available through this methodology enables applications such as land surface and hydrologic modeling and forecasting to account for spatiotemporally varying forcing data uncertainty without the need to adopt a new forcing dataset, which could necessitate recalibrating applications models or workflows. For example, the meteorological ensembles can be used in the ensemble-forcing based hydrologic model parameter estimation (Liu et al. 2021b). The meteorological ensembles can also be used in ensemble data assimilation, which reduces the need to tune the hyperparameters of assumed statistical error models of forcing variables (Liu et al. 2019). The ensemble dressing method may be able to be improved by modifications that can be assessed in future research, including applying different transformation operators for precipitation observations, adopting different spatial attributes as predictors, and where possible aligning the grid sampling strategy with input station locations, if known.
Acknowledgments.
The authors appreciate Dr. Naoki Mizukami for assistance in aggregating NLDAS-2 forcing data from hourly to daily timesteps and Dr. Guoqiang Tang for assistance in paper revision. This research has been supported by the Bureau of Reclamation under Cooperative Agreement R16AC00039, and by the U.S. Army Corps of Engineers under two projects: Project 462032 and Contract 1254557. We would like to acknowledge high-performance computing support from Cheyenne (doi:10.5065/D6RX99HX) provided by the Computational and Information Systems Laboratory of the National Center for Atmospheric Research (NCAR), sponsored by the National Science Foundation. The authors declare that they have no conflict of interest.
Data availability statement.
This study requires two sets of input data: NLDAS-2 forcing dataset and NLDAS-2 topography data. Both are openly available from the Goddard Earth Sciences Data and Information Services Center (GES DISC). The NLDAS-2 forcing dataset is at https://www.emc.ncep.noaa.gov/mmb/nldas/LDAS8th/forcing/forcing_narr.shtml. The NLDAS topography data are at https://ldas.gsfc.nasa.gov/nldas/elevation. Aggregating the NLDAS-2 forcing data from hourly to daily was done using open-source Climate data Operators (CDO) and netCDF Operator (NCO) utilities. In addition, the comparison object for this research is the N15 dataset from Newman et al. (2015).
The output of this study is the Ensemble Dressing of NLDAS-2 (EDN2). The EDN2 dataset is available from the NCAR Research Data Archive (RDA) at https://doi.org/10.5065/karj-0e19 in netCDF format (Liu et al. 2021a). It has 100 members at a daily increment for the period 1979-2019 and for the same domain of the NLDAS-2 dataset. Individual ensemble members, ensemble mean and standard deviation, and the 5th and 95th percentiles of all members of Precip, Tmean, and DTR are provided. In addition, to enable the generation of a user-specified number of ensemble members, or to avoid the large ensemble dataset download, we also provide all the input files that are required in the ensemble generation step of GMET (see section 2), including the regression-estimated mean and standard deviation of Precip, Tmean, and DTR, and the topography attributes for all NLDAS-2 grids (i.e., latitude, longitude, elevation, and east–west and north–south slopes). In sum, the total size of the EDN2 dataset is approximately 991 GB. The NCAR RDA provides subsetting and technical support services.
The GMET software and associated grid resampling and input preparation codes (written in FORTRAN and Python, respectively) used in this paper are available in a public repository at https://github.com/NCAR/GMET/. In the course of this work, a number of notable upgrades to the GMET software used in Newman et al. (2015) were accomplished, including bug fixes related to the calculation of uncertainty, usability upgrades, and changes to the normalization and standardization methods. Other upgrades to GMET, yet not used in this work, are also described in Bunn et al. (2022).
REFERENCES
Abatzoglou, J. T., 2013: Development of gridded surface meteorological data for ecological applications and modelling. Int. J. Climatol., 33, 121–131, https://doi.org/10.1002/joc.3413.
Ailliot, P., D. Allard, V. Monbet, and P. Naveau, 2015: Stochastic weather generators: An overview of weather type models. J. French Stat. Soc., 156, 101–113.
Beven, K., 2002: Towards an alternative blueprint for a physically based digitally simulated hydrologic response modelling system. Hydrol. Processes, 16, 189–206, https://doi.org/10.1002/hyp.343.
Beven, K., and J. Freer, 2001: Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. J. Hydrol., 249, 11–29, https://doi.org/10.1016/S0022-1694(01)00421-8.
Blazkova, S., and K. Beven, 2009: A limits of acceptability approach to model evaluation and uncertainty estimation in flood frequency estimation by continuous simulation: Skalka catchment, Czech Republic. Water Resour. Res., 45, W00B16, https://doi.org/10.1029/2007WR006726.
Bunn, P. T. W., A. W. Wood, A. J. Newman, H.-I. Chang, C. L. Castro, M. P. Clark, and J. R. Arnold, 2022: Improving station-based ensemble surface meteorological analyses using numerical weather prediction: A case study of the Oroville Dam crisis precipitation event. J. Hydrometeor., 23, 1155–1169, https://doi.org/10.1175/JHM-D-21-0193.1.
Carpenter, T. M., K. P. Georgakakos, and J. A. Sperfslagea, 2001: On the parametric and NEXRAD-radar sensitivities of a distributed hydrologic model suitable for operational use. J. Hydrol., 253, 169–193, https://doi.org/10.1016/S0022-1694(01)00476-0.
Carroll, R. J., D. Ruppert, L. A. Stefanski, and C. M. Crainiceanu, 2006: Measurement Error in Nonlinear Models: A Modern Perspective. 2nd ed. CRC Press, 484 pp.
Clark, M. P., and A. G. Slater, 2006: Probabilistic quantitative precipitation estimation in complex terrain. J. Hydrometeor., 7, 3–22, https://doi.org/10.1175/JHM474.1.
Clark, M. P., D. E. Rupp, R. A. Woods, X. Zheng, R. P. Ibbitt, A. G. Slater, J. Schmidt, and M. J. Uddstrom, 2008: Hydrological data assimilation with the ensemble Kalman filter: Use of streamflow observations to update states in a distributed hydrological model. Adv. Water Resour., 31, 1309–1324, https://doi.org/10.1016/j.advwatres.2008.06.005.
Cornes, R. C., G. van der Schrier, E. J. M. van den Besselaar, and P. D. Jones, 2018: An ensemble version of the E-OBS temperature and precipitation data Sets. J. Geophys. Res. Atmos., 123, 9391–9409, https://doi.org/10.1029/2017JD028200.
Cosgrove, B. A., and Coauthors, 2003: Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res., 108, 8842, https://doi.org/10.1029/2002JD003118.
DeChant, C. M., and H. Moradkhani, 2012: Examining the effectiveness and robustness of sequential data assimilation methods for quantification of uncertainty in hydrologic forecasting. Water Resour. Res., 48, W04518, https://doi.org/10.1029/2011WR011011.
Dee, D., J. Fasullo, D. Shea, J. Walsh, and NCAR Staff, 2016: The Climate Data Guide: Atmospheric reanalysis: Overview and comparison tables. UCAR, accessed 20 July 2020, https://climatedataguide.ucar.edu/climate-data/atmospheric-reanalysis-overview-comparison-tables.
Del Giudice, D., C. Albert, J. Rieckermann, and P. Reichert, 2016: Describing the catchment-averaged precipitation as a stochastic process improves parameter and input estimation. Water Resour. Res., 52, 3162–3186, https://doi.org/10.1002/2015WR017871.
Fang, J., and L. Tacher, 2003: An efficient and accurate algorithm for generating spatially-correlated random fields. Commun. Numer. Methods Eng., 19, 801–808, https://doi.org/10.1002/cnm.621.
Fortin, V., G. Roy, N. Donaldson, and A. Mahidjiba, 2015: Assimilation of radar quantitative precipitation estimations in the Canadian Precipitation Analysis (CaPA). J. Hydrol., 531, 296–307, https://doi.org/10.1016/j.jhydrol.2015.08.003.
Fortin, V., G. Roy, T. Stadnyk, K. Koenig, N. Gasset, and A. Mahidjiba, 2018: Ten years of science based on the Canadian Precipitation Analysis: A CaPA system overview and literature review. Atmos.–Ocean, 56, 178–196, https://doi.org/10.1080/07055900.2018.1474728.
Fuentes-Andino, D., K. Beven, A. Kauffeldt, C.-Y. Xu, S. Halldin, and G. Di Baldassarre, 2017: Event and model dependent rainfall adjustments to improve discharge predictions. Hydrol. Sci. J., 62, 232–245, https://doi.org/10.1080/02626667.2016.1183775.
Gelaro, R., and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 5419–5454, https://doi.org/10.1175/JCLI-D-16-0758.1.
Gneiting, T., A. E. Raftery, A. H. Westveld, and T. Goldman, 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., 133, 1098–1118, https://doi.org/10.1175/MWR2904.1.
Habib, E., W. F. Krajewski, and A. Kruger, 2001: Sampling errors of tipping-bucket rain gauge measurements. J. Hydrol. Eng., 6, 159–166, https://doi.org/10.1061/(ASCE)1084-0699(2001)6:2(159).
Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129, 550–560, https://doi.org/10.1175/1520-0493(2001)129<0550:IORHFV>2.0.CO;2.
Haylock, M. R., N. Hofstra, A. M. G. Klein Tank, E. J. Klok, P. D. Jones, and M. New, 2008: A European daily high-resolution gridded data set of surface temperature and precipitation for 1950-2006. J. Geophys. Res., 113, D20119, https://doi.org/10.1029/2008JD010201.
Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803.
Huard, D., and A. Mailhot, 2008: Calibration of hydrological model GR2M using Bayesian uncertainty analysis. Water Resour. Res., 44, W02424, https://doi.org/10.1029/2007WR005949.
Huffman, G. J., and Coauthors, 2019: NASA Global Precipitation Measurement (GPM) Integrated Multi-satellitE Retrievals for GPM (IMERG). Algorithm Theoretical Basis Doc., version 06, 34 pp., https://gpm.nasa.gov/sites/default/files/document_files/IMERG_ATBD_V06.pdf.
Humphrey, M. D., J. D. Istok, J. Y. Lee, J. A. Hevesi, and A. L. Flint, 1997: A new method for automated dynamic calibration of tipping-bucket rain gauges. J. Atmos. Oceanic Technol., 14, 1513–1519, https://doi.org/10.1175/1520-0426(1997)014<1513:ANMFAD>2.0.CO;2.
Johnson, M. E., 1987. Multivariate Statistical Simulation. Wiley Series in Probability and Statistics, John Wiley and Sons, 240 pp., https://doi.org/10.1002/9781118150740.
Joyce, R. J., J. E. Janowiak, P. A. Arkin, and P. Xie, 2004: CMORPH: A method that produces global precipitation estimates from passive microwave and infrared data at high spatial and temporal resolution. J. Hydrometeor., 5, 487–503, https://doi.org/10.1175/1525-7541(2004)005<0487:CAMTPG>2.0.CO;2.
Kavetski, D., S. W. Franks, and G. Kuczera, 2003. Confronting input uncertainty in environmental modelling. Calibration of Watershed Models, Q. Duan et al., Eds., Water Science and Application Series, Vol. 6, Amer. Geophys. Union, 49–68.
Kavetski, D., G. Kuczera, and S. W. Franks, 2006: Bayesian analysis of input uncertainty in hydrological modeling: 1. Theory. Water Resour. Res., 42, W03407, https://doi.org/10.1029/2005WR004368.
Kirstetter, P.-E., J. J. Gourley, Y. Hong, J. Zhang, S. Moazamigoodarzi, C. Langston, and A. Arthur, 2015: Probabilistic precipitation rate estimates with ground-based radar networks. Water Resour. Res., 51, 1422–1442, https://doi.org/10.1002/2014WR015672.
Kobayashi, S., and Coauthors, 2015: The JRA-55 reanalysis: General specifications and basic characteristics. J. Meteor. Soc. Japan, 93, 5–48, https://doi.org/10.2151/jmsj.2015-001.
Krzysztofowicz, R., and W. B. Evans, 2008: Probabilistic forecasts from the national digital forecasts database. Wea. Forecasting, 23, 270–289, https://doi.org/10.1175/2007WAF2007029.1.
Leisenring, M., and H. Moradkhani, 2011: Snow water equivalent prediction using Bayesian data assimilation methods. Stochastic Environ. Res. Risk Assess., 25, 253–270, https://doi.org/10.1007/s00477-010-0445-5.
Li, W., Q. Duan, C. Miao, A. Ye, W. Gong, and Z. Di, 2017: A review on statistical postprocessing methods for hydrometeorological ensemble forecasting. Wiley Interdiscip. Rev.: Water, 4, e1246, https://doi.org/10.1002/wat2.1246.
Liu, H., A. Thiboult, B. Tolson, F. Anctil, and J. Mai, 2019: Efficient treatment of climate data uncertainty in ensemble Kalman filter (EnKF) based on an existing historical climate ensemble dataset. J. Hydrol., 568, 985–996, https://doi.org/10.1016/j.jhydrol.2018.11.047.
Liu, H., A. W. Wood, A. J. Newman, and M. P. Clark, 2021a: Ensemble Dressing of North American Land Data Assimilation version 2 (EDN2). NCAR Research Data Archive, accessed 20 August 2021, https://doi.org/10.5065/KARJ-0E19.
Liu, H., B. A. Tolson, A. J. Newman, and A. W. Wood, 2021b: Leveraging ensemble meteorological forcing data to improve parameter estimation of hydrologic models. Hydrol. Processes, 35, e14410, https://doi.org/10.1002/hyp.14410.
Liu, Y., and H. V. Gupta, 2007: Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. Water Resour. Res., 43, W07401, https://doi.org/10.1029/2006WR005756.
Livneh, B., E. A. Rosenberg, C. Lin, B. Nijssen, V. Mishra, K. M. Andreadis, E. P. Maurer, and D. P. Lettenmaier, 2013: A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States: Update and extensions. J. Climate, 26, 9384–9392, https://doi.org/10.1175/JCLI-D-12-00508.1.
Livneh, B., T. J. Bohn, D. W. Pierce, F. Munoz-Arriola, B. Nijssen, R. Vose, D. R. Cayan, and L. Brekke, 2015: A spatially comprehensive, hydrometeorological data set for Mexico, the U.S., and southern Canada 1950–2013. Sci. Data, 2, 150042, https://doi.org/10.1038/sdata.2015.42.
Maurer, E. P., A. W. Wood, J. C. Adam, D. P. Lettenmaier, and B. Nijssen, 2002: A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States. J. Climate, 15, 3237–3251, https://doi.org/10.1175/1520-0442(2002)015<3237:ALTHBD>2.0.CO;2.
McMillan, H., B. Jackson, M. Clark, D. Kavetski, and R. Woods, 2011: Rainfall uncertainty in hydrological modelling: An evaluation of multiplicative error models. J. Hydrol., 400, 83–94, https://doi.org/10.1016/j.jhydrol.2011.01.026.
Montanari, A., 2005: Large sample behaviors of the generalized likelihood uncertainty estimation (GLUE) in assessing the uncertainty of rainfall-runoff simulations. Water Resour. Res., 41, W08406, https://doi.org/10.1029/2004WR003826.
Montanari, A., and D. Koutsoyiannis, 2012: A blueprint for process-based modeling of uncertain hydrological systems. Water Resour. Res., 48, W09555, https://doi.org/10.1029/2011WR011412.
Moradkhani, H., S. Sorooshian, H. V. Gupta, and P. R. Houser, 2005: Dual state–parameter estimation of hydrological models using ensemble Kalman filter. Adv. Water Resour., 28, 135–147, https://doi.org/10.1016/j.advwatres.2004.09.002.
Moradkhani, H., C. M. Dechant, and S. Sorooshian, 2012: Evolution of ensemble data assimilation for uncertainty quantification using the particle filter-Markov chain Monte Carlo method. Water Resour. Res., 48, W12520, https://doi.org/10.1029/2012WR012144.
Newman, A. J., and Coauthors, 2015: Gridded ensemble precipitation and temperature estimates for the contiguous United States. J. Hydrometeor., 16, 2481–2500, https://doi.org/10.1175/JHM-D-15-0026.1.
Newman, A. J., M. P. Clark, R. J. Longman, E. Gilleland, T. W. Giambelluca, and J. R. Arnold, 2019: Use of daily station observations to produce high-resolution gridded probabilistic precipitation and temperature time series for the Hawaiian Islands. J. Hydrometeor., 20, 509–529, https://doi.org/10.1175/JHM-D-18-0113.1.
Nijssen, B., and D. P. Lettenmaier, 2004: Effect of precipitation sampling error on simulated hydrological fluxes and states: Anticipating the Global Precipitation Measurement satellites. J. Geophys. Res., 109, D02103, https://doi.org/10.1029/2003JD003497.
Papalexiou, S. M., 2018: Unified theory for stochastic modelling of hydroclimatic processes: Preserving marginal distributions, correlation structures, and intermittency. Adv. Water Resour., 115, 234–252, https://doi.org/10.1016/j.advwatres.2018.02.013.
Piani, C., J. O. Haerter, and E. Coppola, 2010: Statistical bias correction for daily precipitation in regional climate models over Europe. Theor. Appl. Climatol., 99, 187–192, https://doi.org/10.1007/s00704-009-0134-9.
PRISM Climate Group, 2016: Descriptions of PRISM spatial climate datasets for the conterminous United States. Oregon State University, 29 pp., https://prism.oregonstate.edu/documents/PRISM_datasets.pdf.
Rasmussen, J., H. Madsen, K. H. Jensen, and J. C. Refsgaard, 2015: Data assimilation in integrated hydrological modeling using ensemble Kalman filtering: Evaluating the effect of ensemble size and localization on filter performance. Hydrol. Earth Syst. Sci., 19, 2999–3013, https://doi.org/10.5194/hess-19-2999-2015.
Renard, B., D. Kavetski, E. Leblois, M. Thyer, G. Kuczera, and S. W. Franks, 2011: Toward a reliable decomposition of predictive uncertainty in hydrological modeling: Characterizing rainfall errors using conditional simulation. Water Resour. Res., 47, W11516, https://doi.org/10.1029/2011WR010643.
Roulston, M. S., and L. A. Smith, 2003: Combining dynamical and statistical ensembles. Tellus, 55A, 16–30, https://doi.org/10.3402/tellusa.v55i1.12082.
Salamon, P., and L. Feyen, 2009: Assessing parameter, precipitation, and predictive uncertainty in a distributed hydrological model using sequential data assimilation with the particle filter. J. Hydrol., 376, 428–442, https://doi.org/10.1016/j.jhydrol.2009.07.051.
Scheuerer, M., and T. M. Hamill, 2015: Statistical postprocessing of ensemble precipitation forecasts by fitting censored, shifted gamma distributions. Mon. Wea. Rev., 143, 4578–4596, https://doi.org/10.1175/MWR-D-15-0061.1.
Schiemann, R., R. Erdin, M. Willi, C. Frei, M. Berenguer, and D. Sempere-Torres, 2011: Geostatistical radar-raingauge combination with nonparametric correlograms: Methodological considerations and application in Switzerland. Hydrol. Earth Syst. Sci., 15, 1515–1536, https://doi.org/10.5194/hess-15-1515-2011.
Semenov, M., and P. Stratonovitch, 2010: Use of multi-model ensembles from global climate models for assessment of climate change impacts. Climate Res., 41, 1–14, https://doi.org/10.3354/cr00836.
Sheffield, J., G. Goteti, and E. F. Wood, 2006: Development of a 50-year high-resolution global dataset of meteorological forcings for land surface modeling. J. Climate, 19, 3088–3111, https://doi.org/10.1175/JCLI3790.1.
Slater, A. G., and M. P. Clark, 2006: Snow data assimilation via an ensemble Kalman filter. J. Hydrometeor., 7, 478–493, https://doi.org/10.1175/JHM505.1.
Tang, G., M. P. Clark, S. M. Papalexiou, A. J. Newman, A. W. Wood, D. Brunet, and P. H. Whitfield, 2021: EMDNA: An Ensemble Meteorological Dataset for North America. Earth Syst. Sci. Data, 13, 3337–3362, https://doi.org/10.5194/essd-13-3337-2021.
Tang, G., M. P. Clark, and S. M. Papalexiou, 2022: EM-Earth: The ensemble meteorological dataset for planet Earth. Bull. Amer. Meteor. Soc., 103, E996–E1018, https://doi.org/10.1175/BAMS-D-21-0106.1.
Thiboult, A., F. Anctil, and M.-A. Boucher, 2016: Accounting for three sources of uncertainty in ensemble hydrological forecasting. Hydrol. Earth Syst. Sci., 20, 1809–1825, https://doi.org/10.5194/hess-20-1809-2016.
Thornton, P. E., M. M. Thornton, B. W. Mayer, Y. Wei, R. Devarakonda, R. S. Vose, and R. B. Cook, 2016: Daymet: Daily surface weather data on a 1-km grid for North America, version 3. ORNL DAAC, accessed 20 August 2021, https://doi.org/10.3334/ORNLDAAC/1328.
Van Schaeybroeck, B., and S. Vannitsem, 2015: Ensemble post-processing using member-by-member approaches: Theoretical aspects. Quart. J. Roy. Meteor. Soc., 141, 807–818, https://doi.org/10.1002/qj.2397.
Wang, X., and C. H. Bishop, 2005: Improvement of ensemble reliability with a new dressing kernel. Quart. J. Roy. Meteor. Soc., 131, 965–986, https://doi.org/10.1256/qj.04.120.
Wilks, D. S., and T. M. Hamill, 2007: Comparison of ensemble-MOS methods using GFS reforecasts. Mon. Wea. Rev., 135, 2379–2390, https://doi.org/10.1175/MWR3402.1.
Wu, L., D.-J. Seo, J. Demargne, J. D. Brown, S. Cong, and J. Schaake, 2011: Generation of ensemble precipitation forecast from single-valued quantitative precipitation forecast for hydrologic ensemble prediction. J. Hydrol., 399, 281–298, https://doi.org/10.1016/j.jhydrol.2011.01.013.
Xia, Y., and Coauthors, 2012: Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products. J. Geophys. Res., 117, D03109, https://doi.org/10.1029/2011JD016048.