1. Introduction
Hydrologic models are commonly calibrated to accurately represent streamflow, with research over the past decade focusing on “regionalization” to estimate streamflow in ungauged basins (e.g., Mizukami et al. 2017; Yang et al. 2019). However, there has been less of a focus on calibrating and regionalizing functionally similar land surface models (LSMs) that can be coupled with and/or driven by meteorological forcing data produced by regional climate models (RCMs; Holtzman et al. 2020). While reanalysis data, or historical forcing data, downscaled by RCMs are not constrained by point station observations like most gridded hydrologic forcing datasets (e.g., PRISM; Livneh et al. 2013), downscaled reanalysis data can provide alternative benefits such as the ability to physically resolve meteorology in space and time (Keller and Wahl 2021). This contrasts with the type of gridded products derived from point-location data that are commonly used as a forcing for hydrologic models. Such gridded products may be accurate at the stations they are derived from, but are statistically, rather than physically, interpolated to ungauged regions, which can lead to inaccurate trends, unrealistic variability in space and time, and deviations from climatology in ungauged regions (Walton and Hall 2018). Furthermore, unlike a hydrologic model calibrated by a gridded observational dataset, an LSM calibrated by an RCMs reanalysis data can then be tightly coupled with the RCM to accurately represent runoff (e.g., Holtzman et al. 2020) and evaluate land surface–atmosphere interactions (e.g., Kerandi et al. 2018).
In the hydrologic community, calibration is an important practice to ensure streamflow can be accurately represented. While sometimes overlooked in the atmospheric and regional climate modeling community, calibration of an LSM’s streamflow can provide a more accurate representation of annual water availability and flood hazards (e.g., Holtzman et al. 2020). This study attempts to calibrate runoff in gauged and ungauged regions using downscaled fifth-generation ECMWF Reanalysis (ERA5) data produced from a previous study, where we used the Weather Research Forecasting (WRF) RCM to reduce downscaled ERA5 precipitation biases via a series of sensitivity tests (Rahimi et al. 2022). While simulations are performed offline and tight coupling is not performed in this study, this calibration prepares the Noah-Multiparameterization (Noah-MP) LSM to accurately represent runoff in subsequent studies where it can additionally be tightly coupled with its corresponding RCM (WRF). Furthermore, we develop two calibrated LSM parameter sets, one that improves the representation of runoff for water availability studies at a monthly time resolution and another focused on flood hazards at a daily time resolution. In many respects, this is a follow-on paper from our downscaling of ERA5, where we found wet biases remained in our streamflow despite improving precipitation via WRF sensitivity tests (Rahimi et al. 2022). While we do not modify WRFs parameterizations, which we evaluated in detail in Rahimi et al. (2022), or perform a tight coupling between WRF and its underlying LSM in this manuscript, we perform calibration of streamflow across a hydroclimatically diverse region (the western United States). Aside from producing a model capable of accurately representing runoff, our intention is to take advantage of the physical representation of spatial meteorology provided by WRFs downscaling of ERA5 and prepare the calibrated LSM for tight coupling with WRF in subsequent studies.
More specifically, we explore and provide details on the three major steps needed to accurately represent streamflow in an LSM: (i) physical parameterization selection, (ii) land surface parameter calibration, and (iii) regionalization of calibrated parameters to ungauged areas. The first component, parameterization selection of the land surface model, is more commonly modified in the regional climate modeling community to provide an accurate representation of snowpack (e.g., Rahimi et al. 2022). However, biases in streamflow, even after optimizing parameterization selection, can be large (Holtzman et al. 2020; Rahimi et al. 2022). Thus, the second step is required, where land surface parameters are calibrated. This second step is applied where there is observational streamflow data to compare against for a basin of interest. By comparing simulated and observed streamflow data, soil and other land surface properties can be adjusted to reduce streamflow biases. The third step, where calibrated land surface parameters are regionalized to ungauged basins, is an active area of research. Here we evaluate a regionalization approach called the donor-basin method. We focus on this regionalization method given its success compared to other approaches (e.g., Arsenault and Brissette 2014; Poissant et al. 2017; Razavi and Coulibaly 2017; Yang et al. 2018; Qi et al. 2021) and its use in the National Water Model (NWM; Gochis et al. 2019), which provides streamflow forecasts across the continental United States (CONUS). A drawback of this approach is uncertainty regarding its performance when applied with unique methodological choices (designs) (e.g., Poissant et al. 2017; Yang et al. 2018), which can lead to suboptimal performance if a single donor-basin design is selected. In this vein, we evaluate the donor-basin method in detail via several sensitivity tests to determine its performance with unique designs (e.g., with or without nonlinear clustering of basins).
In the following section, we discuss the forcing dataset and our land surface modeling setup. In section 3, we discuss how offline Noah-MP’s domain-wide physical parameterizations were selected, the calibration process employed for gauged basins, and provide an evaluation of the model’s performance (validation) in gauged basins for the 1954–2021 time period. Section 4 provides a review of the sensitivity and performance of the donor-basin regionalization approach when applied with unique designs, and conclusions are provided in section 5.
2. Forcing dataset and land surface model overview
The forcing dataset used in this study was obtained from ERA5 dynamically downscaled with the regional climate model WRF to 9-km resolution across the western United States from September 1950 to August 2021 (Rahimi et al. 2022). Henceforth, we refer to the forcing dataset as ERA5-WRF and the study year, which, take the year 1989 as an example, represents September 1988–August 1989. Sensitivity tests were performed to reduce precipitation biases for ERA5-WRF, as discussed in Rahimi et al. (2022). However, even after adjusting RCM biases, streamflow simulated by the Noah-MP LSM (Niu et al. 2011) coupled to ERA5-WRF was wet biased by as much as 28% (Rahimi et al. 2022). Given this bias, here we take the LSM Noah-MP, which was used with WRF to downscale ERA5, offline for calibration. This Noah-MP model has the same grid and resolution as the WRF model used to downscale ERA5. While the reanalysis dataset has remaining precipitation biases, such as a wet summer precipitation bias across the southwestern United States, we purposely avoid bias correction of the downscaled forcing data since doing so would preclude the ability to tightly couple the calibrated LSM with WRF in subsequent studies.
In the offline simulations in this study, a time step of 3 h was used. The years 1951–2021 were simulated, but the first three years were discarded as spinup (1951–53). We determined three years to be a sufficient spinup based on negligible changes to streamflow or snowpack if a longer spinup time were used. The model’s land use data are aggregated from USGS 30-arc-s 24-land-use categories (USGS 2018), and the soil type is similarly aggregated from 30-arc-s hybrid State Soil Geographic Database (STATSGO) soil texture datasets (NCAR 2022). Initial or default land surface parameters in Noah-MP such as vegetation conditions that control the rate of transpiration and soil properties that influence streamflow are based on these datasets. Given the intention to assess historical and future climate change impacts on hydrology in future studies, the hydrologic model is calibrated to full natural flow (henceforth referred to as streamflow), which represents streamflow conditions without human infrastructure interventions (i.e., reservoirs or diversions). This is a common practice for large-scale streamflow studies since full natural flow can then be applied to separate models capable of representing local and time-varying conditions associated with human infrastructure such as water supply/demand and flood mitigation.
Due to the different observational streamflow datasets available at a daily and monthly time resolution, we develop two unique land surface parameter sets, one more appropriate for evaluating flood hazards, and another more appropriate for water availability analysis. The flood hazard model was compared against daily natural flow data from (i) GagesII reference stations, which are minimally human-influenced basins scattered across the continental United States (Falcone 2011), (ii) California’s Sierra Nevada basins from California’s Department of Water Resources (California Department of Water Resources 2021), and (iii) California’s Eastern Sierra Nevada basins received from the Los Angeles Department of Water and Power (LADWP). The water availability model was compared against monthly natural flow data from the previous data sources, in addition to that from the Bureau of Reclamation for the Upper and Lower Colorado River basin (Bureau of Reclamation 2021). Also, for the monthly data, California’s Sierra Nevada basins include larger drainage areas and additional basins that are not included in DWR’s daily dataset. Figure 1 shows the location of basins with available daily and monthly natural flow data used in the study. Note, while we include California’s Eastern Sierra Nevada as a part of our analysis, given that the observed data we received from LADWP is the best available data for this region, we limit our interpretation for this particular region. This is because of possible uncertainties in the observed streamflow data for the eastern Sierra Nevada which may not represent the full drainage area associated with this basin, an ongoing question we are working with LADWP to address.
Basins with available (a) daily and (b) monthly natural streamflow data and the (c) ungauged Hydrologic Unit Code (HUC)-10 basins where calibrated parameters are regionalized. Note, the larger blue outline represents major HUC-2 basins in the western United States.
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
The WRF-Hydro modeling platform was used to drive Noah-MP. To obtain streamflow, gridded runoff from Noah-MP was aggregated across a given watershed and weighted by the fraction of a grid cell’s area within a basin. While overland routing and channel routing can be implemented as a part of the WRF-Hydro modeling platform, we did not do so, for a variety of reasons. These include the significant increase in computation time required to calibrate each routing component’s additional parameters, and issues in associating the location of digital elevation model (DEM) delineated streams to the correct location of several hundred basin outlets. For the size of the basins calibrated in this study we found that routing had a small impact on daily streamflow performance, which agrees with previous studies (e.g., Li et al. 2019). However, if a finer time resolution than daily streamflow or larger basins are of interest, the runoff from the calibrated Noah-MP in this study can be applied to routing models.
3. Model parameterization, calibration, and validation
a. Noah-MP parameterization
The first step in setting up Noah-MP for the western United States involved selecting an appropriate set of LSM parameterizations for the domain. Previous studies collectively suggest that optimal parameterizations differ from region to region and should be selected via sensitivity tests for a given domain (You et al. 2020). To determine the best set of parameterizations, sensitivity tests were performed for the years 1966–69 and 2004–07. These 8 years were selected because they maximize the number of basins with available data for two distinct time periods, resulting in a total of 586 basins with daily streamflow data across the ERA5-WRF domain, with at least 10 months of data in each study year. As later discussed, these years are additionally used for calibrating the LSM and for evaluating regionalization performance to ungauged basins. In this split-sample calibration approach (e.g., Newman et al. 2015), we use the same land use conditions for both time periods and allow for a 3-yr spinup prior to each set of years. The main purpose of the two time periods was to increase the number of basins across the western United States available for regionalizing to ungauged basins, while fine-tuning the calibrated parameter sets was a secondary benefit. The location of the basins available for each time period (1966–69 and 2004–07) and the hydroclimate characteristics associated with the basins in each time period are shown in Fig. S1 in the online supplemental material. The precipitation, based on rainfall across major western U.S. basins (the hydrologic unit codes outlined in Fig. 1), for these 8 years ranges from 86% to 110% of normal (normal represented by the mean from 1954 to 2021) with an average equal to 101% of normal.
For reference, the possible parameterizations in Noah-MP are listed in Table S1. To reduce the total number of Noah-MP parameterizations from its 20 736 possible combinations (You et al. 2020), we only evaluate the most sensitive parameterizations, in a two-step process for snowpack and then streamflow. First, a set of LSM parameterizations were selected that led to the best representation of snow water equivalent (SWE), through comparisons against 795 Snowpack Telemetry (SNOTEL) stations and gridded SWE (PRISM-UA; Zeng et al. 2018) (Fig. 2). SWE solutions were found to be heavily influenced by the choice of rain–snow partitioning (ptp) and radiation transfer (rad) parameterizations. Unique parameterizations for the lower boundary condition of soil temperature (tbot), snow surface albedo (alb), and the first-layer snow/soil temperature time scheme (temp) were found to have a negligible impact on SWE compared to the parameterization options for ptp and rad.
Performance of unique parameterizations for representing snowpack and streamflow with baseline Noah-MP land surface parameters. (a) Mean monthly SWE at SNOTEL sites shown in (b) for the parameterizations SWE is most sensitive to (ptp, rad). (c) Streamflow performance of different runoff parameterizations when compared against observed data across the entire domain and for each HUC-2 basin that falls within the domain. Performance of streamflow was evaluated for the calibration period (1966–69 and 2004–07) and SWE was evaluated from 2004 to 2007 based on data availability. Plots are shown for the daily flood hazard model, but the monthly water availability model has the same performance for SWE and experiences a similar impact on streamflow performance for the runoff parameterizations.
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
Figure 2a illustrates a wide range of uncertainty associated with the two observation datasets (SNOTEL and PRISM-UA). While the combination of rad3 and ptp3 (rad3/ptp3) best represents mean monthly SWE compared to PRISM-UA, other sets of parameterizations (rad1/ptp1) better match the SNOTEL springtime peak SWE or provide an intermediate estimate (rad3/ptp1) between the range of PRISM-UA and SNOTEL observations. Due to this observational uncertainty (these biases between the observational datasets exists for longer time periods as well), and the relatively small difference in the baseline models streamflow performance between rad3/ptp1 and rad3/ptp3 (Fig. S2a), a full calibration followed by a 1954–2021 validation was deemed necessary. We selected between the ptp and rad parameterization combinations by comparing against calibrated (Fig. S2b) and subsequently validated streamflow (Fig. S2c). From our analysis (details outlined in caption of Fig. S2), we found that the default parameterization (rad3/ptp1), which provides an estimate of SWE that falls between PRISM-UA and SNOTEL without favoring either observational SWE dataset like the other rad/ptp options, provides the best domain-wide streamflow performance. We suspect that ptp1 results in a more accurate representation of snowpack and streamflow since it is based on more complex, physically based functions (Jordan 1991), while the ptp3 parameterization uses a relatively simple assumption that all precipitation falls as snow when the surface air temperature is less than freezing. On the other hand, for the rad parameterizations which control the radiation transfer through vegetation canopy, rad1 uses a three-dimensional representation of canopy geometry (Niu and Yang 2004) that may not be accurately represented in the Noah-MP model by its default vegetation conditions, possibly explaining why rad3 results in more accurate streamflow than rad1. Overall, the optimal parameterization combinations that influence snowpack was not an obvious choice, and calibration/validation to streamflow was needed to select the best option for our hydroclimatically diverse domain.
For the daily flood hazard model, the SIMGM runoff parameterization (rnf1) performed noticeably better than the default parameterization (rnf3) in representing streamflow across the study domain (a median NSE for all basins of 0.12 versus −0.06) (Fig. 2c). Furthermore, rnf1 led to more accurate streamflow across most of the western U.S. HUC-2s except California, where rnf3 performed slightly better than rnf1, and the Lower Colorado, where rnf2 performed slightly better than rnf1. As a result, based on our sensitivity experiments, the only change we made from default parameterizations was for the runoff parameterization (Table S1, selected rnf1 rather than rnf3). However, we note that flexible parameterization across the domain, a capability not currently employed in Noah-MP or WRF-Hydro, could lead to more accurate streamflow representation for large, hydroclimatically diverse study domains such as that in this study.
b. Calibration of gauged basins
After selecting the best set of domain-wide parameterizations for representing SWE and streamflow, spatially varying LSM parameters were optimized to further improve streamflow performance. Based on available computational resources, four parameters were selected for calibration of the hydrologic model: saturated hydraulic conductivity (Ksat), maximum soil moisture content (MAXSMC) representing porosity, the pore-size distribution index (Bexp) which impacts how the hydraulic function relates water potential and volumetric water content, and the melt factor for snow depletion curve (MFSNO) which controls snowmelt rate. These parameters were selected from an evaluation of streamflow sensitivity in this and previous studies (Cai et al. 2014; Cuntz et al. 2016; Gochis et al. 2019). While the default values for Ksat, MAXSMC, and Bexp in Noah-MP are based on STATSGO soil types, the values associated with these parameters are typically uncertain, and for the MFSNO parameter the Noah-MP/WRF-Hydro model uses a domain-wide default value of 2.5. A range of physically realistic modifications were considered for each parameter (Table 1) based on values used in previous studies (Cai et al. 2014; Mendoza et al. 2015; Gochis et al. 2019; Hussein 2020), leading to a total of 800 simulations. Like the parameterization selection, the years 1966–69 and 2004–07 were used for the calibration. The calibration was performed using the NSE as an objective function, with daily streamflow for the flood hazard model and monthly streamflow for the water availability model. In determining the best parameter set and NSE, we combine the two time periods as continuous datasets, rather than taking a mean of the optimized basin parameters for the two time periods. The 800 simulations listed in Table 1 were performed in parallel. We considered using an algorithmic calibration approach (e.g., dynamically dimensioned search; Tolson and Shoemaker 2007); however, such an approach was not employed given the requirement to make simulations in series which leads to long computation times.
Parameters and modifications to their initial default values evaluated in the calibration.
The calibration improves the median NSE for all basins, from 0.12 to 0.45 for daily streamflow in the flood hazard model, and from 0.18 to 0.64 for monthly streamflow in the water availability model (Table 2). Additionally, the streamflow bias is reduced from 23.9% to 8.1% for the daily flood hazard model, and from 24.2% to 5.4% for the monthly water availability model. Based on spatial plots of NSE, streamflow is qualitatively improved from the baseline to calibrated parameters across the western United States for both the flood hazard and water availability models (Figs. 3a,b). However, the Lower Colorado River basin and the southwestern portion of the Upper Colorado basin includes several basins with low NSE values even after calibration. Poorer performance of hydrologic models in arid regions is a common problem since small errors in the forcing dataset can lead to large biases in streamflow (Mizukami et al. 2017; Oubeidillah et al. 2014; Yang et al. 2019). This is clearly an issue in this study due to ERA5-WRF’s summertime positive precipitation bias in the Lower Colorado basin and the southwestern portion of the Upper Colorado basin which have an average June–August bias greater than 0.5 mm day−1 (reference Fig. 4 in Rahimi et al. 2022). We find that for both the daily and monthly calibrated models there is a weak to moderate correlation (Pearson’s r ranging from roughly 0.25 to 0.35) between a higher NSE and wetter conditions (as indicated by the mean rainfall during the spring, winter, and fall, extreme maximum 1-day annual rainfall or Rx-1d, and higher latitude), greater forest coverage, and colder mean summer temperature (Table S1). Also, while we optimize streamflow for the two time periods as continuous datasets, the distribution of optimal parameter modifications for the separate time periods (1966–69 and 2004–07) are largely consistent for the two unique time periods (Fig. S3).
Baseline, calibrated, and regionalized performance for the calibration time period (1966–69, 2004–07) with NSE of each basin for the (a) daily flood hazard model and (b) monthly water availability model.
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
Streamflow performance based on all basins.
While SWE was not explicitly optimized, it was modified in the calibration via the selection of a spatially distributed snowmelt factor (MFSNO) which maximized streamflow performance for each basin. However, to avoid changes to SWE that would not be physically realistic, the MFSNO parameter was only modified if it led to a >0.05 increase in streamflow NSE. Not employing this requirement led to a less realistic representation of SWE (a negative SWE bias compared to both SNOTEL and PRISM-UA) with marginal improvements to streamflow.
c. Validation for 1954–2021
After calibrating the daily flood hazard and monthly water availability models, results were validated over 1954–2021 with the 8 years used for calibration removed. The ability of the monthly water availability model to represent mean monthly streamflow and total annual water availability is shown in Fig. 4. We also include performance for additional streamflow metrics not explicitly calibrated to, but relevant for understanding the model’s performance, in Table 2. This table includes the Kling–Gupta efficiency (KGE; Kling et al. 2012) metric which measures streamflow performance based on a weighting of the bias, correlation, and variability of streamflow, and varies from negative infinity (poor performance) to 1 (perfect prediction) similar to the NSE. Based on all of the basins’ monthly streamflow, the median NSE for the validation period is reduced to 0.56 compared to an NSE of 0.64 for the calibration period. However, the KGE (0.65) and bias (4.43%) are nearly the same as the calibration period (0.66% and 5.42%, respectively) (Table 2). Considering the calibration was performed based on ∼10% of the study period, this generally demonstrates consistency over time in the performance of the optimized parameters. In Fig. 4a, we demonstrate that the mean monthly streamflow is accurately represented for basins with mean winter temperatures less than −1°C (assumed to be snowmelt influenced) and greater than 1°C (little to no snowmelt influence). For different regions across the western United States, streamflow is well captured for the monthly model across the Columbia River basin and California’s central Sierra Nevada, moderate performance is observed for the Upper Colorado basin, and poor performance is observed for the Lower Colorado basin. While more accurate streamflow performance can be obtained by increasing the range of MAXSMC (porosity) modification factors considered in the calibration (Table S3, Fig. S4), we elected to limit the range of our modifications since using a wider range of parameter modifications in the calibration leads to wetter soil moisture biases in the Colorado River basin (later discussed in section 4e). In other words, we chose to maintain more physically realistic soil properties that avoid deviating too far from the initial or default soil texture and property conditions from STATSGO used in setting up the model; however, a study primarily interested in streamflow performance may consider using the wider range of parameter modifications outlined in Table S3. Finally, we reiterate that the poorer performance across the Lower Colorado basin and southwestern portion of the Upper Colorado basin is primarily driven due to a wet bias (roughly 0.5 mm day−1) in the forcing dataset across this region which we chose not to bias correct to allow for tight coupling of the LSM with WRF in subsequent studies.
Evaluation of monthly water budget model across the western U.S. domain from 1954 to 2021. (a) Mean monthly streamflow and (b) annual water availability for each gauged region.
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
To understand the flood hazard model’s ability to represent high-flow conditions, the observed and simulated annual-maximum and return-period streamflow were evaluated (Fig. 5). The return period streamflow was obtained by fitting a generalized extreme value (GEV) Type 1 (Gumbel) distribution to the total annual maximum 1-day streamflow for each region evaluated (Gumbel 1958). The Columbia and Colorado River basins have roughly a −12.5% and −9% dry bias, respectively, in the annual maximum streamflow and return periods ranging from 2 to 100 years, while California’s central Sierra Nevada have a 7.7% wet bias in the annual maximum streamflow and roughly a 10% wet bias in the 2–100-yr return period streamflow. We additionally include simulated hydrographs of the most severe winter and spring/summer events on record in the supplemental material (Fig. S5).
(a) Annual 1-day maximum streamflow and (b) percent bias in return period streamflow based on a GEV (type 1) distribution for observed and simulated conditions from the daily flood hazard model from 1954 to 2021.
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
4. Regionalization to ungauged basins
a. Selection of regionalization method
In the hydrologic modeling community, two primary regionalization approaches exist for estimating streamflow in ungauged basins. The first involves calibration of a hydrologic model to gridded runoff metrics, which can simplify the calibration process significantly. Spatially continuous gridded runoff metrics have been derived using artificial neural networks (ANNs) trained on the relationship between gauged basins and their hydroclimate characteristics (Beck et al. 2015; Yang et al. 2019). The second approach involves calibration to individual streamflow gauges, followed by the regionalization of parameters to ungauged basins. The regionalization of parameters for the second approach is based either on a statistical or mathematical measure of the similarity between the gauged and ungauged basins (e.g., Arsenault and Brissette 2014; Razavi and Coulibaly 2017).
We chose the second approach since there are a few issues with using an already developed gridded runoff dataset such as that incorporated by Yang et al. (2019). First, such an approach is limited to calibrating against runoff metrics (e.g., long-term mean flow, high/low flow percentile) rather than an actual streamflow time series. Second, the approach depends on the native resolution of the gridded runoff dataset which may be coarser in resolution than the forcing dataset and hydrologic model (e.g., Yang et al. 2019). Since we are interested in basins with observational data that are relevant to stakeholder needs (i.e., subbasins within California’s Sierra Nevada and the Colorado River basin), this study follows the second approach where regionalization is performed after calibrating to individual streamflow gauges. This ensures high accuracy is achieved for these gauged basins while allowing for high fidelity regionalization in ungauged basins that might be hydrologically distinct. Several regionalization options exist within the second approach:
-
interpolation of calibrated parameters using inverse distance weighting (IDW);
-
a donor-basin approach where an ungauged basin inherits calibrated parameters from its most similar gauged basin(s);
-
using multiple linear regression or ANNs to learn the relationship between a basin’s characteristics and its optimized hydrologic parameters, then applying that relationship to ungauged basins;
-
multiparameterization regression (MPR) where functions are defined to spatially adjust land surface properties (Mizukami et al. 2017).
Given the success of the donor-basin approach when compared to other methodologies (e.g., Arsenault and Brissette 2014; Poissant et al. 2017; Razavi and Coulibaly 2017; Yang et al. 2018; Qi et al. 2021) and the use of the method in NOAA/NCAR’s National Water Model, we incorporate the donor-basin approach for this study.
b. Donor-basin method setup
This section describes the general setup of the donor-basin methodology used to regionalize optimized land surface parameters from gauged basins to the rest of the western United States (e.g., its ungauged basins). Like other regionalization methods, the donor-basin approach requires information on the hydroclimate characteristics (or features) of the gauged and ungauged basins. The basin features used in this study are outlined in Table S4.
To evaluate the performance of the donor-basin method, a leave-one-basin-out cross-validation technique was applied. In this technique, a single gauged basin is left out and considered a pseudoungauged basin. The pseudo-ungauged basin and its hydrologic parameters (Ksat, porosity, Bexp, and MFSNO) are then inherited from its most similar gauged basin based on the similarity index outlined in Eq. (2). In this process, the similarity index for each gauged basin will be unique depending on the pseudoungauged basin being compared against, and the gauged basin with the smallest similarity index will be considered the donor basin for the pseudo ungauged basin. This is performed to evaluate the regionalization performance of the donor-basin method, so that in a given simulation every basin is treated as a pseudo-ungauged basin, with its parameters being inherited from the most similar of all the other gauged basins.
c. Feature selection for regionalization
More than 20 features describing each basin were considered in the donor basin method. These features are outlined in Table S4. Meteorological data such as precipitation and temperature variables were obtained based on the ERA5-WRF forcing dataset, elevation and slope for each basin was obtained using the USGS National Hydrography Dataset (NHDPlus) v2 digital elevation model (DEM), and land use and soil texture data for each basin were obtained based on the same USGS and STATSGO data used to setup the Noah-MP model (NCAR 2022). To determine which hydroclimate features to use in the similarity index of the donor-basin method, an iterative approach was employed, where a single feature was added to the similarity index equation in every iteration (similar to Poissant et al. 2017). If there are 28 basin features (e.g., the case for the daily flood hazard model, Table S4), the first iteration includes 28 simulations, one for each feature. The feature that leads to the greatest increase in the median NSE across all basins, based on leave-one-out cross validation, is then retained. In the following iteration, 27 simulations are performed where each simulation includes the feature retained from the first round of simulations plus one of the remaining features. From these 27 simulations, the feature is retained that leads to the greatest further increase in median NSE. This process is repeated, until the median NSE ceases to increase, to determine the best features for regionalization. Only basins with a NSE or KGE greater than 0.0 were included to determine the best features for regionalization; however, to evaluate the final regionalization performance, which is listed in Table 2, all basins were included. Using basins with an NSE or KGE greater than 0.0 follows logic from previous studies implementing the donor-basin method that indicate using poorly performing basins can reduce the regionalization performance (e.g., Oudin et al. 2008). While somewhat arbitrary, we intentionally use a threshold for KGE that is higher than its minimum benchmark of −0.41 (Knoben et al. 2019) since such basins may provide added value in regionalization despite being below the minimum benchmark for NSE (0.0). Similarly, in our final regionalization to ungauged basins we only use basins with an NSE or KGE greater than 0.0; however, all ungauged basin parameters are estimated. These iterations yielded the basin longitude centroid, the 85th percentile precipitation, and max elevation of the basins as the most important features for the daily flood hazard model (Fig. 6a). The monthly water availability model regionalization improves with the mean fall precipitation, longitude centroid, mean winter temperature, and the latitude centroid (Fig. 6b).
Best regionalization features for (a) daily flood hazard model and (b) monthly water availability model. The final regionalization to ungauged regions of the western United States. incorporated all features up until the red line since the NSE stops improving as additional features are added. Several metrics are shown based on all the basins (median NSE, median KGE, mean bias, mean Pearson’s r).
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
d. Sensitivity to unique designs of the donor-basin method
In addition to evaluating the best features for regionalization, 20 sensitivity tests were performed for the flood hazard and water availability models (total of 40) to maximize the performance of the donor-basin technique (Fig. 7). These tests were performed due to conflicting findings regarding the impact of different methodological choices on the donor-basin method’s performance. Major changes in the tests performed include the following:
-
Using several donor basins rather than a single donor basin. This is achieved via IDW of the transferred hydrologic parameters, weighted by the similarity index rather than distance. Previous research has found that using multiple basins with IDW in the donor-basin approach can either increase (e.g., Poissant et al. 2017) or decrease performance (Yang et al. 2018);
-
Clustering basins via self-organizing maps (SOMs). A SOM is a nonlinear classifying technique that groups basins based on their hydroclimate features. For the donor-basin approach it functions such that an ungauged basin can only receive parameters from gauged basins that fall within its cluster. Some studies have found that the clustering of basins can improve regionalization performance (e.g., Di Prinzio et al. 2011; Razavi and Coulibaly 2017) while others have not (Yang et al. 2018);
-
Rather than using the best features selected for regionalization as discussed in section 4b, using all available hydroclimate features with dimensionality reduction via principal components analysis (PCA). A similar PCA technique was employed to regionalize parameters in the NWM (Gochis et al. 2019).
Sensitivity of the (a) daily flood hazard and (b) monthly water availability model to unique donor-basin designs. When IDW is not included in the y axis, this implies a single donor basin is used. The donor-basin setup that was selected to regionalize parameters is highlighted.
Citation: Journal of Hydrometeorology 24, 2; 10.1175/JHM-D-22-0047.1
We provide a detailed overview of the donor-basin’s performance when applied with different combinations of the possible choices listed. As with determining the best features for regionalization, the performance of each scheme was evaluated using leave-one-out cross validation. Also, only basins with an NSE or KGE greater than 0.0 were used to evaluate the different donor-basin designs; however, all the basins were used to evaluate the performance of the design that was selected (Table 2).
Performance of the donor-basin sensitivity tests in terms of their median NSE for all basins is shown in Fig. 7. For both the daily and monthly models, the tests indicate that unique donor-basin designs can influence the median NSE for all of the basins in the study domain by 0.06. We generally found that the regionalization performance improved when basins were clustered with SOMs. For example, a 0.02 NSE increase occurs between the best donor basin method employed without SOMs and with SOMs. The NSE was generally insensitive to the number of clusters used, but a 3 × 3 SOM (9 clusters which is indicated by SOM9) is considered optimal because it performs equal to or better than any of the other SOMs and avoids assigning few basins to a single group. We found reduced performance (0.03 NSE for the daily model, 0.04 for the monthly model with SOMs) when using all available features represented via PCA, with 95% variance of the features retained (indicated by PCA95), compared to using the best features found in the iterative experiments outlined in Fig. 6. This suggests that despite PCA’s ability to extract the most important features that represent variation between the basins, some of the features provide information that is less relevant to predicting streamflow. We also found that using multiple donor basins via IDW (e.g., indicated by IDW3 if using the three most similar basins) led to reduced performance (0.04 NSE for the daily model, 0.06 for the monthly model when using SOM) compared to applying parameters from the single most similar donor basin. However, when SOMs are not used to cluster basins, using IDW results in similar performance to using a single donor basin. This suggests that SOMs are more effective at nonlinearly clustering similar basins from which the single best donor basin can be selected compared to incorporating information from the x-most similar basins using IDW. In summary, we ultimately chose to use a single-donor basin, with the best set of regionalization features outlined in Fig. 6 (rather than PCA with all basin features), and with basins clustered into a 3 × 3 SOM (highlighted in Fig. 7). Note, we use this setup based on its performance for all of the basins across our western U.S. domain; however, unique donor-basin methods may be more or less accurate for unique hydroclimate regions in the domain.
e. Regionalized LSM performance
The median NSE of the selected regionalization approach compared against the baseline simulation using default LSM parameters is 0.42 versus 0.18 for the monthly water availability model and 0.27 versus 0.12 for the daily flood hazard model during the calibration time period (Table 2, Fig. 3). We also outline how the regionalization performance may vary for unique hydroclimate conditions (Table S2). We generally found that basins with the same type of hydroclimate characteristics that perform more accurately for calibration (e.g., wetter basins and those with greater forest coverage) perform more accurately for the regionalization (Table S2).
When comparing our regionalized daily NSE (right column of Fig. 3a) to Yang et al. (2019) who calibrated to a machine learning derived dataset of gridded runoff (Beck et al. 2015), the employed regionalization approach performs qualitatively similar. Also, the MPR technique applied by Mizukami et al. (2017) across CONUS resulted in a median daily NSE of 0.3. While difficult to compare directly against other studies given the use of unique basin datasets, this suggests that the regionalization method applied in this study is on par with other state-of-the art approaches.
After optimizing the features and specific design to use in the donor-basin method, parameters were regionalized to ungauged USGS Hydrologic Unit Code (HUC) 10 basins across the western United States (Fig. 1). HUCs are delineated and quality controlled by USGS using high-resolution DEMs. HUC10s were selected (e.g., rather than HUC8s) since their average drainage area of 525 km2 is similar to the 521-km2 average drainage area of the gauged basins in the daily flood hazard model. The final hydrologic parameters in every HUC10 basin in the western United States are shown in Fig. S6 and are made available as indicated in the data availability statement.
Using the regionalized parameter sets, we additionally evaluate how our models simulated snowpack (SWE), soil moisture, and latent heat flux (the energy form of evapotranspiration) performs when compared against observational data. We compare against these datasets after regionalizing parameters since the spatial availability of these datasets does not typically align with basins that have observed streamflow data. For October 1981–September 2019, which is the time period when observational data are available for both gridded and station SWE, the simulated SWE across the domain accurately reflects observed seasonal patterns and falls within range of observed peak SWE conditions (Fig. S7). To compare against soil moisture observations, we use the Soil Climate Analysis Network (SCAN) station network (USDA 2022) (Fig. S8). We focus on the top 10 cm due to greater observed data availability (across time) for soil moisture data closer to the surface. A total of 62 stations have data across major western U.S. basins (HUC-2s) during our validation period, with data generally available from 2008 onward. The simulated soil moisture generally shows greater agreement with observations where streamflow performance is more accurate (e.g., Columbia River basin and central Sierra Nevada); however, wet biases are apparent in the Lower Colorado basin and southwestern end of the Upper Colorado basin where soil moisture stations are located (Fig. S8). Finally, to compare against latent heat flux observations, we use AmeriFlux tower micrometeorological evapotranspiration measurements (AmeriFlux 2022), which undergo an extensive quality control process resulting in a reliable product for land surface model evaluation (e.g., Cai et al. 2014). However, only five tower sites are available across the HUC-2 basins that fall within our western U.S. domain (Fig. S9). Our calibrated models generally provide a reliable representation of evapotranspiration across three Columbia basin tower sites. However, our simulated results show a low evapotranspiration bias for a single site along the northern California coast, suggesting a possible low bias in precipitation at this particular location. Finally, for the single tower site in the Lower Colorado basin, evapotranspiration is significantly higher than observations, indicating a wet bias in the ERA5-WRF precipitation data likely exists for this southeastern section of the Lower Colorado since our simulated results also have a wet streamflow bias and soil moisture bias at a collocated soil moisture station.
5. Conclusions
In this study, we improved the representation of runoff in the Noah-MP LSM when driven by a dynamically downscaled ERA5 dataset. In addition to providing a general overview with respect to the calibration process for an LSM (model parameterization selection, land surface parameter calibration, and regionalization to ungauged basins), we provide important findings at each step.
For example, we found that performing sensitivity tests help in selecting an appropriate set of parameterizations for the given domain and forcing but may not provide a clear choice unless a full calibration/validation is performed. After selecting the best set of parameterizations for our western U.S. domain, we developed two parameter sets in our calibration: one more appropriate for water availability analysis at a monthly time resolution and one more appropriate for flood hazard analysis at a daily time resolution. The calibrated model was validated for streamflow over 1954–2021 with the 8 calibration years removed, resulting in a median NSE of 0.56 and 0.36 for the monthly and daily models, respectively, based on all of the basins in the study domain. The monthly water availability model was found to be reliable across the Columbia River basin, California’s central Sierra Nevada, and the Upper Colorado River basin. However, a substantial wet bias in the ERA5-WRF forcing dataset (0.5 mm day−1 from June to August) across the southwestern United States, led to moderate performance in the Upper Colorado basin, and poor performance in the Lower Colorado basin. The daily flood hazard model has a dry bias of 10% for 2–100-yr return period events in the Columbia and Colorado River basins, while California’s central Sierra Nevada has a 10% wet bias for such events.
After selecting appropriate parameterizations for the given forcing/domain and calibrating gauged basins, we extended the calibrated parameters to ungauged basins in the western United States. This was done via the donor-basin regionalization method, in which ungauged basins inherit parameters from gauged basins with similar hydroclimatic characteristics. We found that using a single donor basin that is most similar [as defined by the similarity index; Eq. (2)] to an ungauged basin improves the regionalization performance over using multiple basins weighted by their similarity. In addition, performance is enhanced when a select subset of the most relevant basin hydroclimate features for regionalization are used, rather than using all features with dimensionality reduction. Finally, we found that performance is slightly enhanced when basins are clustered according to their hydroclimate characteristics using a self-organizing map. This clustering limits an ungauged basin to only inheriting parameters from gauged basins that are hydroclimatically similar and are thus grouped into the same cluster. With our selected donor-basin regionalization method, we found qualitatively similar performance to results from recent studies employing other state-of-the-art regionalization methods. While the optimal donor-basin regionalization setup may differ for other domains, our results provide an estimate of how sensitive the donor-basin approach is to unique methodological choices (designs) which varied by a median NSE of 0.06 for all of our domain’s basins. We believe such analysis should be particularly useful given the use of this regionalization method for CONUS streamflow forecasts made by the National Water Model.
This study is unique in that it calibrates streamflow and regionalizes parameters to ungauged areas using an LSM forced by dynamically downscaled reanalysis data. This contrasts with most hydrologic studies that calibrate hydrologic models using gridded products derived from point-location data (e.g., PRISM; Livneh et al. 2013). The calibrated LSM and its downscaled reanalysis forcing provide the potential to better represent spatial and temporal atmospheric data and thus runoff, particularly in ungauged areas. In subsequent studies the calibrated LSM can be tightly coupled with its corresponding RCM to accurately represent runoff and explore land surface–atmosphere interactions. However, a wet summertime precipitation bias in the downscaled reanalysis forcing data in the Lower Colorado basin, results in a wet soil moisture, evapotranspiration and streamflow bias in this region that could lead to nonrealistic land surface–atmosphere interactions in a tightly coupled framework. We note that users primarily interested in the accurate representation of streamflow rather than preserving more realistic soil properties, and as a result soil moisture conditions that would be relevant to a tightly coupled framework, may consider using the wider range of parameters outlined in the supplemental material. The wider range of parameters fall within the typical range of values used in hydrologic calibration, and lead to an overall improvement in streamflow performance across the western United States, particularly across the Colorado basin. Whether to use the wider range of parameters typically used in hydrologic calibration or the more realistic range of parameters discussed in this manuscript depends on the end-use of the model. The hydrologic community may require a more accurate representation of streamflow for flood or water supply analysis, while the atmospheric community can benefit more from a model that improves runoff but does so in a manner that preserves more realistic soil properties relevant to land surface–atmosphere interactions.
Finally, while outside the scope of this manuscript, we believe a comparison against our results when using a traditional gridded forcing dataset derived from station data would provide additional insight. First, such a comparison would inform whether the calibrated parameters from this study can be used in other hydrological applications or if the calibrated parameters are limited to use-cases relevant to the downscaled reanalysis dataset. Second, based on the regionalization performance, this comparison would indicate if the downscaled reanalysis dataset can truly provide a better representation, compared to traditional gridded datasets, of meteorology in space and thus runoff in ungauged areas. Ultimately, we believe the findings in this study provide a useful reference for the land–atmosphere modeling community and their ability to investigate stakeholder questions associated with runoff conditions.
Acknowledgments.
We would like to acknowledge funding from the U.S. Department of Energy, Office of Science, projects “An Integrated Evaluation of the Simulated Hydroclimate System of the Continental US” (Award DE-SC0016605) and the California Energy Commission grant “Development of Climate Projections for California and Identification of Priority Projections” (Agreement EPC-20-006). We also would like to thank the WRF-Hydro team and the computing staff at UCAR for their response to inquiries. Finally, we thank the reviewers for their detailed and useful feedback throughout the revision process.
Data availability statement.
All datasets used in this study are publicly available: GagesII streamflow data are at: https://water.usgs.gov/GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml.; DWR Natural Flow data at: https://cdec.water.ca.gov/reportapp/javareports?name=FNF; Bureau of Reclamation’s Colorado River Basin natural flow at: https://www.usbr.gov/lc/region/g4000/NaturalFlow/current.html; PRISM-UA SWE data are accessible from https://nsidc.org/data/nsidc-0719; SNOTEL data are available via the National Resources Convention Center at https://www.wcc.nrcs.usda.gov/snow/snotel-wedata.html; Soil moisture SCAN station data are available via the United States Department of Agriculture at https://www.nrcs.usda.gov/wps/portal/wcc/home/snowClimateMonitoring/soilClimateConditions/; AmeriFlux tower site data at: https://ameriflux.lbl.gov/data/data-availability/#/FLUXNET. Additionally, the ERA5-WRF forcing dataset is available on Amazon S3 and can be accessed with Amazon Web Services Command Line Interface (AWS CLI) at s3://wrf-cmip6-noversioning/downscaled_products/reanalysis/era5. The default and calibrated parameter files for the Noah-MP LSM are available at https://zenodo.org/record/6983670.
REFERENCES
AmeriFlux, 2022: AmerFlux network data. AmeriFlux, accessed 1 July 2022, https://ameriflux.lbl.gov/data/data-availability/#/FLUXNET.
Arsenault, R., and F. P. Brissette, 2014: Continuous streamflow prediction in ungauged basins: The effects of equifinality and parameter set selection on uncertainty in regionalization approaches. Water Resour. Res., 50, 6135–6153, https://doi.org/10.1002/2013WR014898.
Beck, H. E., A. de Roo, and A. I. J. M. van Dijk, 2015: Global maps of streamflow characteristics based on observations from several thousand catchments. J. Hydrometeor., 16, 1478–1501, https://doi.org/10.1175/JHM-D-14-0155.1.
Bureau of Reclamation, 2021: Colorado River basin natural flow and salt data. Bureau of Reclamation, accessed 1 October 2021, https://www.usbr.gov/lc/region/g4000/NaturalFlow/current.html.
Burn, D. H., and D. B. Boorman, 1993: Estimation of hydrological parameters at ungauged catchments. J. Hydrol., 143, 429–454, https://doi.org/10.1016/0022-1694(93)90203-L.
Cai, X., Z.-L. Yang, C. H. David, G.-Y. Niu, and M. Rodell, 2014: Hydrological evaluation of the Noah-MP land surface model for the Mississippi River Basin. J. Geophys. Res. Atmos., 119, 23–38, https://doi.org/10.1002/2013JD020792.
California Department of Water Resources, 2021: California data exchange center: Daily full natural flow for December 2022. California Department of Water Resources, accessed 1 October 2021, https://cdec.water.ca.gov/reportapp/javareports?name=FNF.
Cuntz, M., J. Mai, L. Samaniego, M. Clark, V. Wulfmeyer, O. Branch, S. Attinger, and S. Thober, 2016: The impact of standard and hard-coded parameters on the hydrologic fluxes in the Noah-MP land surface model. J. Geophys. Res. Atmos., 121, 10 676–10 700, https://doi.org/10.1002/2016JD025097.
Di Prinzio, M., A. Castellarin, and E. Toth, 2011: Data-driven catchment classification: Application to the pub problem. Hydrol. Earth Syst. Sci., 15, 1921–1935, https://doi.org/10.5194/hess-15-1921-2011.
Falcone, J., 2011: GAGES-II: Geospatial attributes of gages for evaluating streamflow. U.S. Geological Survey, accessed 1 April 2021, https://water.usgs.gov/GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml.
Gochis, D., and Coauthors, 2019: Overview of National Water Model Calibration: General strategy and optimization. National Center for Atmospheric Research, accessed 15 January 2021, 30 pp., https://ral.ucar.edu/sites/default/files/public/9_RafieeiNasab_CalibOverview_CUAHSI_Fall019_0.pdf.
Gumbel, E. J., 1958: Statistics of Extremes. Columbia University Press, 373 pp.
Holtzman, N. M., T. M. Pavelsky, J. S. Cohen, M. L. Wrzesien, and J. D. Herman, 2020: Tailoring WRF and Noah-MP to improve process representation of Sierra Nevada runoff: Diagnostic evaluation and applications. J. Adv. Model. Earth Syst., 12, e2019MS001832, https://doi.org/10.1029/2019MS001832.
Hussein, A., 2020: Process-based calibration of WRF-hydro model in unregulated mountainous basin in Central Arizona. M.S. thesis, Ira A. Fulton Schools of Engineering, Arizona State University, 110 pp., https://keep.lib.asu.edu/_flysystem/fedora/c7/224690/Hussein_asu_0010N_19985.pdf.
Jordan, R., 1991: A one-dimensional temperature model for a snow cover: Technical documentation for SNTERERM.89. Special Rep. 91-16, Cold Region Research and Engineers Laboratory, U.S. Army Corps of Engineers, 61 pp.
Keller, J. D., and S. Wahl, 2021: Representation of climate in reanalyses: An intercomparison for Europe and North America. J. Climate, 34, 1667–1684, https://doi.org/10.1175/JCLI-D-20-0609.1.
Kerandi, N. J., J. Arnault, P. Laux, S. Wagner, J. Kitheka and H. Kunstmann, 2018: Joint atmospheric-terrestrial water balances for East Africa: A WRF-Hydro case study for the upper Tana River basin. Theor. Appl. Climatol., 131, 1337–1355, https://doi.org/10.1007/s00704-017-2050-8.
Kling, H., M. Fuchs, and M. Paulin, 2012: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. J. Hydrol., 424–425, 264–277, https://doi.org/10.1016/j.jhydrol.2012.01.011.
Knoben, W. J. M., J. E. Freer, and R. A. Woods, 2019: Technical note: Inherent benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores. Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019.
Li, D., D. P. Lettenmaier, S. A. Margulis, and K. Andreadis, 2019: The role of rain-on-snow in flooding over the conterminous United States. Water Resour. Res., 55, 8492–8513, https://doi.org/10.1029/2019WR024950.
Livneh, B., E. A. Rosenberg, C. Lin, B. Nijssen, V. Mishra, K. M. Andreadis, E. P. Maurer, and D. P. Lettenmaier, 2013: A long-term hydrologically based dataset of land surface fluxes and states for the conterminous United States: Update and extensions. J. Climate, 26, 9384–9392, https://doi.org/10.1175/JCLI-D-12-00508.1.
Mendoza, P. A., and Coauthors, 2015: Effects of hydrologic model choice and calibration on the portrayal of climate change impacts. J. Hydrometeor., 16, 762–780, https://doi.org/10.1175/JHM-D-14-0104.1.
Mizukami, N., M. P. Clark, A. J. Newman, A. W. Wood, E. D. Gutmann, B. Nijssen, O. Rakovec, and L. Samaniego, 2017: Towards seamless large-domain parameter estimation for hydrologic models. Water Resour. Res., 53, 8020–8040, https://doi.org/10.1002/2017WR020401.
Nash, J. E., and J. V. Sutcliffe, 1970: River flow forecasting through conceptual models part I — A discussion of principles. J. Hydrol., 10, 282–290, https://doi.org/10.1016/0022-1694(70)90255-6.
NCAR, 2022: Noah-Multiparameterization Land Surface Model (Noah-MP LSM). Accessed 1 January 2022, https://ral.ucar.edu/solutions/products/noah-multiparameterization-land-surface-model-noah-mp-lsm.
Newman, A. J., and Coauthors, 2015: Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: Data set characteristics and assessment of regional variability in hydrologic model performance. Hydrol. Earth Syst. Sci., 19, 209–223, https://doi.org/10.5194/hess-19-209-2015.
Niu, G.-Y., and Z.-L. Yang, 2004: The effects of canopy processes on snow surface energy and mass balances. J. Geophys. Res., 109, D23111, https://doi.org/10.1029/2004JD004884.
Niu, G.-Y., and Coauthors, 2011: The community Noah land surface model with multiparameterization options (Noah-MP): 1. Model description and evaluation with local-scale measurements. J. Geophys. Res., 116, D12109, https://doi.org/10.1029/2010JD015139.
Oubeidillah, A. A., S. C. Kao, M. Ashfaq, B. S. Naz, and G. Tootle, 2014: A large-scale, high-resolution hydrological model parameter data set for climate change impact assessment for the conterminous US. Hydrol. Earth Syst. Sci., 18, 67–84, https://doi.org/10.5194/hess-18-67-2014.
Oudin, L., V. Andréassian, C. Perrin, C. Michel, and N. Le Moine, 2008: Spatial proximity, physical similarity, regression and ungaged catchments: A comparison of regionalization approaches based on 913 French catchments. Water Resour. Res., 44, W03413, https://doi.org/10.1029/2007WR006240.
Poissant, D., A. Arsenault, and F. Brissette, 2017: Impact of parameter set dimensionality and calibration procedures on streamflow prediction at ungauged catchments. J. Hydrol. Reg. Stud., 12, 220–237, https://doi.org/10.1016/j.ejrh.2017.05.005.
Qi, W. Y., J. Chen, L. Li, C.-Y. Xu, Y.-h. Xiang, S.-b. Zhang, and H.-M. Wang, 2021: Impact of the number of donor catchments and the efficiency threshold on regionalization performance of hydrological models. J. Hydrol., 601, 126680, https://doi.org/10.1016/j.jhydrol.2021.126680.
Rahimi, S., W. Krantz, Y.-H. Lin, B. Bass, N. Goldenson, A. Hall, Z. J. Lebo, and J. Norris, 2022: Evaluation of a reanalysis-driven configuration of WRF4 over the western United States from 1980 to 2020. J. Geophys. Res. Atmos., 127, e2021JD035699, https://doi.org/10.1029/2021JD035699.
Razavi, T., and P. Coulibaly, 2017: An evaluation of regionalization and watershed classification schemes for continuous daily streamflow prediction in ungauged watersheds. Can. Water Resour. J., 42, 2–20, https://doi.org/10.1080/07011784.2016.1184590.
Tolson, B. A., and C. A. Shoemaker, 2007: Dynamically dimensioned search algorithm for computationally efficient watershed model calibration. Water Resour. Res., 43, W01413, https://doi.org/10.1029/2005WR004723.
USDA, 2022: Soil conditions: Soil climate analysis network. Accessed 1 July 2022, https://www.nrcs.usda.gov/wps/portal/wcc/home/snowClimateMonitoring/soilClimateConditions/.
USGS, 2018: USGS EROS archive - Land Cover Products - Global Land Cover Characterization (GLCC). Accessed 1 October 2021, https://doi.org/10.5066/F7GB230D.
Walton, D., and A. Hall, 2018: An assessment of high-resolution gridded temperature datasets over California. J. Climate, 31, 3789–3810, https://doi.org/10.1175/JCLI-D-17-0410.1.
Yang, X. J., J. Magnusson, C. Y. Rizzi, and C. Y. Xu, 2018: Runoff prediction in ungauged catchments in Norway: Comparison of regionalization approaches. Hydrol. Res., 49, 487–505, https://doi.org/10.2166/nh.2017.071.
Yang, Y., M. Pan, H. E. Beck, C. K. Fisher, R. E. Beighley, S.-C. Kao, Y. Hong, and E. F. Wood, 2019: In quest of calibration density and consistency in hydrologic modeling: Distributed parameter calibration against streamflow characteristics. Water Resour. Res., 55, 7784–7803, https://doi.org/10.1029/2018WR024178.
You, Y., C. Huang, Z. Yang, Y. Zhang, Y. Bai, and J. Gu, 2020: Assessing Noah-MP parameterization sensitivity and uncertainty interval across snow climates. J. Geophys. Res. Atmos., 125, e2019JD030417, https://doi.org/10.1029/2019JD030417.
Zeng, X., P. Broxton, and N. Dawson, 2018: Snowpack change from 1982 to 2016 over conterminous United States. Geophys. Res. Lett., 45, 12 940–12 947, https://doi.org/10.1029/2018GL079621.