## Abstract

Rainfall estimated from the polarimetric prototype of the Weather Surveillance Radar-1988 Doppler [WSR-88D (KOUN)] was evaluated using a dense Micronet rain gauge network for nine events on the Ft. Cobb research watershed in Oklahoma. The operation of KOUN and its upgrade to dual polarization was completed by the National Severe Storms Laboratory. Storm events included an extreme rainfall case from Tropical Storm Erin that had a 100-yr return interval. Comparisons with collocated Micronet rain gauge measurements indicated all six rainfall algorithms that used polarimetric observations had lower root-mean-squared errors and higher Pearson correlation coefficients than the conventional algorithm that used reflectivity factor alone when considering all events combined. The reflectivity based relation *R*(*Z*) was the least biased with an event-combined normalized bias of −9%. The bias for *R*(*Z*), however, was found to vary significantly from case to case and as a function of rainfall intensity. This variability was attributed to different drop size distributions (DSDs) and the presence of hail. The synthetic polarimetric algorithm *R*(syn) had a large normalized bias of −31%, but this bias was found to be stationary.

To evaluate whether polarimetric radar observations improve discharge simulation, recent advances in Markov Chain Monte Carlo simulation using the Hydrology Laboratory Research Distributed Hydrologic Model (HL-RDHM) were used. This Bayesian approach infers the posterior probability density function of model parameters and output predictions, which allows us to quantify HL-RDHM uncertainty. Hydrologic simulations were compared to observed streamflow and also to simulations forced by rain gauge inputs. The hydrologic evaluation indicated that all polarimetric rainfall estimators outperformed the conventional *R*(*Z*) algorithm, but only after their long-term biases were identified and corrected.

## 1. Introduction

Weather radars sample the atmosphere at high spatial resolution over contiguous regions, whereas operational rain gauge networks collect rainfall nearly continuously at points. Despite the opportunity to capture the spatial variability of rainfall, a number of studies have identified and quantified errors in radar rainfall estimation. Relevant literature reviews of radar-based rainfall errors can be found in Wilson and Brandes (1979), Austin (1987), and Joss and Waldvogel (1990). The advent of dual-polarization capability with weather radar offers opportunities in improving radar data quality and product generation through the identification and removal of nonweather targets, classifying hydrometeors, and improving rainfall rate estimation (e.g., Straka and Zrnić 1993; Ryzhkov and Zrnić 1995; Vivekanandan et al. 1999; Zrnić and Ryzhkov 1999; Bringi and Chandrasekar 2001). For these reasons, the U.S. National Weather Service (NWS) and other agencies will upgrade the nationwide network of the Weather Surveillance Radar-1988 Doppler (WSR-88D) radar with polarimetric capability. A reasonable expectation is that improvements in rainfall rate estimation will lead to better skill in hydrologic simulation of stream discharge. This is of particular importance in the context of flooding, the second deadliest of all weather-related hazards in the United States; heat is the number one killer (Ashley and Ashley 2008).

Accurate forcing data is a prerequisite for streamflow simulation and is also a subject of particular interest within the context of uncertainty quantification in hydrologic modeling. In hydrology, input uncertainty has been treated as an extension to a model parameter estimation problem (Beven and Binley 1992; Gupta et al. 1998; Vrugt et al. 2005). Errors in rainfall, typically derived from rain gauge networks, are assumed to be consistent and uniform within individual storms, and multipliers are used to perturb storm depth on an event-by-event basis (Kavetski et al. 2006; Vrugt et al. 2008). We anticipate results from our study will help guide the hydrologic community toward proper treatment of rainfall uncertainty, with particular consideration of the instrument(s) used to estimate rainfall. Instrument-specific errors are important in the context of prediction in ungauged basins owing to the recent availability of rainfall from remote sensing systems. This article is also meant to target the radar meteorology community, who desire an answer to the following questions: Will dual-polarized radar improve hydrologic modeling skill including extreme events such as flash floods? Do improvements in the accuracy or precision with polarimetric rainfall estimates result in better hydrologic simulations, or both?

Studies of rainfall estimates from polarimetric radar as inputs to hydrologic models have been performed for significant flash flooding events in the Buffalo Creek watershed near Denver, Colorado, and in the Spring Creek watershed in Ft. Collins, Colorado (Yates et al. 2000; Ogden et al. 2000). Yates et al. (2000) used rainfall estimates from the National Center for Atmospheric Research (NCAR) S-band, dual-polarization radar (S-pol) to calibrate the U.S. Geological Survey (USGS) Precipitation–Runoff Modeling System (Leavesley and Stannard 1995). Runoff sensitivity tests were conducted with different rainfall forcings, which highlighted the complex, nonlinear response of the watershed. Simulated discharge was found to be sensitive to the different rainfall forcings (i.e., dual polarimetric versus single polarimetric). Ogden et al. (2000) calibrated the Cascade Two-Dimensional (CASC2D) model (Julien et al. 1995) with forcing from Colorado State University (CSU)–University of Chicago–Illinois State Water Survey (CHILL) S-band, dual-polarization radar. A single parameter—soil saturated hydraulic conductivity—was calibrated from rainfall–runoff observations of a storm that occurred earlier in the day that saturated soils and raised water levels in the nearby Horsetooth reservoir. These designated parameter settings were then transferred to the Spring Creek watershed for the flooding event, and reference simulations were created with inputs from CHILL rainfall estimates. Sensitivity studies indicated errors in the spatiotemporal distribution of rainfall estimation impacted runoff simulations more significantly than details represented in the hydrologic model’s land surface parameters. Both studies used indirect peak discharge estimated from the USGS to compare with hydrologic model results, and both found model parameter settings using event-based forcing from the polarimetric radar-rainfall estimates; that is, the model parameters settings were dependent on the polarimetric rainfall forcing. These singular storm studies demonstrated the potential applicability of dual-polarization radar in hydrologic modeling, but the transferability of the results to calibrated, continuous hydrologic simulation of multiple events remains an open question. The study undertaken will evaluate dual-polarization rainfall estimates as inputs to a calibrated, distributed hydrologic model for a comprehensive dataset of storm events.

The study capitalizes on data collected from KOUN in Norman, Oklahoma, over a densely instrumented basin near Ft. Cobb, Oklahoma. Rain gauges installed in the U.S. Department of Agriculture Agricultural Research Service (USDA–ARS) research watershed were used along with USGS stream gauge data to calibrate the NWS Hydrology Laboratory Research Distributed Hydrologic Model (HL-RDHM) (Koren et al. 2004) for a 3-yr period. Candidate KOUN rainfall estimates were then substituted as HL-RDHM inputs for nine events that had a complete archive of KOUN observations. One of the events making up the storm dataset includes an extreme rainfall and deadly flooding event from a rare overland reintensification of Tropical Storm Erin. Details of this remarkable event are discussed in Arndt et al. (2009).

This paper is organized by describing the ARS Micronet instrumentation, physical characteristics of the Ft. Cobb research watershed, and the storm event database in section 2. Details of KOUN polarimetric rainfall estimation are described in section 3. Section 4 presents a more traditional radar–gauge-based analysis by comparing KOUN-based rainfall to collocated rain gauges within the ARS Micronet. A novel approach to evaluating rainfall estimates from the perspective of hydrologic simulations is introduced in section 5. This section also presents results from the hydrologic evaluation. Because this article is intended to reach both hydrological and radar meteorology communities, we anticipate readers from the former audience will focus on sections 2 and 5, while the latter will be most interested in sections 2–4 (and perhaps 5). A summary and concluding remarks are provided in section 6.

## 2. Study domain and instrumentation

The Ft. Cobb watershed was added to the USDA ARS watershed research network in 2005 to address research objectives related to constituents that impair water quality and wildlife habitat in Oklahoma. The basin was selected for this study because it contains an ARS Micronet of 15 in situ instruments that measure atmospheric and soil properties, is situated 83–120 km from the KOUN polarimetric radar, and has three USGS-gauged subbasins. These sources of data are beneficial for radar rainfall evaluation and assessing the impacts of polarimetric rainfall estimation on hydrologic simulation. The Ft. Cobb basin is 59% cropland according to the USGS National Land Cover Database; the second largest classification being grassland at 31% (Homer et al. 2007). Soils are classified by the U.S. Natural Resources Conservation Service State Soil Geographic database (STATSGO) as being predominantly silt loam/loam and are considered to be deep and well drained (Soil Survey Staff 1994; USDA 1994; Soil Survey Staff 1996). A 30-yr climatology of precipitation in Caddo County from the Oklahoma Climatological Survey indicates the months of May and June are the wettest, while the winter months of January and February are the driest. The annual average rainfall is 800 mm with a monthly maximum of 150 mm occurring in May. The annual mean temperature is 16.2°C and May temperatures average 21.1°C, which results in a vast majority of the precipitation falling as rain. Contributions to runoff from frozen precipitation and frozen soils were negligible in this basin for the events collected.

The entire basin is 813 km^{2} in area, with 15 ARS Micronet stations that record air temperature, rainfall, relative humidity, solar radiation, and soil temperature at 5, 10, 15, and 30 cm below ground and soil water content at 5, 25, and 45 cm below ground (Fig. 1). Micronet observations were averaged over 5–30 min depending on the instrument and then tested for validity using range, step, persistence, spatial, and like-instrument comparisons developed for the Oklahoma Mesonet (Shafer et al. 2000; Fiebrich et al. 2006). ARS Micronet tipping-bucket rain gauges were used in this study to evaluate the accuracy of rainfall estimates and to calibrate a distributed hydrologic model. The gauges, manufactured by Met One, are neither heated nor shielded. The instrument collects rainfall at 0.6 m above ground in a 30-cm orifice, funnels the rain to one of two buckets mounted on a balance pivot, and fills the bucket until it tips and registers 0.254 mm of rain. The number of tips is transmitted, quality controlled, and archived on a 15-min basis.

There are three USGS stream gauges in the basin (USGS site numbers 07325800, 07325850, and 07325860) with contributing drainage areas of 342, 154, and 75 km^{2}, respectively (Fig. 1). The stream gauges report stream discharge on a 15-min basis. Streamflow measurements from USGS 07325800 are used in this study to evaluate hydrologic simulations forced by polarimetric rainfall estimates.

From 2005 to 2008, nine events that produced significant rainfall and were observed over their complete life cycle on Ft. Cobb by KOUN were identified for study (see Table 1). This latter criterion is necessary when using rainfall datasets as inputs to hydrologic models; all rainfall contributing to basin streamflow must be collected continuously. The storm database consists of severe convective storms, mesoscale convective systems (MCS), and extreme rainfall from a strengthening TS Erin. The Oklahoma Climatological Survey reported that TS Erin produced 187 mm of rain in three hours at the Ft. Cobb Mesonet site; this rainfall was determined to have a return period of 500 years. The same rain gauge collected a total of 235 mm of rain from TS Erin in 24 h. The flooding impacts from TS Erin in Caddo County were devastating. Four people lost their lives. Three persons perished as their van was swept off the road and washed downstream, and an elderly person drowned in a flooded basement. There were dozens of rescues of motorists and residents. The property damage in Caddo County was estimated at $110 000, while the damage across the entire state of Oklahoma was estimated at $4 960 000 with two additional fatalities (NWS 2007).

## 3. KOUN polarimetric rainfall estimation

The prototype polarization upgrade of the WSR-88D radar network sited in Norman, Oklahoma, has been collecting data in simultaneous transmission and reception mode at S-band frequency for selected events since 2002. The resolution of the data of 0.250–0.267 km in range and 1° in azimuth correspond to a short dwell time (48 samples) to satisfy the Next Generation Weather Radar (NEXRAD) antenna rotation rate of three revolutions per minute. The accuracy of derived rain products depends on the quality of the raw polarimetric variables. Data quality procedures have been developed and applied to KOUN data as described in Ryzhkov et al. (2005a) and Ryzhkov et al. (2005c); a brief summary is provided herein. Calibration of radar reflectivity *Z* to within 1 dB was achieved through comparison to the nearby WSR-88D radar in Twin Lakes, Oklahoma, KTLX, which was shown to be well calibrated in Ryzhkov et al. (2005a) and Giangrande and Ryzhkov (2005). Differential reflectivity *Z*_{DR} was manually calibrated to within 0.2 dB for each event by examining dry aggregated snow above the melting layer at elevation angles between 4.5° and 6°. Here, the true or intrinsic *Z*_{DR} is known to be approximately 0.3 dB, which was used for each case to manually calibrate *Z*_{DR} values. Range profiles of the differential propagation phase shift Φ_{DP} were used to correct for attenuation losses in *Z* and *Z*_{DR} (Ryzhkov and Zrnić 1995). The range derivative of Φ_{DP}, specific differential phase shift *K*_{DP}, was computed as the slope of a least squares fit over pathlengths of 9 or 25 successive gates (corresponding to 2.25 or 6.25 km). Light filtering over nine gates was used if all bins had *Z* > 40 dB*Z* in the path, else the heavily filtered *K*_{DP} was used (Ryzhkov and Zrnić 1996). Smoothed *Z*_{DR} data were created by averaging *Z*_{DR} values along three radials and five adjacent range gates centered on each pixel, totaling a 15-gate average. Experimental results from Ryzhkov et al. (2005a) compared to the theoretical analysis of Bringi and Chandrasekar (2001) found the standard deviation in the KOUN *Z*_{DR} measurements to be between 0.2 and 0.3 dB, corresponding to 48 samples and measured values of copolar cross-correlation coefficient *ρ*_{HV} in rain. This smoothing was meant to reduce the noisiness in *Z*_{DR} measurements stemming from the short dwell time to within 0.1–0.2 dB.

The quality controlled polarimetric variables collected at 0.5° elevation angle were used to compute rainfall rates. In this study, we investigated the skill of seven different rainfall algorithms. Prior to the computation of rainfall rates, a fuzzy logic hydrometeor classification algorithm (HCA) described in Giangrande and Ryzhkov (2008) and Park et al. (2009) was used to discriminate meteorological and nonmeteorological echoes. The HCA uses *Z*, *Z*_{DR}, *ρ*_{HV}, and a texture parameter that describes the standard deviation of small-scale fluctuations of *Z* [SD(*Z*)]. These variables are useful in identifying hydrometeor type (e.g., graupel, hail, aggregates) and adjusting the estimation scheme accordingly. Precipitation rates were computed at grid points having *ρ*_{HV} > 0.85 and signal-to-noise ratio (SNR) > 8 dB so as to minimize the effect of noisy measurements. This study examined the skill of the following relations:

and

In (1)–(4) *Z* is expressed in linear units (mm^{6} m^{−3}), *Z*_{dr} is in linear units as denoted by the lowercase “dr” subscript, *K*_{DP} is in degrees per kilometer, and *R* is in millimeters per hour. Note that (1) is the standard NEXRAD *Z*–*R* relation (*Z* = 300*R*^{1.4}) (Fulton et al. 1998). The parameters for (2) and (3) were derived using KOUN radar and rain gauge data collected during the Joint Polarization Experiment (JPOLE) (Ryzhkov et al. 2005c). It is possible to arrive at negative values of *R* using (2) due to noisiness in *K*_{DP} in light rain and nonuniform beam filling effects. In this study, negative values of *R*(*K*_{DP}) were set to 0 mm h^{−1} because negative values of *R* are nonphysical and not tolerated by the hydrologic model. The coefficients in (4) were computed by independently varying the parameters of a gamma drop size distribution (DSD) and simulating *R*, *Z*, and *Z*_{DR} data as described in Bringi and Chandrasekar (2001, section 8.1.1) The efficacy of the HCA on precipitation estimation was evaluated by considering (3) and (4) but using (2) for each pixel that was classified as hail. These relations are hereafter referred to as *R*(*Z*, *Z*_{DR}|JPOLE, HCA) and *R*(*Z*, *Z*_{DR}|BC, HCA), respectively. The final algorithm evaluated in this study was the “synthetic algorithm” *R*(syn), hereafter, described in Ryzhkov et al. (2005b) with slight modifications. This algorithm was found to be the optimum relation during JPOLE in terms of lowest rms error (RMSE) and bias using rain gauges at distances from 50 to 80 km of KOUN (e.g., the ARS network in the Little Washita watershed). Here *R*(syn) uses an *R*(*Z*, *Z*_{DR}) relation tuned for light rain if *R*(*Z*) from (1) is less than 10 mm h^{−1} [note the original synthetic algorithm used a *R*(*Z*) threshold of 6 mm h^{−1}]. For 10 < *R*(*Z*) < 50 mm h^{−1}, *R*(syn) uses an *R*(*K*_{DP}, *Z*_{DR}) relation tuned for moderate precipitation rates. For intense precipitation possibly mixed with hail, *R*(syn) reverts to (2) for *R*(*Z*) > 50 mm h^{−1} due to the assumption of *K*_{DP} being less susceptible to hail contamination at S band. All rainfall algorithms described above have been capped so that their maximum computed values were 100 mm h^{−1}. Precipitation rates computed from each algorithm were aggregated to hourly accumulations and resampled on a 200-m resolution Cartesian grid.

## 4. Rain gauge evaluation

The high-density Micronet rain gauge network shown in Fig. 1 was used to evaluate all seven KOUN precipitation algorithms. In this section, rain gauges are considered ground truth. However, numerous studies have pointed out errors with rain gauge accumulations of rainfall (Zawadzki 1975; Wilson and Brandes 1979; Marselek 1981; Legates and DeLiberty 1993; Nystuen 1999; Ciach 2003). Radar-derived rainfall estimates and rain gauge accumulations are both prone to error, so the following analysis should be interpreted as instrument comparisons. Figure 2 shows scatterplots of the KOUN hourly precipitation amounts to collocated rain gauge accumulations for all events listed in Table 1. Each panel indicates the normalized bias (NB), the RMSE computed after the radar bias was removed, or the standard error (SE), and Pearson correlation coefficient (CORR), defined as follows:

and

where

Here NB is dimensionless, SE is in millimeters, and CORR is dimensionless. The summations in (5) and (6) are over each *i*th radar–gauge pair, yielding a sample size *N* of 1299. In (6) and (8) *E*(·) refers to the expected value. In the computation of CORR, *σ* is the standard deviation; NB, when multiplied by 100, is the same as the fractional bias (FB in percent) used in Ryzhkov et al. (2005b).

The first notable feature of Fig. 2 is the binning of light rain gauge accumulations into 0.254 mm amounts corresponding to individual tips. Caution must be exercised in evaluating KOUN skill for hourly gauge accumulations less than 1 mm. In comparison to the conventional estimator *R*(*Z*) in Fig. 2a, the polarimetric algorithms reduced the SE by 0.70–1.56 mm, improved the CORR from a baseline of 0.83 to 0.87–0.91, but lowered the NB by 1%–33%, resulting in underestimation. Overall, these results are quite consistent with those of Ryzhkov et al. (2005b) and Giangrande and Ryzhkov (2008), who also examined KOUN polarimetric rainfall algorithms using ARS Micronet rain gauges. The primary difference with the findings in Fig. 2a and those from the aforementioned studies is the NB. The former KOUN rainfall study shows *R*(*Z*) overestimated hourly rain gauge amounts by 19.4%, while the latter indicates *R*(*Z*) had a positive bias of 1.5 mm at a range of 100 km. A reduction in the NB with the polarimetric algorithms resulted in values closer to 0. In our case, *R*(*Z*) underestimated gauge accumulations by 9%. Thus, the expected and observed trend of negatively biasing precipitation amounts with polarimetric algorithms results in more underestimation. This prompted us to examine the behavior of NB as a function of rainfall intensity and storm event.

Figure 3 shows the NB with the conventional estimator is more sensitive to rainfall intensity than the polarimetric algorithms. The *R*(*Z*) NB decreases from −0.09 to −0.32 with increasing rainfall intensity. The NB values with the polarimetric algorithms, on the other hand, show much less dependence on rainfall intensity. Independence of NB on rainfall rate suggests lack of sensitivity to DSD variability and is thus a favorable feature. However, when considering the entire dataset in Fig. 3 (G > 0 mm h^{−1}), the NB closest to 0 occurs with *R*(*Z*).

Figure 4 shows the NB for each event from all seven algorithms. The right-hand ordinate indicates the basin-averaged rainfall estimated from rain gauge accumulations. Here *R*(*Z*) had the “best” average NB when all events were combined, as mentioned previously, but is shown to have the greatest variability from case to case; *R*(*Z*) demonstrates an NB as low as −0.39 for event 5 (TS Erin, refer to Table 1) and as high as 0.82 for event 7. A plausible explanation for the *R*(*Z*) NB variability is sensitivity to DSD and presence of hail. With tropical rain characteristics in event 5, *R*(*Z*) is known to underestimate precipitation with the standard NEXRAD relation because of the high concentrations of relatively small-diameter drops (e.g., Smith et al. 1996; Petersen et al. 1999). If hail is present, then *R*(*Z*) is known to overestimate rainfall due to the high reflectivity values relative to their liquid water content (Ryzhkov et al. 2005b). This factor evidently was predominant in events 6 and 7. In fact, Vulpiani et al. (2009) discussed “pronounced polarimetric hail signatures” over Ft. Cobb during event 7. Here *R*(syn) shows the least variability in NB from event to event with a standard deviation in NB of 0.11 compared to a standard deviation of 0.40 with *R*(*Z*). The HCA components added to both *R*(*Z*, *Z*_{DR}|JPOLE) and *R*(*Z*, *Z*_{DR}|BC) algorithms resulted in lower standard deviations of NB from event to event. Apparently, the HCA successfully identified the presence of different hydrometeors (e.g., hail, graupel, aggregates) and adjusted the precipitation estimation schemes accordingly. Impacts of the identified radar-based precipitation errors and their temporal stability on hydrologic simulation are subjects addressed in the following section.

## 5. Hydrologic evaluation

The second component of this study is an evaluation of KOUN precipitation estimators in the context as inputs to a distributed parameter hydrologic model. The primary focus is to answer: will polarimetric radar upgrades to the WSR-88D network improve hydrologic simulation and flash-flood forecasting? As in the previous section, *R*(*Z*) is the least biased when considering all events, but this bias is nonstationary and the algorithm was shown to be the least precise; *R*(syn), on the other hand, was shown to have a significant normalized bias at −31%, but it is stationary and thus readily amendable. The latter algorithm was also shown to be more precise with a lower SE. The hydrologic evaluation undertaken herein has been adapted from the original technique of Gourley and Vieux (2005) to evaluate the effects of polarimetric radar observations on streamflow simulation.

### a. HL-RDHM

The model concepts used in this study originate from the Sacramento model (Burnash et al. 1973). This model was subdivided into grid cells having 4.76-km resolution, in accordance with the NWS Hydrologic Rainfall Analysis Project (HRAP) grid. Each grid cell has a water balance component as well as kinematic overland and channel routing components (Koren et al. 2004). The water balance component of the model, referred to as the Sacramento Soil Moisture Accounting model (SAC-SMA), considers forcing from spatially variable rainfall inputs and monthly potential evaporation (PE) demand when transferring water between the surface, upper and lower soil zones, and channels. SAC-SMA utilizes a total of 16 parameters and six state variables characterizing the water contents and fluxes between the upper and lower soil zones. Eleven of the 16 parameters are spatially distributed with a priori values supplied by empirical relations to observed or inferred soil properties and depths (Koren et al. 2000). Precipitation excess is computed at each grid cell and then kinematically routed downhill/downstream based on cell connectivity derived from a digital elevation model. The channel density and overland roughness values are set to constant values. The channel routing component assumes a power-law relation between the cross-sectional area of the stream and discharge. Two parameters defining the relation were found empirically using available observations from USGS stream gauging locations in the Ft. Cobb watershed (see Fig. 1). These routing parameters were then redistributed upstream from each USGS station to contributing grid points and as such are spatially distributed. In total, HL-RDHM uses 18 SAC-SMA and channel routing parameters, 5 of which are lumped.

### b. Assessment of rainfall inputs using DREAM (ARID)

The method of evaluating model inputs by comparing hydrologic simulations to observed streamflow is subject to the parameter settings of the model. For instance, a rainfall input that is biased low can appear to be a skillful input if the model parameters have been tuned over a sufficiently long calibration period to expect erroneously low rainfall amounts. This model “self-adjustment process” is particularly achievable with a highly parameterized model, such as HL-RDHM, which has 18 parameters. To separate the effects of errors in model inputs and subsequent adjustment by model parameters, we have carefully designed a methodology that begins with automatic parameter estimation.

Vrugt et al. (2009) demonstrated a model parameter estimation framework based on an adaptive Markov Chain Monte Carlo (MCMC) algorithm, called Differential Evolution Adaptive Metropolis (DREAM). The method runs multiple chains simultaneously to explore the global parameter space, and it automatically tunes the scale and orientation of the proposed distributions in estimating the posterior distribution of parameters. Once the marginal posterior probability density functions (pdfs) of the parameters are estimated, ensemble streamflow prediction and appropriate estimation of uncertainty bounds ensue by sampling the joint parameter pdfs. The DREAM method was developed for complex, highly nonlinear, multimodal target distributions, while maintaining ergodicity and good efficiency.

In the case of the Ft. Cobb watershed, the “true,” unbiased rainfall is supplied by the 15 ARS Micronet tipping-bucket rain gauges (see Fig. 1). Hourly rainfall fields were computed from the gauge observations using a two-parameter inverse-distance weighting scheme with a “leave one out” cross-validation step. The gauge-based rainfall fields were then sampled on the hydrologic model’s HRAP grid, which has a nominal grid spacing of 4.76 km. The gauge rainfall fields *R*(gag) were input into HL-RDHM over the DREAM calibration period June 2005 through June 2008. The length of time considered in this study corresponds to the period of record of ARS Micronet rain gauge data availability in Ft. Cobb.

The HL-RDHM was then run in the DREAM parameter estimation framework using rainfall forcing from *R*(gag). The sum of squared errors, computed by comparing simulated to observed streamflow at a USGS stream gauging location (07325800, refer to Fig. 1), was the objective used to define simulation skill. DREAM converged on final parameter settings by optimizing scalars applied to 18 parameter fields, 5 of which were spatially uniform, requiring 647 425 function evaluations. The calibrated hydrograph is shown with observed streamflow in Fig. 5. Figure 5b shows that there was a tendency for the model to overforecast, especially from August 2006 through March 2007, when there was little or no observed streamflow. Also, Fig. 5c indicates some larger peak discharges were underestimated in October 2007 and May–June 2008. Overall, the DREAM automated parameter estimation methodology yielded a Nash–Sutcliffe coefficient of efficiency (NSCE) (Nash and Sutcliffe 1970) of 0.83 and NB (defined as the sum of simulated minus observed runoff, divided by the sum of observed runoff, in %) of −24.56%. While these skill scores during calibration are acceptable, perhaps some improvements could have been made with a different approach toward parameter estimation, such as manual adjustment. However, for the purposes of this hydrologic evaluation, we required that the parameter estimation be automatic and thus completely objective.

The method to evaluate rainfall algorithms as inputs to hydrologic models, ARID, is particularly useful for rainfall algorithms derived from new sensors or experimental technologies (e.g., dual-polarized radar) for which there may not be the availability of a continuous record to be used for calibration, which was the case with the KOUN polarimetric radar observations. Otherwise, it would be possible to directly use the evaluated rainfall algorithms, even if bias is present, during the calibration period; this latter approach is recommended for instruments that operate and are maintained continuously by agencies such as the NWS. The fundamental requirement of ARID having a hydrologic model whose parameters were estimated using unbiased rainfall forcing has been achieved at this point. The next step in ARID replaces the *R*(gag) inputs with each of the KOUN-based precipitation algorithms described in section 3 for the events listed in Table 1. The calibrated HL-RDHM model is then run continuously over the 3-yr period. Here *R*(gag) rainfall is input to the model for the times at which KOUN data were not available so that the model states are representative leading up to the events listed in Table 1.

Simulated and observed discharge for all nine events using inputs from the seven KOUN-based precipitation algorithms and “true rainfall” from *R*(gag) are shown in Fig. 6. At first glance, it appears that the error in simulating peak discharge from event to event can be directly anticipated based on the rainfall biases shown in Fig. 4. Specifically, the majority of the simulated hydrographs tended to underestimate observed streamflow owing to the negatively biased rainfall inputs (Figs. 6a,c,d,e,h). On the other hand, overestimation of streamflow occurred with the cases that were noted to contain hail in the rain gauge evaluation (Figs. 6f,g). Curiously, there were two events in which many of the rainfall inputs were negatively biased according to Fig. 4, yet simulated streamflow still overestimated observed flow significantly (Figs. 6b,i). In fact, overestimation of streamflow occurred with all events that had peak discharge <30 m^{3} s^{−1}. It is quite plausible that the combination of model error and parametric uncertainty dominated the simulated streamflow response for these weaker events where the streamflow response was less influenced by the rainfall forcing. The same errors occurred with simulated hydrographs that used *R*(gag) forcing; thus, it is important to cast the results from the hydrologic evaluation relative to the calibrated simulations.

To facilitate interpretation of the results in Fig. 6 in a condensed format, we computed statistical measures summarizing the precision and accuracy of the simulations for the 3-yr period of study. Here *R*(gag) were used as inputs for a majority of the study period, and the KOUN-based precipitation algorithms substituted for the events listed in Table 1; we thus define the following statistics to evaluate simulations from each of the KOUN precipitation algorithms *relative* to the skill of the DREAM-based *R*(gag) calibration run (see Fig. 5):

where GRE is the gauge-relative efficiency for simulations corresponding to each of the KOUN-based precipitation inputs *R*, *Q* is the streamflow at each *i*th 15-min time step, and the superscript “obs” represents the observed streamflow. The GRE is very similar in formulation to the more common NSCE. The difference is that the mean observed streamflow present in the denominator of NSCE is replaced with the *Q _{i}*

^{R(gag)}simulation that had DREAM-optimized parameters with unbiased rainfall forcing from

*R*(gag) shown in Fig. 5. The GRE score thus casts simulation skill in relation to the skill achievable by the calibration run. A score of 0 indicates the

*R*rainfall input resulted in the same efficiency that was obtained using true

*R*(gag) rainfall in the model calibration run. A score of 1 indicates the simulation skill exceeded that produced by model calibration and agreed perfectly with observations. GRE scores worsen as they become more negative up to −∞, indicating the least skill. The second statistic computed is defined as follows:

where GRB (%) is the gauge-relative bias for simulations corresponding to each *R*. GRB = 0% indicates the simulation bias was the same as that achieved with *R*(gag) inputs, or −24.56%. The two statistics shown in (9) and (10) have been designed with the expectation that simulation skill will generally worsen when the rainfall forcing deviates from the true rainfall used in the calibration step.

Figure 7 shows a two-dimensional plot of GRE and GRB for each of the KOUN precipitation algorithms evaluated in this study. Simulation skill equivalent to that achieved by *R*(gag) inputs with optimized model parameters will have GRE and GRB values of 0. First, we see inputs from *R*(*Z*, *Z*_{DR}|BC) and *R*(*Z*, *Z*_{DR}|BC, HCA) yielded streamflow simulations that were more skillful than those achieved using the conventional *R*(*Z*) algorithm. The *R*(*K*_{DP}), *R*(*Z*, *Z*_{DR}|JPOLE), *R*(*Z*, *Z*_{DR}|JPOLE, HCA), and *R*(syn) algorithms, on the other hand, all had worse hydrologic skill than those simulations forced by *R*(*Z*) inputs. The rain gauge evaluation in section 4 indicated these algorithms had the largest negative biases. Apparently, hydrologic simulations according to the GRB and GRE scores are rather sensitive to rainfall bias (accuracy). Regarding the magnitude of the NB found in the rain gauge evaluation in comparison to the GRB found using ARID, we see the biases with each of the KOUN precipitation algorithms have been approximately halved after propagating through the hydrologic model as streamflow.

The sensitivity of hydrologic simulation to rainfall bias combined with the differing behaviors of KOUN rainfall algorithm biases as a function of rainfall intensity and storm event (see section 4) prompted us to reexamine ARID findings, this time with long-term, event-combined rainfall biases corrected (i.e., removed). To do this, we applied a single mean field bias adjustment to each KOUN precipitation algorithm based on the NB values shown in Fig. 2. Then, we performed ARID as before but using the unbiased rainfall estimators. The simulations for each event are shown in Fig. 8. In comparing Figs. 6, 8, improvements in skill are readily visible after rainfall biases have been removed. However, overestimation of streamflow is still present for the events with observed peak discharges <30 m^{3} s^{−1} (Figs. 8b,f,g,i). This finding confirms that, indeed, the hydrologic model error dominates the streamflow simulations when the events are weak and are thus not heavily driven by the rainfall forcing, even after the rainfall biases were removed. Figure 7 quantifies the overall hydrologic skill based on GRE and GRB scores. Significant improvements are noted in both GRE and GRB scores for all KOUN algorithms following bias correction (denoted by an asterisk superscript), with the notable exception of *R*(*Z*)*. In fact, all precipitation algorithms using polarimetric observations with bias correction outperformed *R*(*Z*) and *R*(*Z*)*. Evidently, the high variability of the *R*(*Z*) bias changing from event to event means a long-term bias adjustment yields no improvement in hydrologic simulation. The biases associated with the polarimetric algorithms, on the other hand, have a more stationary behavior and, when the long-term bias has been removed, significant improvement in hydrologic skill is realized.

The best hydrologic skill is associated with *R*(syn)* and *R*(*Z*, *Z*_{DR}|BC, HCA)*. Figure 3 shows these two algorithms in particular had significant biases prior to correction, but the biases were virtually independent of rainfall intensity. Here *R*(syn)* and *R*(*Z*, *Z*_{DR}|BC, HCA)* are the least susceptible to DSD variability and have biases with a stationary behavior, both of which are evidently positive attributes when considering them as rainfall inputs for hydrologic simulation. We also see improvements in *R*(*Z*, *Z*_{DR}|BC, HCA)* over *R*(*Z*, *Z*_{DR}|BC)* and in *R*(*Z*, *Z*_{DR}|JPOLE, HCA)* over *R*(*Z*, *Z*_{DR}|JPOLE)*, indicating the HCA component yielded improvement to precipitation algorithms employing polarimetric observations. Of all polarimetric algorithms with bias adjustment, *R*(*K*_{DP})*, which used the least amount of polarimetric information, had the lowest hydrologic skill. This analysis indicates information provided by all polarimetric observations [i.e., use of *Z*, *Z*_{DR}, and *K*_{DP} in the *R*(syn)* estimator] and complex utilization of the data, such as in the HCA, are justified. However, the bulk of the improvement in hydrologic simulation was only realized following event-combined bias correction.

## 6. Summary and conclusions

A heavily instrumented USDA ARS research watershed near Ft. Cobb, Oklahoma was used to 1) evaluate precipitation estimates from the KOUN polarimetric prototype of the WSR-88D and to 2) assess the impacts of polarimetric precipitation estimation on hydrologic simulation using the calibrated HL-RDHM. Hourly precipitation from seven polarimetric algorithms were compared to 15 ARS Micronet rain gauge accumulations totaling 1299 radar–gauge pairs over nine separate events from 2005 to 2008. The following points are summarized below from the rain gauge evaluation:

In terms of cumulative skill metrics, all algorithms that used polarimetric observations yielded better precision by reducing the SE and increasing the CORR over the conventional

*R*(*Z*) estimator.When events were considered altogether, the least biased (most accurate) precipitation algorithm was the conventional

*R*(*Z*) estimator.The NB of

*R*(*Z*) had the greatest variability from event to event, due to different DSDs and the presence of hail, and tended to underestimate increasingly with higher rainfall intensities.At a distance of 83–120 km from KOUN, the synthetic polarimetric precipitation algorithm

*R*(syn) had the most significant underestimation with an NB of −31% (least accurate), but this bias was virtually independent of rainfall intensity and storm events characterized by different DSDs and presence of hail.

The second component of this study evaluated the polarimetric precipitation algorithms in the context as inputs to a distributed hydrologic model so as to assess the impacts of polarimetric upgrades to the WSR-88D network on hydrologic simulation and flash flood forecasting. A methodology called Assessment of Rainfall Inputs using Differential Evolution Adaptive Metropolis (DREAM; ARID) was developed and implemented to evaluate polarimetric precipitation algorithms. The fundamental requirements of ARID were ARS Micronet rain gauge data [*R*(gag)] to represent the “true,” unbiased rainfall and an automated parameter estimation methodology. The *R*(gag) data were input into HL-RDHM for the 3-yr calibration period from 2005 to 2008, and then 647 425 simulations were performed within the DREAM framework to automatically locate the optimized parameter scalar multipliers. These parameter settings were then fixed and the calibrated model was rerun with rainfall forcing from the seven KOUN algorithms. Continuous simulations were performed for the 3-yr period where *R*(gag) data were used in between events (i.e., when polarimetric data were not available) to provide accurate model states. Two new statistical measures—gauge-relative bias (GRB) and gauge-relative error (GRE)—quantified the hydrologic skill of each algorithm relative to the simulation forced by *R*(gag) used in the model calibration step. Limitations of applying the ARID methodology were noted with four weakly forced events that had peak discharges <30 m^{3} s^{−1}. Streamflow was overpredicted with these cases regardless of rainfall forcing, which indicates the combination of model error and parametric uncertainty dominated the uncertainty in the streamflow response instead of the uncertainty in the rainfall inputs. The following points summarize the hydrologic skill of the evaluated polarimetric algorithms:

Rainfall biases found with each of the polarimetric precipitation algorithms were approximately halved after propagating through the hydrologic model as streamflow.

When the precipitation algorithms were evaluated “as is” with no bias correction,

*R*(*Z*,*Z*DR|BC) and*R*(*Z*,*ZDR*|BC, HCA) were the only polarimetric precipitation estimators to improve hydrologic skill over conventional*R*(*Z*) inputs.Following correction of event-combined bias based on the rain gauge evaluation, all simulations forced by polarimetric precipitation estimators outperformed the conventional

*R*(*Z*) and the bias-corrected*R*(*Z*)*.The hydrometeor classification algorithm (Giangrande and Ryzhkov 2008; Park et al. 2009) added to both

*R*(*Z*,*Z*DR|BC)* and*R*(*Z*,*ZDR*|JPOLE)*, subject to bias correction, yielded improvements in hydrologic simulation.The best hydrologic skill, relative to parameter settings from

*R*(gag) forcing, was achieved with forcing from bias-corrected*R*(syn)* and*R*(*Z*,*Z*DR|BC, HCA)*.

This study found that polarimetric precipitation algorithms were more precise than the conventional *R*(*Z*) estimates, as denoted by lower SE values, but were subject to significant underestimation up to −31% at ranges greater than 80 km; range-dependent errors of polarimetric precipitation algorithms have been explored in more detail in Giangrande and Ryzhkov (2008). It is possible that polarimetric rainfall biases on other radars will also result from miscalibrated *Z* and *Z*_{DR} values, which were carefully corrected in this study. The mitigation of these biases, however, was found to be very effective in continuous hydrologic simulation because the biases in the polarimetric precipitation algorithms were relatively stationary in behavior; they were virtually independent of rainfall intensity and storm events characterized by different DSDs and the presence of hail. It is thus imperative that long-term biases of polarimetric precipitation algorithms are initially identified and corrected before being implemented in hydrologic models. We recommend future studies focused on polarimetric precipitation errors as a function of range, as in Giangrande and Ryzhkov (2008), but for other geographic areas and storm types. Following this correction, improvement in hydrologic simulation was found to correspond to better precision of the precipitation algorithms. This precision resulted from increasing complexity in the algorithm formulations in terms of the number of polarimetric parameters used, inclusion of a hydrometeor classification algorithm, and adaption of the utility of all three polarimetric observations based on precipitation rate as in the *R*(syn) algorithm.

Areas inviting future research are estimating parameters in distributed hydrologic models with continuous datasets of polarimetric observations, examination of polarimetric precipitation inputs in hydrologic simulations in diverse geographic regimes, such as the intermountain West, and in basins with smaller catchment areas. Another topic worth pursuing is the simultaneous estimation of parameters used in polarimetric precipitation relations along with hydrologic model parameters using the difference between simulated and observed discharge.

## Acknowledgments

Funding was provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA–University of Oklahoma Cooperative Agreement NA17RJ1227, U.S. Department of Commerce. Support from NSSL and University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies personnel who maintain and operate the KOUN WSR-88D polarimetric radar is gratefully acknowledged. The second author would like to acknowledge the support of Isztar Zawadzki and Pavlos Kollias at McGill University. The last author is supported by a J. Robert Oppenheimer Fellowship from the Los Alamos National Laboratory postdoctoral program. Computer resources for the calibration of the model were made available from the University of Oklahoma’s Supercomputing Center for Education and Research (OSCER).

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Jonathan J. Gourley, National Weather Center, 120 David L. Boren Blvd., Norman, OK 73072-7303. Email: jj.gourley@noaa.gov

This article included in the State of the Science of Precipitation special collection.