The NEXRAD program has recently upgraded the WSR-88D network observational capability with dual polarization (DP). In this study, DP quantitative precipitation estimates (QPEs) provided by the current version of the NWS system are evaluated using a dense rain gauge network and two other single-polarization (SP) rainfall products. The analyses are performed for the period and spatial domain of the Iowa Flood Studies (IFloodS) campaign. It is demonstrated that the current version (2014) of QPE from DP is not superior to that from SP mainly because DP QPE equations introduce larger bias than the conventional rainfall–reflectivity [i.e., R(Z)] relationship for some hydrometeor types. Moreover, since the QPE algorithm is based on hydrometeor type, abrupt transitions in the phase of hydrometeors introduce errors in QPE with surprising variation in space that cannot be easily corrected using rain gauge data. In addition, the propagation of QPE uncertainties across multiple hydrological scales is investigated using a diagnostic framework. The proposed method allows us to quantify QPE uncertainties at hydrologically relevant scales and provides information for the evaluation of hydrological studies forced by these rainfall datasets.
Single-polarization (SP) radars have been used to estimate precipitation quantities for several decades (e.g., Wilson and Brandes 1979). During this period, the research community has extensively examined the limitations of these instruments and developed methods to improve quantitative precipitation estimation (QPE; see, e.g., Austin 1987; Smith et al. 1996; Pereira Fo. et al. 1998; Krajewski and Smith 2002; Villarini and Krajewski 2010; Kitzmiller et al. 2013). The importance of radars for weather analysis, warnings, and forecasting has been proven throughout the years. However, as shown by various studies, the accuracy of SP radar QPE is limited, and applying this information in hydrology requires careful attention (Smith et al. 1996; Baeck and Smith 1998; Borga 2002; Cunha et al. 2012; Berne and Krajewski 2013). With the goal of improving radar QPE, the NWS has implemented a few initiatives, one of the most recent ones being the upgrade of the Next Generation Weather Radar (NEXRAD) network with dual-polarization (DP) capabilities.
DP radars present many advantages over SP radars, including better characterization of hydrometeor types; enabling the identification of nonweather targets; the differentiation of rain, snow, and melting layer; and the detection of hail and heavy rain. These features allow the improvement of data quality control (QC) and QPE (e.g., Chandrasekar et al. 1990; Liu and Chandrasekar 2000; Illingworth et al. 1986; Chandrasekar et al. 2013). The ability of DP to identify hydrometeor type has been extensively explored. However, it is still unclear how hydrometeor type information and DP variables should be combined to improve QPE (Sachidananda and Zrnić 1987; Seliga and Bringi 1976; Chandrasekar et al. 2008; Giangrande and Ryzhkov 2008; Cifelli et al. 2011). Although some studies demonstrate improvements in QPE based on DP radars, other results are contradictory (e.g., Ryzhkov et al. 2005b; Tabary et al. 2011), and improvements are not always consistent (Cunha et al. 2013). The current lack of consensus and consistency of DP algorithms is partially due to the fact that the DP technology has only been recently implemented in operational practice and is still being advanced and tested. Similar to the development of SP technology, many years of research will be required until we learn how to optimally use the measurements obtained by DP radars.
The NEXRAD program completed the DP upgrade in June 2013, and NOAA has implemented a system to ingest, process, and distribute DP radar data and derived products. The new products have been recently added to the NEXRAD operational product stream. Istok et al. (2009) provide a description of the variables directly and indirectly obtained by DP radar and the current methods used for QPE. In this study, we evaluate the DP rainfall estimates provided by the current version of the WSR-88D systems using a dense rain gauge network and two other SP rainfall products. We focus our analyses on the period and location of the Iowa Flood Studies (IFloodS) campaign (http://pmm.nasa.gov/ifloods). The IFloodS campaign was conducted in the northeastern part of Iowa in the spring of 2013 to support ground validation program activities of the international Global Precipitation Measurement (GPM) satellite mission (Petersen and Krajewski 2013). Our goal is to evaluate the new WSR-88D QPE in the context of IFloodS and hydrologic applications. We also attempt to identify strengths and weaknesses of the current DP system to guide future research on the topic. Because the system was only recently implemented and the definitions of QPE algorithms and parameters are based on a limited dataset (Giangrande and Ryzhkov 2008), one of the first steps for further development is to assess how the method performs for different areas, different radars, and under different meteorological conditions.
This paper assesses the quality of DP QPE by comparing it with rain gauge observations and two operational SP QPE products: Stage IV and Iowa Flood Center (IFC; IFC-SP). We investigate DP uncertainties for different flood events, radar, and hydrometeor types and explore how these uncertainties propagate across scales. The paper is organized as follows. In section 2, we describe the study area, the datasets, and the methods used to evaluate DP QPE. In section 3, we present our main results. We summarize flood events that occurred in Iowa during the late spring and early summer of 2013, present rain gauge versus radar data comparisons, and demonstrate how rainfall uncertainties propagate through the river network. In the last section, we present our conclusions and a discussion on the main technical challenges we foresee in DP QPE that should be resolved in the coming years.
2. Study area, data, and methods
We focus on the IFloodS spatial domain shown in Fig. 1 and the period from 1 April to 30 July. Iowa was chosen as a GPM (Hou et al. 2014; Tapiador et al. 2012) field campaign site [for more details on the campaign, see Petersen and Krajewski (2013)] because of its uniform land cover, relatively flat topography, absence of coastal effects, the high frequency of floods from May to June, and the availability of preexisting rainfall measurement instruments, including seven NEXRAD radars that collectively cover the state. The site also enables studies of floods across a wide range of spatial scales. During the campaign, several additional instruments that measure rainfall, including the NASA polarimetric S-band radar (NPOL), NASA disdrometers, and four X-band radars (XPOL), were installed in the area. In this study, we evaluate QPE for three of the seven radars: Des Moines (KDMX), Davenport (KDVN), and La Crosse (KARX). In Fig. 1, we show the 230-km range for these radars. The IFloodS campaign focused on three watersheds, delineated in Fig. 1: the Cedar River, the Iowa River, and the Turkey River basin. The hydrological analyses we present in this study focus on the Turkey River basin, since 20 extra rain gauges with soil moisture probes and two Iowa Flood Center X-band polarimetric mobile weather radars were deployed in this basin during the campaign [see Mishra et al. (2015, manuscript submitted to J. Hydrometeor.) and Petersen and Krajewski (2013)]. This basin is completely covered by the KARX radar and partially covered by the KDMX and the KDVN radars.
In the following sections, we describe the datasets used in this study. We start by describing SP and DP radar QPE products and conclude by describing the rain gauge data used as ground reference in the analyses.
a. Radar rainfall data
The IFC-SP product is generated by the Iowa Flood Center at the University of Iowa and is used as the main input to real-time hydrological forecast models for the state of Iowa. The product is generated using a real-time implementation of Hydro-NEXRAD (Krajewski et al. 2011, 2013; Kruger et al. 2011; Seo et al. 2011). The IFC rainfall processing system uses super-resolution (0.5° × 250 m) level II SP radar volume data (Torres and Curtis 2007; Seo and Krajewski 2010) obtained from the Unidata Local Data Manager (LDM) delivery software (Sherretz and Fulker 1988; Fulker et al. 1997). The system uses predefined sets of quality control and QPE algorithms and relevant parameters, such as space and time resolution, spatial domain, and rainfall conversion [e.g., rainfall–reflectivity; R–Z or R(Z)] parameters [for more detailed information, refer to Krajewski et al. (2013)]. The real-time product occasionally misses individual volume scan data from one or more of the radars covering the Iowa domain since it is susceptible to network losses or delays in the distribution of the data. The overall fraction of the missing data is small, on the order of 0.1% and most likely to occur during periods of precipitation when the files are larger. To avoid such problems, in this study we evaluate a product that was reprocessed after the IFloodS campaign using the same algorithms as the real-time system and including missing files. The product covers the entire state and is based on reflectivity measured by seven radars: KDVN in Davenport, Iowa; KDMX in Des Moines, Iowa; KARX in La Crosse, Wisconsin; KMPX in Minneapolis, Minnesota; KOAX in Omaha, Nebraska; KFSD in Sioux Falls, South Dakota; and KEAX in Kansas City, Missouri.
The steps used to generate the IFC rainfall product involve radar reflectivity data quality control (detection of ground echoes due to anomalous propagation and ground clutter), time synchronization of volume scans, merging radar reflectivity from multiple radars onto predefined common spatial and temporal domains, and transforming radar-measured reflectivity to rainfall amounts (rain rate/accumulation). Note that radar merging is performed at the reflectivity level. Details about each procedure are presented in Krajewski et al. (2013). The final product is rainfall rate (mm h−1) at 5-min temporal resolution and is accumulated over different time intervals (e.g., 15 min or hourly) in the postprocessing. The IFC-SP presents the highest spatial resolution (approximately 0.5 km) from all products evaluated in this study. A minimum rain-rate threshold of 1.59 mm h−1 is used in the generation of the 5-min data to address the effects of anomalous propagation and ground clutter mostly closer to the radars. This rather high threshold is justified by the negligible effects of what is essentially drizzle (below the threshold) on development of flooding.
2) Stage IV
Since the mid-1990s, the NWS has provided multisensor precipitation products for hydrological operations. The product consists of real-time estimates of liquid precipitation at the ground and snow depth and snow water equivalent estimates (Super and Holroyd 1997). Seo et al. (2010) and Kitzmiller et al. (2013) presented a detailed description of the NWS system. Initially, the NWS rainfall estimation method consisted of three steps denoted Stage I, II, and III. Stage I generated single-radar, radar-only product; Stage II produced multisensor (radar and rain gauge data) for a single radar (Fulton et al. 1998); and Stage III consisted of the regional mosaicking of the Stage II radar–gauge products with interactive quality-control procedures performed at the NWS River Forecast Centers (RFCs). In 2002 this system was replaced by the Multisensor Precipitation Estimator (MPE) algorithm, which incorporated significant improvements gained since the implementation of the system (Breidenbach et al. 1999; Breidenbach and Bradberry 2001; Lawrence et al. 2003). Since 2002, improvements have been performed to correct numerical truncation errors in accumulation (Fulton et al. 2003; Zhang et al. 2011) and to improve operational QC (O’Bannon and Ding 2003; Glaudemans et al. 2009).
Stage IV refers to the nationwide mosaic of manually edited regional maps produced on an hourly basis. In this work, we use the hourly Stage IV product on the Hydrologic Rainfall Analysis Project (HRAP) grid (Reed and Maidment 1999) of approximately 4 km × 4 km generated and maintained by the National Centers for Environmental Prediction (NCEP) that is produced by combining Stage III products from the regional multisensor analyses obtained from the 12 RFCs in the contiguous United States. Lin and Mitchell (2005) describe the methods used by NCEP to generate this product. The hydrological and meteorological communities have been using this product as reference for gridded rainfall estimates because of its national coverage, high spatial and temporal resolutions, and lack of significant overall bias (e.g., Wu et al. 2012; Seo et al. 2013; Price et al. 2014; Brun and Barros 2014).
According to D. Kitzmiller (2015, personal communication), the upgrade of the Advanced Weather Interactive Processing System (AWIPS) MPE software to ingest, bias adjust, and incorporate DP QPE products into the operational Stage III and Stage IV processes is underway. The ability to incorporate real-time gauge information would help mitigate some of the time- and space-dependent biases demonstrated in the paper.
3) DP products: Level III NCDC
In this study, we evaluate level III DP products produced by the NWS for the three radars shown in Fig. 1. NEXRAD DP QPE algorithms were developed and tested based on the data collected during the multiyear Joint Polarization Experiment (JPOLE) that took place in central Oklahoma (see Ryzhkov et al. 2005c; Giangrande et al. 2008; Giangrande and Ryzhkov 2008; Vulpiani and Giangrande 2009). The algorithm first identifies the most likely type of hydrometeor sampled by the radar (Ryzhkov et al. 2005a; Park et al. 2009) using fuzzy logic, and the top and bottom melting-layer boundaries (Giangrande et al. 2008). The boundaries of the melting layer are defined based on lower- and upper-limit thresholds for reflectivity and correlation coefficient typically found in a melting layer. In the current system the following classes are defined: 1) biological, 2) ground clutter and anomalous propagation, 3) ice crystal, 4) dry snow, 5) wet snow, 6) light–moderate rain, 7) heavy rain, 8) big drops, 9) graupel, and 10) hail mixed with rain. Once the hydrometeor type and the melting-layer position are identified, they are used to specify the most appropriate relationship to convert DP variables into rainfall. For example, for light–moderate rain and heavy rain, rainfall rate is a function of reflectivity Z and differential reflectivity Zdr. For wet snow, graupel, and dry snow, rainfall is estimated based on the conventional R(Z) relationship multiplied by an empirical factor selected to remove average bias. Giangrande and Ryzhkov (2008) empirically estimated the equation and parameters for each hydrometeor type and its parameters using JPOLE data.
We evaluate two NEXRAD DP products: 1) the hybrid hydrometeor class (HHC), and the 2) instantaneous precipitation rate (IPR). Both IPR and HHC fields are provided at 1-km spatial resolution for each volume scan. In the typical rainfall mode operation, NEXRAD radars collect a volume scan every 4–5 min. The HHC is the hydrometeor classification obtained from the best/lowest available scan at each location (www.ncdc.noaa.gov/oa/radar/radarproducts.html). We processed IPR and HHC to generate hourly maps of rainfall rate (mm h−1) and most frequent hydrometeor class. The NEXRAD DP information is provided for a maximum radar range of 230 km (Fig. 1). To enable evaluation of overlapping radars and the comparison with different rainfall products, we first converted the data from polar coordinates to geographical coordinates with a spatial resolution of approximately 1 km × 1 km. NEXRAD radars typically collect a volume scan every 4–5 min in a precipitation mode, and in the case of level III products, the dataset consists of IPR with the same intervals. To deal with the irregular time intervals of the input dataset, we linearly interpolated radar scans to 1-min intervals and then aggregated the 1-min data to obtain 60-min rain rates. We adopted a similar procedure to generate hourly maps of hydrometeor types. In that case, we interpolated hybrid hydrometeor class to 1-min resolution using nearest neighborhood approach, and we aggregated the information by associating a specific hydrometeor type to an hour if it is present for at least 70% of that hour (48 min).
As previously described, both IFC-SP and Stage IV are based on SP data from multiple NEXRAD radars that are mosaicked to generate a regional (IFC-SP) or national (Stage IV) product. The IFC-SP is mosaicked at the reflectivity level (reflectivity based), while the Stage IV is mosaicked at the rainfall level (accumulation based) using the final product of the MPE created at the RFCs. To compare DP products with IFC-SP and Stage IV, we mosaicked 60-min data from KDMX, KDVN, and KARX using an accumulation-based procedure. We generated a regional merged product, Merged-DP, that covers the entire study area using IFC-SP as a mask (resolution, domain, and spatial grid). The entire IFloodS study area is covered by at least one radar, and some regions are covered by two or three radars. Note that the Turkey River basin is where the three radars overlap, but it is at a relatively far range from all three radars. We spatially combined different rainfall fields using range-dependent weights estimated based on the double exponentially decaying function (e.g., Zhang et al. 2005; Seo et al. 2011).
b. Rain gauge network
In this study, we use rain gauge data from six different networks: 1) the NWS Automated Surface Observing System (ASOS; McKee et al. 2000), 2) the Automated Weather Observing System (AWOS; Milewska and Hogg 2002), 3) the Iowa State University AgClimate (ISUAG) stations located at experimental farms throughout Iowa, 4) the IFC–NASA rain gauges (Petersen and Krajewski 2013), 5) the NASA disdrometer network deployed specifically for IFloodS (D’Adderio et al. 2015), and 6) the NCEP U.S. gauge dataset (Kim et al. 2009). All the networks provide rainfall data with at least hourly resolution. The ASOS and AWOS networks utilize all-weather gauges that provide the liquid equivalent of all precipitation forms (i.e., liquid, freezing, frozen, or combinations), while the other networks utilize gauges that only measure liquid precipitation. (These datasets are freely available at http://mesonet.agron.iastate.edu/ for AWOS, ASOS, and ISUAG; at http://iowafloodcenter.org/projects/ifloods/ for IFC–NASA and NASA disdrometers; and at http://data.eol.ucar.edu/codiac/dss/id=21.004 for NCEP.)
Gauge and disdrometer rainfall data are commonly used as ground reference to evaluate radar rainfall products. However, the measurements provided by these instruments are also susceptible to errors (Ciach 2003; Frasson et al. 2011). Some of the networks do not include a data quality or missing data flag, which prevents the identification of periods of no rain or missing data or periods of gauge malfunctioning. To enhance the data reliability, we established procedures to quality control the rain gauge data and eliminated sites that did not pass the quality control. The goal of the quality control was to identify missing data or periods when gauges and disdrometers malfunctioned, especially for the networks that do not include QC flags. Quality-control procedures were based on the comparison of the gauge data with Stage IV rainfall products. We exclude from our analyses gauges that present total accumulation of less than 20% of the total accumulation obtained with Stage IV. This procedure eliminates data from sites in which rain gauges were malfunctioning (e.g., clogged) or were fully or partially inoperative over the study period. We then checked if temporal correlation, at an hourly time scale, with Stage IV and IFC-SP is larger than 0.5. The goal of this procedure is to remove gauges with unsynchronized clocks in the gauge platform that introduce shifts in the time series. A total of 179 sites (42 from AWOS, 15 from ASOS, 10 from ISUAG, 65 from IFC–NASA, 13 from NASA disdrometers, and 34 from NCEP) passed the quality control and were used in the analyses. In Fig. 1, we present the locations of the gauges and disdrometers used in the study.
1) Radar and rain gauge comparison
To evaluate dual-polarization radar rainfall estimates, we compared them to rain gauge and two single-polarization radar QPEs. Rainfall estimates are characterized in terms of relative bias (dimensionless), correlation (dimensionless), and mean-square error (mm2 h−2), where R denotes radar data, G is gauge data, Cov is covariance, is standard deviation. To better understand the contribution of bias and random errors, we decompose the mean-square error following the methodology proposed by Nelson et al. (2010), where
In this equation, the first two terms represent bias in the mean and in the univariate variability of the estimated precipitation. The third term measures the strength of covariation between radar and gauge rain rates. Summary statistics are calculated conditioned on the rain gauge value based on different thresholds rt. This approach allows us to better understand how uncertainties change with rain-rate intensity. To avoid the inclusion of periods for which the rain gauge was not operational, which cannot be easily detected since some networks do not present flags for missing data, we apply the minimum threshold of 0.01 mm h−1. Thresholds vary from 0.01 to 10 mm h−1.
We investigate the ability of radar products to correctly detect rainfall by estimating the probability of successful detection POD, probability of false detection POFD, relative bias due to missed rain MB and due to false detection FB:
2) Multiscale rainfall analyses
One of the goals of the IFloodS campaign is to understand how rainfall estimates and their uncertainties affect the outcome of flood forecast models. A first step in achieving this goal is to understand how rainfall uncertainties propagate through the river network. Although radar versus rain gauge comparisons are commonly applied to assess the accuracy of radar rainfall estimates, this type of analysis does not address how the spatial distribution of QPE uncertainties affect the estimation of important hydrological variables across multiple scales. To better understand the implication of radar error on hydrological studies, we apply a diagnostic framework. We refer to the method as “diagnostic” because our goal is not to specify which product provides the best estimates for each event or basin region, but to evaluate uncertainties based on the comparison of different state-of-the-art QPE estimates and how they differ. This approach allows us to spatially describe the scale and pattern of QPE uncertainty for different rainfall events. This information can guide the use of QPE for hydrological studies (e.g., water balance) or as input to hydrological forecasting models.
We evaluate uncertainties in the estimation of two hydrologically relevant variables that are the main input to the hydrological models: 1) storm total ST and 2) maximum rain rate MaxRR, across multiple spatial scales (from the hillslope to the catchment scale). Storm total is one of the major factors that controls flood generation and peak discharge at medium to large scales [see Ayalew et al. (2014) for scaling-based insights], while maximum rain rate controls flood generation at small scales with fast response.
We perform the analyses for the Turkey River basin (4370 km2). The basin boundaries are shown in Fig. 1 and daily rainfall and streamflow time series for the Turkey River at Garber stream gauging station are shown in Fig. 2. The method to estimate multiscale storm totals and maximum rain rates consist of the following steps:
extraction of the basin river network using 30-m DEM (USGS) and CUENCAS, a geographical information system developed for river network analyses (Mantilla and Gupta 2005) and hydrological simulation (Cunha et al. 2011, 2012);
estimation of hourly rain-rate time series for each hillslope using hillslope mask and different rainfall datasets (Stage IV, IFC-SP, and Merged-DP);
estimation of hourly mean areal rain-rate time series for each river link by averaging the rainfall for all upstream links;
estimation of storm total and maximum hourly rain rate for each link in the network; and
calculation of the relative difference between storm total and maximum rain-rate estimate based on different products (IFC-SP and Merged-DP) using Stage IV rainfall as reference.
We use Stage IV as reference, since it is bias corrected and it presents the best results in comparison with rain gauge data. However, this dataset presents the coarser spatial resolution and might not capture rainfall spatial variability at small scales.
a. Summary of rainfall events
Iowa experienced an extremely wet summer in 2013, and the IFloodS campaign collected an invaluable database to support GPM validation and flood studies (Petersen and Krajewski 2013). In Fig. 2, we present streamflow measured at Turkey River at the Garber USGS streamflow gauge station (USGS 05412500) with drainage area of 4001 km2 (see basin boundaries in Fig. 1) and the daily mean areal precipitation for the period from 1 April to 1 July 2013. A total of 685 mm of rainfall was observed in this basin during that period (based on Stage IV data), and rain (>1 mm day−1) was observed for approximately 55% of days. In this study we evaluate four representative mixed snow and rainfall events, all with 5 days duration: 1) from 4 to 11 April (accumulation of 59.5 mm), 2) from 29 April to 4 May (70.3 mm), 3) from 26 to 31 May (97.6 mm), and 4) from 20 to 25 June (82.0 mm). Even though the IFloodS campaign focused on the period from 1 May to 15 June (Petersen and Krajewski 2013), we extend our period of analyses to include significant events that occurred in early April and in late June (event 4) for which flood warnings were issued by the NWS for the area.
b. Radar–rain gauge analyses
1) Radar versus gauge comparison
In Fig. 3, we show scatterplots of radar products versus surface reference rainfall. At the hourly scale, we observe significant scatter between radar and rain gauge for all the products. To explore how uncertainties change with rain rate, we include a power-law relationship to parameterize error as a function of rain rate (systematic conditional bias). We calculated the power-law parameters (exponent and coefficient) using the least squares nonlinear fitting technique, and we include the parameters in Fig. 3. To avoid bias toward small rain-rate values, we estimated the equation for rain rates larger than 1.0 mm h−1. The power-law exponent describes the degree of nonlinearity in the dependency of error on rain amount. Power-law relationships are commonly used to represent radar rainfall systematic conditional bias (Ciach et al. 2007), but Fig. 3 shows that power-law relationships do not always provide a good representation of errors, especially for large rain values that are infrequent and thus result in small sample size. To identify the cases and regions for which the power-law relationship describes the systematic conditional bias well, we also included a regression line (black solid line) estimate based on the Nadaraya–Watson nonparametric kernel regression estimator. Since this method performs a local fit, results are not affected by the higher frequency of small precipitation values. Note that, on average, all products have the tendency to underestimate small, frequent rain rates and overestimate large, infrequent rain rates when compared to the surface reference.
Figure 4 presents correlation, MSE, bias in the mean (MSE term 1), and bias in the variance (MSE term 2) for all radar rainfall products and multiple rain-rate thresholds. DP products (single radars and Merged-DP) present the highest correlation among all products and thresholds, while IFC-SP presents the lowest correlation. The IFC-SP product presents the highest spatial resolution, and as discussed by Miller et al. (2013), the results for radar–gauge comparison can be affected by the increased sampling error/noise in smaller versus larger sample bins. Among DP products, the correlation is higher for the Merged-DP than for the single-radar products. Stage IV presents similar correlation to Merged-DP for small thresholds (rt < 1 mm h−1), but correlation decreases as rain-rate thresholds increase.
DP products present larger MSE than Stage IV and IFC-SP, with the exception of KDVN, which presents smaller MSE than IFC-SP for large thresholds. Bias in the mean is negligible for DP products for all thresholds, while bias in the mean for IFC-SP and Stage IV is noticeable for thresholds larger than 5 mm h−1. The low bias for Stage IV product for small threshold values is expected since this dataset has been bias corrected using gauge data and manually quality controlled at the RFCs. Some of the gauge data used to correct the Stage IV product is also used in this study (NCEP). On the other hand, DP products present high bias in the variance, while bias in the variance for SP products is negligible. Even though IFC-SP presented the worst performance in terms of correlation, it performs similarly to Stage IV, a bias-corrected product, in terms of bias in the mean and in the variance.
We also observe large discrepancies among the three single radars, with KDVN showing the best performance in terms of MSE and bias and KARX the worst. The three radars evaluated in this study cover different spatial domains, and the difference in performance might be due to differences in the characteristics of the rainfall observed by each radar or the location of the radar in relation to the gauges included in the study. KARX presents the worst performance based on our metrics. Note that data for this radar are evaluated for a range larger than 75 km. In general, the merging procedure improves the correlation of DP products in relation to single radars. The increase in correlation is expected since the merging procedure assigns higher weight to close-range values, mitigating uncertainties caused by range effects (e.g., brightband contamination and heterogeneous vertical profile), and because of the multi-QPE averaging effect. However, merging data from different radars did not result in lower MSE. Merged-DP and KDMX present similar MSEs. KDMX is located in the middle of the domain and has a major impact on the generation of the merged product using distance weighting.
We also evaluate radar skill in detecting rain and no-rain periods based on hourly data. In Fig. 5, we present POD, POFD, the relative missed rain bias, and the relative false rain bias for all products and multiple thresholds. Values of POD and POFD range from 0 to 1, where 1 means a perfect POD and 0 means a perfect POFD. Low POD or high POFD might arise from 1) problems with the gauge rainfall estimates, such as errors caused by a tipping bucket’s inability to detect the beginning or the end of the rain or gauge malfunction (clogging); or 2) problems with the radar rainfall estimates (difficulties on identifying rain vs no-rain periods). The comparative evaluation of different radar rainfall products allows us to isolate errors that are due to gauge limitations from those due to radar rainfall algorithm. To guarantee that we do not count missing rain gauge periods as no-rain periods, we only included gauges that present flags for missing data (ASOS and AWOS).
IFC-SP presents a low probability of rainfall detection for small rain rates, but POD becomes similar to the other products for thresholds equal to or larger than 3 mm h−1. This is partially explained by the intended choice of a fairly high rain intensity threshold (1.59 mm h−1) justified by the purpose of the IFC to provide flood-relevant information to the Iowa public. The threshold (1.59 mm h−1) is applied to suppress the strong ground clutter effect that is often observed around radars. Another reason for the low detection rate of IFC-SP is the use of a fixed standard R–Z equation (Z = 300R1.4) that is more appropriate for convective-type storms (see, e.g., Crosson et al. 1996). This standard equation tends to produce smaller rainfall values at low reflectivity ranges identified as stratiform or snow/mixed type precipitation than other forms of the R–Z equation [for more discussion, see Seo et al. (2014)]. Stage IV and DP products present similar POD for all thresholds. On the other hand, IFC-SP presents a better POFD than Stage IV and DP products for thresholds lower than to 1 mm h−1 In Fig. 5, we also present the relative contribution of missed and false rainfall detection in the overall bias. Note that these biases have opposite signs and cancel each other when evaluating average bias. For a small threshold, POD for IFC-SP results in a relative bias of −11% and POFD results in a relative positive bias of 3%. All products present similar statistics for thresholds larger than 3 mm h−1.
DP radars have the potential to reduce uncertainties caused by the variability of the drop size distribution (DSD) since they are able to characterize the size, shape, and form of the hydrometeors. However, some of the known radar SP QPE sources of uncertainties (e.g., nonuniform beam filling and wet radome) also affect DP estimates. As shown in Fig. 4, the current version of DP products presents higher bias than SP products. In the existing system, an algorithm based on fuzzy logic (Park et al. 2009) is used to identify the most likely hydrometeor type, and the hydrometeor and melting-layer position define the equation used to estimate rainfall intensity.
To better understand DP products and explain the features observed in Figs. 3 and 4, we evaluate the accuracy of DP rain rate as a function of hydrometeor type. In Fig. 6, we present correlation, MSE, and total bias (sum of bias in the mean and in the variance) for different rain-rate thresholds. Rain-rate values are separated according to the following hydrometeor type identified by DP radars: light–moderate rain, graupel, dry snow, and wet snow. Light–moderate rain occurs below the melting layer, graupel and dry snow above the melting layer, and wet snow in the mixed-phase region. For single-radar DP, we combined data from three different radars: KDMX, KDVN, and KARX. We did not include plots for the remaining hydrometeors (ice crystal, heavy rain, and big drops and ice crystals) since they do not occur frequently and the sample size is small.
Figure 6 shows that single-radar DP products present a higher correlation than IFC-SP and Stage IV for light–moderate rain and wet snow for all thresholds, and a similar correlation to Stage IV for graupel. The high correlation of the single-radar DP product demonstrates the potential of DP QPE estimates. However, in terms of MSE, SP products (IFC-SP and Stage IV) perform better than single-radar and Merged-DP products, and Stage IV performs better than IFC-SP. A significant part of MSE for DP products is due to bias, pointing out issues in the equations used to convert DP measurements into rainfall. The good performance of Stage IV compared to IFC-SP and Merged-DP in terms of MSE and total bias demonstrates the effectiveness of the bias correction and the manual QC procedures. However, the manual QC procedure constrains the use of this dataset in real time.
For light–moderate rain, SP radars use the conventional R–Z relationship, while DP radar estimates are based on reflectivity and differential reflectivity [see Istok et al. (2009) for details]. The parameters of the R(Z, Zdr) equations were estimated by Ryzhkov et al. (2005b) based on data collected in Oklahoma. Differential reflectivity is a good measure of the median drop diameter. Giangrande and Ryzhkov (2008) showed that rainfall uncertainties that arise from DSD variability could be minimized by the use of reflectivity and differential reflectivity to estimate rainfall. However, this equation is not immune to hail contamination and is not effective in situations of melting-layer contamination and precipitation overshooting (e.g., Ryzhkov and Zrnić 1996; Ryzhkov et al. 2005b). Differential reflectivity is also sensitive to radar calibration (vertical and horizontal), attenuation, and depolarization (Zrnić et al. 2010). For light–moderate rain, DP products present higher correlation than SP products for all thresholds, but DP products present significant total bias. The IFC-SP product presents similar correlation and MSE to the bias-corrected Stage IV product, highlighting the good performance of this product for this hydrometeor type. Note that Merged-DP does not present significant improvement for this hydrometeor type, since it usually occurs in a short range from the radar.
The relationships between DP measurements and precipitation for hydrometeor types that occur in the mixed-phase region or above the melting layer (graupel, dry snow, and wet snow) are still not well known. For these hydrometeors, there is no consensus in terms of which DP measurements should be used and what is the expected gain from using them (Giangrande and Ryzhkov 2008; Ryzhkov and Zrnić 1996). For some operational DP algorithms, for example, the Colorado State University–Hydrometeor Identification Rainfall Optimization (CSU-HIDRO; Cifelli et al. 2011), rainfall is not estimated when dry snow, graupel, hail, or wet snow with specific differential phase Kdp larger than 0.3° km−1 is detected. Specific differential phase is the range derivative of differential propagation phase shift (Chandrasekar et al. 1990; Istok et al. 2009). This threshold is used to identify hail. For these situations, Giangrande and Ryzhkov (2008) recommended the use of the conventional R(Z) relationship, scaled by a constant to correct for expected bias. In Fig. 6, we present results for graupel, dry snow, and wet snow. For graupel, the conventional R(Z) relationship is scaled by a factor of 0.8 to minimize bias. The QPE for graupel presents the lowest correlation and the highest MSE for all radar products. For this hydrometeor, Merged-DP presents significant bias for all thresholds, while bias for Stage IV and IFC-SP is significant for thresholds larger than 7 mm h−1. Note that bias for the Merged-DP product is higher than the bias for the single-radar product.
For dry snow, two formulations are considered depending on the melting-layer position. When dry snow is identified below the melting layer, the conventional R(Z) equation is used and the rainfall is multiplied by 2.8 [R = 2.8R(Z)]. When dry snow is identified above the melting layer, the conventional R(Z) relationship is used. A scaling of the R(Z) relationship is also applied for wet snow [R = 0.6R(Z)]. While the high correlation demonstrates the potential of DP radars, the high bias for graupel, dry snow, and wet snow demonstrates that a simple scaling of the R(Z) relationship does not improve rainfall estimation for these hydrometeor types. Future research should focus on determining better relationships for these types of hydrometeors.
For all hydrometeor types, correlation for Merged-DP is higher than for single-radar DP. Correlation improved significantly for dry snow and graupel. MSE improved significantly for dry snow and slightly for wet snow. This demonstrates that merging data from different radars using inverse distance weights is an effective way to mitigate range-dependent errors in radar rainfall and can improve the representation of spatial and temporal patterns in rainfall. However, this procedure is not successful in mitigating bias. Both DP single- and merged-radar product estimates are biased for all hydrometeor types. We further explore this aspect in the following section.
2) Range effects and hydrometeor classification
The results presented in Fig. 6 show that DP QPE uncertainties vary with hydrometer types. In this section, we demonstrate how errors change with radar range. In Figs. 7a, 8a, and 9a, we present hydrometeor frequency for KDMX-DP, KDVN-DP, and KARX-DP, respectively. Light–moderate rain dominates ranges up to 100 km, while dry snow dominates ranges greater than 175 km. In the range of 100–175 km we see a transition, where both light–moderate rain and wet snow occur at a similar frequency. The observed range-dependent changes in hydrometeor phase are explained by the radar beam sampling altitude at a given distance from the radar and the vertical characteristics of the melting layer. For KDMX and KARX, we observe dry snow at close range, due to cold events that occurred in April. DP hydrometeor type classification algorithms successfully identified the presence of snow at closer ranges for these events. KDVN did not detect dry snow at closer ranges. The remaining hydrometeor types were not detected as often. In Figs. 7b–d, 8b–d, and 9b–d, we present radar rainfall relative bias (zero corresponds to no bias), correlation, and MSE as a function of range for KDMX, KDVN, and KARX. Since in this study we include gauges located in the state of Iowa, we do not have gauges located near the KARX radar, which is located in Wisconsin. For comparison, we also present statistics for Stage IV, IFC-SP, and Merged-DP. The range for these products is calculated based on the location of the radar shown in Fig. 1. Since these rainfall products are based on the data from multiple radars, the adopted range is artificial. However, this range definition allows us to directly compare the merged- and single-radar products. We calculate the statistics based on the period from 1 April to 1 June, and to avoid problems due to the inability of gauges to detect the beginning or the end of the rain period and high uncertainty for small rain-rate values, we calculate these statistics using a rain-rate threshold equal to G > 1 mm h−1. We included gauges from all the networks, and we represent different networks by different colors. To facilitate our visual inspection, we included moving-average lines in all the plots.
For Stage IV and IFC-SP, relative bias is close to zero for all ranges. For single-radar DP products, we see a clear pattern in relative bias; relative bias shows small underestimation for ranges up to 50 km and overestimation for ranges of 50–230 km. Maximum relative bias is observed at ranges close to 150 km, which is the location where light–moderate rain and dry snow occur with similar frequency. In general, single-radar DP bias increases as the frequency of dry snow increases, demonstrating that the QPE equation for dry snow has significant bias. Note that overall Merged-DP exhibits statistics similar to single radars up to 125-km range, and for larger ranges bias the statistics decrease considerably because of the inclusion of data from other radars. However, Merged-DP bias is still larger than bias for Stage IV and IFC-SP.
In terms of correlation, on average, Stage IV and Merged-DP present a higher correlation than IFC-SP. The correlation for single-radar DP products tends to decrease for ranges larger than approximately 100 km. However, KDMX-DP and KDVN-DP show a very low correlation with some gauges located at relatively close range (<75 km). A closer look at the time series of those gauges reveals that the low correlation is caused by a few periods for which the DP radar dramatically overestimates rainfall compared to gauges. For these ranges, rainfall is usually identified as light–moderate rain, and errors in QPE for this hydrometeor are likely to be caused by the use of differential reflectivity as one of the parameters to estimate rainfall. As mentioned before, differential reflectivity is sensitive to attenuation, miscalibration, and depolarization (Zrnić et al. 2010).
Figure 10 shows an example of DP QPE errors for light–moderate rain. Stage IV gives the best match with the points for which we have gauge data. Merged-DP considerably overestimates rainfall in the area that is dominated by KARX. IFC-SP rainfall estimates match gauge values at ranges close to that of KDMX (around 100 km). We also present the hybrid hydrometeor classification represented by the colors in the background of the figure. KDMX and KARX overestimate rainfall where light–moderate rain occurred. The problem is more pronounced for the KARX-DP products where rainfall is overestimated by more than 300% at a range of 70–100 km. At the same time period, an XPOL radar was being operated on a range–height indicator (RHI) azimuthally aligned with KARX. Based on observations obtained by this radar (lower correlation and change in differential reflectivity), the melting layer is located at an altitude of approximately 2.5 km. We include the height of the radar beam in the left y axis for reference. For altitudes larger than 2.5 km, hydrometeors are classified as dry snow or graupel, which is consistent with the location of the melting layer identified by the XPOL radar. However, in the transition zone (height between 1 and 2.5 km), KDMX observed wet snow and graupel, while KARX mainly observed graupel, with wet snow detected only in few spots.
A relevant aspect shown in Fig. 10 is the irregular and noisy transition between different hydrometeor classes. For example, between ranges of 140 and 208 km, the KDMX-DP shows sharp and unrealistic transitions among light–moderate rain, wet snow, graupel, and finally into dry snow. Since different equations are used for different hydrometeor types, these abrupt changes introduce abrupt changes in the error structure of QPE. Highly variable errors in space and time are not easily corrected using gauge data, unless data from an extremely dense gauge network are available. These analyses illustrate that errors in DP QPE might also arise from uncertainties in defining hydrometeor types.
3) Event-based comparison
In this section, we evaluate the storm total rainfall μ spatial distribution for four major rainfall events based on Stage IV, IFC-SP, and Merged-DP (Fig. 11). The Stage IV data are used as reference for comparison since they are bias corrected, have manual QC, and presented the most consistent performance in the comparison with gauge data for all four events. We focus on evaluating the difference between the products instead of absolute values, since all products contain errors. Overall, the rainfall spatial patterns of the three products are quite similar, but the IFC-SP and Merged-DP show some over- and underestimation tendency compared to the Stage IV. The IFC-SP captures the rainfall spatial structure for events 3 and 4, which are characterized by a mesoscale convective system. However, the IFC-SP shows relatively inaccurate estimates for events 1 and 2. These are identified as cold cases associated with snow/mixed precipitation. As discussed earlier in section 3b(1), the R–Z conversion equation in the IFC-SP does not sufficiently represent these snow and stratiform cases and produces very low rainfall rate values. The observed difference between the Stage IV and IFC-SP for event 2 (Fig. 11) indicates that Stage IV is corrected for the area of snow/mixed rain that developed in the south of the domain, passed over the NPOL area (in the middle of the Cedar and Iowa River basins) and proceeded toward the north. This demonstrates the difficulty of detecting/estimating snow/mixed cases using a single R–Z relationship in the SP algorithm and the necessity of applying DP algorithms. Significant overestimation by the Merged-DP was observed for events 2 and 4. Notably, a border that indicates a radar calibration issue between the KDMX and KDVN radar is detected.
In Fig. 12, we present statistics for the gauge versus radar comparisons for the four events. This figure shows striking differences in the performance of different products depending on the characteristics of the event. DP products present the highest correlation for all events and the highest MSE and bias for events 1, 3, and 4, while IFC-SP presents the highest MSE and bias for event 2. However, IFC-SP presents similar performance to Stage IV for event 3 and better performance for event 4.
4) Radar versus radar: Implications for hydrological analyses across scales
Radar rainfall products are often evaluated at the point scale using radar–gauge comparison. These types of analyses that focus on QPE uncertainties at the local scale are important to highlight data and instrument limitations and to guide required improvements. However, for certain hydrological applications, rainfall data are averaged over multiple spatial scales; therefore, it is essential to understand how rainfall uncertainties changes with scale. Previous studies have shown that small-scale random rainfall uncertainties are filtered out by the aggregative effect of the river network (see, e.g., Carpenter and Georgakakos 2004; Mandapaka et al. 2009; Seo and Krajewski 2010; Cunha et al. 2013). In practice, that means that rainfall products, even when containing random errors at the point scale, can still be adequate for medium- to large-scale hydrological applications. Consequently, hydrologists should focus on evaluating rainfall products at relevant scales (Berne and Krajewski 2013).
In this section, we investigate how radar rainfall uncertainties affect the estimation of flood-relevant hydrological variables across multiple spatial scales. Multiscale radar rainfall evaluations cannot be performed in an absolute sense, since ground-based reference rainfall is not available for sufficiently large areas (Krajewski and Smith 2002; Villarini and Krajewski 2010). However, this type of analysis can be performed in a diagnostic/relative sense by evaluating the relative differences between estimates provided by multiple rainfall products. Our analyses are based on the comparison of multiscale ST and MaxRR estimates based on different rainfall products. We chose these variables since storm total dominates flood generation for medium to large basins, while maximum rainfall rate shapes floods at small basins with fast response. In the methodology section, we described the procedures used to estimate storm total and maximum rainfall rate for every river link, across multiple spatial scales. It is important to note that the spatial resolution of the rainfall products investigated in this study are different: 4 × 4 km2 for Stage IV, 1 × 1 km2 for Merged-DP, and 0.5 × 0.5 km2 for IFC-SP. These differences in resolution affect the estimation of storm total and maximum rainfall rate for small watersheds. However, the effect of resolution decreases as basin drainage area increases, especially for storm totals.
In Fig. 11, we presented absolute storm total maps estimates based on Stage IV, IFC-SP, and Merged-DP for the four rainfall events. Based on Stage IV, events 1 (μ = 60 mm) and 2 (μ = 68 mm) present lower storm total basin averages than events 3 (μ = 79 mm) and 4 (μ = 92 mm) over the Turkey River basin. In Figs. 13 and 14, we use the river network to characterize storm total and maximum rainfall rate relative differences for the Turkey River basin for scales varying from 0.01 (hillslope) to 4370 km2 (basin outlet). To facilitate the visual evaluation of multiscale rainfall, we plot it using the river network as reference. The hillslope areas are represented by a gray background. In Fig. 13, we show the relative difference between storm total estimated based on IFC-SP and Merged-DP rainfall products with Stage IV. These maps allow us to identify regions and scales for which IFC-SP and Merged-DP underestimates (from light to dark blue) or overestimates (from yellow to red) storm total compared to the bias-corrected product, Stage IV. IFC-SP underestimates the storm total for event 1 with an average difference in the basin μb equal to 39% and for event 2 with average difference equal to 47%. This confirms the results presented in previous analyses. However, IFC-SP presents very similar storm totals for events 3 (μb = 1% underestimation) and 4 (μb = 9% underestimation). IFC-SP underestimation is caused by the inaccurate estimation of rainfall during snow or for events that are not well developed vertically, as previously discussed. Errors for the same product and event are not homogeneous in space. For example, for IFC-SP event 4, we see regions where rainfall is underestimated (blue, and minimum value of 36% for a link) and others where it is overestimated (yellow, and maximum of 113% for a link). However, the river network filters out these uncertainties and differences as basin scale increases, approaching zero for the main channel. Merged-DP does a better job in the estimation of storm total for event 1 (μb = 18% overestimation) and 2 (μb = 13% underestimation), but overestimates storm total for events 3 (μb = 47%) and 4 (μb = 53%). The absolute value of the bias is an important feature, but more important is to evaluate how bias changes in space and across events. If bias is constant and computable, we can correct the rainfall fields. Note that although bias varies significantly in space for both datasets, Merged-DP exhibits higher variability. For example, Merged-DP storm total differences for a link for event 1 can be as high as 283% overestimation and as low as 31% underestimation.
Another important variable that controls flood generation, especially at small scales, is maximum rainfall rate (Fig. 14). We generated this figure using the same procedure adopted in Fig. 13, where we used the river network to spatially characterized multiscale rainfall estimates. When we evaluated storm total, we averaged out errors in space and in time. Maximum rainfall rate is an instantaneous value; consequently, errors are averaged out in space but not in time. Therefore, we expect higher differences and more spatial heterogeneities for maximum rainfall rate than for storm total. For IFC-SP, there is no clear tendency to over- or underestimate maximum rainfall rate. For event 2, IFC-SP underestimates maximum rainfall rate for the upper part of the basin and overestimates it in the lower part of the basin. As with storm total, IFC-SP performs better for events 3 and 4 (also see Fig. 12), but differences for a link are as large as 328% overestimation (event 2) and as low as 80% underestimation (event 1). Basin-average Merged-DP values are 38% (event 1) to 80% (event 4) higher than the ones observed by Stage IV, but locally we observed differences as high as 570% (event 2) and as low as 64% (event 3).
Results presented in Figs. 13 and 14 demonstrate that even with the advance of radar QPE methods, radar rainfall estimates still contain significant errors, and applying this information to hydrological simulation or prediction, especially at small scales, requires careful attention. The information provided in Figs. 13 and 14 can support the validation of hydrological simulations. If those datasets are used as input to hydrological models, the figures allow the identification of regions for which uncertainties in model results are likely to be caused by high uncertainties in rainfall estimation. Those maps highlight parts of the basins for which rainfall estimation is less (or more) certain and the type of uncertainty expected in each basin region (under- or overestimation) for simulations based on different rainfall products. These maps can be used to flag regions for which we should question (or trust) model results. For example, when storm total differences for the outlet of the basin are as high as 53% (Merged-DP event 4) or as low as 80% underestimation (IFC-SP event 2), we should question our ability to provide good predictions for these events based on this rainfall forcing.
The diagnostic evaluation presented in this study is only possible when we have different datasets for comparison. It is common practice in hydrology to use one of those datasets as ground reference and manipulate model parameters to compensate errors and obtain results that match observed streamflow at the outlet of the basin. Results shown in this section highlight the importance of understanding rainfall error in a hydrological context before using this dataset as input for hydrological models. Moreover, it demonstrates the importance of ground instrumentation to evaluate and correct remote sensing rainfall measurements since none of the radar-only products provided accurate estimates for all four events.
4. Conclusions and discussion
The NEXRAD program recently upgraded the NEXRAD observational capability with dual polarization, and NOAA has added a system to ingest, process, and distribute dual-polarization radar data and derived products, including instantaneous rain-rate fields and hybrid hydrometeor types to the operational NEXRAD product stream. In this study, we evaluate DP radar rainfall estimates provided by the current version of the WSR-88D systems using a dense rain gauge network and two single-polarization rainfall products: the National Weather Service Stage IV and the Iowa Flood Center product. We investigate DP uncertainties for different flood events, radars, and hydrometeor types and show how these uncertainties propagate across hydrologically relevant scales.
The principal conclusions of the paper are summarized as follows.
Based on the comparison of radar and rain gauge data, the current version of the NWS DP rainfall estimation algorithm produces higher correlation than the SP rainfall algorithm products, suggesting that the hydrometeor classification improves the spatial and temporal characterization of rainfall. However, NWS DP QPEs are not superior to SP QPEs in terms of bias and mean-square error.
Radar rainfall uncertainties are a function of hydrometeor type. DP rainfall estimates have higher correlation with rain gauge than Stage IV for light–moderate rainfall and wet snow, but not for graupel and dry snow. DP estimates for all hydrometeor types (graupel, dry snow, and wet snow) are considerably biased.
Bias for DP rainfall estimates is range dependent, reaching the maximum value at the range where both light–moderate rain and dry snow occur with similar frequency, demonstrating that DP measurements are affected by brightband contamination.
At close range (<75 km), DP QPE presents a large bias for a few gauges, which arises from bias on the estimation of light–moderate rain. The bias is likely caused by sensitivity of differential reflectivity to radar calibration, attenuation in the presence of large drops, and depolarization.
The hydrometeor classification algorithm was able to identify the melting-layer position during the 1400 UTC 26 April rainfall event. However, hydrometeor types change abruptly with range among dry snow, wet snow, graupel, and light–moderate rain. Since different equations are used for different hydrometeor types, these abrupt transitions lead to sudden changes in the radar rainfall error structure. This type of error cannot be corrected using rain gauge data, unless an extremely dense rain gauge network is available.
The process of merging data from multiple radars using inverse distance weighting considerably increases the DP estimate correlation with rain gauge, but it did not have a strong effect on MSE. Merged SP products are less biased than Merged-DP products. In this study, we adopted the traditional inverse distance weighting method, but future studies should focus on developing methods that take advantage of multiple DP measurements and hydrometeor type information.
DP QPE and Stage IV products have higher rainfall detection probability than IFC-SP products. However, the low probability of detection from the IFC-SP product seems to be related to inaccurate precipitation estimates for snow at lower elevations and stratiform events. The IFC-SP algorithm uses a unique R–Z relationship that works better for heavy or convective-type storms (event 2).
Storm total comparisons for four events demonstrate that DP products often overestimate rainfall when compared to Stage IV and IFC-SP products. However, Stage IV and Merged-DP products present similar rainfall spatial patterns for the four events analyzed in this study.
IFC-SP and Stage IV products present similar spatial patterns for two of the events investigated in this study (events 3 and 4). For events 1 and 2, IFC-SP underestimates rainfall. The difference in accumulation observed for IFC-SP events 1 and 2 occurs because the conventional R–Z equation does not represent snow and stratiform cases well and produces very low rain-rate values. This demonstrates the difficulty of detecting snow/mixed cases using a single R–Z relationship in the SP algorithm and the necessity of applying DP algorithms.
We investigate how rainfall uncertainties propagate across hydrologically relevant spatial scales using a diagnostic approach. The method proposed in this study provides valuable uncertainty information to support the use of rainfall datasets on hydrological studies and hydrological simulations. Based on the hydrological analyses presented in this study, we verified that the river network filters out random rainfall uncertainties, but as expected, bias remains. DP products overestimate storm total and maximum rainfall rate for all scales and events, while IFC-SP products do a good job estimating storm total and maximum rainfall rate for events 3 and 4, but fail to represent events 1 and 2 well.
No radar-only product (Merged-DP or IFC-SP) was efficient in estimating rainfall intensity and spatial distribution for the four events. While DP products perform better in representing events 1 and 2, IFC-SP products perform similarly (event 3) or better than Stage IV (event 4). Even with the current advance on radar QPE methods, ground data are still essential for validation and for bias removal.
In conclusion, previous studies have shown the potential of DP radars to improve QPE, but additional research is required before we understand how DP measurements can effectively be used to mitigate known SP radar QPE uncertainties for all possible hydrometeor types, ranges, and climatological conditions. In this study, we identified several points that should be further evaluated and explored. Should we continue adopting different equations for different hydrometeor types? Should we focus on identifying the DP measurements that provide better QPE for each hydrometeor type? For example, at the moment, differential reflectivity is used for QPE for light–moderate rain even though it is known to be sensitive to miscalibration, attenuation, and depolarization. Ryzhkov et al. (2014) recently proposed a methodology based on specific attenuation that is less susceptible to the variability of drop size distributions than QPE algorithms based on radar reflectivity, differential reflectivity, and specific differential phase. Once we understand the weakness and strengths of each DP measurement, we have to work on estimating optimal parameters for QPE equations under different hydrometeorological conditions. A second important point is how to improve hydrometeor classification algorithm to avoid abrupt/artificial transition among different hydrometeor types. These abrupt changes in hydrometeor types introduce sudden changes in QPE error structures that cannot be easily corrected using current bias-correction methods and rain gauge data.
Funding for this work was provided by NASA Grants NNX13AD83G and NNX13AG94G, the Willis Research Network, the Iowa Flood Center, and the NOAA Cooperative Institute for Climate Science.
This article is included in the IFloodS 2013: A Field Campaign to Support the NASA-JAXA Global Precipitation Measurement Mission Special Collection.