## Abstract

A procedure for the estimation of rainfall rate, capitalizing on a radar-based raindrop size distribution (RSD) parameter retrieval and neural network (NN) inversion techniques, is validated using an extensive and quality-controlled archive. The RSD retrieval algorithm utilizes polarimetric variables measured by the polarimetric prototype of the Weather Surveillance Radar-1988 Doppler (WSR-88D) in Norman, Oklahoma (KOUN), through an ad hoc regularized neural network method. Evaluation of rainfall estimation from the NN-based method is accomplished using a large radar data and surface gauge observation dataset collected in central Oklahoma during the multiyear Joint Polarization Experiment (JPOLE) field campaign. Point estimates of hourly rainfall accumulations and instantaneous rainfall rates from NN-based and parametric polarimetric rainfall relations are compared with dense surface gauge observations. Rainfall accumulations from RSD retrieval-based methods are shown to be sensitive to the choice of a raindrop fall speed model. To minimize the impact of this choice, a new “direct” neural network approach is tested. Proposed NN-based approaches exhibit bias and root-mean-square error characteristics comparable with those obtained from parametric relations, specifically optimized for the JPOLE dataset, indicating an appealing generalization capability with respect to the climatological context. All tested polarimetric relations are shown to be sensitive to hail contamination as inferred from the results of automatic polarimetric echo classification and available storm reports.

## 1. Introduction

After decades of scientific research following the first pioneering paper (Seliga and Bringi 1976), weather radar polarimetry is increasing in popularity within operational meteorological services. Several recent studies and field campaigns have demonstrated the advantages of an operational radar upgrade to polarimetric capabilities. Benefits of weather radar polarimetry at S band include improved data quality and hydrological products, self-consistency techniques for assessment of system miscalibration, and automatic echo classification for weather and forecasting applications (e.g., Zrnic and Ryzhkov 1999; Vivekanandan et al. 1999; Straka et al. 2000; Liu and Chandrasekar 2000; Ryzhkov et al. 2005b,c; Marzano et al. 2007). Polarimetric methods to correct for the effects of attenuation in rain through the use of differential phase measurements (e.g., Testud et al. 2000; Vulpiani et al. 2008) have also renewed the interest for quantitative precipitation estimation at C- and X-band frequencies.

For radar-based hydrologic products, it is known that the natural variability of the raindrop size distribution (RSD) introduces large uncertainty. Conventional reflectivity–rainfall (*Z*–*R*) estimates are prone to significant error due to microphysical processes including evaporation, coalescence, and wind/advection (e.g., Doviak and Zrnić 1993, their section 8.4). A primary justification for an operational upgrade to polarimetric capabilities is that polarimetric rainfall relations have been shown to be more robust with respect to RSD variations and the presence of hail than conventional *Z*–*R* relations (e.g., Ryzhkov et al. 2005b,c).

Several polarimetric rainfall relations have been investigated using extensive datasets collected over the past few decades (e.g., Ryzhkov and Zrnić 1996; Brandes et al. 2002; Ryzhkov et al. 2005b; Giangrande and Ryzhkov 2008). These parametric relations utilize a combination of polarimetric measurements of the reflectivity factor *Z*, differential reflectivity *Z*_{DR}, and specific differential phase *K*_{dp}. Polarimetric rainfall relations, capitalizing on measurements of differential reflectivity *Z*_{DR}, have been found less sensitive to RSD variability, although not immune to hail and/or melting layer contamination (e.g., Aydin et al. 1990; Ryzhkov and Zrnić 1995; Brandes et al. 2002; Giangrande and Ryzhkov 2008). Rainfall algorithms, based on *K*_{dp} computations, are found robust in the presence of hail and/or a partial beam blockage, although these relations are nonoptimal for light rain at S band and may produce nonphysical results due to nonuniform beam filling (e.g., Chandrasekar et al. 1990; Ryzhkov and Zrnić 1995; Ryzhkov 2007).

An alternate approach to reduce the effects of RSD variability on radar-based rainfall estimates is provided through the possibility of automatically retrieving the parameters of the RSD using polarimetric radar observations (e.g., Gorgucci et al. 2002; Brandes et al. 2002). From the retrieved parameters of the RSD, it is then possible to compute the corresponding rainfall rate (e.g., Bringi et al. 2004). Following this path, Vulpiani et al. (2006) proposed a neural network technique for the estimation of RSD parameters and the subsequent computation of the corresponding rainfall rates. In a simulated framework, the neural network (NN) methodology has shown solid performance when compared with physically based parametric approaches (Gorgucci et al. 2002; Brandes et al. 2002). However, experimental validation for NN-based methods for rainfall estimation has been limited outside of the simulated framework.

In this study, the Vulpiani et al. (2006) NN-based technique for polarimetric rainfall estimation is evaluated alongside conventional Next Generation Weather Radar (NEXRAD) and polarimetric parametric relations. A validation study is performed using the extensive event dataset collected within the well-observed central Oklahoma region during the Joint Polarization Experiment (JPOLE) over a multiyear period (Ryzhkov et al. 2005c). Radar measurements are obtained from the polarimetric prototype of the Weather Surveillance Radar-1988 Doppler (WSR-88D) (KOUN, S-band system), assumed to be well calibrated to a NEXRAD standard. Rainfall rates and accumulations from the dense Agricultural Research Service (ARS) Micronet rain gauge networks operated by the Oklahoma Climatological Survey are used as “ground truth” to validate NN-based and parametric polarimetric radar rainfall estimations (e.g., Brock et al. 1995; Shafer et al. 2000). The ARS gauges are well calibrated and located at distances between 50 and 115 km from the KOUN radar (e.g., Shafer et al. 2000; Fiebrich et al. 2006; McPherson et al. 2007).

The paper is organized as follows: a description of the neural network algorithm is provided in section 2. The JPOLE validation dataset and details on the standard KOUN radar and ARS rain gauge processing methods are described in section 3. Results of the validation study for various parametric and NN-based rainfall estimates are presented in section 4. Since the RSD retrieval-based method, in absence of direct observations (i.e., disdrometers or vertically pointing radars), requires an assumption for a raindrop fall speed model, section 4 includes a sensitivity analysis with respect to several fall speed models. A second neural network technique enabling the “direct” (without passing through the preliminary estimation of the RSD) estimation of rainfall rate is also evaluated in section 4. Discussion and conclusions are provided in the final section.

## 2. Background

After giving some basic definitions of rain microphysics and radar observables, this section provides a description of NN fundamentals and the algorithms designed to retrieve rainfall rates from radar measurements.

### a. Raindrop size distribution and polarimetric measurements

The normalized Gamma raindrop size distribution is commonly considered the function able to describe most of the variability occurring in the naturally observed RSD (Bringi and Chandrasekar 2001). The number of raindrops per unit volume per unit size (mm^{−1} m^{−3}) can be written as

where *D* is the volume-equivalent drop diameter, *f* (*μ*) is a function of *μ* only, the parameter *D*_{0} is the median volume drop diameter, *μ* is the shape of the drop spectrum, and *N _{w}* (mm

^{−1}m

^{−3}) is a normalized drop concentration that can be calculated as a function of liquid water content

*W*and

*D*

_{0}(e.g., Bringi and Chandrasekar 2001). Given a known RSD, the rainfall intensity

*R*(mm h

^{−1}) can be computed as a flux of raindrop volume at a terminal fall velocity

*υ*(

*D*) (m s

^{−1}) in still air, usually parameterized as a power law of

*D*:

where *V*(*D*) is the raindrop volume, assumed here equivalent to a sphere of diameter *D*, and the vertical wind speed component is assumed negligible.

The copolar radar reflectivity factors *Z _{hh}* and

*Z*(mm

_{υυ}^{6}m

^{−3}) at

*H*and

*V*polarization state and the differential reflectivity

*Z*

_{DR}(here DR indicates decibel units) are defined as follows:

where *Z*_{dr} is adimensional and *S*_{hh,υυ} (mm) are the backscattering copolar components of the complex scattering matrix 𝗦 of a raindrop. Here, *K* depends on the complex dielectric constant of water estimated as a function of wavelength *λ* (mm) and temperature (e.g., Bringi and Chandrasekar 2001). In (3), *D*_{min} and *D*_{max} stand for the minimum and maximum drop diameter of the actual RSD, respectively.

The specific differential phase shift *K*_{dp} (° km^{−1}), due to the forward propagation phase difference between *H* and *V* polarization, can be obtained in terms of the forward scattering amplitude *f* (mm) as

where the symbol ℜ represents the real part of a complex number.

Recently, it has been shown that *D*_{0}, *N _{w}*, and

*μ*from (1) can be retrieved from polarimetric radar measurements of

*Z*,

_{hh}*Z*

_{dr}, and

*K*

_{dp}(e.g., Brandes et al. 2002; Gorgucci et al. 2002; Bringi et al. 2002). It was following that approach that Vulpiani et al. (2006) proposed a neural network technique for the RSD parameter retrieval and rainfall estimation. The following sections briefly summarize the neural network foundations and the suggested retrieval algorithm.

### b. Neural network inversion approach

Most retrieval problems in remote sensing are ill-posed nonlinear problems. This means that the related inverse problem can be addressed only by resorting to the statistical analysis and by adding a priori information. Within this framework, the neural network technique represents a powerful approach to design a retrieval algorithm in a more flexible and robust way than conventional methods such as linear regression (Haykin 1995). The selection of a neural network topology, very often thought to be a “black box,” is theoretically equivalent to the choice of either a regression analytical model or a Bayesian probability model.

Adopting the definition provided by Haykin (1995), one may define a neural network as “a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experiential knowledge and making it available for use.” Knowledge is acquired by the network by means of a learning process and is stored through the interneuron connection strengths, known as *synaptic weights*. A neuronal model is composed by three main elements: a set of synaptic weights, an *adder* to sum the input signals, and an activation function for limiting the output of a neuron. Mathematically, each *k*th processing unit (*neuron*) can be described by the following equation:

where *υ _{k}* =

*u*+

_{k}*b*, while

_{k}*x*

_{1},

*x*

_{2}, and

*x*are the input signal;

_{m}*w*

_{k1},

*w*

_{k2}, and

*w*are the synaptic weights of neuron

_{km}*k*;

*u*is the linear combiner output;

_{k}*b*is the bias;

_{k}*ϕ*() is the

*activation function*; and

*y*is the output signal of the neuron. The function of the bias is to apply an affine transformation to the output

_{k}*u*, increasing or decreasing the input of the activation function. A commonly used activation function is the

_{k}*sigmoid function ϕ*(

*υ*) = 1/[1 + exp(−

*aυ*)], with

*a*being the slope parameter.

Generalizing (6) and (7), an artificial neural network can be viewed as a nonlinear parameterized mapping from an input *x* to an output *y* = NN(*x*; **w**, *M*), where **w** is the vector of parameters (weights and biases) relating the input *x* to the output *y*, while the functional form of the mapping (i.e., the architecture of the net) is denoted as *M*.

The multilayer perceptron architecture (MLP), considered here, is a mapping model composed of several layers of parallel processors. It has been theoretically proven that one–hidden layer MLP networks may represent any nonlinear continuous function (Haykin 1995), while a two–hidden layer MLP may approximate any function to any degree of nonlinearity taking also into account discontinuities (Sontag 1992). A crucial step for setting up the neural network is the training, or learning process. During this step the synaptic weights of the networks are modified by applying a set of labeled training samples or task samples. Each example *D* consists of a unique input signal *x* and a corresponding desired target *t*. Denoting with *a* the neural network response to the input *x*, the weights and biases are determined by minimizing the cost or performance function usually defined as the mean-square difference *E _{D}* between the targets

*t*and the NN response

*a*. The minimization of the performance function is accomplished by means of the back-propagation learning rule (i.e., performing the computations backward through the network; Rumelhart et al. 1986). The steepest gradient descent method has been used as the minimization technique. The weights are updated according to the

*delta rule*or

*Widrow and Hoff rule*(Widrow and Hoff 1960), which is defined in terms of the connection hidden (output) weight

*w*and characterized by the

_{jk}*j*th hidden (output) node, the

*k*th input (hidden) node, its updated value, and the so-called learning rate

*η*

_{0}.

To reduce the sensitivity of the steepest descent method with respect to the learning rate, Battiti’s bold driver technique (Battiti 1989) has been used to adaptively determine the learning rate *η*_{0}. Furthermore, a *momentum* term is added to the mean-square difference *E _{D}* to avoid the minimization technique getting stuck into local minima (Hagan et al. 1996; Vulpiani et al. 2006). The desired generalization capability of the neural network can be obtained through a

*regularization*technique accomplished either by perturbing the input (Aires et al. 2002) or including, within the performance function, the additional term (1 −

*γ*)

*E*where

_{W}*E*is the sum of squares of the network’s weights and biases. The neural network architecture and regularization parameter

_{W}*γ*have been determined according to a heuristic monitoring of the generalization capability on test data, the root-mean-square error having been used as a metric. According to what is suggested in Aires et al. (2002), it has been found that the one–hidden layer configuration improves the generalization capability of the NNs. The training is repeated for many examples in the set until the network reaches a steady state where there are no further changes in the synaptic weights.

### c. NN-based retrieval of raindrop size distribution and rainfall rate

Radar reflectivity factor at horizontal polarization *Z _{hh}* and differential reflectivity

*Z*

_{dr}are commonly used in RSD retrieval (Gorgucci et al. 2002; Brandes et al. 2002). Specific differential phase shift

*K*

_{dp}is another potential predictor for RSD retrieval. However,

*K*

_{dp}computations are often noisy and/or negative, which may perturb the results. Consequently,

*K*

_{dp}computations may be considered the most reliable after applying a lower threshold of 0.2° km

^{−1}(e.g., Gorgucci et al. 2002). Notwithstanding, Vulpiani et al. (2006) found that the proposed algorithm may perform well even for very low values of

*K*

_{dp}. Moreover, in the case of unreliable or unavailable measurements of

*K*

_{dp}(i.e., radars that do not measure differential phase), a two-input neural network algorithm can also be successfully applied.

The median volume drop diameter *D*_{0} and the intercept parameter *N _{w}* are independently estimated using distinct NNs with three (i.e.,

*Z*,

_{hh}*Z*

_{dr},

*K*

_{dp}) or two inputs (i.e.,

*Z*,

_{hh}*Z*

_{dr}), according to the availability and reliability of

*K*

_{dp}. The shape parameter

*μ*is estimated from

*Z*

_{dr}and the retrieved values of

*D*

_{0}[as suggested in Zhang et al. (2001)] using a two-input NN (i.e.,

*Z*

_{dr},

*D*

_{0}). Thus, the estimate of the shape parameter is indirectly dependent on

*K*

_{dp}through

*D*

_{0}. The proposed RSD retrieval technique can be formalized in the following way:

when *K*_{dp} is available, *NN*_{D0}(*Z _{hh}*,

*Z*

_{dr}) and

*NN*(

_{Nw}*Z*,

_{hh}*Z*

_{dr}) being used otherwise. Note, the hat shows the estimated quantity and

*NN*indicates the neural network operator. The number of nodes in the hidden layer has been fixed to 6 for

*NN*

_{D0}and

*NN*, and 12 for

_{Nw}*NN*[see Vulpiani et al. (2006) for details on the optimization of the neural network configuration]. A choice of

_{μ}*γ*= 0.7 has been found to be suitable for the retrieval of all the RSD parameters. The input perturbation technique has also been adopted to increase the generalization of

*NN*(

_{μ}*Z*

_{dr},

*D*

_{0}).

It is worth mentioning that the algorithm has been set up by using simulations of polarimetric radar variables, computed through the 𝗧-matrix scattering model (e.g., Mishchenko 2000) and corresponding rain-rate calculations using (2). Because of intrinsic constraints of the scattering model, the radar parameters are computed [through (3) and (5)] from a simulated scattering matrix by integration over the raindrop size distribution in the range of drop diameters between 0.5 (*D*_{min}) and 8.0 mm (*D*_{max}). Regarding the raindrop microphysical parameterization, the following assumptions have been made:

the axis ratio follows Brandes et al. (2002);

the temperature range is 5° <

*T*< 20°C;the raindrop size distribution is as in (1) with 0.5 ≤

*D*_{0}≤ 3.5 mm, 2 ≤ log(*N*) ≤ 5, −1 <_{w}*μ*≤ 5; andthe canting angle is a Gaussian distribution with zero mean and 10° standard deviation.

Once the RSD is retrieved, the rain rate can be simply estimated as

where *υ*(*D*) is the fall speed relationship, while the subscript RSD is used to emphasize that *R* is computed from the retrieved RSD parameters, estimated via (7).

As shown in subsequent sections, the choice of *υ*(*D*) has a nonnegligible impact on the retrieval of *R* when using the “indirect” NN-based approach in (8). Moreover, the relation between *R*_{RSD} and the parameters *N _{w}*,

*D*

_{0}, and

*μ*is strongly nonlinear and this may affect the overall NN-based retrieval accuracy. To minimize the importance of the choice of

*υ*(

*D*), a new direct (without passing through the RSD estimate) neural network rainfall algorithm is also evaluated. Formally, we can write this algorithm as

During training, the known neural network output (i.e., *D*_{0}, *N _{w}*, and

*μ*for

*R*

_{RSD}and

*R*for

*R*

_{NN}) has been randomly generated assuming, for the latter, the Atlas and Ulbrich (1977) terminal velocity relationship. This indicates that both

*R*

_{RSD}and

*R*

_{NN}are trained without any a priori knowledge of the specific climatology and/or radar measurements of the considered site. It is worth noting that for the indirect NN-based

*R*

_{RSD}, the choice of the fall speed model only affects the retrieval phase (e.g., rainfall rate estimates following RSD retrievals), with the NN training dealing only with RSD parameter estimation. In contrast, the training of the direct NN-based

*R*

_{NN}could potentially be sensitive to the assumed raindrop terminal velocity. Nevertheless, as briefly discussed later in the text,

*R*

_{NN}was found largely insensitive to that assumption used for training.

## 3. JPOLE dataset description

Validation of the NN methods outlined in the previous section is accomplished using the JPOLE polarimetric radar dataset collected in central Oklahoma (e.g., Ryzhkov et al. 2005c). A total of 43 events observed by the KOUN radar between the years of 2002 and 2005 have been selected for analysis (as in Giangrande and Ryzhkov 2008, their Table 1). Concurrent gauge observations were available from the densely spaced ARS network stations located at ranges of 50–115 km from the KOUN radar. For this study, the ARS Little Washita watershed is the primary location for rain gauges, a basin of about 611 km^{2} depicted in Fig. 1 (southwest of the radar).The total number of ARS gauges in the Little Washita watershed during JPOLE was 42 (24 after 2004 when some gauges were decommissioned and/or reassigned) with an average spacing of about 5 km. Over the ARS watershed, comparisons between the performances of radar-based rainfall retrievals are mainly affected by RSD variability and the possible presence of hail rather than ground clutter or contamination from the melting layer or frozen hydrometeors (e.g., Ryzhkov et al. 2005b; Giangrande and Ryzhkov 2008). As noted by Giangrande and Ryzhkov (2008), the dataset includes various precipitation types including warm-season convective storms containing hail, mesoscale convective systems (MCS) with intense squall lines and trailing stratiform precipitation, widespread cold-season stratiform rain, and select tropical storm remnants. ARS gauges [shielded Met One tipping-bucket (TB) type] used in the study are unheated; therefore the dataset does not include events with frozen and/or mixed-phase contamination at radar measurement level (as inferred from the results of radar echo classification).

To complement the JPOLE dataset, a recent KOUN case study (2 March 2008) has been analyzed to help assess the sensitivity of the considered algorithms with respect to the radar echo type and possible hail contamination. Data from this event include the results of KOUN automatic echo classification used to infer regions of hail and/or a rain/hail mixture from the KOUN radar [e.g., Liu and Chandrasekar 2000; as performed over gauges in Giangrande and Ryzhkov (2008)]. Here, the ARS Fort Cobb watershed gauge network (commissioned summer 2005) located westward of the radar (Fig. 1) and comprising 15 gauges distributed over about 800 km^{2} was considered as a reference. Similarly, these gauges are all located at distances within 115 km of the KOUN radar and are not subject to melting layer contamination for this event.

During JPOLE, KOUN radar variables were measured at a radial resolution of 0.250–0.267 km using a short dwell time (48 radar samples) to satisfy NEXRAD antenna rotation rate (3 rpm) and azimuthal resolution (1°) requirements. Radar rainfall estimates and automatic echo classification results were obtained using data collected at the 0.5° elevation scan with an update time to within 6 min. Radar reflectivity measured by KOUN was matched with *Z* obtained from the nearby Oklahoma City Twin Lakes (KTLX) WSR-88D radar, which was assumed to be well calibrated (e.g., Ryzhkov et al. 2005a; Giangrande and Ryzhkov 2005). The quantity *Z*_{dr} was calibrated using polarimetric signatures of dry aggregated snow above the melting level following Ryzhkov et al. (2005a). Attenuation correction of *Z _{hh}* and

*Z*

_{dr}was performed using differential phase Φ

_{dp}and the relations Δ

*Z*(dB) = 0.04Φ

_{hh}_{dp}(°) and Δ

*Z*

_{DR}(dB) = 0.004Φ

_{dp}(°) (Ryzhkov and Zrnić 1995). A minimum

*ρ*= 0.85 threshold was applied to filter echoes of nonmeteorological origin. Radar reflectivity was capped at 53 dB

_{hυ}*Z*to mitigate hail contamination in following with NEXRAD operations. Additional details of data processing can be found in Ryzhkov et al. (2005b,c).

For the validation study, we compare hourly gauge and radar rainfall accumulations over gauge locations. In agreement with previous JPOLE studies, hourly radar accumulations are defined as an hourly rainfall estimate centered on a gauge. Radar measurements are averaged using five gates centered over the gauge location and the two closest azimuths separated by 1°. Such averaging produces a radial resolution of 1.0 km and transverse resolution that varies with range. To establish the quality of the radar rainfall algorithms and NN-based methods, absolute differences between radar and gauge estimates (expressed in mm) are examined rather than standard fractional errors, which are heavily weighted toward small accumulations. Rainfall estimates are characterized by the bias *B* = 〈Δ*T*〉, standard deviation STD = 〈|Δ*T* − *B*|^{2}〉^{1/2}, and RMSE = 〈|Δ*T*|^{2}〉^{1/2}, where Δ*T* = *T _{R}* −

*T*is the difference between radar hourly rain-rate totals

_{G}*T*and gauge hourly rain-rate totals

_{R}*T*for any given radar–gauge pair and brackets implying averaging over all such pairs.

_{G}When comparing radar and gauge rain estimates, one must be mindful of the errors of tipping-bucket gauge measurements (e.g., Zawadzki 1975; Wilson and Brandes 1979; Austin 1987; Ciach 2003). Errors in gauge accumulations, associated with high-wind undercatch and/or splashing, may exceed 12% for intense MCS events in central Oklahoma, as is expected for the 2 March 2008 example (Duchon and Essenberg 2001). Quality assurance meteorologists at the Oklahoma Climatological Survey perform regular gauge maintenance and event-based analysis to detect and remove accumulation reports from malfunctioning and apparently biased gauges. As in Giangrande and Ryzhkov (2008), we suggest that the intrinsic gauge errors in the hourly ARS rain totals are well below the expected errors of radar rainfall measurements.

## 4. Results

In this section the sensitivity of the proposed NN retrieval algorithm with respect to an assumed raindrop fall speed relationship and neural network configuration is first evaluated. Then, a comparison between the NN-based retrieval algorithm and the optimal JPOLE estimator is carried out. Finally, hail contamination effects are analyzed using a specific case study.

### a. Sensitivity of R_{RSD} to raindrop fall speed model and number of inputs

The following three terminal fall velocity relationships have been considered in this work:

For the assumed *υ*(*D*) relations above, the NN-based algorithm is tested using two (i.e., *Z _{hh}*,

*Z*

_{dr}) and three (i.e.,

*Z*,

_{hh}*Z*

_{dr},

*K*

_{dp}) inputs from the KOUN radar, respectively, as in section 2.

Comparisons between hourly accumulations from ARS rain gauges and NN-based rainfall methods for the raindrop fall speed models are shown in Figs. 2 –4. For each *υ*(*D*) model, the performance of the two- and three-input NN configuration is displayed in the upper and lower panels, respectively. The results, summarized in Table 1 in terms of error bias, error STD, and RMSE, indicate that the assumption of a fall speed relationship plays a nonnegligible role for estimating hourly rainfall accumulations from the retrieved RSD. Among the considered relationships, A77 is best matched (bias = −0.17 mm, STD = 3.44 mm, RMSE = 3.45 mm) to the observed rainfall accumulations, although we caution that this result may be fortuitous for minimizing errors in rainfall estimation and does not necessarily imply that A77 is the best fall velocity model. The performance of *R*_{RSD} is similar when considering A73 and B02.

Regarding the performance of the NN-based algorithm with respect to the neural network input configuration, the sensitivity analysis on the observed radar dataset confirms the findings from the simulation environment by Vulpiani et al. (2006); there is a nonnegligible benefit to using *K*_{dp} jointly with *Z _{hh}* and

*Z*

_{dr}in estimating rainfall rate. For example, with the A77 fall speed model the performance of the two-input neural network is inferior compared to the three-input configuration in terms of standard errors (STD = 3.98 mm, RMSE = 3.98 mm), with the impact on the bias found to be negligible (bias = −0.09 mm).

### b. Comparisons with optimal JPOLE rainfall relations

The following parametric retrieval algorithms, based on empirical regression of measured gauge/disdrometer and radar data, have been chosen for comparison with the proposed neural network methodology:

where *Z _{hh}* and

*Z*

_{dr}are expressed in linear units. Relation (13) is the inversion of the standard NEXRAD rainfall formula for continental (nontropical) application (e.g., Fulton et al. 1998), whereas (14) and (15) are selected because of their optimal performance in rain in central Oklahoma during the JPOLE field campaign (e.g., Ryzhkov et al. 2005b; Giangrande and Ryzhkov 2008).

Furthermore, as proposed in Ryzhkov et al. (2005b), we have applied a *synthetic algorithm* (R_{SYN}). According to the synthetic algorithm, the choice between various polarimetric rainfall relations is determined solely by the radar reflectivity *Z _{hh}* or

*R*(

*Z*). Such a selection criterion may act as a proxy in the rain medium for rainfall relations contingent on the results of polarimetric echo classification, as outlined in Giangrande and Ryzhkov (2008).

_{hh}We emphasize that the polarimetric algorithms (14), (15), and the synthetic algorithm have been optimized for Oklahoma climatology and the JPOLE dataset. It is noted that “JPOLE-matched” conventional relations have also been tested (e.g., as in Ryzhkov et al. 2005b), but have not shown a significant improvement over optimal polarimetric relations. As recently outlined in Schuur et al. (2008), relationships (13)–(15) may still not be sufficient estimators if “tropical-like” events hit the region. It is worth noting that both *R*_{RSD} and *R*_{NN} have been constructed in a simulated framework through a general a priori microphysical parameterization. The consequence is that the estimators *R*_{RSD} and *R*_{NN} are potentially robust and may be equally suitable in other precipitation regimes.

An advantage for all polarimetric rainfall methods is confirmed for this study, wherein all of the polarimetric algorithms are found to outperform the single-parameter conventional NEXRAD *R*(*Z _{hh}*) relation. For hourly radar–gauge accumulation comparison plots in Figs. 5a–d and 6 and summarized in Table 2, the conventional NEXRAD

*R*(

*Z*) relation is characterized by a large bias and the highest standard deviation and RMS errors when compared with NN-based and JPOLE polarimetric relations. The

_{hh}*R*(

*K*

_{dp}) relation slightly outperforms the

*R*(

*Z*,

_{hh}*Z*

_{dr}) and the

*R*

_{RSD}NN method. Contingent on the choice for raindrop fall speed relation (i.e., A77), the RSD-retrieval-based neural network

*R*

_{RSD}may perform slightly better than

*R*(

*Z*,

_{hh}*Z*

_{dr}) in terms of STD (STD = 3.44, RMSE = 3.45) and is comparable with

*R*(

*K*

_{dp}) for this dataset. The JPOLE optimal synthetic algorithm

*R*

_{SYN}ostensibly outperformed

*R*(

*Z*,

_{hh}*Z*

_{dr}),

*R*(

*K*

_{dp}), and the

*R*

_{RSD}method in terms of bias and RMS errors. Results obtained by applying the direct neural network

*R*

_{NN}are improved compared to previous

*R*

_{RSD}NN-based approaches in all configurations. For the best-matched fall speed relation (as based on indirect method testing) and three-input configuration, bias and standard errors are comparable to the published synthetic methodology that we emphasize was optimized for the JPOLE dataset (bias = −0.23, STD = 2.92, RMSE = 2.93). During sensitivity testing for various fall speed models in the direct method training, it was determined that the results do not significantly change and even improve slightly, for example, when using the B02 model instead of A77.

### c. Hail contamination sensitivity, case study for Ft. Cobb watershed 2 March 2008

The proposed NN-based *R*_{RSD} or *R*_{NN} algorithms have been conceived for rain situations and trained with a synthetic dataset derived from uncorrelated raindrop size distribution parameters. It is reasonable to anticipate a nonnegligible sensitivity to hail or mixed-phase hydrometeor contamination from NN-based algorithms as previously observed with other polarimetric and conventional relations. Although the results of the previous section indicate that NN-based polarimetric rainfall estimates may be less sensitive to the presence of hail, additional factors including RSD variability and noisiness in differential phase measurements could have contributed to the relative performance.

On 2 March 2008, a squall line passed over the Ft. Cobb watershed and collocated ARS network gauges. The event was associated with several small hailstorm and scattered severe hailstorm reports in the vicinity of the watershed. Observed KOUN polarimetric signatures and the results of automatic echo classification over the watershed region were consistent with the presence of hail and rain/hail mixtures. The automatic echo classification is obtained by applying a fuzzy-logic technique (e.g., Liu and Chandrasekar 2000; Heinselman and Ryzhkov 2006), as performed over gauges in Giangrande and Ryzhkov (2008). Although these storm reports and polarimetric signatures do not provide a definitive confirmation, the 2 March event offers an opportunity to explore the sensitivity to hail contamination as intense cells are passing over the densely packed gauges of the ARS network. As it is difficult to isolate hail contamination in hourly radar–gauge accumulation comparisons, we also investigate instantaneous rainfall rate estimates over selected network gauges. To facilitate identification of the occurrences of potential hail contamination, time series of radar rainfall rate estimates over gauge locations have been merged with the results of KOUN polarimetric echo classification for the same measurement times.

Figure 7 summarizes several hours of radar–gauge hourly rainfall accumulation comparisons from the 2 March event for select parametric relations and the direct NN-based method. A listing of bias and errors for the tested relations is found in Table 3. Overall, the performance of differential phase–based relations is consistent with previous studies suggesting phase measurements are less sensitive to the presence of hail.

The JPOLE-tuned synthetic *R*_{SYN} method is optimal for this rain/hail event in terms of both bias and RMS errors, whereas the results for the *R*(*K*_{dp}) relation are nearly identical. The similar behavior can be explained with an understanding that the *R*_{SYN} approach capitalizes exclusively on measurements of *K*_{dp} in situations of high *Z _{hh}*, as in rain/hail. For this event, the performance of the direct NN-based method did not match identically with the previous findings of the cumulative JPOLE validation study. However, neural network configurations that included

*K*

_{dp}measurements also exhibit lower bias and RMS errors when compared with algorithms and network configurations that solely capitalized on

*Z*and

_{hh}*Z*

_{dr}measurements biased by the presence of hail. Rainfall estimation that heavily capitalized on the reflectivity factor was found to significantly overestimate gauge accumulations despite provisions to cap

*Z*values and to remove spurious rainfall rates associated with low correlation coefficient

_{hh}*ρ*consistent with hail. Note that relations based on both

_{hυ}*Z*and

_{hh}*Z*

_{dr}may not provide consistent performance in rain/hail mixtures if the increase in

*Z*for hail is not compensated by a proportional increase of

_{hh}*Z*

_{dr}.

Figure 8 displays time series of instantaneous radar-based rainfall rates and the results of automatic echo classification for the 2 March event over three Fort Cobb watershed gauges—102 (top panel), 103 (middle panel), and 114 (bottom panel). The 5-min ARS gauge accumulation values converted into rainfall rate measurements are included for comparison. On these images, the results of KOUN bulk echo classification have been identified as in Giangrande and Ryzhkov (2008) in the following manner: “R” stands for “light/moderate” rain; “B” represents a “big drops” category associated with large *Z*_{DR} values (e.g., as in the presence of melting particles or the absence of smaller drops due to evaporation at the leading edge of storms); “H” represents “heavy rain” classifications; “A” is used to symbolize hail or a “rain/hail” mixture; and an “N” indicates that no echo or an ambiguous echo has been identified. It is noted that since bulk classifications were trained over gauges corresponding to an approximate 2 km × 2 km areal average, polarimetric signatures and identification of rain/hail mixture or pure hail echo classifications may be masked because of this spatial averaging.

Considering gauge 102, all methods provide poor performance for the first time period (around 2315 UTC), at which times the classification algorithm has detected heavy rain and the relations largely overestimate the gauge observation. Assuming the peak of *R*(*Z _{hh}*) as synonymous with high reflectivity values and considering the relatively small observed rain rate, we suggest that the large radar rainfall overestimation originates in part from hail contamination and/or additional gauge undersampling. During the second sequence (0045 UTC), the increased observed rainfall was reasonably well reproduced by

*R*

_{NN}with

*R*(

*Z*) and

_{hh}*R*(

*K*

_{dp}) overestimating and underestimating, respectively, by about 30 mm h

^{−1}(i.e., relative error of about 50%) the observed maximum rain rate.

Gauge 103 was hit by one of the stronger parts of the convective line with an embedded region of pronounced polarimetric hail signatures. It is noted that the NN method failed at the same time as these hail signatures were observed. In this case, *R*(*K*_{dp}) provided the best performance, but also exhibited negative rainfall rates behind the hail core. The large negative values could be the consequence of nonuniform beam filling in the vicinity of a strong squall-line precipitation gradient (e.g., Ryzhkov 2007). During the second wave of the convective line over this gauge, all of the tested methods performed reasonably well.

Over gauge 114, all rainfall methods overestimate the gauge-based rainfall. However, the indication is that the *R*_{NN} method performs slightly better than the single parameter relations. This is especially true in the presence of big drops that mainly affect the performance of *R*(*Z _{hh}*). During the passage of the first convective line when hail is likely present,

*R*(

*K*

_{dp}) and

*R*

_{NN}perform well whereas

*R*(

*Z*) required a capping threshold at 53 dB

_{hh}*Z*. During the second line passage, similar results were once again observed; again, a possible explanation for the negative

*K*

_{dp}values is nonuniform beam filling and, according to (15), the negative rain rates lead to the worsening of

*R*(

*K*

_{dp}) performance. Note that negative and unphysical rain rates should not statistically endanger

*R*(

*K*

_{dp}) and will also often result in favorable accumulations when other methods are largely overestimating.

## 5. Conclusions

The variability of the raindrop size distribution represents one of the main physical factors affecting radar-based estimation of rainfall. The use of polarimetric methodologies has been previously found to reduce the impact of such variability. This study evaluates one such polarimetric rainfall technique that uses a neural network algorithm for raindrop size distribution and rainfall retrieval. This technique was only tested previously using simulations as in Vulpiani et al. (2006).

For the JPOLE dataset, the “indirect” NN-based *R*_{RSD} methodology, based on RSD estimation, has shown a nonnegligible sensitivity to the assumed fall speed velocity. As expected, polarimetric algorithms have all outperformed the single-parameter *R*(*Z _{hh}*) in terms of bias, RMS errors, and standard deviation. Among the polarimetric algorithms,

*R*

_{RSD}has shown a performance slightly better than

*R*(

*Z*,

_{hh}*Z*

_{dr}) and comparable with

*R*(

*K*

_{dp}) over the entire JPOLE dataset. The best results as compared with gauge accumulations were obtained through use of the “direct” NN-based

*R*

_{NN}algorithm and JPOLE optimal

*R*

_{SYN}method with a slightly lower bias provided by the former. It is again worth noting that both NN-based

*R*

_{RSD}and

*R*

_{NN}have been constructed in a simulated framework without any climatologically driven optimization. The sensitivity of

*R*

_{NN}as it pertains to changes in fall speed relation and highlighted in this study is still a subject of ongoing research.

These results, if confirmed by future validation studies, might open the design of a general class of rainfall retrieval algorithms relatively independent of the climatology of the specific measurement site. The further analysis of a recently observed case study event highlights the potential sensitivity of the proposed neural network techniques, conceived for rain only, to hail contamination. Among them, the RSD-retrieval-based method did not take advantage of the use of multiple radar parameters. Consequently, a further refinement of the explored approaches would be, as recently proposed by Giangrande and Ryzhkov (2008), the use of a preliminary echo classification as a way to potentially overcome the contamination by frozen and/or mixed-phase hydrometeors.

## Acknowledgments

We express our gratitude to the staff of the Oklahoma Climatological Survey for providing high-quality gauge data. The support from NSSL and CIMMS, University of Oklahoma staff who maintain and operate the KOUN WSR-88D polarimetric radar is also acknowledged. Funding for the collection, processing, and distribution of the KOUN radar dataset was provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA-University of Oklahoma Cooperative Agreement NA17RJ1227, U.S. Department of Commerce. Availability of the KOUN radar dataset was also supported by the National Weather Service, the Federal Aviation Administration, and the NEXRAD Product Improvement Program. This work was also supported by the National Department of Civil Protection, Rome, Italy, under the IDRA project.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Gianfranco Vulpiani, Prime Ministry, Department of Civil Protection, Via Vitorchiano 4, 00189 Roma, Italy. Email: gianfranco.vulpiani@protezionecivile.it