## 1. Introduction

### a. Z–R relations and their problems

Remote measurement of the rainfall rate is an important product provided by the Next Generation Weather Radar (NEXRAD) network. Radars have a large coverage area that is not practical to achieve with a dense network of rain gauges. Given the importance of rainfall-rate estimation to understanding the hydrology of a region (e.g., flash flood warnings), one must know the limitations of current methods of estimating rainfall rates from radar reflectivities, and then determine how they can be improved.

*Z*–

*R*relationship, the rainfall rate

*R*(mm h

^{−1}) is estimated from the radar reflectivity factor

*Z*(mm

^{6}m

^{−3}) using a simplified relationship of the form

*a*and

*b*, which can be derived from an assumed drop size distribution (DSD) after several simplifying assumptions. Despite those simplifications, Eq. (1) does model the basic observed behavior that radar reflectivity increases with heavier precipitation. It has been noted by Doviak (1983) that these static parameters can be tuned to the climatology of the region to improve the estimate. However, Doviak (1983) goes on to say that from an operational standpoint, this is typically not practical.

Estimating rainfall rates from the returned power is particularly difficult because of variations in DSD (Ulbrich 1983). Equation (1) assumes that the precipitation is composed of spherical drops of liquid water and that the vertical wind velocity is zero. If the precipitation is ice, snow, or mixed, then the amount of returned power is significantly different than if it was liquid, and the model is then rendered invalid for a given set of parameters.

### b. Overview of rainfall-rate estimation

*R*in a unit volume is known to be a function of the drop size distribution

*N*(

*D*) and the fall speed of the precipitation

*w*(

_{p}*D*) relative to the ground (i.e., terminal fall speed minus the vertical component of the wind),

*D*. For a given

*Z*–

*R*relation, the DSD is assumed and the precipitation fall speed is derived from a simplified model. Advances in rainfall-rate estimation will come from a better representation of the DSD and the precipitation fall speed.

### c. Precipitation representation

Representing precipitation DSD and fall speed is challenging. As noted by Rogers and Yau (1989), the DSD is highly dependent upon the conditions exhibited by the precipitating environment. These properties change over time and location, even for a single precipitating event (Waldvogel 1974). Therefore, any improved representation of the DSD and the precipitation fall speed must be available at nearly the same spatial and temporal scale of the precipitation event.

Dual-polarization radars would be an excellent source of data to satisfy those requirements (Seliga and Bringi 1976). Some of the dual-polarization radar parameters are directly affected by the shape and type of the precipitation, so these parameters can represent the DSD of the precipitation. Unfortunately, until they are upgraded, the single-polarization Weather Surveillance Radar-1988 Doppler (WSR-88D) radars cannot determine information about the DSD and, to a degree, the precipitation fall speed from the radar signal. Even after the upgrade, dual-polarization parameters will still be absent from the historical record. An alternative source of data that has been extensively archived is still needed for climatological and reanalysis studies.

Observables that have been extensively and reliably recorded include temperature, humidity, pressure, and wind. It is known that DSD and precipitation phase are both highly dependent upon the concentration of cloud condensation nuclei, the available water vapor, and temperature (Bohren and Albrecht 1998; Rogers and Yau 1989). Also, the wind is indicative of the source of the air (i.e., tropical versus continental air masses) that feeds the precipitation event, which would, among other things, influence the concentration of the cloud condensation nuclei. The terminal fall speed depends upon not only the diameter of the precipitation, but also its shape and density (comprising the Reynolds number), which are tied to the DSD and the phase. Last, pressure can be a useful indicator of storm severity because pressure deepening is linked to the generation of vorticity.

As examined in van den Dool (1994) and also in Root et al. (2007), different types of weather events have their own atmospheric signature. So, if DSD and fall speed are tied to atmospheric variables, and a precipitating event can have an atmospheric signature, then it follows that an atmospheric signature can be tied to a characteristic DSD and precipitation fall speed. Therefore, atmospheric variables could serve as a source of information in the absence of dual-polarization parameters. Thus, this study used observed atmospheric variables as supplemental input to a rainfall-rate model.

### d. Why artificial neural networks?

Traditional studies to measure an input parameter’s significance use linear or polynomial regression. However, such relationships are already known to be of insufficient complexity. Therefore, a different model fitting is needed. This work used an artificial neural network (ANN) to produce a model that adapted itself to observed data, and yet was still generalizable. While the artificial intelligence model itself may or may not be useful for gaining a theoretical understanding of the physical relationships involved, it is still useful for discovering relationships in data (Haupt et al. 2009).

Indeed, the use of an ANN in the field of hydrology is not new. Some noteworthy studies used satellite data as inputs into an ANN model (Tsintikidis et al. 1997; Hsu et al. 1997; Rivolta et al. 2006). Other studies focused upon the prediction of rainfall (French et al. 1992; Chow and Cho 1997; Venkatesan et al. 1997; Kuligowski and Barros 1998a), while still others used ANN models to do quality control for rainfall (Kuligowski and Barros 1998b) and reflectivity measurements (Lakshmanan et al. 2007).

More relevant to this work were the studies to improve estimates of rainfall accumulation (Xiao and Chandrasekar 1997; Liu et al. 2001; Trafalis et al. 2002). However, those models used only radar moments for input, with no consideration for the atmospheric conditions. Most hydrologic studies using an ANN model focused on prediction as opposed to estimation. There has been little, if any, study regarding the usability of atmospheric observables for rainfall-rate estimation. A survey of how ANN models have been applied in the field of hydrology can be found in Maier and Dandy (2000).

## 2. Methods

Creating ANN models for estimating rainfall rate involves planning and design. The following subsections describe how the models and the datasets were created and evaluated.

### a. WEKA

The Waikato Environment for Knowledge Analysis (WEKA) is a software “workbench” for experimenting with and learning different machine-learning techniques (Witten and Frank 2005). WEKA is capable of performing many different data analysis tasks, which makes it ideal for comparing analysis techniques. For this work, version 3.5.8 was used.

### b. Multilayer perceptron

*n*th node of the

*m*th layer can be described by Eq. (3) as follows:

*m*= 1. In a hidden layer where 2 ≤

*m*<

*M*, a node is an activation function that takes the form of a logistic function controlled by a threshold

*T*

_{nm}. For an output layer node

*m*=

*M*, the activation function is a weighted sum of the

*P*nodes of the previous layer plus a threshold

_{m}*T*

_{nm}. Each node has its own array of weights

*w*to be applied to the nodes of the previous layer. Section 2e goes into the details on the structure of the implemented MLP.

### c. MLP input/output

When designing an MLP, one must first consider the inputs and outputs of the model. There was one output: the estimated rainfall rate. For inputs, radar reflectivity was chosen for its primary relationship with rainfall rates. Temperature, relative humidity, meridional and zonal wind, and air pressure were also chosen. Section 1c explained the rationale for these atmospheric observables.

Unfortunately, obtaining in situ measurements of these atmospheric variables at the density and frequency of radar observations is not feasible. Therefore, the conditions at the same volumes scanned by the radar were instead represented by surface measurements. The surface measurements served as a proxy for the conditions aloft. Admittedly, using proxy data is not optimal, but a level of abstraction between the inputs and outputs already exist in ANN models. Having an additional level of abstraction between variables that are highly coupled should only have a minimal impact.

### d. Training dataset

The training dataset was compiled from two sources. The first data source was level-II radar data from the WSR-88D station KTLX, which is near Oklahoma City, Oklahoma. These data were obtained from the National Climatic Data Center (NCDC) using the Hierarchical Data Storage System (HDSS) Access System (HAS). The radar reflectivity data for the lowest elevation angle were interpolated onto a 0.01° latitude × 0.01° longitude grid and a light spatial smoothing was applied by the Java NEXRAD Data Exporter. The second data source was the automated surface observations from the Oklahoma Mesonet, which is a statewide network of over 100 automated weather observing stations (with at least one per county), providing a relatively dense spatial coverage (Brock et al. 1995). Only 66 stations within a 160-km radius of KTLX were used, so that range effects (beam spreading and beam height above surface) would be minimized. Each Mesonet observation contained the accumulation since 0000 UTC, surface temperature, relative humidity, pressure, and wind. Both the radar data and the Mesonet data were recorded at approximately 5-min intervals.

The times for the data records were obtained by using, as a guide, the major precipitation events in central Oklahoma listed in the Severe Thunderstorm Event Index from the Storm Prediction Center and the Storm Events Database from the NCDC. The events are covered from 1995 to 2008, occurring in all seasons and representing the gamut of precipitation types. To mitigate selection bias, the data obtained did not exclude light-to-moderate precipitation events.

For the following, *Z*(*x*, *y*, *t*) denotes radar reflectivity, indicating its spatial and temporal dimensions. The Mesonet data containing all variables recorded by a station are denoted as **M**(*s*, *t*), where *s* is an index for the station. The location (*x*, *y*) of each station *s* is denoted as the array **x**(*s*).

Because of the Mesonet records accumulation, the rate has to be calculated using two observations **M**(*s*, *t* − *δt*) and **M**(*s*, *t*), both before and after the radar observation *Z*[**x**(*s*), *t*] using finite difference. The other observables (temperature, pressure, etc.) for a station were averaged between those two observations. These processed data (rainfall rate, averaged temperature, averaged pressure, etc.) are represented as **M̂**. Therefore, a data record in the training dataset is {*Z*[**x**(*s*), *t*], **M̂**(*s*, *t*)}.

*Z*–

*R*plane tended toward being uniform. Observations were binned in the

*Z*–

*R*plane, and a percentage

*p*of the data in the

_{i}*i*th bin were retained as

*N*

_{obs}is the total number of observations,

*N*

_{bins}is the total number of bins, and

*n*is the number of observations for the

_{i}*i*th bin. Here,

*C*is a scaling parameter to control the amount of data that are retained. This effectively “balanced” the training dataset by eliminating redundant records, therefore protecting the models from being swamped during the training process. Using

*C*= 5.0, there were 38 839 observations retained to train and test the models.

It is unnecessary for this study to use an independent testing dataset that closely represents the climatology of the region. A testing dataset would be needed if we were trying to determine the best model for operational use. This study is a data analysis to determine how much value surface data could contribute to a rainfall-rate estimation model.

### e. MLP configuration

This study used a supervised training approach for the MLP with gradient-descent back propagation to find the optimal weights and thresholds for the nodes. The Rumelhart et al. (1986) text provides an in-depth exploration of this method and its parameters. This training method is parameterized by a “learning rate,” which was set to 0.05, and a “momentum,” which was set to 0.3. The values for these parameters were determined during a sensitivity study of the model. The model’s cost function to minimize was the mean-squared error between the observed rainfall rate and the model-estimated rainfall rate. Last, the neural network was initialized with random values for its weights and thresholds.

The training process iterated for 3000 epochs, which was found to be sufficient for convergence. While fairly insensitive to changes in training time, longer trainings would yield diminishing improvements. Significantly shorter trainings (<1000 epochs) would not guarantee that the MLP achieved the desired function.

For the hidden portion of the MLP, the choice of structure was determined by searching for the MLP structure with the fewest nodes while still achieving the desired behavior. The key behaviors necessary for the model to exhibit were to estimate a zero precipitation rate for low reflectivity and to exponentially increase that rate for higher reflectivities. Starting with an overly complex configuration, the number of nodes was reduced until the model could no longer exhibit that behavior. This procedure resulted in a (4, 2) hidden structure. In other words, the first hidden layer had four nodes while the second hidden layer had two nodes. This structure is diagrammed in Fig. 1.

### f. Testing and evaluation

The focus of this study was to determine if surface data have an added value as input to a rainfall-rate estimation model. Four variations of the MLP model were examined. These models were named FullSet, SansWind, JustWind, and Reflect. The best fit (least sum of squares sense) of the traditional *Z*–*R* relation, named ZRBest, was used to put the MLP model results into context of traditional models. The models are listed in Table 1.

For each of the models, 50 training–testing iterations were performed. Each training–testing iteration consists of training against a random 66% of the dataset and then testing against the remaining portion. The correlation, root-mean-square error (RMSE), and mean absolute error (MAE) were recorded for each model instance. These statistics were then bootstrapped to estimate the mean of those performance measures. A 90% confidence interval for the bootstrapped means was also calculated using an accelerated bias-corrected method (Efron and Tibshirani 1993). The change in the absolute error from one model to another model for each data record was also used to analyze model performance. Finally, a permutation test (Efron and Tibshirani 1993) was performed for the null hypothesis that the distribution of a performance measure for a model is the same for the distribution of the same measure for the Reflect model.

## 3. Results and discussion

The rainfall rates estimated by FullSet model (aggregated over 50 instances) are plotted as a function of reflectivity in the left panel of Fig. 2. For comparison, the rainfall rates as calculated by Reflect model are shown in the right panel. The gray points in these graphs are the observed rainfall rate as a function of observed radar reflectivity. The black dots are the models’ rainfall-rate estimate as a function of the observed radar reflectivity. Figure 2b is a good exemplar of the insufficiency of the one-dimensional *Z*–*R* relationship, even when using an ANN. While the fit is good, it could never represent what a multidimensional model could represent, as shown in Fig. 2a.

It is useful to visualize how rainfall-rate estimation changes while augmenting the model with input data. By noting the rainfall-rate estimate for each data record by each model, the change in the estimation error between two models can be calculated. Figure 3 is a plot of the improvement in error as a function of reflectivity for each data record between the Reflect and FullSet models. The FullSet model experienced an average error improvement of 0.17 mm h^{−1} per data record over the FullSet model. Note that these average error improvements are highly damped by the minuscule changes in estimates for reflectivities below 20 dB*Z*.

Figure 4 shows how the FullSet and Reflect MLP models compare with the observed rainfall rate. The line in both graphs is the model’s goal. For all MLP models fitted to the training data (including those not shown), there is a characteristic underestimation of high rainfall-rate cases, and some overestimation of low rainfall-rate cases.

Figure 5 depicts the bootstrapped mean of the performance measures, along with its 90% accelerated bias-corrected confidence interval for each model. The mean correlation coefficient for the FullSet model was 0.73, while the mean correlation for the Reflect model was 0.69. The FullSet model also did better with a mean root-mean-square error of 7.97 mm h^{−1} versus the Reflect model’s 8.43 mm h^{−1}. According to the mean-squared error and the correlation, the higher-dimensional models performed better than the lower-dimensional models. However, for mean absolute error, the difference between the FullSet and Reflect models were significantly smaller, with 4.06 and 4.23 mm h^{−1}, respectively. Examining further, the distribution of the mean absolute error statistic for the MLP models had a long right tail toward high values. This would cause the mean of this statistic and its error bars to be overestimated for the MLP models compared to the ZRBest model.

The *p* values for the permutation test comparing the performance measures of the models to the performance measures of the Reflect model are shown in Table 2. Values closer to zero indicate that there is a higher level of confidence needed to reject the null hypothesis that the performance measure is statistically the same. In other words, larger *p* values indicate that there is little statistical difference between the model and the Reflect model. Because the mean correlation for the augmented MLP models were greater than for the Reflect model, the low *p* values indicate that not only were they statistically different, but those models were better than the Reflect model. The same was also true for the root-mean-square error, with a possible exception of the JustWind model. The mean absolute error indicated that there was a nontrivial amount of doubt that the FullSet model might not be statistically different from the Reflect model, and it is likely the JustWind model is the same as the Reflect model in the MAE sense.

## 4. Conclusions

The attempt to use surface-based information together with radar reflectivity in a neural network resulted in an improvement in rainfall-rate estimation over an unaugmented ANN. All of the MLP models, though, tended to underestimate the rainfall rate at the mid-to-upper reflectivity range. This appears to be due to the several records of low rainfall rate at around 50 dB*Z* and above (visible in Fig. 2b), which would cause difficulties for error minimization routines. Given that there were so few cases of high rainfall rate available for training, and that there were more cases of low rainfall rate than not at the highest reflectivities, it should be expected for a function to be “pulled down” by these data points during the training process.

In addition to the problems with the high-reflectivity cases, there was also an issue with the overwhelming number of low rainfall-rate cases spread out across the reflectivity domain. Using the entire dataset is not practical because of time and hardware constraints, so the dataset was filtered. Early attempts overfiltered the dataset, which resulted in so many cases of low rainfall rate being removed that the MLP models were not “anchored” to zero rainfall rate. Overfiltering the data points also resulted in unrealistic coefficients for the best fit *Z*–*R* model. It was then found that with moderate filtering, which allowed more cases of low rainfall rates to remain, both the best fit *Z*–*R* model and the MLP models began having the expected behaviors in the *Z*–*R* plane. Unfortunately, retaining predictable data records swamps the performance measures, which depresses the differences in performance measures between the models.

Because of the number of high-reflectivity but light-rainfall observations in the dataset, a quality control assessment should be performed. There was no quality control performed beyond that done by NEXRAD and Mesonet. Possible explanations for these observations are melting precipitation near the surface, ground clutter, and a small, isolated storm passing near, but not over, a Mesonet station. Additional study of the dataset is needed to determine the cause of these cases.

To improve the training of the MLP, more cases of heavy precipitation (reflectivity greater than 40 dB*Z*) are needed. In addition, the dataset needs better quality control to remove potential contamination of the training process. Given that the training data could only use ground-based information to act as a proxy for conditions that would affect the DSD, the amount of improvement that the full MLP produced in its rainfall-rate estimates is remarkable. By augmenting the MLP models with surface parameters, it was shown that this utilization of the surface observations improved the rainfall-rate estimation.

## Acknowledgments

This work was primarily supported by NOAA/NSSL under Cooperative Agreement NA17RJ1227. We greatly appreciate the discussions and inputs from George Young, Valliappa Lakshmanan, and Kim Elmore. We also thank the reviewers for their valuable input and help while revising this paper.

## REFERENCES

Bohren, C. F., and Albrecht B. A. , 1998:

*Atmospheric Thermodynamics*. Oxford University Press, 402 pp.Brock, F. V., Crawford K. C. , Elliott R. L. , Cuperus G. W. , Stadler S. J. , Johnson H. L. , and Eilts M. D. , 1995: The Oklahoma mesonet: A technical overview.

,*J. Atmos. Oceanic Technol.***12****,**5–19.Chow, T. W. S., and Cho S. Y. , 1997: Development of a recurrent Sigma-Pi neural network rainfall forecasting system in Hong Kong.

,*Neural Comput. Appl.***5****,**66–75.Doviak, R. J., 1983: A survey of radar rain measurement techniques.

,*J. Climate Appl. Meteor.***22****,**832–849.Efron, B., and Tibshirani R. , 1993:

*An Introduction to the Bootstrap*. 1st ed. Chapman & Hall/CRC, 436 pp.French, M. N., Krajewski W. F. , and Cuykendall R. R. , 1992: Rainfall forecasting in space and time using a neural network.

,*J. Hydrol.***137****,**(1–4). 1–31.Haupt, S. E., Pasini A. , and Marzban C. , 2009:

*Artificial Intelligence Methods in the Environmental Sciences*. 1st ed. Springer, 424 pp.Hsu, K., Gao X. , Sorooshian S. , and Gupta H. V. , 1997: Precipitation estimation from remotely sensed information using artificial neural networks.

,*J. Appl. Meteor.***36****,**1176–1190.Kuligowski, R. J., and Barros A. P. , 1998a: Localized precipitation forecasts from a numerical weather prediction model using artificial neural networks.

,*Wea. Forecasting***13****,**1194–1204.Kuligowski, R. J., and Barros A. P. , 1998b: Using artificial neural networks to estimate missing rainfall data.

,*J. Amer. Water Resour. Assoc.***34****,**1437–1447.Lakshmanan, V., Fritz A. , Smith T. , Hondl K. , and Stumpf G. , 2007: An automated technique to quality control radar reflectivity data.

,*J. Appl. Meteor. Climatol.***46****,**288–305.Liu, H., Chandrasekar V. , and Xu G. , 2001: An adaptive neural network scheme for radar rainfall estimation from WSR-88D observations.

,*J. Appl. Meteor.***40****,**2038–2050.Maier, H. R., and Dandy G. C. , 2000: Neural networks for the prediction and forecasting of water resources variables: A review of modelling issues and applications.

,*Environ. Modell. Software***15****,**101–124.Rivolta, G., Marzano F. S. , Coppola E. , and Verdecchia M. , 2006: Artificial neural-network technique for precipitation nowcasting from satellite imagery.

,*Adv. Geosci.***7****,**97–103.Rogers, R. R., and Yau M. K. , 1989:

*Short Course in Cloud Physics*. 3d ed. Butterworth-Heinemann, 304 pp.Root, B., Knight P. , Young G. , Greybush S. , Grumm R. , Holmes R. , and Ross J. , 2007: A fingerprinting technique for major weather events.

,*J. Appl. Meteor. Climatol.***46****,**1053–1066.Rumelhart, D. E., Hinton G. E. , and Williams R. J. , 1986: Learning internal representations by error propagation.

*Parallel Distributed Processing: Explorations in the Microstructure of Cognition,*Vol. 1, D. E. Rumelhart and J. L. McClelland, Eds., MIT Press, 318–362.Seliga, T. A., and Bringi V. N. , 1976: Potential use of radar differential reflectivity measurements at orthogonal polarizations for measuring precipitation.

,*J. Appl. Meteor.***15****,**69–76.Trafalis, T. B., Richman M. B. , White A. , and Santosa B. , 2002: Data mining techniques for improved WSR-88D rainfall estimation.

,*Comput. Ind. Eng.***43****,**775–786.Tsintikidis, D., Haferman J. , Anagnostou E. , Krajewski W. , and Smith T. , 1997: A neural network approach to estimating rainfall from spaceborne microwave data.

,*IEEE Trans. Geosci. Remote Sens.***35****,**1079–1093.Ulbrich, C. W., 1983: Natural variations in the analytical form of the raindrop size distribution.

,*J. Climate Appl. Meteor.***22****,**1764–1775.van den Dool, H. M., 1994: Searching for analogues, how long must we wait?

,*Tellus***46A****,**314–324.Venkatesan, C., Raskar S. D. , Tambe S. S. , Kulkarni B. D. , and Keshavamurty R. N. , 1997: Prediction of all India summer monsoon rainfall using error-back-propagation neural networks.

,*Meteor. Atmos. Phys.***62****,**225–240.Waldvogel, A., 1974: The N0 jump of raindrop spectra.

,*J. Atmos. Sci.***31****,**1067–1078.Witten, I. H., and Frank E. , 2005:

*Data Mining: Practical Machine Learning Tools and Techniques*. 2d ed. Morgan Kaufmann, 525 pp.Xiao, R., and Chandrasekar V. , 1997: Development of a neural network based algorithm for rainfall estimation from radar observations.

,*IEEE Trans. Geosci. Remote Sens.***35****,**160–171.

List of models and the input variables used; temperature (*T*), relative humidity (RH), pressure (*P*), wind (*W*) (both components), and reflectivity (*R*).

The *p* values for testing that the mean of a performance measure for a model is statistically the same for the Reflect model. Lower *p* values indicate a higher degree of confidence for rejecting that hypothesis.