## 1. Introduction

Operational and experimental measurements of rainfall are commonly based on tipping-bucket (TB) rain gauges. They are convenient and reliable tools; however, like all other field sensors, they are subject to systematic and random instrumental errors. The systematic errors, including flaws in the gauge calibration, wind-induced undercatch, or wetting–evaporation losses, have been studied extensively and are fairly well recognized (Sevruk 1985; Humphrey et al. 1997; Habib et al. 1999). In this study, we focus on uncertainties that manifest themselves as random differences between closely collocated TB rain gauges. We call these discrepancies “local random errors” to distinguish them from the area-point representativeness problems that are important in areal rainfall estimation based on rain gauges. These local differences can originate from several sources, such as the time-sampling effect caused by the discrete character of the TB measurements, hydrodynamic water flow instabilities in the gauge funnels, and differences in the wind effect caused by turbulent airflow around the rain gauges. The time-sampling errors were analyzed by Habib et al. (2001) using a data-driven simulation model of the TB measurements. They demonstrated considerable magnitudes of these errors at short timescales as well as their strong dependence on the data collection and preprocessing strategies. Here, we apply a fully empirical approach based on the data from a cluster of 15 collocated TB rain gauges. We assume that averaging of these 15 measurements sufficiently smoothes out the random errors so that the averages can be treated as fairly good approximations of the true local rainfall rates. We define the local random error as a measure of the discrepancies between single-gauge measurements and the 15-gauge average. We investigate the dependences of these errors on rainfall intensity and timescale (rainfall averaging, or accumulation time). An analytical formula is fitted to describe the error standard deviation as a function of these two variables. Two commonly used data collecting/processing strategies are considered. In the first, we construct the interval-averaged rain rates using linear interpolation between the tip times. The other is more common in operational practice and simply counts the number of tips in each interval. We demonstrate differences between the two schemes and their dependence on the rainfall rate and accumulation interval.

Information about the level and the structure of the local errors in rain gauges can be important for several applications. In hydrometeorological practice, the point rain gauge measurements often have to be expanded over the entire area covered by a gauge network. Different interpolation schemes are used for this purpose. Those that have a sound statistical basis can use the information on the local errors to improve the interstation rainfall estimates (Seo 1990a,b). In small spatial scales that are specific for high-resolution radar rainfall measurements, the local uncertainties combine with the rain gauge representativeness errors and contribute to the discrepancies between the available surface measurements and the desired true area-averaged rainfall. Fundamental problems in the radar–rain gauge comparisons that arise from the uncertainties of the ground reference are discussed in several studies (e.g., Zawadzki 1975; Kitchen and Blackall 1992; Ciach and Krajewski 1999a,b). Possible answers to these questions need to be based on a thorough insight into all aspects of the rain gauge errors. A discussion about possible cost-efficient improvements of local rainfall measurements is provided in the last section of this study.

## 2. Experimental setup and data sample

The data sample that we use in this study was collected in Oklahoma during the period from 28 September through 20 December 1999. The cluster of 15 TB rain gauges was deployed over an area of about 8 m × 8 m within the field station operated by the U.S. Department of Agriculture (USDA) Agricultural Research Service in Chickasha, Oklahoma. The main purpose of this setup was to evaluate the rain gauge that is used in the Tropical Atmosphere–Ocean (TAO) automatic weather stations deployed by the National Oceanic and Atmospheric Administration (NOAA)/Pacific Marine Environmental Laboratory (PMEL) (McPhaden 1995). Our rain gauge cluster was used as a reliable reference for this comparison. Here we use the data collected in this experiment to evaluate uncertainties in our rain gauges. A picture of the cluster together with the TAO station is presented in Fig. 1. The accumulated depth of the rainfall received during the above collection period was about 138 mm. The gauges shown in Fig. 1 were the beginning of the “PicoNet,” a semimobile system designed for precise small-scale rainfall measurements (Ensworth et al. 1999; see also the PicoNet Web page at www.evac.ou.edu/piconet) developed and operated by the Environmental Verification and Analysis Center (EVAC) of the University of Oklahoma. The rain gauges (model 380C manufactured by MetOne, Inc.) used in the PicoNet are the same as those used in the Oklahoma Mesonet (Brock et al. 1995). The diameter of their orifice is 30.5 cm (12 in.) and their resolution is 0.254 mm (0.01 in.) of rainfall depth. Static and dynamic calibrations of the gauges were performed with the help of Oklahoma Climatological Survey personnel who allowed us to use the equipment in the Mesonet Laboratory.

The rain gauges are equipped with individual small dataloggers (HOBO Event loggers manufactured by the Onset Computer Corporation) that can record the bucket tip times with a resolution of 0.5 s. Each logger can save up to 2000 tip times, and these records can be downloaded to a personal computer via standard serial interface. The high time resolution of the HOBO loggers, however, is not matched by the accuracy of the time measurements. According to our tests, all the dataloggers show systematic positive time drifts ranging from about 17 s to about 30 s week^{–1}, depending on the logger. Our data sample was collected during two periods approximately 6 weeks long, and the logger timers were set correctly at the start of each period. For this scenario, the differences in the individual time drifts result in more than 1-min differences in the logger time errors. If uncorrected, this would add substantially to the local random errors at short timescales. Fortunately, the individual time errors per week are fairly constant for each datalogger, and we correct them assuming that the errors increase linearly in time with the rate characteristic for each datalogger. These rates were computed from the time drifts that we measure at the end of each data collection period during downloading of the data from the loggers to a personal computer.

*V*

_{b}is the 1-tip bucket volume resolution equal to 0.254 mm in our case, and

*t*

_{i}and

*t*

_{i+1}are the times of the two tips. Then, we used this piecewise-constant function of quasi-instantaneous rain rates to compute rainfall accumulations as a function of time for each rain gauge. From these accumulation functions we could compute interval-averaged rainfall intensities for any required timescale by dividing the accumulation in an interval by the interval length. In Fig. 2, an example of the results of this data preprocessing is presented for a fragment of a storm that occurred on 9 December 1999. The plot shows the time series of 1-min rainfall rates averaged over all 15 rain gauges together with all the single-gauge rain rates. One can see that the relative differences between the rain gauges are especially pronounced for low rainfall intensities at the end of the presented time period. Figure 3 presents the tip times for each of the 15 gauges over the last 30 min of the above example. This schematic illustrates qualitatively the relation between the tips of different gauges in the cluster and the interpolation results shown in Fig. 2.

## 3. Analysis of the local random errors

*T,*we express the local rain gauge errors using the formula that quantifies relative departures of single-gauge measurements from the 15-gauge average: where

*e*

_{1,T}is a random variable representing the local errors of single rain gauges, variable

*R*

_{1,T}denotes interval-averaged rainfall rates measured by single gauges in any interval

*T,*variable

*R*

_{15,T}is an average of the 15 single-gauge rain rates in the same interval that approximates the true local rainfall, and the subscript

*T*indicates the timescale equivalent to the length of the averaging (or accumulation) interval. The error analyses in this section are applied to single-gauge rain rates

*R*

_{1,T}computed using the interpolation scheme based on Eq. (1). These

*R*

_{1,T}values were subsequently used to produce the 15-gauge average denoted by

*R*

_{15,T}. From Eq. (2) and the definition of

*R*

_{15,T}, the average of

*e*

_{1,T}values for each given

*R*

_{15,T}has to be equal to zero, which means that the local random errors as defined here are bias free. In Fig. 4, we present an example scattergram of the error values given by Eq. (2) as a function of the true local rain rates for the 5-min timescale. The plot shows qualitatively a large scatter of the relative errors for small rain rates and a fairly rapid drop with increasing rainfall intensities. However, even for the highest 5-min rain-rates in our sample, some residual scatter still persists. To describe these features in a more quantitative manner, we performed a nonparametric regression estimation of the standard deviations of the local random errors

*e*

_{1,T}as a function of the true local rainfall intensities.

*T,*the nonparametric estimate of the local standard error (standard deviation of the error

*e*

_{1,T}) at a given local rain rate value

*R*

_{T}was computed using the Nadaraya–Watson kernel regression estimator. Below we apply it to the second statistical moment of

*e*

_{1,T}: where the nonnegative function

*k*( · ) is called the smoothing kernel, and

*h*is called the estimation bandwidth. The subscript in

*σ*

_{k}denotes the kernel regression value of the standard error, which is just the square root of the result based on Eq. (3). The summations in the numerator and denominator are over all the observed data points for which the kernel function

*k*( · ) assumes nonzero values. In the above expression, we also took advantage of the already stated fact that, for each

*R*

_{15,T}conditional averages of the

*e*

_{1,T}values are equal to zero. The kernel regression used here is the oldest and the most popular nonparametric regression method. It has been used in applied statistics since 1964 (Hardle 1990) and we chose it for this study because it is intuitive and easy to implement. Conceptually, it follows the formulation of the regression problem, in which a regression curve is defined by conditional averages of the response variable conditioned on the values of the prediction variable (Bickel and Doksum 1977). In practice, to realize this idea for finite data samples, the conditional averaging has to be performed over some neighborhood of each predictor value. The relative contribution of the closer versus farther neighbors is determined by the shape of the kernel

*k*( · ), whereas the size of the neighborhood is determined by the bandwidth

*h.*In contrast to parametric regression, the nonparametric methods enable estimating the shape of a regression function (curve, in the univariate case considered here) without any assumptions about its specific functional model, or about the distribution of the errors. This makes them very efficient tools for exploratory data analysis when little prior information is available. The details of the kernel regression are described in many textbooks (e.g., Hardle 1990; Simonoff 1996). In Eq. (3), we applied this method to regress the squares of the single-gauge errors

*e*

_{1,T}on the true local rainfall intensities.

*h,*

*h*] interval by the bandwidth

*h,*and its center is shifted to the given rain-rate value

*R*

_{T}; this way it defines the weighted averaging neighborhood around this rain rate. Hardle (1990) shows that the Epanechnikov kernel has asymptotic optimality properties that make it a reasonable choice for our purposes. The selection of the bandwidth is always a compromise between the degree of smoothing of the data scatter and the wish to preserve fine features of the regression curve. It is also dependent on the size of the available data sample because, for the kernel-determined neighborhood of each considered rain-rate value, one needs to average a considerable number of data points to make the results statistically stable. Technical details about various systematic bandwidth selection methods are discussed in the abovementioned monographs, as well as in numerous statistical papers. In this study, we applied a more heuristic approach based on visual inspection of the regression outcomes for different selections of the bandwidth. As a result, we used a variable bandwidth proportional to the rainfall intensity:

*h*=

*aR*

_{T}to partially compensate for the drop in the number of data points with increasing rain rate. The value of

*a*= 0.2 was selected as a satisfactory trade-off between oversmoothing and undersmoothing of the estimated regression curves.

*σ*

_{k}as a function of the true local rainfall intensities for three averaging intervals: 5 min, 15 min, and 1 h. As expected, the standard errors are more substantial for shorter averaging times and smaller rainfall intensities. For example, they reach about 10% at 5 mm h

^{–1}rain rates and about 6% at 10 mm h

^{–1}, for the 5-min scale. To describe these results in a concise mathematical way, we searched for the simplest analytical model that can approximate them with sufficient accuracy for all considered timescales. After examining several monotonically decreasing functions, we concluded that the standard errors

*σ*

_{k}can be represented using the following formula:

*σ*

_{m}

*T,*

*R*

_{T}

*e*

_{0}

*T*

*R*

_{0}

*T*

*R*

_{T}

*e*

_{0}and

*R*

_{0}are the model coefficients that depend on the timescale

*T,*and the subscript in

*σ*

_{m}denotes the analytical model values of the standard errors. Parameter

*R*

_{0}in this model determines the scale at which the standard error drops with increasing

*R*

_{T}, whereas

*e*

_{0}is the residual standard error at high rainfall rates. We estimated the coefficients

*e*

_{0}and

*R*

_{0}through minimization of the mean absolute differences between the model values

*σ*

_{m}and the nonparametric regression values

*σ*

_{k}. We confirmed that using models with more parameters than in Eq. (5) does not improve the accuracy of the approximation significantly. The solid lines in Fig. 5 represent the analytical approximations

*σ*

_{m}for the three corresponding timescales. The estimated values of the coefficients in Eq. (5), as a function of timescale, are shown in the upper panels of Fig. 6. Both of them drop gradually with increasing

*T,*and the residual error

*e*

_{0}seems to be worth consideration only for timescales of 5 min and below. The results in the upper panels of Fig. 6, together with Eq. (5), can be used to estimate the standard deviations of the local random errors in TB rain gauges in situations in which the instruments have basic technical specifications similar to those used in our study and when the collected tip times are processed using the interpolation scheme based on Eq. (1).

## 4. Errors of the tip-counting scheme

*t*is an arbitrary time point during the observation period,

*N*(

*t,*

*t*+

*T*) is the number of tips in the interval [

*t,*

*t*+

*T*], and

*V*

_{b}is the same one-tip bucket volume resolution as in Eq. (1). Such a simple tip-counting scheme must result in much larger instrumental errors than the more precise interpolation between tip times that we discussed in the previous sections. This loss in rainfall measurement accuracy can be evaluated empirically using our local random error approach. First, we computed the new single-gauge rain rates

*R*

_{1}based on Eq. (6) and substituted them in to Eq. (2) to obtain the new values of the relative errors. In this scenario, the 15-gauge averages

*R*

_{15,T}in Eq. (2) were still based on the time-interpolated data so that the maximum accuracy of the approximation of the true local rainfall was maintained. Next, the nonparametric estimates of the standard deviations of the errors were obtained based on Eq. (3) and the coefficients in Eq. (5) were estimated. Examples of these results are presented in Fig. 7 for the same three averaging intervals as in Fig. 5. For short timescales, the errors are a few times larger than in the case in which the rain rates were obtained using the interpolation scheme. For example, they reach about 16% at around 10 mm h

^{–1}rain rates, for the 5-min scale. The estimated values of the coefficients in Eq. (5), as a function of timescale, are shown in the lower panels of Fig. 6. We can see that the estimates of

*e*

_{0}are fairly close to the values in the upper panels. However, the estimates of the

*R*

_{0}coefficient are now much larger than for the rainfall rates computed with the interpolation scheme. Using the coefficient values in Fig. 6 and the standard error model given by Eq. (5), one can evaluate the advantages of the interpolation scheme over the crude tip-counting approach for all rainfall intensities and accumulation intervals. An example of such a comparison is presented in Fig. 8 for the interval-averaged rain rates of 10 mm h

^{–1}. For this moderate rainfall intensity, the error difference between the two data collecting/processing schemes is about 3 times for the 5-min averaging time, and about 2 times for the 15-min averaging time. At the hourly timescale both errors become practically negligible. One striking feature of the two standard errors in Fig. 8 is that in the log–log scale they drop almost perfectly linearly, with break points at about a 15-min timescale. This is evidence of temporal self-similarity (between break points) that is sometimes apparent in the observed rainfall (Lovejoy and Schertzer 1990). Here it is clearly a characteristic feature of the uncertainties in rainfall measurements. Such scaling properties can probably be applied to construct a better model of the error structure that could allow accounting for the measurement uncertainties in the rainfall analyses; however, further exploration of this issue is beyond the scope of this study.

## 5. Conclusions

In this study we presented an analysis of the empirical estimates of local random errors in TB rain gauges. We showed that these errors are substantial in the cases in which precise measurements of short-time rainfall intensities are required. The magnitude of the rain gauge errors is highly dependent on the local rainfall intensity and the timescale. Nonparametric regression was applied to obtain the estimates of the standard errors as a function of these two variables. We also demonstrated a strong dependence of the local errors on the way in which the TB data are collected and processed. A scheme based on rainfall interpolation between the tip times was compared with the simple tip counting used in many operational setups. We showed that the latter scheme results in much larger errors, especially for shorter timescales. A simple analytical model of the standard deviations of the errors, described by Eq. (5), can be used to obtain estimates of the standard errors for any averaging time and rainfall intensity. The values of the model coefficients for the two data collecting/processing schemes considered here are given in Fig. 6. However, these results are directly applicable only to the TB rain gauges that have basic technical characteristics similar to those used in our study, and their generalization to other instruments might not be straightforward. Comparison of our results with the TB sampling error estimates in Habib et al. (2001) shows a high degree of consistency between the two studies. For example, at moderate rainfall intensities of 10 mm h^{–1}, Habib et al. (2001) report relative standard errors of 6.4% for the 5-min and 2.3% for 15-min timescales, whereas we obtained 4.9% and 2.9%, respectively. The discrepancies can be attributed to the differences in the analysis methods, as well as to the estimation errors due to limited data samples. This rather small magnitude of differences between the two studies might indicate that the empirical local random errors in the TB rain gauges are mostly caused by the bucket sampling effect and that the contributions of other possible sources of the local errors are much smaller.

For most of the currently available collections of rainfall data, the random errors analyzed here are unknown and are usually ignored by the users. As a rule, the operational networks use single rain gauges at the stations and their rainfall measurements have to be taken for granted, unless severe systematic anomalies in the data are noticed. In this situation the quality and credibility of the surface rainfall data are not high, which has serious negative consequences both in research and operational applications. For example, it is one of the factors prohibiting conclusive verification and validation of remote sensing rainfall products (Ciach and Krajewski 1999b; Steiner et al. 1999). A natural and inexpensive solution that could improve this situation is to build measurement networks that have two or more rain gauges on each station. This solution has already been postulated, for example, by Ciach and Krajewski (1999b) and by Steiner et al. (1999), and it is slowly gaining popularity in parts of the hydrometeorological community. The possibility of comparing two independent and collocated measurements dramatically improves early detection of the instrument failures. It is also efficient in the most common situations in which there is only a partial deterioration of the measurement accuracy that can pass unnoticed in the traditional single-gauge systems. Regarding the local random errors demonstrated in this study, averaging of two or more collocated rainfall measurements is the simplest way to reduce them. For example, using the four-gauge design would result in 2 times smaller standard errors. Additionally, comparison of the rain gauges would provide information on these errors, which could then be used by many modern applications that can account for the uncertainties.

## Acknowledgments

This work was supported by Oklahoma EPSCoR Grant NCC 5171 and NSF Grant EAR00-01249. This support is gratefully acknowledged. We also thank Dr. Mark Morrissey for his help in organizing the PicoNet system, John Ensworth for his work on the PicoNet, the personnel of the Oklahoma Climatological Survey for their help with preparation and calibration of the PicoNet rain gauges, and Dr. Witold Krajewski for his helpful remarks.

## REFERENCES

Bickel, P. J., , and Doksum K. A. , 1977:

*Mathematical Statistics: Basic Ideas and Selected Topics*. Holden-Day, 492 pp.Brock, F. V., , Crawford K. C. , , Elliott R. L. , , Cuperus G. W. , , Stadler S. J. , , Johnson H. L. , , and Eilts M. D. , 1995: The Oklahoma mesonet: A technical overview.

,*J. Atmos. Oceanic Technol.***12****,**5–19.Ciach, G. J., , and Krajewski W. F. , 1999a: Radar-raingage comparisons under observational uncertainties.

,*J. Appl. Meteor.***38****,**1519–1525.Ciach, G. J., , and Krajewski W. F. , 1999b: On the estimation of radar rainfall error variance.

,*Adv. Water Resour.***22****,**585–595.Ensworth, J. D., , Ciach G. J. , , Morrissey M. L. , , and Krajewski W. F. , 1999: A pico-scale raingauge network for small spatial scale rainfall variability and its use to verify radar and satellite data.

*Extended Abstracts, Fall-99 AGU Meeting,*San Francisco, CA, Amer. Geophys. Union.Habib, E., , Krajewski W. F. , , Nespor V. , , and Kruger A. , 1999: Numerical simulation studies of rain gauge data correction due to wind effect.

,*J. Geophys. Res.***104****,**723–734.Habib, E., , Krajewski W. F. , , and Kruger A. , 2001: Sampling errors of tipping-bucket rain gauge measurements.

,*J. Hydrol. Eng.***6****,**159–166.Hardle, W., 1990:

*Applied Nonparametric Regression*. Cambridge University Press, 333 pp.Humphrey, M. D., , Istok J. D. , , Lee J. Y. , , Hevesi J. A. , , and Flint A. L. , 1997: A new method for automated dynamic calibration of tipping-bucket rain gauges.

,*J. Atmos. Oceanic Technol.***14****,**1513–1519.Kitchen, M., , and Blackall R. M. , 1992: Representativeness errors in comparisons between radar and gauge measurements of rainfall.

,*J. Hydrol.***134****,**13–33.Lovejoy, S., , and Schertzer D. , 1990: Fractals, raindrops, and resolution dependence of rain measurements.

,*J. Appl. Meteor.***29****,**1167–1170.McPhaden, M. J., 1995: The Tropical Atmosphere Ocean (TAO) array is completed.

,*Bull. Amer. Meteor. Soc.***76****,**739–741.Seo, D-J., , Krajewski W. F. , , and Bowles D. S. , 1990a: Stochastic interpolation methods used for multiple sensor rainfall estimation. 1. Experimental design.

,*Water Resour. Res.***26****,**469–478.Seo, D-J., , Krajewski W. F. , , Bowles D. S. , , and Azimi-Zonooz A. , 1990b: Stochastic interpolation methods used for multiple sensor rainfall estimation. 2. Results.

,*Water Resour. Res.***26****,**915–924.Sevruk, B., 1985: Correction of precipitation measurements.

*Correction of Precipitation Measurements: WMO/IAHS/ETH Workshop,*Sevruk, B. Ed., Zurcher Geographische Schriften, No. 23, Swiss Federal Institute of Technology, 13–23.Simonoff, J. S., 1996:

*Smoothing Methods in Statistics*. Springer-Verlag, 338 pp.Steiner, M., , Smith J. A. , , Burges S. J. , , Alonso C. V. , , and Darden R. W. , 1999: Effect of bias adjustment and rain gauge data quality control on radar rainfall estimation.

,*Water Resour. Res.***35****,**2487–2503.Zawadzki, I., 1975: On radar-raingage comparison.

,*J. Appl. Meteor.***14****,**1430–1436.