## Introduction

The goal of radar rainfall (RR) estimation is to provide approximate rainfall values that are as close to the physical truth as possible. Here, the *true rainfall* is defined as the amount of rainwater that falls on a specified area of the earth’s surface in a specified time interval. The question is how to formalize rigorously the requirement of “closeness to the truth.” The abundance of the existing reflectivity–rainfall rate (*Z–R*) relationships (Battan 1973) suggests that an unambiguous way to fulfill this apparently obvious demand might not exist. In this paper, it is shown that indeed there are at least two RR estimation performance criteria that are in opposition to each other. One is the commonly used mean-square error (MSE), and the other is a conditional bias (CB).

Generally, CB (it can also be called conditional mean error) can be described as a quantity that shows how the radar rainfall estimates differ, on the average, from a given true rainfall. Formally, we define it as a statistic that quantifies discrepancies between a true rain rate and the conditional average of its estimates, where the averaging is done only for the cases close to the given true rain rate (the conditioning). Large CB can manifest itself in the severe underestimation of strong rainfalls. Demonstrating the behavior of this new RR error component and the role it plays in RR estimation is the main purpose of this study. We apply a conceptual model proposed by Ciach and Krajewski (1999) to illustrate how CB affects the relationship between the RR estimates and the truth and how CB and MSE compete with each other.

The CB problem is fairly well recognized in the statistical literature that discusses the “error-in-variable” effects (Carroll et al. 1995; Seber 1989; Fuller 1987). It appears when the least square regression is used directly to build a predictive relationship in a situation where the predictors are measured with substantial errors. The resulting predictions (by definition) minimize MSE but also systematically underestimate the outcomes in their higher range and overestimate them in their lower range. This systematic discrepancy is known in the statistical literature as an “attenuation” effect, because it can be viewed as significant reduction of the dynamic range of the response. Recently, Rosenfeld and Amitai (1998) encountered the conditional bias problem in the RR context. In their paper, they demonstrate the effect using disdrometer data and argue that their “window probability matching method” removes it. In our study, we attempt to formalize the CB description and to develop an analytical insight into the problem.

This paper is organized as follows. In section 2, we briefly define and discuss the model and the RR estimation formulas used in this study. Next, we define CB and demonstrate its effect on the RR estimates. In section 4, we compare and discuss the CB and MSE errors. A summary and conclusion section ends the study.

## The model framework

*R*

_{t}and the true near-surface radar reflectivity

*Z*

_{t}can be described by a one-to-one functional dependency of power-law type:

*Z*

_{t}

*AR*

^{b}

_{t}

*A*and

*b*are called the relationship multiplier and exponent, respectively. The true reflectivity factor

*Z*

_{t}is a physical descriptor of the precipitation volume, as defined by Battan (1973). Both the true rain rate

*R*

_{t}and reflectivity

*Z*

_{t}are averaged over the same area over which the radar reflectivity is measured. Also, let the measured radar reflectivity

*Z*

_{m}, corrupted with errors, be associated with

*Z*

_{t}through the following equation:

*Z*

_{m}

*Z*

_{t}

*E*

_{z}

*E*

_{z}is the reflectivity measurement error. From Eqs. (1) and (2), we can obtain the “observation equation” of the modeled system,

*Z*

_{m}

*AR*

^{b}

_{t}

*E*

_{z}

*Z*

_{m}to the true rain rates and the errors. We define

*R*

_{t}and

*E*

_{z}as the following independent and lognormally distributed random variables:where CV

_{r}are CV

_{ez}are the coefficients of variation of the rain-rate process and the reflectivity measurement errors, respectively. Note that, in the above definitions, to simplify further notation and without any significant loss of generality, we assume that the ensemble means of the two random variables are equal to 1. For the

*R*

_{t}values, it is equivalent to converting them into dimensionless quantities by scaling the rain rates with their climatological average. This idealized model is designed to be analytically tractable and to describe qualitatively only those elements of reality that are crucial for the conditional bias question. As a consequence, its solutions can be obtained in a synthetic form of closed mathematical expressions. For a discussion of the model assumptions, see Ciach and Krajewski (1999).

*Z*

_{t}–

*R*

_{t}relationship such as Eq. (1) between the true reflectivity and rain rate. However, our analytical framework is, in fact, more flexible, and this assumption can be relaxed. We can incorporate a simple

*Z–R*variability model into the multiplicative factor

*A*in Eq. (1), assuming that the

*A*parameter is also a lognormal random variable:where

*A*

_{o}is the average value of the

*Z*

_{t}–

*R*

_{t}relationship multiplier

*A,*and the independent lognormally distributed random variable

*E*

_{a}describes the variability of this multiplier in a real precipitation system. Substituting definition (5a) into Eq. (3), one obtains the modified observation equation

*Z*

_{m}

*A*

_{o}

*R*

^{b}

_{t}

*E*

_{a}

*E*

_{z}

*Z–R*variability factor

*E*

_{a}and the reflectivity measurement error

*E*

_{z}are mixed in a single error factor. The product of

*E*

_{a}and

*E*

_{z}is also a lognormally distributed random variable, independent of

*R*

_{t}, describing the combined effect of the two uncertainty sources in our rainfall observation system. It suggests that the natural

*Z–R*variability can be considered in a simplified way as an increment of the radar reflectivity measurement error, with all its consequences for the CB error discussed in the next sections.

*Z*

_{m}of a precipitation system into radar-estimated rain rates

*R*

_{r}. For this conversion, we need to choose a

*Z*

_{m}–

*R*

_{r}relationship adjusted to a specific situation. The

*R*

_{r}values obtained this way are approximations of the true rain rates

*R*

_{t}averaged over the area over which the radar reflectivities are measured. Within our model framework, this predictive

*Z*

_{m}–

*R*

_{r}relationship is also a power-law function preserving the lognormality of the system. It can be represented in a traditional way as a

*Z–R*relationship:

*Z*

_{m}

*αR*

^{β}

_{r}

*R*

_{r}

*α*

^{−1/β}

*Z*

^{1/β}

_{m}

*Z*

_{m}-to-

*R*

_{r}conversion.

*R*

_{r}

*A*

*α*

^{1/β}

*R*

^{γ}

_{t}

*E*

^{1/β}

_{z}

*cR*

^{γ}

_{t}

*E*

^{1/β}

_{z}

*γ*=

*b*/

*β*and multiplier

*c*are introduced to simplify further derivations. This equation is a basis for our further analysis of the behavior of the RR estimator [Eq. (8)]. Our focus is on the exponent parameters

*b*and

*β*that, as Eq. (9) shows, govern the way the radar estimates

*R*

_{r}are related to different intensities

*R*

_{t}of the true rain rate. First, however, let us consider the multiplier

*α.*For any fixed

*β, α*can be adjusted so that the overall bias of the RR estimation is removed, and the climatological averages of

*R*

_{r}and

*R*

_{t}are equal (Ciach et al. 1997). Representing these averages in our model with statistical expectations, we obtain:

*E*

*R*

_{r}

*E*

*R*

_{t}

## Definition of CB and its effect on RR estimates

*r*

_{t}is a given

*R*

_{t}value (the conditioning), CB

_{m}is the multiplicative conditional bias, and CB

_{a}is its additive equivalence. Both expressions tell us how the average RR for a given

*R*

_{t}differs from this fixed true rain rate. Equation (12a) expresses it as a dimensionless coefficient, and Eq. (12b) expresses it in absolute rain-rate units. Both statistics can be useful in different situations, and we will utilize them in our further analysis. The CB

_{a}averaged over all the

*r*

_{t}values is equal to zero. This fact is due to the unbiasedness condition [Eq. (10)] and the “double expectation theorem” [see, e.g., p. 6 in Bickel and Doksum (1977)]. Note, however, that the respective average CB

_{m}is not equal to 1, because the expectation of a ratio is not the same as the ratio of expectations.

*Z*

_{m}–

*R*

_{r}conversion exponent

*β*equal to the model assumed

*b,*that is, for

*γ*= 1, the exponents in Eq. (13) vanish and CB

_{m}(

*r*

_{t}) = 1 for all

*r*

_{t}values. In this, and only this, case we obtain conditionally unbiased rain-rate estimates

*R*

_{r}

*cR*

_{t}

*E*

^{1/b}

_{z}

*β,*the average RR estimate conditioned on a specified rain rate

*r*

_{t}differs from this truth. This nonlinear departure is a function of

*r*

_{t}and can be described by CB

_{m}(

*r*

_{t}) not only within this model but also in analysis of results based on real data, provided that we can estimate it.

Equation (9) and the overall nonbias condition Eq. (10) imply that, if we choose *β* > *b* (i.e., *γ* < 1), then the *R*_{r} values obtained based on Eq. (8) underestimate high rain rates and overestimate low rain rates. Statistical literature on the error-in-variable problem (e.g., Fuller 1987) calls this effect attenuation. Note that statistical meaning of the term attenuation (pertaining to the excessive decrease of the estimation’s dynamic range) must not be confused with its use in radar meteorology, where it means the extinction of electromagnetic waves in heavy precipitation. For *β* < *b* it works in the opposite direction, causing on the average an “enhancement” of the dynamic range of RR estimates in comparison with the truth. Equation (13) shows that the magnitude of the effect depends on the actual rain-rate intensity *r*_{t}, the climatological rain-rate variability CV_{r}, and the difference between the predictive and the physical *Z–R* exponents.

In Fig. 1, we demonstrate the behavior of the multiplicative conditional bias defined by Eq. (12a), as a function of the true rain rate, for a few cases of the assumed model parameters. One can see that using *β,* which is significantly larger than the physical *Z–R* exponent *b* (the case of *γ* = 0.6), leads, on the average, to underestimation of the high rainfalls by a factor of about 2. To maintain zero overall bias, this underestimation is compensated by the 2 to 3 times overestimation of the weak rainfalls. As a result, the dynamic range of the intensities is much smaller for the RR estimates than for the true rain rates. A comparison of the left and right panels in Fig. 1 shows that this result is not affected much by the climatological variability of the rain rates. If the predictive exponent is smaller than *b* (e.g., *γ* = 1.2), the opposite effect occurs.

## Comparison of CB and MSE

*Z*

_{m}–

*R*

_{r}relationship [Eq. (7)] to result in RR estimates that are as close to the true rain rates as possible. This goal can be achieved through an appropriate adjustment of its parameters

*α*and

*β.*An obvious component of this closeness is the overall nonbias requirement [Eq. (10)] that can be fulfilled by an adjustment of the

*α*multiplier, as shown in section 2. The remaining differences between

*R*

_{r}and

*R*

_{t}can be evaluated in different ways. One meaningful criterion is the multiplicative conditional bias CB

_{m}(

*r*

_{t}) defined by Eq. (12a). We used it to describe the systematic differences between the RR estimators for low versus high rainfalls. Using Eq. (12b), one can construct a statistic that can be applied to compare CB with MSE. The MSE is a traditional statistic, often used to characterize the random estimation errors. Within our model framework, MSE can be simply defined as follows.

*R*

_{r}

*R*

_{t}

*E*

*R*

_{r}

*R*

_{t}

^{2}

_{tot}

_{a}

*R*

_{t}

*E*

_{a}

*R*

_{t}

^{2}

_{a}(

*R*

_{t}) is equal to zero, as stated in the previous section. The CB

_{tot}measures the conditional bias effect synthetically for all of the

*r*

_{t}values, is an analytically tractable quantity, and, being also a variance, is the same kind of statistic as MSE. Formally, it is possible to define many global measures of the CB error. However, the above advantages make the CB

_{tot}a good choice for our purposes.

_{tot}, we first need to derive explicit expressions for the two statistics as functions of the

*Z*

_{m}–

*R*

_{r}exponent

*β*and other parameters of our model. For MSE, we expand Eq. (15) and substitute Eq. (9) to obtain

*c*

^{2}

*E*{

*R*

^{2γ}

_{t}

*E*

^{2/β}

_{z}}−2

*cE*{

*R*

^{γ+1}

_{t}

*E*

^{1/β}

_{z}}+

*E*{

*R*

^{2}

_{t}}.

*c*with Eq. (11) and several rearrangements, yields

*γ*

^{2}

_{r}+ 1)(CV

^{2}

_{ez}+ 1)

^{1/b2}]

^{γ2}− 2(CV

^{2}

_{r}+ 1)

^{γ}

^{2}

_{r}+ 1,

_{r}and CV

_{ez}and exponent

*b,*is a function of the argument

*γ*=

*b*/

*β.*

_{tot}

*γ*

^{2}

_{r}+ 1)

^{γ2}

^{2}

_{r}

^{γ}

^{2}

_{r}

*γ*) and CB

_{tot}(

*γ*) shows that they are both concave and have single minima. For

*γ*= 0 (i.e., in the limit of large

*β*values), they have the same value and MSE(0) = CB

_{tot}(0) =

^{2}

_{r}

*γ,*both functions go to infinity. The argument values minimizing MSE(

*γ*) and CB

_{tot}(

*γ*), however, differ significantly. They can be found by differentiating Eqs. (18) and (20) and setting the derivatives equal to zero. The MSE has its minimum for the following value of the

*β*parameter:which agrees with the result in Ciach and Krajewski (1999) obtained based on different considerations. On the other hand, minimum of the total CB error [Eq. (20)] is for

*β*

_{cb}=

*b*(

*γ*= 1), and, at this point, the conditional bias simply vanishes, that is, CB

_{tot}(1) = 0. The minimum for the total conditional bias function is, of course, in perfect agreement with the results in the previous section regarding the multiplicative bias.

Note that the *β*_{ms} value that optimizes MSE is always bigger than *b.* For given physical parameters *b* and CV_{r} of the modeled precipitation system, the difference between *β*_{ms} and *β*_{cb} = *b* minimizing the two error variances depends on the level of the reflectivity measurement errors described by CV_{ez}. Figure 2 demonstrates the relative behavior of the MSE and CB error expressions derived above, as a function of the *γ* parameter, for a few cases of the model parameters. In the left panel, one can see that, for moderate reflectivity measurement errors (CV_{ez} = 1), *β*_{ms} is about 22% larger than the *β*_{cb} exponent value, whereas, for large errors (CV_{ez} = 2), the difference is almost 40%. For the model with larger climatological rain-rate variability (CV_{r} = 4 in the right panel), the effect is slightly smaller.

From Fig. 2 one can immediately see the dilemma that arises when the *β* exponent of the predictive *Z*_{m}–*R*_{r} conversion is tuned to make the RR estimates as close as possible to the truth. Inherently, one cannot simultaneously minimize both the MSE and CB_{tot} errors. Minimization of MSE (the most common statistical optimization method) leads to substantial conditional bias distortions. On the other hand, choosing an RR estimator that has no conditional bias (CB_{tot} = 0 for *γ* = 1) can only be done at the cost of a significant increase of MSE. For moderate reflectivity measurement errors, MSE is larger than its minimum by a factor of about 1.7. However, for higher measurement errors, MSE is about 3 times larger for the conditionally unbiased estimates than at the minimum point and is even bigger than MSE at *γ* = 0 (climatological average prediction).

Note that, in this framework, there is no conflict of the overall unbiasedness condition and CB_{tot} (or MSE). This fact is most likely also true in real data setups. Not all performance criteria compete, even if they pertain to different aspects of a product. The knowledge of which criteria are in conflict and which can be adjusted independently is necessary to control the optimization process properly.

## Summary and conclusions

We investigated a new radar rainfall estimation performance criterion, called the conditional bias error, using a simple analytical model. After a short presentation of the model, we formally defined two useful descriptors of CB. We then demonstrated the behavior of this quantity and showed how it affects the relationship between the RR estimates and the true rain rates. We also compared the conditional bias with the commonly used mean square estimation error, pointing out an acute dilemma between reduction of CB and minimization of MSE. We showed that the natural variability of the physical *Z–R* relationship can add to the CB effect in the same way as the reflectivity measurement errors.

Our study shows that nonzero CB in the RR products indicates that the RR values are not linearly related to the corresponding true rainfall values and that CB_{tot} is a general and convenient global measure of the magnitude of this distortion that can also be used for real data analysis. It is an undesirable systematic effect that one would like to remove from the RR products. However, the cost of removing CB from the RR estimates can be high in terms of the MSE increase. On the other hand, minimizing MSE can result in strong underestimation of heavy rainfalls and overestimation of weak rain rates. This dilemma is an inevitable consequence of the uncertainties involved in the RR estimation process and cannot be solved with the uncertainties present. Finding a compromise between the two extreme RR estimation strategies most likely requires careful recognition of the prevailing priorities in a specific application of the hydrological radar–rainfall products. Such a discussion is beyond the scope of this paper, and the tradeoff between MSE and CB is considered here only in a descriptive way as a fact of nature. However, in our opinion, one should be aware of the conditional bias effects and be able to quantify them in order to make conscious decisions about the RR estimation schemes used in practice. Practical problems of efficient estimation of the CB errors using uncertain reference data will be one of the topics of our future research.

We hope that this study will promote better understanding and more rigorous definitions of the optimality criteria for the RR estimation/validation procedures. It may also be viewed as the next step toward a parsimonious model of the RR error structure and the efficient estimation of its parameters. Pursuing these difficult goals will also be a subject of our future research effort.

## Acknowledgments

This work was supported by Oklahoma EPSCoR Grant NCC 5-171, and NASA Grant NAG 6-2084. This support is gratefully acknowledged. We also acknowledge an anonymous reviewer of the paper by Ciach and Krajewski (1999) whose remarks inspired this study.

## REFERENCES

Battan, L. J., 1973:

*Radar Observation of the Atmosphere.*University of Chicago Press, 324 pp.Bickel, P. J., and K. A. Doksum, 1977:

*Mathematical Statistics: Basic Ideas and Selected Topics.*Holden–Day, 492 pp.Carroll, R. J., D. Ruppert, and L. A. Stefanski, 1995:

*Measurement Error in Nonlinear Models.*Chapman and Hall, 305 pp.Ciach, G. J., and W. F. Krajewski, 1999: Radar–rain gauge comparisons under observational uncertainties.

*J. Appl. Meteor.,***38,**1519–1525.——, ——, E. N. Anagnostou, M. L. Baeck, J. A. Smith, J. R. McCollum, and A. Kruger, 1997: Radar rainfall estimation for ground validation studies of the Tropical Rainfall Measuring Mission.

*J. Appl. Meteor.,***36,**735–747.Fuller, W. A., 1987:

*Measurement Error Models.*John Wiley and Sons, 440 pp.Rosenfeld, D., and E. Amitai, 1998: Comparison of WPMM versus regression for evaluating

*Z–R*relationships.*J. Appl. Meteor.,***37,**1241–1249.Seber, G. A. F., 1989:

*Nonlinear Regression.*John Wiley and Sons, 768 pp.