## Abstract

The spatial variability and temporal variability of precipitation are widely recognized. In particular, rainfall rates can fluctuate widely in regions where the raindrops are clustered and where mean conditions are changing (statistical heterogeneity). Indeed, at times, the ambiguity associated with an estimated average rainfall rate may become very large. Therefore, in quantitative measurements of precipitation, it would be useful to identify where this occurs. In this work a technique is proposed and applied to quantify the variability in rainfall rates introduced by statistical heterogeneity and raindrop clustering using deviations from Rayleigh statistics of intensity fluctuations. This technique separates the Rayleigh contributions to the observed relative dispersion from those arising from clustering and statistical heterogeneities. Applications to conventional meteorological radar measurements are illustrated using two scans. Often, but not always, the greatest ambiguities in estimates of the average rainfall rate occur just where the rainfall rates are the largest and presumably where accurate estimates are most important. This ambiguity is not statistical; rather, it indicates the presence of important sub-beam-scale fluctuations. As a consequence, no single average value can be applied uniformly to the entire domain. The examples provided here also demonstrate that the appropriate observations are feasible using current conventional meteorological radars with adequate processing capabilities. However, changes in radar technology that improve and increase pulse-to-pulse statistical independence will permit such observations to be gathered more routinely at finer spatial resolution and with enhanced precision.

## 1. Introduction

The spatial variability of precipitation is widely acknowledged. To be specific, rainfall varies the most where the raindrops are clustered and where mean conditions are changing (statistical heterogeneity). For purposes that require radar measurements that are as quantitative as the physics and the measurement process allow, this variability may frustrate attempts to make precise measurements of mean values. Therefore, it would be useful, at times, to identify those locations where this variability occurs. This paper presents one possible method using radars.

Radars measure the reflectivity factor *Z* derived from observations of backscattered intensities *I* in sample volumes. For stationary antennas, the signal fluctuations are determined by random coherent summations of the electromagnetic waves reflected by the individual particles as described by Rayleigh statistics. The variance then is equal to the mean squared (i.e., the relative dispersion of the intensity *σ*_{I}/*I* is unity). For scanning antennas, however, additional deviations occur as clustering and statistical heterogeneities extend the tail of the distributions of the relative dispersion.

That is, Rayleigh statistics are based upon the central limit theorem applied to each of the two components of the complex amplitude when conditions are “near” statistical stationarity. Jameson and Kostinski (1996) explored the meaning of near stationarity and concluded, for the simple case of drops of one size, that, whenever the number of drops within the beam fluctuated from sample to sample by more than about 15% of the mean, non-Rayleigh effects could be detected. Why? Because now the measurement not only depends upon the constructive and destructive interference of the waves scattered off all the drops, it also depends upon the doubly stochastic nature of the process in which the number of scatterers themselves becomes a random variable largely because of the motion of the observation volume between successive radar samples. [A more complete discussion of the origin of non-Rayleigh signal statistics may be found in Jameson and Kostinski (1996).] What causes the number of drops to vary?

A central source of drop concentration fluctuations in a moving radar sample volume is clustering. Clustering is the enhanced concentration (and dilution) of particles associated with increased (decreased) correlations of scatterers in neighboring volumes. That is, the numbers of drops in neighboring volumes are not statistically independent. In physics and other fields, counting is usually treated using Poisson statistics. However, a Poisson process is characterized by three assumptions: 1) the probability of detecting more than one drop in a given volume *dV* is vanishingly small for sufficiently small *dV*, 2) drop counts in nonoverlapping volumes are statistically independent random variables (at any length scale), and 3) the process is statistically homogeneous. With regard to rain, the first point can usually be satisfied. The second assumption, however, is usually found not to be true. That is, the presence of a drop enhances (or in some cases decreases) the likelihood that there are other drops in the neighboring volume. The correlations in natural rain arise because rain appears to consist of “patches” of different dimensions. That is, there are locations that are rich in drops interspersed with regions in which drops are scarcer [see discussion in Jameson and Kostinski (1999a, p. 3921)].

Another source of drop concentration variability is statistical heterogeneity, the result of changing conditions such that the statistics of the observations depend upon the location of the measurements. Radar intensity measurements consequently exhibit increased variance (and relative dispersion) because of the bunching of drops produced by correlation (i.e., clustering) and because of systematic changes in the observed longer-term or larger-scale mean values of drop concentrations associated with statistical heterogeneity. As antennas scan during sampling, radars “see” the effects of all of this variability (clustering in conjunction with heterogeneity) in addition to the usual Rayleigh signal coherency fluctuations. This drop concentration variability is coincidentally and, perhaps, fortuitously also the main source of fluctuations in the rainfall rate (Jameson and Kostinski 1999a, 2001a). Non-Rayleigh signal fluctuations consequently offer a potential tool for measuring the variability in the rainfall rate.

So how can the normal Rayleigh statistical fluctuations be separated from the potentially meteorologically useful signals? As will be seen below, the key is to increase the numbers of statistically independent samples per estimate to restrict Rayleigh statistical fluctuations to as narrow a region around unity as possible while, at the same time, enhancing the detectability of the intrinsic total variability. This can be achieved in several ways. One is to design a special-purpose radar (Jameson 2005). An alternative that is better suited to existing conventional meteorological radars is to collect as many statistically independent samples as possible using frequency chirp, phase coding, or whitening [decorrelation of the backscattered signals as discussed in Koivunen and Kostinski (1999)]. Most conventional meteorological radars, however, do not yet have such capabilities, and therefore in the preliminary studies here a different strategy is required. One such approach is considered in section 4, but first it is necessary to develop the meteorological understanding of such measurements.

## 2. The relation between the relative dispersions of *Z* and *R*

As just discussed, the radar parameter of interest is the relative dispersion of the measured intensities, which is exactly equivalent to the relative dispersion of the radar reflectivity factor *Z* (because *Z* = *CI*, where *C* is the radar constant). What one would like to be able to do, however, is to convert the relative dispersion of *Z* into the relative dispersion of quantities of meteorological interest such as the rainfall rate *R*, for example. As can be seen below, this can be accomplished precisely when the rain is statistically homogeneous and somewhat more imprecisely when the rain is statistically heterogeneous.

### a. Statistically homogeneous, clustered rain

In statistically homogeneous rain, there is only one drop size distribution (Jameson and Kostinski 2001a), because otherwise there would be more distributions contributing and the rain would be statistically heterogeneous. The drop size distribution, then, can be viewed as expressing how frequently a drop of diameter *D* occurs. The frequencies of occurrence over a range of different drop sizes then define the drop size distribution. However, any particular realization of this distribution over a measurement interval will depend upon two factors—namely, the total number of drops as well as which drop diameters happen to be observed during that interval. Thus, the observed drop size distributions fluctuate as the total number of drops fluctuates (in accordance with the statistical distribution of the counts) and as different drop sizes are sampled. However, in statistically homogeneous rain, after a sufficient number of observations, the observed distributions will eventually converge to the one (and only) steady drop size distribution (Jameson and Kostinski 2001b).

The rate at which this convergence proceeds is determined by the values of the pair correlation functions (closely related to the usual correlation function) among drop counts in neighboring volumes. That is, drops are said to be correlated when the number of drops occurring in an observation interval at one time or location are significantly correlated with the number of drops occurring at a different time or location. [Detailed discussions may be found in Kostinski and Jameson (1997) and subsequent articles and are not repeated here.] In such cases, the drops (or rain) are said to be clustered. Rain can be both statistically homogeneous and clustered, simultaneously. This clustering can lead to substantial fluctuations even in statistically homogeneous rain (e.g., see Jameson and Kostinski 1999a, their Fig. 3a). However, if the distribution of the number of drops per unit sample volume is Poisson distributed, for example, there is no such correlation by definition. The rain is then said to be unclustered or “steady” (Jameson and Kostinski 2002a) and the convergence occurs as rapidly as statistically possible (Kostinski and Jameson 1999, their section 5).

Even in statistically homogeneous conditions, however, there can be correlations among drops (i.e., the rain is not steady). These fluctuations then reduce the rate of convergence toward the constant drop size distribution (Kostinski and Jameson 1999). These fluctuations, then, also produce increased variance of the rainfall rate and intrinsic radar reflectivity factor (Jameson and Kostinski 1999a, their Figs. 2 and 3; Jameson and Kostinski 2001b).

Assuming that the number of drops and the drop size distribution itself are statistically independent (see appendix A, section a), Jameson and Kostinski (2001a, 527–530) then show that

and

where *E* denotes the expected value, *D* is the drop diameter, *V* is the terminal fall speed of the drop of diameter *D*, *Z* is the radar reflectivity factor (for Rayleigh scatterers, *Z* is the sum over all of the drops per unit volume of *D*^{6}), and *n* is the total number of drops per sample volume. Factors *F _{R}* and

*F*account for drop correlation [see Jameson and Kostinski (2001a, 528–529) for further explanation]. They are unity when there is no correlation, but they may reach values approaching 4 when drop correlation is significant.

_{Z}These expressions differ from those of earlier landmark work (e.g., Joss and Waldvogel 1969). That work, however, is incomplete because it did not fully appreciate the conditional probabilities involved. To be specific, the variance term resulting from fluctuations in the observed drop size distributions themselves was not included. Furthermore, at that time no one was aware of the effect of raindrop clustering on the variance of *n* so that only Poisson statistics were used. As a consequence, one cannot simply extrapolate those results to the dimensions of radar sample volumes nor can one use them to understand the physics of non-Rayleigh signal fluctuations.

This limitation is important because, for a typical radar sampling volume, *E*(*n*) is on the order of several billion particles and (1) and (2) can be simplified. In that case and when drop clustering is present, the second terms in (1) and (2) are on the order of unity and larger (e.g., Jameson and Kostinski 1999a, p. 3924) while the first terms go as 1/*E*(*n*) and, therefore, quickly become negligible. Thus, for most radar observations in statistically homogeneous, clustered rain, it follows that

(Remember, however, that the *Z* here is the *intrinsic* radar reflectivity factor observed *without* Rayleigh signal fluctuations. In observations, unity must be subtracted from the observed relative dispersion of *Z* before estimating the intrinsic relative dispersion of *R*.) If, then, in statistically homogeneous cases of clustered rain one can observe the intrinsic (i.e., in the absence of Rayleigh fluctuations) relative dispersion of *Z*, it is then equivalent to the relative dispersion of *R* (and of *n*).

What happens in the much more common and likely case of statistically heterogeneous rain? That is considered next.

### b. Statistically heterogeneous, clustered rain

As the absence of any discussion in statistical textbooks indicates, the subject of statistical heterogeneity has been a difficult problem to address–in particular, with any generality. However, Jameson (2007) recently showed that statistically heterogeneous rain can apparently be decomposed into a half-dozen or so statistically homogeneous components (i.e., steady drop size distributions). The statistically heterogeneous rain event, then, can described (and, indeed, reconstituted) using linear, weighted combinations of these different statistically homogeneous components. If it is assumed that these results are generally applicable, it then provides a conceptual framework for the meteorological interpretation of non-Rayleigh signal measurements. As shown in appendix A, section c, it is found that

and

Thus, when the *w _{i}* are narrowly distributed, the squared relative dispersions of

*R*and

*Z*are determined by the summation over the squares of the relative dispersions of

*n*for each statistically homogeneous component of the statistically heterogeneous rain multiplied by the square of the fractional contribution that each component mean

*E*(

_{i}*Z*) [or

*E*(

_{i}*R*)] makes to the overall average

*Z*(or

*R*). Hence, one may think of clustering being the leftmost term in the summations in (4) and (5) while the second multiplicative term represents much of the effect of statistical heterogeneities that act as weighting factors. Those components making the biggest contribution will carry the biggest weights, but only if there is clustering. If the clustering happens to be null, then they may make no contribution at all to the sum regardless of the size of their fractional contribution to the overall average

*Z*(

*R*). However, they will still reduce the measured overall clustering, as was seen in the first example above. Thus, for most radar measurements in statistically heterogeneous rain one obtains the important conclusions that

*σ*

^{2}(

*R*)/

*E*

^{2}(

*R*) and

*σ*

^{2}(

*Z*)/

*E*

^{2}(

*Z*) provide a measure of the clustering and represent a weighted clustering index when the distributions of

*w*are sufficiently narrow. Note also that if there were only one component there would be no statistical heterogeneity so that (4) and (5) would reduce to (3) for the statistically homogeneous case with clustering.

_{i}One final consideration before moving on to the next section, however, concerns the magnitudes of the two relative dispersions in (4) and (5). Differences will appear because the weighting distributions of [*w _{i}E_{i}*(

*Z*)/〈

*E*(

*Z*)〉]

^{2}will not usually be identical to [

*w*(

_{i}E_{i}*R*)/〈

*E*(

*R*)〉]

^{2}. In the examples considered above, the relative dispersions of

*R*appears to be 50%–90% of that for

*Z*[a value of 0.83 was also found using the disdrometer data in Jameson (2007)], although the representativeness of these numbers is, of course, unknown. In the subsequent data analyses (dominated by convective rain and larger values of dispersions of

*Z*as in the second example above), it is assumed that the squared relative dispersion of

*R*is approximately 0.8 that of

*Z*, as argued in appendix A section b. Nevertheless, one can say with reasonable certainty that the

*intrinsic*relative dispersion of

*Z*, if it can be measured, will provide useful information about the relative dispersion of

*R*.

Moreover, if one were bold enough to claim a knowledge of the mean *R*, one could then even estimate the variance of the rainfall rate itself from such observations. Regardless, however, observations of the relative dispersions of *Z* likely provide useful estimates of where there is significant clustering and, therefore, where any estimates of the mean rainfall rate are likely to be particularly ambiguous.

The trick to all of this, of course, is that in real measurements one must account for the effects of Rayleigh signal statistical fluctuations. That is considered next. After that, real data are used to extract estimates of the intrinsic relative dispersion of *Z* and then *R*. It is fortunate that it appears that a separation of non-Rayleigh from Rayleigh fluctuations is possible, provided that there are enough statistically independent samples.

## 3. A brief but necessary consideration of signal statistics

To assess whether an observation of the relative dispersion of *Z* originates from the meteorological conditions or simply from Rayleigh signal fluctuations, one must first understand the statistics of the relative dispersion for pure Rayleigh intensity fluctuations. (Here, the reader is reminded that the relative dispersions of the intensity *I* and of the radar reflectivity factors are interchangeable because *I* and *Z* are related through the radar constant. For the remainder of this work, then, sometimes *I* is used and sometimes *Z* is used.)

Although the mean value of the relative dispersion is known to be unity for Rayleigh signals, its probability density function (pdf) is unknown. Now it follows that

so what one needs to know are the statistics (pdf) of because the pdf of *I* is already well known (viz., the Erlang distribution) as a function of *k* independent samples [e.g., Marshall and Hitschfeld (1953), based upon the work of Lord Rayleigh (Rayleigh 1877, 35–42)].

The distribution of *I*^{2} can be readily derived from the Rayleigh distribution for *I*. However, the difficulty comes when one tries to determine the pdf corresponding to a sum of *I*^{2} values for *k* independent samples. (For a very large number, the distribution approaches the normal distribution, of course, but such large numbers of independent samples are rarely available.) As a consequence, in lieu of a closed-form solution, the pdf for *k* independent samples of *σ*^{2}(*I*)/*I*^{2} was determined by using a huge number of statistically pseudorandom draws from an exponential distribution for the intensity *I* (applicable to Rayleigh statistics) and then by computing the statistics directly for various *k*. Results from these calculations are shown in Fig. l. Particularly noteworthy are the large number of *σ*^{2}(*I*)/*I*^{2} less than unity, when *k* is small for reasons discussed in appendix B.

These computations then allow one to calculate confidence limits so that when an observed *σ*^{2}(*I*)/*I*^{2} exceeds the appropriate threshold, one can claim the measurement of non-Rayleigh effects to a specific degree of confidence. These curves are illustrated in Fig. 2. For example, to have 99% confidence that the signals of 2 or larger were *not* simply due to Rayleigh fluctuations, one would need a minimum of 24 independent samples, about 5 times what is normally used to estimate many observed mean intensities. For convenience, these curves are well represented using a parametric equation of the form

where *T* is the threshold, *k* is the number of independent samples, and the coefficients *a* and *b* are listed for the 90%, 95%, and 99% certainties in Table 1. These fits are valid in the range of 2 ≤ *k* ≈ 4000 after which the distribution can be assumed to be Gaussian.

From these results, it is clear that the key to minimizing the effects of Rayleigh fluctuations, and, therefore, to enhancing the detection of meteorologically useful non-Rayleigh signal fluctuations, is to use as large a number of independent samples as possible. Most conventional meteorological radars are unfortunately not equipped to increase the number of independent samples. In the current study, then, one is restricted to what nature provides. As will be seen in the next section, to access a sufficient number of independent samples, averaging is required. Fairly typical is that, for a 1°-beam, 10-cm-wavelength radar one assumes the order of 4–6 independent samples per 1° in azimuth. On the other hand, samples in successive range bins are then statistically independent. Therefore, when using conventional radars, one must average over several range bins and a few azimuths to get enough independent samples. Nevertheless, despite this reduction in spatial resolution, some interesting results are found, as is illustrated in the next section.

## 4. Some example observations

Two azimuthal-range scans are considered in this section. The first, henceforth referred to as scan 1, was measured on 9 September 2005. Scan 2 was observed on 26 July 2006. These data were kindly provided for this preliminary research by D. Brunkow of the Colorado State University–University of Chicago and Illinois State Water Survey (CSU-CHILL) National Radar Facility at Colorado State University in Fort Collins, Colorado. In both cases the observations were made in rain. The CHILL radar has both a nominal 1° beam and a nominal wavelength of 10 cm.

In each scan, over a set of 150-m range bins and over several degrees of azimuth, the *I* and *Q* data were given for each pulse. This permitted a detailed inspection of data quality (e.g., the *I* and *Q* were found to be completely balanced). It was also then possible to compute the intensity (*I*^{2} + *Q*^{2}) for each pulse and range bin. Data over a number of pulses and range bins were then used to estimate and *I*^{2} and, therefore, the variance *σ*^{2}(*I*). The number of data combined was selected to preserve as high a spatial resolution as possible while still yielding a reasonable number of statistically independent samples. The number of statistically independent samples was estimated assuming that there were five independent samples per 1° of azimuth and that the range bins along each azimuthal radial separated by 1° were also statistically independent. This means that the estimate of the number of independent samples for each estimate of the variance and mean of *I* were, on average, on the order of 13 and 50 for scans 1 and 2, respectively. The corresponding 99% confidence thresholds that the relative dispersions (squared) were not Rayleigh in origin were computed using (7). They were 2.24 and 1.80 for scans 1 and 2, respectively. That is, any squared relative dispersions greater than 2.24 and 1.80 were 99% likely *not* to be due to a Rayleigh fluctuation, or, to put it more positively, they were 99% *likely* to be due to clustering weighted by statistical heterogeneity.

The frequency distributions of the ratio *σ*^{2}(*Z*)/*Z*^{2} for both scans are illustrated in Fig. 3, along with the two thresholds. It is obvious that, in both cases, there were frequent, statistically significant non-Rayleigh values of *σ*^{2}(*Z*)/*Z*^{2}.

To see where these occur, the shaded contours of radar reflectivity *Z* are shown for scan 1 in Fig. 4, along with the contours of *σ*^{2}(*Z*)/*Z*^{2}. The most significant values of the relative dispersion are often found in or near the highest radar reflectivity factors *Z*, such as in the core of a small, convective shower at about 156° azimuth and 36-km range (156°, 36 km). This is repeated as well in showers at (158°, 42 km) and (171°, 34 km). This is not a hard and fast rule, however. For example, the maximum *Z* at (171°, 37 km) does not show a corresponding maximum in the relative dispersion of *Z*, and there are other locations at which *σ*^{2}(*Z*)/*Z*^{2} is significant even though *Z* is only between 25 and 30 dB*Z* (e.g., at 159° and 33 km). It appears, then, that the two fields [i.e., *Z* and *σ*^{2}(*Z*)/*Z*^{2}] exhibit a large amount of statistical independence (correlation coefficient = 0.45) except, perhaps, when *Z* > 40 dB*Z*.

This is even more apparent in the second scan (Fig. 5), in which *Z* and the relative dispersion are actually slightly anticorrelated largely because the greatest relative dispersion of *Z* is found adjacent to the only significant maximum of *Z* near (65°, 24 km). These two figures, then, suggest that *Z* and its relative dispersion are, for all practical purposes, independent variables.

The patterns for the relative dispersion in *R* will essentially be identical to those for *Z* for the reasons discussed in the previous section. However, because some of the locations where significant relative dispersion for *R* occurs are also where *R* is small (as suggested by the presence of some significant relative dispersions in *Z* where *Z* is small), the variability is not likely to be meteorologically important at those locations. Hence, a measure of greater meteorological relevance, perhaps, would be the standard deviation of *R* estimated from its relative dispersion using an estimate or measurement of *R*. This standard deviation *σ*_{R} is then a measure of the physical (*not signal statistical*) uncertainty or variability in the mean value. [Note that for somewhat smaller relative dispersions of *R* the standard deviation of *R* can still become very large when the mean value of *R* is large, thus changing the contour pattern, as a comparison of Figs. 5 and 7 (described below) demonstrates.]

So how does one estimate *R*? It can be done, for example, by using dual-polarization radar measurements. However, without that capability, one can at least attempt to get a qualitative feeling for what is happening (for purposes of illustration and discussion) by using a *Z*–*R* relation with which one feels comfortable in a particular setting. However, this can only be done with the full understanding that such relationships are fraught with substantial uncertainties and potential errors (e.g., see Jameson and Kostinski 2001a, b), especially because the selected *Z*–*R* relation is arbitrary in this case. Because the rain in Figs. 4 and 5 occurs during convection, the Sekhon and Srivastava (1971) *Z*–*R* relation *Z* = 300*R*^{1.35} is used to estimate *R* using the *Z* plotted in Figs. 4 and 5. The intrinsic standard deviation in the estimate of *R* is then calculated using

as discussed in appendix A section b. Remember, this estimate of the standard deviation is that arising solely from clustering weighted by the statistical heterogeneity and does not include the errors associated with the selected *Z*–*R* relation or radar calibration (which does not affect the relative dispersions themselves, of course).

The precise numbers should not be taken too seriously because, after all, an arbitrary *Z*–*R* relation is being used. Nevertheless, the results make an important point. Figures 6 and 7 illustrate the field of estimated *R* with superposed contours of the estimated standard deviation of the mean rainfall rate *σ*_{R} arising from clustering weighted by statistical heterogeneities. It is clear that, in Figs. 6 and 7, just where *R* are largest, the meteorological interpretation of the mean value becomes the most ambiguous.

What does that mean? One interpretation is that these contours represent an “uncertainty” in the mean value. This is true, but one must understand that the uncertainty in this case is not because of statistical signal fluctuations. Rather, it is a reflection of the intrinsic uncertainty and, thus, ambiguity in the very meaning of an average value at those locations because of sub-beam-scale variability. The second and alternative interpretation is that the standard deviation means that a wide range of mean values are occurring simultaneously so that no one mean value applies uniformly to the entire domain. After a moment’s consideration, this is, perhaps, not a big surprise. However, in almost all studies of radar rainfall measurements, the mean value is taken to be uniformly applicable to the entire radar volume, and it certainly reveals the inherent difficulty in any comparison of, say, a rain gauge measurement with that estimated using a radar.

Perhaps more sobering is that unknown deviations of the radar estimate from the true mean value are possible, if not likely, where *σ*_{R} is most significant because one has no idea about the actual pdf of *R* at those locations. That is, for example, if a large *σ*_{R} is associated with a skewed distribution of *R*, the radar could be observing one value (e.g., that part associated with larger drops), say, more toward the tail of the distribution, while, in reality, a watershed may be experiencing a different *R* more akin to the modal value. In particular, then, note that in Fig. 6 the standard deviation in the mean rainfall rate appears to be about 2 times the calculated mean value for a few of the small convective showers. In those circumstances, what does an average value really tell us? Even in Fig. 7, there are several locations at which the variability in *R* is greater than or of a magnitude that is comparable to the estimated mean value, and that is only for one standard deviation! {Recall that had this same rain been “steady” [i.e., Poissonian numbers of drops and a constant (steady) drop size distribution; Jameson and Kostinski 2002a], there would have been no such ambiguity from clustering and statistical heterogeneity. The mean would represent the actual mean, assuming that the *Z*–*R* relation were applicable. However, this is the assumption implicit in most analyses and for all operational applications, even where the rain is not steady.} Remember, too, that had dual-polarization measurements of rainfall been available, this same ambiguity would apply to those rainfall estimates of the means as well. There is simply an intrinsic ambiguity in the meaning of an average rainfall rate where there is raindrop clustering and statistical heterogeneity. When called to an investigator’s attention, the usual response is “of course.” What is less often acknowledged, however, is that it is routinely ignored, in part because a measure was not readily available. What this paper offers is a method for making such measurements.

It is also interesting to consider that at locations of significant relative dispersions of *Z* and, therefore, of *n*, it is possible that Bragg scatter may, at times, be making a significant contribution to the radar signals [see discussion of Bragg scatter in Kostinski and Jameson (2000, 907–909)]. With regard to rainfall estimation, this is important because, whether using *Z* or polarization variables, it is always assumed that the radar parameters are proportional to *n*, the number of particles in the sample volume. However, for Bragg scatter, the signal is proportional to *σ*^{2}(*n*), which, where there is clustering, is of order *n*^{2}. Thus, if there happens to be a significant Fourier component [i.e., a component the Fourier transform of the pair correlation function along the direction of propagation (Kostinski and Jameson 2000)] of the Bragg scatter at the radar wavelength at those locations at which the relative dispersions are large, then Bragg scatter may lead, at times, to an overestimation of *R*. For now, of course, this is pure speculation and the subject of future research, but such concerns do provide yet one more reason for observing and using non-Rayleigh signal statistics.

What can be done? At the very least it should now become a routine expectation to see bars of uncertainty arising from raindrop clustering weighted by the statistical heterogeneity associated with each estimate of a mean rainfall rate. However, it is important to remember always that this uncertainty is *not* statistical but physical and that it will not disappear with increased averaging.

## 5. Brief summary and a few suggestions

In this work, I have explored the existence and potential application of non-Rayleigh signal statistics to studies of rain. Drop clustering and statistical heterogeneity are intrinsic properties of most rain. These, in turn, lead to variability in the intrinsic radar reflectivity factor *Z* and rainfall rate *R*. From the usual perspective of independent scatterers, for a radar having a stationary antenna, this intrinsic variability is invisible because the backscattered radar signals are described completely by the fluctuations prescribed by Rayleigh statistics, that is, the measured amplitudes are simply the net result of random complex amplitudes from all of the illuminated scatters. However, once the antenna begins moving this is no longer true because the variability in the intrinsic *Z* modulates the Rayleigh signal fluctuations, as discussed in Jameson and Kostinski (1996). This means, then, that, in principle, one can use these modulations to probe the variability of the intrinsic *Z* itself; that is, one can use deviations from Rayleigh signal fluctuations to explore the variability of the rain occurring over scales smaller than a beam dimension.

It is obvious that radars usually measure a mean backscattered intensity *I* related to the intrinsic mean *Z* through the radar constant. However, the mean intensity alone is insufficient for quantifying non-Rayleigh signal fluctuations that act to broaden the pdf of *I* beyond that expected from normal Rayleigh fluctuations. Measurement of the higher moments of the pdf of *I* are, therefore, required. Because the difficulty (i.e., the number of independent samples required for statistically meaningful estimates) increases rapidly as the power of the moment increases, the focus here is on using the variance of the pdf of *I*. (There are other important physical reasons as well for using the variance, as is apparent in section 2 and appendix A.) To be specific, the square of the relative dispersion *σ*^{2}(*I*)/*I*^{2} = *σ*^{2}(*Z*)/*Z*^{2} is considered.

It is shown that, in conditions of raindrop clustering and statistical heterogeneity, if one could somehow measure the intrinsic relative dispersion of *Z*, its square would be related to the raindrop clustering index (in essence *σ*^{2}(*n*)/*n*^{2} when *n* is large) appropriately weighted by the square of the fractional contribution that each of the components of the heterogeneous rain makes to the mean overall reflectivity factor, as discussed in section 2 and appendix A.

In section 3, the statistics of the square of the relative dispersion are presented and confidence limits are derived numerically, for want of a closed-form expression, as a function of the number of independent samples. These confidence limits are necessary to exclude normal Rayleigh amplitude fluctuations. It is found that, when there are only a relatively few statistically independent samples, most values of the square of relative dispersion of *I* (and *Z*) are less than unity. This is good news for most conventional radar observations, which usually have only a few independent samples per estimate of the mean. Except under unusual circumstances, non-Rayleigh signal statistics will have little effect on the reliability of most of those estimates. This condition, however, may change if radars in the future are equipped to have nearly every pulse be independent and estimates are then formed using a large number of independent samples.

Some preliminary observations in rain were then considered. Pulse-to-pulse time series *I* and *Q* observations over several range bins and azimuths of data were analyzed by computing the intensities and then by combining observations both in range and azimuth, as discussed in the text. The number of independent samples was estimated for the two scans, and the 99% confidence limits were calculated. Measurable and significant dispersions of *Z* and, therefore, *R* were found in both scans. These observations suggest that often just where the rainfall is likely to be most intense the “average” is also likely to be the most variable and its meteorological interpretation is likely to be the most ambiguous. At the very least, investigators should be aware of this ambiguity and perhaps should even begin bounding their estimates by assigning standard deviations arising from raindrop clustering weighted by the statistical heterogeneity.

Furthermore, locations of large relative dispersions also indicate where Bragg scatter could potentially lead to inflated estimates of *R* regardless of the radar technique used. This effect, of course, remains to be quantified in future research and is well beyond the intended scope of this work.

Nevertheless, these results suggest that routine observations of the relative dispersion of *I* in excess of those arising from Rayleigh fluctuations could be quantitatively useful and could be readily achieved even using existing meteorological radars with an adequate processing capability. There is really only one radar requirement: the ability to compute the variance (or, to be more precise, the average *I*^{2}) of the intensity at every range bin from a series of intensity measurements.

The effect of the angular separation between statistically independent single-pulse samples Δ and width over which the ensemble average is measured *B* was considered in Jameson and Kostinski (1999b). It basically showed that Δ/*B* should be ≪1. This is just another way of saying that the more statistically independent samples there are the better it is. This can be achieved either by combining samples, each displaced one beamwidth, from a radar having a narrow beam or, for larger beams, by using chirp, whitening, or some other technique to increase the statistical independence among successive pulses, each of which is usually collected over a fraction of the beam dimension. Thus, while the detection of non-Rayleigh fluctuations will depend upon factors such as rotation rate, the pulse-repetition frequency, beam size, and the wavelength, these are important because they determine how many independent samples the investigator will have at his disposal for estimating *σ*^{2}_{I}.

There is, then, somewhat of a paradox. If one wants to minimize non-Rayleigh effects, one should use a small number of independent samples, which is currently the situation with most conventional meteorological radars. However, these days there is an increased interest in rapid scanning. To measure the mean intensities more quickly, though, one needs more independent samples. The cost is that non-Rayleigh signal effects will become increasingly important; they will actually enhance the uncertainty associated with each estimate. (One can, then, at least measure these effects, in principle, of course.) On the other hand, if one wants to perform research such as that presented here, there is no choice but to increase the number of independent samples.

## Acknowledgments

This work was supported by the National Science Foundation (NSF) under Grant ATM05-31996. The author thanks Dave Brunkow and the NSF-funded National CHILL Radar Facility for so conscientiously supplying the high-quality data used in this work.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

### APPENDIX A

#### Topics Concerning the Relation between the Relative Dispersions of Z and R

##### On the apparent correlations among parameters of drop size distributions

Another consideration is whether the mean drop diameter *D* [or, in the more general case, the pdf of drop sizes itself: *p*(*D*)] and the total number *n* of drops of all sizes observed per unit sample volume are correlated or are statistically independent. Over years of observations, a consensus in the literature has evolved that *D* (and other powers of *D*) and *n* are correlated. [Indeed, for example, the Marshall and Palmer (1948) relations claim that the slope of the distribution Λ and the multiplicative concentration term *N _{o}* are correlated. As shown in Kostinski and Jameson (1999a), Λ = 1/

*D*while

*N*

_{o}=

*n*/

*D*so that Marshall and Palmer (1948) require that

*n*and

*D*also be correlated. In addition, they also mandate that even the pdf of the distribution of the drop sizes,

*p*(

*D*) =

*N*(

*D*)/

*n*, must be correlated with

*n*.] There are physical arguments both in favor of and against the existence of such correlations. However, Jameson and Kostinski (2002b, their Fig. l) show that the observations upon which most of these “correlations” are based are flawed because of poor sampling inherent in past measurements. They also show that the poor sampling leads to an

*apparent*but

*fictitious*correlation among the parameters. Even radars, with much-improved sampling volumes, are so strongly biased toward large drop sizes that they too may introduce apparent but fictitious correlations. Whether or not such correlations exist, it is the opinion of this author that they have not yet been convincingly demonstrated. Therefore, for the purposes here, it will be assumed that

*D*,

*p*(

*D*), and

*n*are all statistically independent. (As shown in section b of this appendix, however, if, in fact, such correlations exist, the estimates of the relative dispersions of

*Z*and

*R*will be little affected anyway because they are so overwhelmingly dominated by the relative dispersions of

*n*so that drop size distributions play a very minor role.)

##### Simple models of statistically heterogeneous rain and the associated relative dispersions

In the case of statistically homogeneous rain, Jameson and Kostinski (2001a) argue that the terms involving the variance of *n* quickly overwhelm terms involving the drop size distributions. To see whether this is true in statistically heterogeneous conditions, look at the ratio of these two terms, namely,

and

for some simple models of statistical heterogeneity. For *R* and *Z*, these quantities, in essence, measure the relative importance of the variability in the total number of drops in the sample volume and the variability caused by changes and fluctuations in the drop size distributions.

Suppose one now considers three statistically homogeneous components, one representing stratiform rain, one representing convective precipitation, and the third representing a transition regime between the first two. Furthermore, suppose each component is described by a nontruncated exponential diameter distribution characterized by mean drop sizes of 0.25, 0.3, and 0.4 mm for the stratiform, mixed, and convective rains, respectively. [From the more traditional perspective these mean diameters correspond to slopes of 40, 33.3, and 25 cm^{−1} as explained in Kostinski and Jameson (1999).] It will also be assumed that a radar is being used so that *E _{i}*(

*n*) = 10 billion drops in the sample volume. It is also initially assumed that the stratiform rain is unclustered [

*σ*

^{2}(

*n*) =

*E*(

*n*) and

*F*=

_{R}*F*= 1] and the mixed rain has a modest clustering index [the clustering index is given by

_{Z}*σ*

^{2}(

*n*)/

*E*

^{2}(

*n*) − 1/

*E*(

*n*) (Jameson and Kostinski 1999; Shaw et al. 2002; Jameson 2005)—for the large sampling volumes of radars, this reduces to

*σ*

^{2}(

*n*)/

*E*

^{2}(

*n*)] of 1.3 [i.e.,

*σ*

^{2}(

*n*) = 1.3

*E*

^{2}(

*n*) and

*F*=

_{R}*F*= 1.5]. Insofar as turbulence is at least partially responsible for the clustering of rain, convective rain is often likely to be more clustered, and therefore the clustering index is set to 2 [

_{Z}*σ*

^{2}(

*n*) = 2

*E*

^{2}(

*n*) and

*F*=

_{R}*F*= 2]. Also, initially it is assumed that the data are dominated by the unclustered stratiform rain so that

_{Z}*w*= 0.6,

_{s}*w*= 0.3, and

_{m}*w*= 0.1, where the subscripts refer to stratiform, mixed, and convective, respectively.

_{c}After inserting all of these into (7)–(8) and (A1)–(A2), it is found that the square of the relative dispersion of the rain is 0.34 while that for *Z* is 0.66. Moreover, the ratios of clustering to drop size variability for both *R* and *Z* are on the order of 200–300 million. That is, the first terms in (7) and (8) are, for all practical purposes, negligible.

Now reverse the importance of the stratiform and convective components so that *w _{s}* = 0.1,

*w*= 0.3, and

_{m}*w*= 0.6 with all else kept the same. In this case, it is found that the square of the relative dispersion of the rain is now 1.38 while that for

_{c}*Z*is 1.66, which is fairly typical of observations discussed below. Thus, in the analyses, it is assumed that the square of the relative dispersion of the rain is about 0.8 times that of the radar reflectivity factor. Moreover, the ratios (Al) and (A2) are both once again on the order of 300 million.

Last, consider the unrealistic case of the above condition, except now suppose that all clustering is eliminated. It is then found that the squared relative dispersion of *Z* is about 77 × 10^{−9} while that for *R* is less than about 2.6 × 10^{−9}. [They are small because they are dominated by a factor of 1/*E*(*n*).] The ratios given by (Al) and (A2) are now about 0.03. That is, when there is no clustering the drop size distributions do play the dominant important role (but then the relative dispersions are all approximately zero anyway).

The conclusion is that the variability caused by drop size distributions is not important relative to the effects of clustering, at least over the dimensions of most radar volumes.

##### Derivation of the relation between the relative dispersions of R, Z, and n in statistically heterogeneous conditions

Therefore, let it be supposed that one has *M* statistically homogeneous components of the rainfall occurring with weights *w*. If it is assumed for the moment that the pdfs of all the *w _{i}* are narrowly distributed about their respective means, then over a set of observations the expected value of the rainfall rate

*E*(

*R*) for example is then given by

where *E*(*R _{i}*) is the expected value of the rainfall rate for the

*i*th component. On the other hand, it is straightforward to show that the expected value of

*R*

^{2}is also given by

so that the variance can be written as

where cov denotes the covariance function.

Now the statistically homogeneous components are functionally independent (consistent with their statistical homogeneity). Furthermore, it is easy to show that cov(*R _{i}R_{j}*) is proportional to cov(

*n*), where

_{i}n_{j}*n*and

_{i}*n*are the number of drops associated with the

_{j}*i*th and

*j*th components, respectively. By assuming that there is no reason to expect these to be correlated, the covariance term in (A5) can be ignored. Now, using the discussion in the previous section and the expressions (17), (19), and (20) from Jameson and Kostinski (2001a), one finally arrives at

and

where the summation is over the statistically homogeneous components and *V* is the fall velocity of the drop of size *D*. As mentioned previously, *F _{R}* and

*F*are factors taking into account any statistical dependence among the drops such that they are unity when the drops are completely statistically independent, but they may reach values of up to 4, based upon observations. [For more details the reader is directed to the discussion in Jameson and Kostinski (2001a), 528–529.]

_{Z}The important, feature about (A6) and (A7) is that they allow one to model statistical heterogeneity along with drop clustering (appendix A section b). What is important here, however, is that, in most realistic radar measurements, clustering (variance of *n*) will almost always dominate the relative dispersions of both *R* and *Z* (see appendix A section b). From that perspective, then, just consider the second term in (A7) so that

After noting that *E _{i}*(

*Z*) ∝

*E*(

_{i}*n*)

*E*(

_{i}*D*

^{6}), it is found that

so that

and

where the summations are over all of the statistically homogeneous components of the statistically heterogeneous data and the angle brackets denote the average over the weights of the components *w _{i}*, that is, the average over all of the statistically heterogeneous data. It should be remembered, however, that (A10) and (A11) are based upon the assumption that the pdfs of the

*w*are sufficiently narrow. When that is not the case, other terms related to the statistical heterogeneity enter the analysis. However, in that case it can be shown that the clustering still accounts for 63% (for uniform distributions of

_{i}*w*) to nearly 100% (for Dirac-like distributions of

_{i}*w*) of the observed variability. Such considerations, however, do not impact the work or results presented here.

_{i}### APPENDIX B

#### A Discussion of Fig. 1

The large number of *σ*^{2}(*I*)/*I*^{2} less than unity when *k* is small occurs because, for the exponential distribution of *I* (derived from the Rayleigh distribution of amplitudes), over 86% of the instantaneous values are of magnitude less than 2*I* so that it takes many, many observations to sample adequately the tails (and variance) of the distribution. As a consequence, when *k* is small, most values of the variance will be underestimated, as Fig. 1 illustrates. Moreover, for small *k*, the distribution also has a long tail. This is important because meteorological radars typically use only a few (4–8) independent samples to form an estimate of the mean intensity or radar reflectivity. The result is that when *k* is small any contributions to the relative dispersions arising from clustering and statistical heterogeneities would be very difficult to distinguish from those due to Rayleigh fluctuations. On one hand, in such cases (fairly typical of current meteorological radar observations), non-Rayleigh signals are, then, not likely to affect the radar measurements significantly except under extreme conditions such as those used in early studies of non-Rayleigh signal statistics (e.g., Schaffner et al. 1980). However, on the other hand, if a radar were equipped so that every pulse sampled were independent, then, as Fig. 1 illustrates, any relative dispersion from Rayleigh fluctuations would be narrowly confined to a region around unity, and contributions from clustering and statistical heterogeneities would then be more readily detected. This may well be one of the benefits (or costs, depending upon one’s perspective) of increasing the number of independent samples using whitening (Koivunen and Kostinski 1999), phase coding, or chirp techniques to improve sample independence.

## Footnotes

*Corresponding author address:* A. R. Jameson, RJH Scientific, 5625 N. 32nd St., Arlington, VA 22207-1560. Email: arjatrjhsci@earthlink.net