Introduction
Investigators frequently acquire observations of raindrop sizes and seek to describe the drop size distributions (DSDs) of the underlying populations from which the samples were taken by analytical expressions, with the exponential or gamma function being most common. Although moment methods to estimate parameters for the DSD functions have become more or less traditional, Haddad et al. (1996, 1997) pointed out that such methods are biased. Hydrologists have long been aware of this bias (e.g., Wallis et al. 1974; Maidment 1993), and Smith and Kliche (2003) provided examples of the bias for the case of raindrop observations. Nevertheless, the intuitive appeal of the moment approach seems almost irresistible, and the associated mathematical manipulations lend a convincing aura. Yet the methods are indeed biased, in the statistical sense that the expected values of the “fitted” parameters differ from the parameters of the underlying raindrop populations, and so can lead to erroneous inferences about the characteristics of the DSDs being sampled.
The bias in the moment methods can be demonstrated by testing their ability to recover parameters of known DSDs from which samples are taken. This must be done by computer simulation, because DSDs in nature are inherently unknown. The simulations herein use a Monte Carlo simulation procedure similar to that described in Smith et al. (1993) and outlined below. This paper gives results for samples taken from a hypothetical exponential DSD fitted with various moment-based procedures; corresponding results for samples taken from gamma DSDs will be presented later.
Simulation of raindrop sampling
The simulation proceeds from a selected value of NT by first drawing from a Poisson distribution with a mean value NT to determine the actual total number of drops C in a given sample. Then, C values of y drawn from the exponential PDF establish the (normalized) sizes of those drops. Normalized values for the six sample moments of M1S through M6S are next calculated for each sample, and then various moment-based calculations (discussed in section 4 and summarized in the appendix) are applied to estimate the DSD parameters. For purposes of the present work, we classified the drop sizes into intervals of Δy = 0.02, representing the size classification procedure that is common to drop-measuring instruments, and truncated the exponential PDF at y = 3.0. Repetition of the sampling and fitting process yields the desired distributions. We used about 1 000 000 drops (e.g., 50 000 samples with NT = 20 and 5000 samples with NT = 200) in the simulations; because the probability of a drop in an exponential PDF being larger than y = 3.0 is 6 × 10−6, we are lacking a total of about six larger drops from a full exponential DSD.
Characteristics of sampling distributions
Inspection of (6) shows that the skewness increases with the order i of the moment Mi, and decreases with increasing sample size NT. Figure 1 illustrates the former property for the sample moments M3S (LWC) and M6S (Z), for the case NT = 100. The general tendency is for the sample moments to be lower than the corresponding population values; this behavior is the ultimate cause of the bias in the moment methods for estimating DSD parameters. As shown in section 4, the increase of the skewness with the moment order leads to greater biases when higher moments are employed. Figure 2 illustrates how the median sample moments approach the population values as the sample size increases. The skewness of the sampling distributions, and the resulting biases, decrease in a similar manner.
Sampling the numerous small drops can be a major instrumental problem for exponential DSDs, and adequately sampling the relatively rare large drops is also a concern. Fewer than one drop in 400 in an exponential DSD is larger than D = 1.5Dm, and only about one in 3000 is larger than D = 2Dm. However, the drops larger than 1.5Dm contribute more than 15% of the LWC and more than 60% of the reflectivity factor. Consequently, the relatively large but relatively rare drops tend to be important in determining the moments of physical significance. The sample values of these moments are, therefore, correlated with the size of the largest drop in each sample (e.g., Fig. 3). Table 1 demonstrates that these correlations are stronger for higher-order moments, and remain appreciable even for fairly large sample sizes. The sample moments are, in turn, correlated with each other—an artifact of the sampling variability as discussed in Smith et al. (1993).
Figure 5 of Smith et al. (1993) showed that the maximum drop size in an exponential DSD is rarely approached in samples of even hundreds of drops. The distribution of values along the abscissa in Fig. 3 here demonstrates the same thing. There is clearly no basis for assuming truncation of the underlying DSD at the maximum observed drop diameter, with samples of such sizes.
The bias in moment estimators
The essence of the moment approach for estimating parameters for DSD functions is to use the same number of moments calculated from observed raindrop size distributions as there are parameters in the function to be fitted. Analytical expressions for the selected moments of that function are solved algebraically for the needed parameters, and observed values of the sample moments are then entered into the resulting equations to estimate the parameters. The appendix summarizes the relevant mathematical expressions used here. The use of moment methods for rain DSDs evidently began with Waldvogel’s (1974) paper on the “N0 jump” of DSDs. He used observed values of moments M3S and M6S (i.e., LWC and Z) along with (5), or its equivalent, to determine pairs of parameters for exponential functions like (1) that purportedly represented the observed DSDs. However, most functions that are fitted in this way do not represent well either the samples upon which they are based or the underlying populations from which the samples were taken.
The introductory section pointed out that such moment estimators are biased, and the fact that estimates of DSD parameters obtained in this way are biased was actually indicated in Smith et al. (1993). Figure 4, reproduced from that paper, compares sample estimates of D̂m with the value of Dm for the exponentially distributed population from which the (simulated) samples were drawn. In this example, more than 80% of the values of D̂m are underestimates, and the mean (the expected value) is about 78% of the population value. In terms of the more familiar exponential slope parameter Λ (=4/Dm), this means that Λ̂ is generally overestimated.
Estimators for exponential functions
The figure includes two moment-method fits to the sample for exponential DSD functions. The two are the “Waldvogel fit,” based on moments M3 and M6, as employed in Waldvogel (1974), and one based on M2 and M3. The one using parameters based on M3 and M6 does not represent either the “observed” sample or the original population DSD very well; the smaller discrepancy resulting when the lower moments (M2 and M3) are used in the calculation is evident.
The foregoing discussion and the specific example in Fig. 5 suggest the general nature of the bias in moment estimators for parameters of exponential DSD functions: they tend to overestimate both the concentration (intercept) parameter no and the size (slope) parameter Λ. In terms of the parameters of (2), they tend to underestimate Dm and overestimate NT, yielding fits having too many drops that are too small when compared with the original raindrop population. The biases are greater when higher moments are used in the procedure; Fig. 6 illustrates this behavior for n0. Consequently, procedures that use sample values of reflectivity in the moment calculations lead to greater biases than ones employing only lower moments.
Extension of this argument would appear to suggest that using the lowest moments—M0S, the sample size, and M1S, related to the mean drop diameter—would yield the smallest biases of all. Such would indeed be the case in a simple mathematical exercise like that involved here. However, as noted in Testud et al. (2001), Smith (2003), and elsewhere, instrument responses to very small drops are highly variable and often suspect. Thus, trying to use moments that are lower than M2S in the analysis would introduce another kind of uncertainty into the moment procedures. There may even be problems with using M2S; instruments that do not adequately sense drops as small as 0.2–0.3Dm may significantly underestimate the value of M2 (as illustrated in Table 2). If, say, Dm = 1 mm, this could be a challenging instrumentation problem. For that reason no approach using the lowest-order moments is considered here.
The skewness in the sampling distributions for the moments diminishes as the sample size increases (e.g., Fig. 2), so the bias in the moment estimators should also decrease with increasing sample size. Figure 7 shows that to be the case; with samples of hundreds or thousands of drops, the bias may become small enough to be negligible for many purposes.
Estimators for gamma functions
The example in Fig. 8, for the same ideal sample as that in Fig. 5, suggests that not to be the case. The curve for the gamma fit based on moments M2S, M3S, and M4S, as suggested in Smith (2003), has a shape parameter μ = 1.0; the actual calculated value of μ̂ was 1.013, but cumulants are not available for noninteger values of μ. Using the higher moments M3S, M4S, and M6S, as done by several of the aforementioned authors, would yield μ̂ = 3.365—a more strongly biased estimate. The second gamma curve in Fig. 8, for μ = 3.0, provides a close approximation. The gamma fits match part of the sample distribution reasonably well, but in no way do they correspond to the population from which the sample was taken. As noted, the fitted value of the gamma M2, M3, M4 shape parameter μ̂ is about 1.0; the expected value of μ̂ for samples of this size, drawn at random from an exponential DSD (μ = 0), is μ̂ = 1.50. For gamma fits based on moments M3, M4, M6, the median value of μ̂ is about 4.5. Thus, sampling the exponential DSD with samples of this size and attempting to fit a gamma DSD function by moment methods will not reveal that the underlying DSD is exponential.
Figure 9 illustrates the bias in the moment estimators for μ, for different choices of the three moments used in the procedure. The general tendency is to overestimate μ. Here again, the lower moments yield smaller biases; use of the sample reflectivity values (M6S) leads to the strongest bias among the examples illustrated. This bias can be quite misleading. Here, one has sampled from what is actually an exponential DSD, used moment methods in attempts to fit parameters of gamma distributions to the observations, found (biased) high values of μ̂, and probably concluded, quite erroneously, that the population DSD was gamma after all.
The hybrid approach used by Testud et al. (2001) does not employ a moment-based calculation to determine μ̂, but the estimators for their other gamma parameters (Dm and NW) are biased. Figure 10 illustrates this bias for values of D̂m, which as noted earlier, tend to be underestimates. Figure 11 shows corresponding results for values of N̂W, which tend to be overestimates. (In both instances, when M3S and M4S are among the estimating moments, the third moment of the group has no effect on these estimates.)
As the sample size increases, the sampling variability and the skewness in the sample moments decrease, and the correlations between the various sample moments weaken (Smith et al. 1993). Consequently, the biases diminish. Figure 12 illustrates this behavior for the shape parameter μ̂. Thus, with very large samples, the moment methods may give approximations of the population parameters that become sufficiently accurate for practical purposes, even when the wrong functional form is assumed.
Related findings
The correlations of the various sample moments with the maximum drop size in a sample (section 3), and the associated correlations between moments, lead to correlation of the fitted parameters with the maximum drop size (e.g., Fig. 13). The parameters are, in turn, correlated with each other (Fig. 14), and such correlations (caused here entirely by sampling variability) could be mistaken for physical relationships. The correlations between estimated parameters are a bit less strong when lower-order moments are used in the fitting process; with NT = 100, the n̂0 − Λ̂ correlation is only 0.927 when sample moments M2S and M3S are used as compared with the 0.946 value with moments M3S and M6S, as illustrated in Fig. 14. The figure also illustrates the general tendency for the moment methods to overestimate both the slope and intercept parameters of an exponential function.
The n̂0 − Λ̂ correlation actually increases as the sample size increases—to values of 0.946 (for moments M2S, M3S) or 0.980 (for moments M3S, M6S) when NT = 1000—though the ranges of the variation of the parameters decrease. In any case, this behavior suggests that one should be wary of inferring physical relationships between such fitted parameters until the effects of the sampling variability have been taken into account.
Implications for analysis of experimental data
In trying to relate these simulations to actual raindrop observations, one should keep in mind several factors:
The actual population DSDs in nature are not known. There is no assurance that they are exponential, though there are indications that this may be the case (Joss and Gori 1978).
Observations with surface-sampling instruments, such as impact disdrometers, involve sample volumes that increase with the drop size, which tends to mitigate the skewness in the high-order sample moments, and, consequently, the associated biases.
The observations include only the actual sample size (C in these simulations). The mean, or expected, sample size (NT) is not known, though the actual sample size provides a better approximation as it increases.
Very small drops are generally absent from many such observations, either because such small drops are not present, or because the instruments do not respond to those drops. In a full exponential DSD, 55% of the drops would be smaller than 0.2Dm, and 86% would be smaller than 0.5Dm. With typical values of Dm being 1–3 mm, this means the simulations may involve more (and in some cases much more) than twice the total numbers of drops that would be found in corresponding observations. Thus, results given above for NT = 100 might be more applicable to observations with total drop counts of, say, 20–50.
These caveats notwithstanding, certain broad inferences as follows are applicable:
Values of the parameters for DSD functions as estimated by moment methods will be biased.
The bias will be stronger when higher moments are employed.
The bias will diminish as the sample size increases.
Thus, DSD parameters that are estimated using high-order sample moments (e.g., reflectivity) with small sample sizes (say a few tens of drops) are the most highly suspect. Moreover, as suggested in the preceding section, caution should be used in attempting to impute a physical basis to relationships between DSD parameters fitted by moment methods.
Because the actual DSD parameters (even if the DSD were exponential) are unknown, it is difficult to critique in a quantitative manner any given set of results based on moment-method analyses. The most likely indication of the kind of biases discussed here appears in published frequency distributions of the gamma shape parameter μ̂, such as those reported in Kozu and Nakamura (1991) or Tokay and Short (1996). They extend to values as large as 30, which in view of Fig. 9 herein may well be a consequence of the M3, M4, M6 moment approach that is used by those authors. The fact that the maximum μ̂ values reported by Kozu and Nakamura decrease as the rainfall rate increases (which should be accompanied by increases in sample size) lends further weight to this interpretation. A simple way to test this idea would be to stratify the shape parameter estimates by sample size, to look for a trend similar to that shown in Fig. 12.
Conclusions
Moment estimators for parameters of DSD functions are inherently biased. They tend to give erroneous values of the DSD parameters unless the drop samples are much larger than those commonly available. In particular, estimates of the gamma shape parameter μ tend to be far larger than the shape parameter of the underlying DSD from which the samples are taken. The bias is strongest for small sample sizes, and are also stronger when higher-order moments of the observed DSDs are used in the “fitting” process.
Moment methods may provide estimates of DSD parameters of sufficient accuracy if very large samples (hundreds, perhaps thousands) of drops are available. Failing that, some alternative approach to fitting the observed DSDs must be used. The maximum likelihood approach suggested by Haddad et al. (1996, 1997) may be satisfactory, though the maximum likelihood estimators are not without bias (Choi and Wette 1969). A variant of the L-moment approach used in hydrology (e.g., Maidment 1993) may also be worthy of consideration.
Acknowledgments
This material is based upon work supported by the National Science Foundation under grant ATM-9907812. The authors appreciate the assistance of Prof. S. J. Burges in directing them to references concerning use of moment methods in hydrology.
REFERENCES
Bringi, V N. and V. Chandrasekar. 2001. Polarimetric Doppler Weather Radar. Cambridge University Press, 636 pp.
Choi, S C. and R. Wette. 1969. Maximum likelihood estimation of the parameters of the gamma distribution and their bias. Technometrics 4:683–690.
Gertzman, H S. and D. Atlas. 1977. Sampling errors in the measurement of rain and hail parameters. J. Geophys. Res. 82:4955–4966.
Haddad, Z S., S L. Durden, and E. Im. 1996. Parameterizing the raindrop size distribution. J. Appl. Meteor. 35:3–13.
Haddad, Z S., D A. Short, S L. Durden, E. Im, S. Hensley, M B. Grable, and R A. Black. 1997. A new parameterization of the rain drop size distribution. IEEE Trans. Geosci. Remote Sens. 35:532–539.
Joss, J. and E G. Gori. 1978. Shapes of raindrop size distributions. J. Appl. Meteor. 17:1054–1061.
Kozu, T. and K. Nakamura. 1991. Rainfall parameter estimation from dual-radar measurements combining reflectivity profile and path-integrated attenuation. J. Atmos. Oceanic Technol. 8:259–270.
Maidment, D. 1993. Handbook of Hydrology. McGraw-Hill, 1424 pp.
Marshall, J S. and W M. Palmer. 1948. The distribution of raindrops with size. J. Meteor. 5:165–166.
Smith, J A. 1993. Marked point process models of raindrop-size distributions. J. Appl. Meteor. 32:284–296.
Smith, P L. 1982. On the graphical presentation of raindrop size data. Atmos.–Ocean 20:4–16.
Smith, P L. 2003. Raindrop size distributions: Exponential or gamma—Does the difference matter? J. Appl. Meteor. 42:1031–1034.
Smith, P L. and D V. Kliche. 2003. The bias in moment estimators for parameters of drop size distribution functions. Preprints, 31st Conf. on Radar Meteorology, Seattle, WA, Amer. Meteor. Soc., 47–50.
Smith, P L., Z. Liu, and J. Joss. 1993. A study of sampling-variability effects in raindrop size observations. J. Appl. Meteor. 32:1259–1269.
Testud, J., S. Oury, R A. Black, P. Amayenc, and X. Dou. 2001. The concept of “normalized” distribution to describe raindrop spectra: A tool for cloud physics and remote sensing. J. Appl. Meteor. 40:1118–1140.
Tokay, A. and D A. Short. 1996. Evidence from tropical raindrop spectra of the origin of rain from stratiform versus convective clouds. J. Appl. Meteor. 35:355–371.
Ulbrich, C W. 1983. Natural variation in the analytical form of the raindrop size distribution. J. Climate Appl. Meteor. 22:1764–1775.
Ulbrich, C W. and D. Atlas. 1998. Rainfall microphysics and radar properties: Analysis methods for drop size spectra. J. Appl. Meteor. 37:912–923.
Waldvogel, A. 1974. The N0 jump of raindrop spectra. J. Atmos. Sci. 31:1067–1078.
Wallis, J R., N C. Matalas, and J R. Slack. 1974. Just a moment! Water Resour. Res. 10:211–219.
APPENDIX
Equations for Moment Estimators of DSD Parameters
In this fashion, parameter estimates based on the normalized moments mi can be compared with the population parameters in dimensionless expressions where the actual particle sizes do not enter. Thus, the only population parameter that appears is NT (with one exception, discussed in section 1b below). Expressions for the remaining (normalized) moment estimators follow.
Exponential functions
Moments: M3, M6
Moments: M3, M4 (results not shown)
Gamma functions
Moments: M2, M3, M4
Moments: M2, M4, M6
Moments: M3, M4, M6
The expressions for D̂m and N̂w are identical to those for the M2, M3, M4 moment set; those for n̂1 and λ̂ are algebraically the same but involve different values for μ̂.
Correlations between sample moments and maximum drop size in a sample. Population DSD: exponential.
Contributions of small drops to some population moments (exponential DSD).