Introduction
This paper is concerned with the randomization distributions of ratio statistics and with the means and standard errors of the asymptotic distributions of these ratios and their logarithms. These distributions are important for the correct application and interpretation of this type of statistics, which have found wide use in the analysis and design of randomized weather modification experiments (Adderley and Twomey 1958; Siliceo and Mosino 1963; Kapoor et al. 1973; Dennis et al. 1975; Smith et al. 1979; Changnon 1979; Miller et al. 1979; Gagin and Neumann 1981; Gabriel and Rosenfeld 1990; List et al. 1999). Nowadays it is not difficult to generate a very large number of rerandomizations of an experiment and obtain a reliable estimate of its P value without recourse to asymptotic formulas. That approach, however, becomes exorbitant and yields little understanding of the design of an experiment and of alternative methods of analysis. These purposes are better served by application of asymptotic variance formulas to hypothetical models of experiments and will be illustrated in section 5.
Ratio statistics relate totals, or means, of precipitation at different times and in different areas—under cloud seeding or in the absence of seeding, on a seeding target or a control area, or on two alternate targets. The totals cumulate all seeded or unseeded experimental units at all stations of a particular area. Since daily, or even weekly, precipitations tend to have highly skewed distributions, this emphasizes the occasions of extreme precipitation at the expense of the many small or moderate precipitations. Ratio methods may therefore be suspected of being nonrobust and prone to the effect of outliers. Replacing them by statistics that censor outliers, or substitute ranks for amounts of precipitation, would increase robustness but do so at the risk of ignoring the large effects of seeding. Not enough is known about cloud seeding to guarantee that rejecting extreme precipitation events might not result in ignoring its largest effects. Current models of the effect of seeding on precipitation are quite speculative, and so it makes sense to proceed conservatively by analyzing the totals of precipitation, especially as that is what weather modifiers usually aim to affect.
This paper does not argue that ratio statistics are best but presents tools for making correct inferences about them, given that they have been much used and are likely to continue being used.
Two experimental designs are discussed: one has a single target, which is randomly seeded or not seeded;the other uses two targets, which are seeded alternately according to random allocation. Either of these designs may be analyzed with or without covariates. In practice, covariates usually consist of precipitation on nearby control areas that are always unseeded. The targets and controls are assumed to be located so that seeding will affect only the target to which seeding is assigned; in practice, there often are serious concerns about contamination (see, e.g., Wurtele 1971), but these are not addressed here. The units of experimentation are assumed to be well defined at the time of seeding assignment and thus usually consist of fixed time intervals, referred to here as “days.” More meteorologically meaningful units, such as individual clouds, would surely enhance the sensitivity of experiments, but defining them (especially the times of their beginning and end) in real time at the moment of seeding assignment, and/or of finding alternate targets and controls, has proved to be quite elusive. Finally, the length of an experiment, as denoted by the number n of days, is assumed to be fixed at the outset of experimentation.
A variety of papers have been concerned with randomization distributions of ratio statistics (Adderley 1961; Moran 1959a,b; Gabriel and Feder 1969; Neumann and Shimbursky 1972; Davis 1979; Petrondas 1983; Gabriel 1988, 1991; Nirel 1994). Other publications have discussed ratio statistics under parametric assumptions (Flueck 1976; Flueck and Holland 1976; Mielke and Flueck 1976; Mielke et al. 1977) and other alternative techniques to avoid normality assumptions (Mielke and Medina 1983).
Definitions and notation
Consider an experiment carried out on days i = 1, . . . , n according to random allocation variables θ1, . . . , θn, such that, on day i, if θi = 1, then one target Y, say, is seeded, whereas if θi = 0, then that target is not seeded. In addition, in an alternating targets experiment there is another target X, say, which is seeded if θi = 0 and not seeded when θi = 1. The allocation variables θ1, . . . , θn are independent, and each has probability ½ for each of θi = 1 and θi = 0.
Precipitations on day i are written yi and xi for targets Y and X, respectively. Data may also be available on covariates, with zi,j denoting the ith day’s observation of covariate Zj, j = 1, . . . , k, and a constant is introduced by setting z̃i,0 ≡ 1 for dummy covariate Z0.
In such experiments, the only part that is considered random is the collection θi, . . . , θn of allocation variables, whereas the statistical analysis is conditioned on the observations yi and xi of precipitation and zi,j of covariates, which are taken as given. This approach is particularly useful for testing a hypothesis that observations are independent of allocation, that is, that seeding has no effect on precipitation.












Ratio statistics are calculated from partial sums of precipitation and covariate(s), one sum for days seeded on Y, another for days not seeded on Y. These sums are obtained by multiplying the summands by θi or 1 − θi, respectively, as shown in Table 1.
Single target experiments
The design
An experiment may have a single target Y, which is seeded or not seeded according to random allocations θ1, . . . , θn, which are independent with Pr(θi = 1) = Pr(θi = 0) = ½, i = 1, . . . , n. The response is target precipitation yi on day i. Covariate observations zi,j, j = 1, . . . , k may also be available; they usually consist of precipitation on upwind control areas.
The single ratio and its approximation


Statistical inferences on seeding effect can be based on the SRY statistic by relating it to its distribution over all possible random seeding allocations, conditional on the y’s. This randomization distribution derives from the randomness of the θ’s.




For experiments with large n, one may use these approximate moments for testing significance and setting confidence bounds on the proportional effect of seeding. For intermediate-sized experiments one should preferably use lnSRY for this purpose rather than SRY itself. Petrondas (1981) carried out a detailed study of such ratio statistics, noting that “for large datasets . . . Taylor approximation of the variance is adequate. For experiments with fewer than one hundred data points . . . normal approximation is adequate for the central part of the permutation distribution of the logarithm” (of the ratio). He further concluded that “for larger experiments” (over 100 data points) “the normal distribution is adequate” (for the entire permutation distribution).
The double ratio and its approximation














Note that the fit, and the correlation, of the above regression is the same as that of the regression of the precipitation yi onto the covariates zi,j themselves (not relative). Hence, one could use the latter multiple regression instead of the former to obtain the same result.
Alternating targets experiments
The design
An experiment may have targets Y and X, which are seeded on alternate days where the alternation is according to random allocation θ1, . . . , θn, which are independent with Pr(θi = 1) = Pr(θi = 0) = ½, i = 1, . . . , n. The responses are precipitation yi and xi on alternate targets Y and X on day i. Covariate observations zi,j, j = 1, . . . , k may also be available. The covariates usually are precipitations on several control areas, some upwind of target Y, others upwind of target X.
The root double ratio and its approximation




Comparing this with the single area experiment’s DR(0,1), one notes it differs by introducing the square root. This is justified because DR(0,1) compares SRY to the SR1 for control Z1, which could not be affected by seeding, whereas RDR compares SRY to SRX for the alternate target X, which could be reciprocally affected by seeding; the square root then averages over the effects on the two targets.


The root quadruple ratio and its approximation






The root regression ratio and its approximation








Note that the fit and the correlation of the above regression is the same as that of the regression of the difference of the targets’ relative precipitation ỹi − x̃i onto the covariates zi,j themselves (not relative). Hence, one could use the latter multiple regression instead of the former to obtain the same result.
A schematic comparison of designs and ratio statistics
The schema
The formulas developed in the preceding sections allow calculation of the variances of different ratios when the relative variances and covariances of all variables are known. Given, however, that certain common patterns have been observed in many precipitation experiments, we propose a schema of variances and correlations that roughly represent observed patterns. We then derive the ratios’ variances from this schema and make some fairly general comparisons of various ratio statistics in terms of their variances and thus in terms of the length of experimentation they require.
The schema, Fig. 1, concerns one or two targets, Y or Y and X, with corresponding control(s) Z1, or Z1 and Z2, respectively. (Illustrations follow in Tables 2 and 3 below.) It thus includes both one- and two-target designs. It assumes the relative variances of all variables are the same, that is,
Some examples
Examples of alternating targets experiments with controls are Israel II (Gagin and Neumann 1981; Gabriel and Rosenfeld 1990) and Puglia (List et al. 1999) with the correlations and coefficients of variation as shown in Table 2.
For the Israeli data, the coefficient of variation is remarkably constant at around 1.25. The target–control correlation average is 0.753, whereas the between-target and between-control correlation average is 0.738, which is a bit lower, as postulated under the schema. The lowest are the target–other control correlations, which average 0.665, again as expected. The two correlations of each kind do not closely correspond to the schema, that is, 0.866 is not close to 0.641, 0.668 to 0.808, and 0.745 to 0.584, so the approximation to the Israeli situation is rough.
For the Puglia data, Table 3, the coefficient of variation is fairly steady around 1.60. The target–control correlation average is 0.832, the between-target and between-control correlation average is 0.508, which is a bit lower than postulated under the schema. Least is the target–other control correlation with an average of 0.498, again as expected. Again, the two correlations of each kind are not very close together, so the schema is likely to be a rough approximation of the Puglia situation.
Relative variability, that is, the coefficient of variation, is somewhat higher in Puglia than in Israel; however, once adjusted for the control covariate, it does not vary widely, which suggests that important features of experiments in these two areas may not be that different.
Some comparisons and practical conclusions
The variances of the various ratios for single and alternating targets experiments are shown in Fig. 2, both as derived generally in sections 3 and 4, and as calculated under the simplified schema.
In a single target experiment the double ratio DR(0,1) is seen to be more precise, that is, to have smaller variance than the single ratio SRY, provided the target–control correlation κ exceeds 0.5. The table also shows RR to be the more precise than either, so it should always be used, unless no suitable control is available and only SRY is available.
In an alternating targets experiment the root quadruple ratio RQR(0,1,−1) is similarly seen to be more precise than the root double ratio RDR, provided RY−X:1,2 = (κ − γ)/(1 − τ) exceeds 0.5. The RRR is, however, found to be the most precise and should always be used if controls are available; in the absence of controls, only the RDR is available.
Comparing an alternating targets experiment with a single target experiment, both without controls, the former’s RDR will have a smaller variance, (1/n)


Additional issues
Power and length of experimentation
The above expressions for the expectations and variances of ratio statistics have been computed under the null hypothesis that seeding has no effect and are conditional on the first two moments of target precipitation and the covariates. They can be used to assess the powers of planned experiments and the length of experimentation needed for a given power.
For any ratio statistic R, if seeding were to increase the precipitation every day by a proportion δ, this would add ln(1 + δ) to lnR. The expectation of lnR would increase by ln(1 + δ), whereas the variance would remain the same. Standard formulas for power can therefore be applied provided one has estimates of the relative variation of precipitation and the covariates.
















Comparisons of different experimental designs




Actual experiments commonly have controls that are more highly correlated with a target than with an alternate target, that is, κ > τ, so the ratio of Var[lnRDR] to Var[lnDR(0,1)] is larger than 0.25. For the Bari target in the Puglia experiment, τ = 0.61 and κ = 0.77 and so the variance of the alternating targets experiment’s RDR would have been 0.42 of the variance of a single target experiment’s DR(0,1).
Some general comparisons








Taking rough average values from the Israeli II and Puglia experiments, we would put κ = 0.8, τ = 0.65, and γ = 0.6, and coefficients of variation
The data from both experimental regions show that single area experiments on days are impractical; certainly so if there is no control and in practice also in the presence of a control, since it is not realistic to design experiments of 15–20 years’ duration. The data from both experiments also show that alternating targets experiments require some 5–6 years for reasonable power if a 15% seeding effect is tested for and that the use of control covariates does not significantly reduce the time requirement. In the author’s experience, those seem to be the order of magnitude for experiments of this type in general.
Single area experiments could therefore be viable only with some drastic change from the designs studied above. Thus, better covariates might be helpful if they can be found, as would better focused experimental units that reduced variability, and, of course, seeding techniques that justified testing for bigger effects than the ones hoped for from AgI seeding. Without any of these, alternate targeting is the only game in town.
Acknowledgments
Dr. B. A. Silverman’s generous advice and critical encouragement of the formulation of this paper are greatly appreciated.
REFERENCES
Adderley, E. E., 1961: Non-parametric methods of analysis applied to large-scale cloud-seeding experiments. J. Meteor.,18, 692–694.
——, and S. Twomey, 1958: An experiment on artificial stimulation of rainfall in the Snowy Mountains of Australia. Tellus,10, 275–280.
Changnon, S. A., F. A. Huff, and C. F. Hsu, 1979: On the need to evaluate operational weather modification projects. Bull. Amer. Meteor. Soc.,60, 770–774.
Davis, A. W., 1979: On certain ratio statistics in weather modification experiments. Technometrics,21, 283–289.
Dennis, A. S., J. R. Miller, D. E. Cain, and R. L. Schwaller, 1975: Evaluation by Monte Carlo tests of effects of cloud seeding on growing season rainfall in North Dakota. J. Appl. Meteor.,14, 959–969.
Flueck, J. A., 1976: Evaluation of operational weather modification projects. J. Wea. Modif.,8, 42–56.
——, and B. S. Holland, 1976: Ratio estimators and some inherent problems in their utilization. J. Appl. Meteor.,15, 535–543.
Gabriel, K. R., 1988: More about ratio statistics, especially for weather experiments. Tech. Rep. 88/07, University of Rochester, Rochester, NY, 31 pp. [Available from Department of Biostatistics, University of Rochester, Rochester, NY 14642.].
——, 1991: The use of ratio statistics in rain experiments, with special reference to Puglia and Sardinia. TECNAGRO Report, Corso V. Emanuele 173, TECNAGRO, Rome, Italy, 42 pp.
——, and P. Feder, 1969: On the distribution of statistics suitable for evaluating rainfall stimulation experiments. Technometrics,11, 149–160.
——, and D. Rosenfeld, 1990: The second Israeli rainfall stimulation experiment: Analysis of precipitation on both targets. J. Appl. Meteor.,29, 1055–1067.
Gagin, A., and J. Neumann, 1981: The second Israeli randomized cloud seeding experiment. J. Appl. Meteor.,20, 1301–1311.
Kapoor, R. K., K. Krishna, U. S. De, K. G. S. Nair, I. C. Talwar, S. K. Sharma, and B. V. R. Murthy, 1973: Results of operational cloud seeding experiment over Rihand catchment in northeast India. Indian J. Meteor. Geophys.,25, 379–384.
List, R., K. R. Gabriel, B. A. Silverman, Z. Levin, and T. Karakostas, 1999: The rain enhancement experiment in Puglia, Italy: Statistical evaluation. J. Appl. Meteor.,38, 281–289.
Mielke, P. W., and J. A. Flueck, 1976: Distributions of ratios for some selected bivariate probability distributions. Proc. Social Statistics Section, Washington, DC, American Statistical Association, 608–618.
——, and J. G. Medina, 1983: A new covariate ratio procedure for estimating treatment differences with application to Climax I and II experiments. J. Climate Appl. Meteor.,22, 1290–1295.
——, J. S. Williams, and S. C. Wu, 1977: Covariance analysis technique based on bivariate log-normal distribution with weather modification applications. J. Appl. Meteor.,16, 183–187.
Miller, A. J., D. E. Shaw, and L. C. Veitch, 1979: Analyzing the results of a cloud-seeding experiment in Tasmania. Comm. Stat. Theory Meth.,A8, 1017–1047.
Moran, P. A. P., 1959a: The methodology of rainmaking experiments. Rev. Int. Stat. Inst.,38, 105–119.
——, 1959b: The power of a cross-over test for the artificial stimulation of rain. Aust. J. Stat.,1, 47–52.
Neumann, J., and E. Shimbursky, 1972: On the distribution of a ratio of interest in single-area cloud-seeding experiments. J. Appl. Meteor.,11, 370–375.
Nirel, R., 1994: Bootstrap confidence intervals for the estimation of seeding effect in an operational period. Water-Related Issues, V. Barnett and K. F. Turkman, Eds., Vol. 2, Statistics for the Environment, Wiley, 109–123.
Petrondas, D. A., 1983: Two topics in permutation (re-randomization) inference: Double ratio statistics and multiple comparisons. Ph. D. thesis, University of Rochester, 365 pp. [Available from University of Rochester, Rochester, NY 14627.].
Siliceo, P., and P. A. Mosino, 1963: Twelve years of cloud seeding in the Necaxa watershed, Mexico. J. Appl. Meteor.,2, 311–323.
Smith, E. J., L. G. Veitch, D. E. Shaw, and A. J. Miller, 1979: A cloud-seeding experiment in Tasmania 1964–1970. J. Appl. Meteor.,18, 804–815.
Wurtele, Z. S., 1971: Analysis of the Israeli cloud-seeding experiment by means of concomitant variables. J. Appl. Meteor.,10, 1185–1192.
APPENDIX
Asymptotic Randomization Distributions of Ratio Statistics
Single ratio SRY








Double ratio DR(a)








Regression ratio RR




Root double ratio RDR








Root quadruple ratio RQR(b)








Root regression ratio RRR



A schema of correlations between targets and controls τ, between targets and their controls κ, and between targets and other controls γ.
Citation: Journal of Applied Meteorology 38, 3; 10.1175/1520-0450(1999)038<0290:RSFREI>2.0.CO;2

A schema of correlations between targets and controls τ, between targets and their controls κ, and between targets and other controls γ.
Citation: Journal of Applied Meteorology 38, 3; 10.1175/1520-0450(1999)038<0290:RSFREI>2.0.CO;2
A schema of correlations between targets and controls τ, between targets and their controls κ, and between targets and other controls γ.
Citation: Journal of Applied Meteorology 38, 3; 10.1175/1520-0450(1999)038<0290:RSFREI>2.0.CO;2

Variances of different ratio statistics, general and schematic.
Citation: Journal of Applied Meteorology 38, 3; 10.1175/1520-0450(1999)038<0290:RSFREI>2.0.CO;2

Variances of different ratio statistics, general and schematic.
Citation: Journal of Applied Meteorology 38, 3; 10.1175/1520-0450(1999)038<0290:RSFREI>2.0.CO;2
Variances of different ratio statistics, general and schematic.
Citation: Journal of Applied Meteorology 38, 3; 10.1175/1520-0450(1999)038<0290:RSFREI>2.0.CO;2
Totals and means for single target and alternating targets experiments. Note that Σ without an index will be considered to have index i running from 1 to n.


Coefficients of variation and correlations of targets and controls for the Israel II experiment. (The corresponding parameters from the model of Fig. 1 are indicated as approximations.)


Coefficients of variation and correlations of targets and controls for the Puglia experiment. (The corresponding parameters from the model of Fig. 1 are indicated as approximations.)


Number of days (years) needed for different designs and statistics to yield 90% power of detecting a 15% increase with 5% significance; approximations under schematic assumptions for the Israel II and Puglia experiments.

