1. Introduction
Since rainfall is highly variable in both space and time, measurements from existing sparse networks of rain gauges may not be representative of rainfall generally. Satellite rainfall measurements, by contrast, are attractive because they promise information about rainfall rates on a nearly global basis. Several satellites are orbiting to measure precipitation especially in remote areas like the tropical oceans (see Simpson et al. 1988; Wilheit 1991; Theon et al. 1992; Thiele 1992). The Tropical Rainfall Measuring Mission (TRMM), which was launched in November 1997, has already carried out several years of measurements of rainfall. Among the many issues concerning TRMM satellite data processing (Simpson et al. 1988; McConnell and North 1987; Shin and North 1988; Kedem et al. 1990; North et al. 1991), the ground truth problem is one of the most critical because its solution will provide information about comparative accuracy of gauge (or radar) and satellite measurements.
There are various forms of ground validation for precipitation. The ground-based or airplane-based radars will be the ultimate “bearer” of ground truth, but the algorithms for radar backscatter estimation of rain rate remain controversial. The advantage of the point gauge ground truth is that it does not introduce the inevitably controversial algorithms associated with estimating the rain rate from other “ground truth” measurements such as that derived from radar. We thus present here an analysis of validating the satellite estimations with point gauge measurements.
A problem in the satellite estimation of rain rate is that the measurement taken by the satellite sensor is fundamentally different from the point gauge measurement. This is because the satellite measures a snapshot in time (actually about a 5–10-min average) of an area average over its field of view (FOV), while the point gauge measures precipitation nearly continuously in time. For an individual measurement pair of satellite and gauge, there is likely to be a large random difference between the two. North et al. (1994) and Ha and North (1994) developed a ground validation strategy that reduces the random errors by taking enough simultaneous pairs of measurements.
Since the probability distribution of real rain has a large contribution at zero rain rate (usually greater than 90%), the measurement pairs can be (no rain, no rain), (rain, no rain), and (rain, rain), where the first entry is the satellite measurement and the second one is the gauge measurement. For this reason, it is necessary for us to decide whether to throw out no-rain measurements from the stream of data pairs. Based on the method of throwing out the no-rain measurements, Ha and North (1999) proposed the following three ground truth designs.
Design 1 uses all data pairs of the two measurements (gauge, satellite) from all visits.
Design 2 throws out all the data pairs when the FOV has no rain. This design uses only data pairs (rain, no rain) and (rain, rain).
Design 3 throws out all the data pairs when the gauge has no rain. This design uses only data pairs (rain, rain).
Ha and North (1999) derived the error distribution and statistics (mean and mean-square error) with a spatially white noise Bernoulli random field for each of the designs mentioned above. The major finding in Ha and North (1999) was that design 3 cannot be used as a ground truth design due to its large design bias. It was also shown that there is a relationship between the mean-square error of design 1 and design 2 for the Bernoulli random field. Since actual rain is not a Bernoulli random field, the results in Ha and North (1999) may have limited applicability to the real situation.
In this study, a spatially non–white noise homogeneous random field having the mixed distribution is used as the rain-rate model. This means that it is raining or not raining at all with prescribed probability. This model is more general in its ability to describe the real rain compared to the simple Bernoulli model in Ha and North (1999). We derive the ensemble mean and the mean-square error for each design and thus we evaluate which scheme is the most appropriate as a ground truth design. At the same time, we examine the relationship between the ground truth designs proposed here.
To gain a better understanding of the scheme developed in this paper, we check the applicability of the ground truth design developed in this study for the non–white noise random field using Global Atmospheric Research Program (GARP) Atlantic Tropical Experiment (GATE) data and evaluate each of the three designs proposed in this paper.
2. Definitions
Consider a random field ψ(r, t) defined in the r = (x, y) plane and along the time axis t. Let the ensemble average of ψ(r, t) be 〈ψ(r, t)〉 and its variance at a point in space be σ2. The random variable ψ(r, t) is assumed to be weakly statistically homogeneous in space and time; that is, the lagged covariance is a function only of ξ = |r − r′| and τ = |t − t′|. Since rain rates are inherently patchy, a mixed distribution such as the mixed lognormal distribution is sometimes used as a model for the distribution of rain rates (Kedem et al. 1990) because this distribution includes the possibility of the no-rain phenomenon characteristic of real rain. That is, the rain rate has a mixed distribution such as the mixed lognormal distribution at a point in space and time.
The random variable ψ(r, t) has a positive probability 1 − p for the event {ψ(r, t) = 0}, but otherwise P[
In practice, the microwave radiometer estimates the rain rate by measuring the upwelling radiation from an atmospheric column. In the ideal case this represents a column average of the rain rate over the projection of the FOV onto the ground for the sensor. Such a measurement is not compared with the instantaneous rain rate at the surface but rather some kind of time average at the surface, since it takes raindrops several minutes to fall from the top of the column to the surface. This is very fortunate since instantaneous rain rates are notoriously variable in space and the effect of time averaging is to smooth considerably the spatial variability of the field. Hence, the comparison can be made between the satellite estimate for an FOV and a few-minute time-averaged measurement from a rain gauge. In this paper, the ground and satellite measurements can be instantaneous or time-averaged measurements and are usually denoted by Ψg and Ψs.
Two measurements are taken from the nth visit,
3. Evaluation of some ground truth designs
a. Preliminary remarks
The ground truth design must satisfy two conditions to detect the retrieval bias of the retrieval algorithm we want to check. 1) The error εdi = Ψsi − Ψgi must have no bias; that is, 〈εdi〉 = 〈Ψsi − Ψgi〉 = 0. If the error has this bias, which we call design bias, the retrieval bias and the design bias are combined and thus it is difficult to evaluate the retrieval bias we want to detect. 2) For the ground truth designs that satisfy 〈εdi〉 = 0, the mean-square error 〈
The error statistics for one ground truth design can be obtained from the error statistics of other designs if there is a relationship of the statistics (mean, variance, and mean-square error) among the ground truth designs. We thus here seek the relationship of the statistics among the ground truth designs and provide a method of computing the statistics for design 2 and design 3 using the statistics of design 1.
b. The bias of the error
Since the mean of the design error must be zero to unambiguously detect the retrieval bias, we thus first compute the ensemble mean of the error for each ground truth design to see which ground truth design is good for the validation of the satellite measurements. Since the random field is assumed to be homogeneous, the gauge and satellite measurements for design 1 have the same mean 〈Ψs1〉 = 〈Ψg1〉 = 〈ψ(r, t)〉. The ensemble mean of error εd1 is 〈εd1〉 = 0 and thus design 1 has no bias.
c. Mean-square error
Section 3b showed that design 3 cannot be used as a ground truth design, so henceforth only design 1 and design 2 are considered as ground truth designs, whose mean-square errors will be calculated.
d. Number of visits
Because design 2 is designed to throw away all visits where the satellite measurement has no rain, it is worthwhile to ask how many visits, say Nvisits(d2), we need to have N(d2) qualifying data pairs, which is the number of measurement pairs (satellite, gauge) we use when we apply design 2. In other words, Nvisits(d2) is the number of data pairs we use plus the number of visits (data pairs) we throw away for design 2. The total expected number of visits necessary to detect the retrieval bias becomes Nvisits(d2) = N(d2)/Ps. Note that the design 1 uses all visits and thus the total expected number Nvisits(d1) is equal to the number of data pairs N(d1). Therefore, if the satellite visits the gauge site once a day, it will take Nvisits(d1) = N(d1) days for design 1 and Nvisits(d2) = N(d2)/Ps days for design 2.
4. Numerical examples
In this section, for the applicability of the ground truth design developed in this study, we use GATE data and evaluate each ground truth design.
The ensemble mean of satellite measurements, gauge measurements, and the errors are provided in Table 1 to investigate the theoretical results in section 3. In Table 1, we can see that there is no bias for design 1 or design 2. But, as we showed theoretically already, design 3 has a design bias. This result says that the ground truth design 3 has a serious disadvantage as a ground truth design. The absolute value of the bias for design 3 increases as the width of the FOV increases. It was shown in section 3 that the mse(d2) is equal to the mse(d1) divided by the probability that the satellite measurement has rain inside the FOV. Figure 3 provides the estimated probability (relative frequency) Ps that the satellite measurement has rain inside the FOV based upon GATE data. The Ps is linearly increasing as the width of the FOV increases and the linear regression equation is Ps = 0.0555 + 0.00876 × (width of FOV). The coefficient of determination for the regression equation is R2 = 0.995. The variances of the gauge measurement for design 1 and design 2 are computed as
Table 2 gives the dimensionless root-mean-square error (drmse, hereafter), which is the square root of the dimensionless mean square error of design 1 and design 2 for a single visit. The drmse of design 2 is a little greater than the drmse of design 1 for any size of FOV. Table 2 also gives the number of visits to achieve 10% of the standard deviation of the gauge measurement. We take our nominal FOV to have a size of 20 km × 20 km because the footprint size of the TRMM Microwave Imager (TMI) (19.4-GHz channel) might be thought of as having a nominal 25-km resolution. For the typical 20 km × 20 km FOV, the number of data pairs is N(d1) = 50 for design 1 and N(d2) = 56 for design 2 to detect 10% bias. As we explained in section 3, the number of visits Nvisits(d2) necessary to detect 10% of the variability of gauge measurement is Nvisits(d2) = N(d2)/Ps. With Table 2 and Fig. 3, the number of visits for design 2 of GATE data can be obtained. For example, for the 20-km FOV, the expected number of visits is Nvisits(d2) = N(d2)/Ps = 56.3/0.24 ≈ 234. The expected number of visits to detect the 10% retrieval bias for design 1 is Nvisits(d1) = N(d1) ≈ 50 because design 1 uses the data pairs from all visits. It is notable that the total number of visits of design 1 is smaller than that of design 2 to detect the retrieval bias with the same tolerance level. This result seems to suggest that design 1 may be better than design 2, but this may not be so. Because the rain rate is patchy, the error (=satellite measurement − gauge measurement) has many zeros (see Fig. 2; the probability is about 0.9) and thus the mse(d1) can be so small due to the characteristic of the patchy rain. For the extreme rain field that always has no rain, the mse(d1) is zero. Even though this extreme case is trivial, it seems to us that the mse(d1) is not as appropriate measure to evaluate the accuracy of the ground truth design of satellite rain rate.
It is interesting to compare our result to the model study in North et al. (1994). They used the noise-forced diffusive rain model tuned to GATE data and obtained the number of visits N = 60 to detect the retrieval bias with 10% level. Since the noise-forced diffusive rain model always provides continuous rain fields, this model does not make any difference between design 1 and design 2. That is, for the noise-forced diffusive model, the probability that the satellite has rain is always Ps = 1. The number of visits N = 60 for the noise-forced diffusive is quite close to N(d1) = 50 and N(d2) = 56 with real GATE data.
For the white noise mixed lognormal random field using the statistics tuned to GATE, we also analytically derived the statistics and evaluated each design proposed in this paper. It was found that design 3 has bias and the number of visits to detect 10% bias is about 96 for both design 1 and design 2 with p = 0.1. Ha and North (1999) found that the number of data pairs to detect 10% bias is about N(d1) = 96 and N(d2) = 97 for the white noise Bernoulli random field with p = 0.1. Remember that the probability that satellite measurement has rain is Ps = 1 − (1 − p)A for the white noise random field. Because this probability is quite close to 1 if the probability of rain p is small and/or the FOV size is large, it is expected that the numbers N(d1) and N(d2) are almost the same for the white noise random field. Therefore, the expected number of visits Nvisits(d2) is almost the same as the number of data pairs N(d2) used in design 2. This result say that the white noise random field requires twice the number of measurement pairs for more realistic rain fields when we use design 2. However, the number of visits for realistic rain fields Nvisits(d2) = 234 is more than twice the number of visits Nvisits(d2) = 96 for white noise random fields.
5. Summary and conclusions
In this paper we have considered ground truth designs based on point gauge measurements to validate satellite measurements. Based upon properties encountered with real rain, we modeled the non–white noise homogeneous random field having a mixed distribution as a rain-rate field. Because either or both measurements (satellite, gauge) may have no rain, three ground truth designs based on the method of throwing out the no-rain measurements were proposed. Design 1 uses data pairs from all visits. Design 2 uses data pairs only when the FOV average has rain. Design 3 uses data pairs only when the gauge has rain.
It was theoretically shown that the satellite measurement is an unbiased estimator of the gauge measurement for design 1 and design 2. However, design 3 has a serious disadvantage as the ground truth design because it exhibits a large design bias. The efficiencies of design 1 and design 2 are indexed by the mean-square error (difference) between the satellite and gauge estimates. It was derived that the mean-square error for design 2 is equal to the mean-square error for design 1 divided by the probability that the satellite measurement has rain inside the FOV. This fact gives us a way to compute the mean-square error for design 2 without using the conditional random field. The theoretical results were confirmed with the GATE data. These results generalize what Ha and North (1999) showed for the white noise Bernoulli random field.
With the GATE data having an FOV width of 20 km, we have found that for design 1 and design 2 the number of data pairs necessary to distinguish a bias of 10% is of the order of 50, which is almost half of number of data pairs for a white noise random field. Since design 2 rejects data pairs when the satellite measurement has no rain, about 230 overpasses are required. Since a TRMM FOV will include a given gauge about once per day near the equator, this suggests that 8–10 months of data should be adequate. The retrieval of rain rate gives a beam-filling error (Chiu et al. 1990; Ha and North 1995), which is composed of a retrieval bias and the random error with ensemble mean zero. It will thus take more than 230 overpasses to validate the satellite measurements with point gauge measurements due to the beam-filling error. Since GATE data are characteristic of precipitation in the ITCZ where it is most intense, the results may not fully apply outside these areas.
Acknowledgments
The first author (EH) wishes to thank the Yonsei University Research Foundation for its support. The second author (GRNI) thanks the NASA TRMM program for its support.
REFERENCES
Arkell, R., and Hudlow M. , 1977: GATE International Meteorological Radar Atlas. NOAA, Washington, DC, 222 pp.
Bell, T. L., Abdullah A. , Martin R. L. , and North G. R. , 1990: Sampling error for satellite-derived tropical rainfall: Monte Carlo study using a space–time stochastic model. J. Geophys. Res, 95 , 2195–2205.
Chiu, L. S., North G. R. , Short D. A. , and McConnell A. , 1990: Rain estimation from satellites: Effect of finite field of view. J. Geophys. Res, 95 , 2177–2185.
Ha, E., and North G. R. , 1994: Use of multiple gauges and microwave attenuation of precipitation for satellite verification. J. Atmos. Oceanic Technol, 11 , 629–636.
Ha, E., and North G. R. , 1995: Model studies of the beam-filling errors for rain-rate retrieval with microwave radiometers. J. Atmos. Oceanic Technol, 12 , 268–281.
Ha, E., and North G. R. , 1999: Error analysis for some ground validation designs for satellite observations of precipitation. J. Atmos. Oceanic Technol, 16 , 1949–1957.
Kedem, B., Chiu L. S. , and North G. R. , 1990: Estimation of mean rain rate: Application to the satellite observations. J. Geophys. Res, 95 , 1965–1972.
McConnell, A., and North G. R. , 1987: Sampling errors in satellite estimates of tropical rain. J. Geophys. Res, 92 , (D8),. 9567–9570.
North, G. R., and Nakamoto S. , 1989: Formalism for comparing rain estimation designs. J. Atmos. Oceanic Technol, 6 , 985–992.
North, G. R., Shen S. S. P. , and Upson R. B. , 1991: Combining rain gages with satellite measurements for optimal estimates of area-time averaged rain rate. Water Resour. Res, 27 , 2785–2790.
North, G. R., Valdes J. B. , Ha E. , and Shen S. P. , 1994: The ground-truth problem for satellite estimates of rain rate. J. Atmos. Oceanic Technol, 11 , 1035–1041.
Parzen, E., 1962: Stochastic Processes. Holden-Day, 324 pp.
Patterson, V. L., Hudlow M. D. , Pytlowany P. J. , Richards F. P. , and Hoff J. D. , 1979: GATE radar rainfall processing system. NOAA Tech. Memo. EDIS 26, Washington DC, 158 pp.
Shin, K. S., and North G. R. , 1988: Sampling error study for rainfall estimates by satellite using a stochastic model. J. Appl. Meteor, 27 , 1218–1231.
Simpson, J., Adler R. F. , and North G. R. , 1988: A proposed Tropical Rainfall Measuring Mission (TRMM) satellite. Bull. Amer. Meteor. Soc, 69 , 278–295.
Theon, J. S., Matsuno T. , Sakata T. , and Fugono N. , 1992: The Global Role of Tropical Rainfall. A. Deepak, 280 pp.
Thiele, O. W., 1992: Ground truth for rain measurement from space. The Global Role of Tropical Rainfall, J. S. Theon et al., Eds., A. Deepak, 245–260.
Wilheit, T. T., Cheng A. T. C. , and Chiu L. S. , 1991: Retrieval of monthly rainfall indices from microwave radiometric measurements using probability distribution functions. J. Atmos. Oceanic Technol, 8 , 118–136.
Schematic diagram for the FOV of L km on a side
Citation: Journal of Atmospheric and Oceanic Technology 19, 1; 10.1175/1520-0426(2002)019<0065:EOSGTD>2.0.CO;2
The histogram of the satellite and gauge measurements, and the errors. The GATE data and the FOV of size 20 km × 20 km (resolution is 4 km) are used
Citation: Journal of Atmospheric and Oceanic Technology 19, 1; 10.1175/1520-0426(2002)019<0065:EOSGTD>2.0.CO;2
The estimated probability Ps that the satellite measurement has rain inside the FOV with GATE data. The linear regression equation is Ps = 0.0555 + 0.00876 × (width of the FOV)
Citation: Journal of Atmospheric and Oceanic Technology 19, 1; 10.1175/1520-0426(2002)019<0065:EOSGTD>2.0.CO;2
Ensemble mean of the satellite measurements 〈ψsi〉, gauge measurements 〈ψgi〉, and error (=satellite measurement − gauge measurement) 〈εdi〉 with GATE data
The number of data pairs N (d1) for design 1 and N (d2) for design 2 to achieve about 10% of the standard deviation of the gauge measurement are given, as well as the dimensionless root-mean-square error W (di) of one data pair. The dimensionless mean-square error is defined as W2(di) = 〈