## Abstract

A procedure is described to estimate bias errors for mean precipitation by using multiple estimates from different algorithms, satellite sources, and merged products. The Global Precipitation Climatology Project (GPCP) monthly product is used as a base precipitation estimate, with other input products included when they are within ±50% of the GPCP estimates on a zonal-mean basis (ocean and land separately). The standard deviation *σ* of the included products is then taken to be the estimated systematic, or bias, error. The results allow one to examine monthly climatologies and the annual climatology, producing maps of estimated bias errors, zonal-mean errors, and estimated errors over large areas such as ocean and land for both the tropics and the globe. For ocean areas, where there is the largest question as to absolute magnitude of precipitation, the analysis shows spatial variations in the estimated bias errors, indicating areas where one should have more or less confidence in the mean precipitation estimates. In the tropics, relative bias error estimates (*σ*/*μ*, where *μ* is the mean precipitation) over the eastern Pacific Ocean are as large as 20%, as compared with 10%–15% in the western Pacific part of the ITCZ. An examination of latitudinal differences over ocean clearly shows an increase in estimated bias error at higher latitudes, reaching up to 50%. Over land, the error estimates also locate regions of potential problems in the tropics and larger cold-season errors at high latitudes that are due to snow. An empirical technique to area average the gridded errors (*σ*) is described that allows one to make error estimates for arbitrary areas and for the tropics and the globe (land and ocean separately, and combined). Over the tropics this calculation leads to a relative error estimate for tropical land and ocean combined of 7%, which is considered to be an upper bound because of the lack of sign-of-the-error canceling when integrating over different areas with a different number of input products. For the globe the calculated relative error estimate from this study is about 9%, which is also probably a slight overestimate. These tropical and global estimated bias errors provide one estimate of the current state of knowledge of the planet’s mean precipitation.

## 1. Introduction

Over the last few decades a number of multiyear or climatological precipitation datasets and analyses (hereinafter “climatologies”) have been produced that cover all, or a substantial portion, of the globe. These include climatologies based on conventional surface observations (e.g., Jaeger 1976; Legates and Willmott 1990), combinations of satellite and gauge observations at monthly time resolution (e.g., Adler et al. 2003a; Huffman et al. 2009; Xie and Arkin 1995), and monthly, satellite-only estimates [often ocean only; e.g., Hilburn and Wentz (2008); Klepp et al. (2005)].

Data from the Special Sensor Microwave Imager (SSM/I) on board the U.S. Defense Meteorological Satellite Program series of satellites have been a critical input to many of these datasets since 1987. An intercomparison of relatively early SSM/I-based-algorithm ocean results at the monthly mean level showed a very large range of estimates of mean oceanic rainfall in the tropics (Adler et al. 2001). The availability of a few years of both passive and active microwave rain estimates from the Tropical Rainfall Measuring Mission (TRMM) at the beginning of the twenty-first century and advances in algorithm development led to a narrowing of the range of TRMM estimates (~20% in range of mean values) in the tropics relative to the pre-TRMM estimates (Adler et al. 2003b). The latest version (version 6) of the TRMM products has even smaller ranges of tropical mean values because of improved physics and techniques being used in the retrievals. There still remains a significant variation in mean precipitation values among various satellite estimates over the ocean, however. Over land, there is a similar variation, although in areas of good rain gauge coverage the gauge information is usually accepted as the standard or becomes a strong component of any multiproduct analysis. Satellite information is still valuable over land for discerning patterns and magnitudes in some key land areas where gauges are sparsely distributed or are of questionable quality.

So, with all of these estimates available, what is the correct mean precipitation at a location, or over a large or small area, either in a climatological sense, or for a particular month? A better question, however, may be “What is the error bar of a particular estimate?” The error associated with a particular estimate can be thought of as having two parts. One part of the total error is the random error, which could consist of random measurement errors and random errors due to sampling limitations and other processes. Because these errors are “random,” significant spatial or temporal averaging should reduce the mean random error to near zero. The second part of the total error is the systematic, or bias, error. No amount of averaging eliminates this type of error. This component can be contributed to by systematic algorithm or other measurement errors or by sampling biases (e.g., sampling only part of the diurnal cycle). Therefore, for long-term means or climatologies, where the random error should be near zero, “What is the magnitude of the estimated systematic (or bias) error?” is a valid and important question. For example, the monthly precipitation product from the Global Precipitation Climatology Project (GPCP) is a community-based analysis of global precipitation under the auspices of the World Climate Research Program (WCRP) from 1979 to the present (Adler et al. 2003a; Huffman et al. 2009). A significant amount of effort by a relatively large group of people has resulted in the technique and the resulting dataset spanning 1979–present, which has been used in over a thousand scientific journal articles. Although this monthly GPCP dataset is accompanied by gridded estimates of *random* error [combined algorithm and sampling; see Huffman (1997)], there is no estimate of systematic or *bias* error accompanying the product or the climatologies produced from the monthly analyses. In the GPCP analysis procedure the developers seek to minimize bias by, for example, adjusting the satellite information by the gauge analysis over land and adjusting infrared (IR) estimates over ocean by the relatively less frequent, but higher-quality, passive microwave estimates. The design goal is to remove apparent biases before combination is done.

One method by which error estimates, including bias error, may be calculated for individual satellite algorithms (e.g., passive microwave retrievals over ocean) is by estimating errors in the input information, for example in the assumed microphysics (Wilheit et al. 2007) or in parameters such as instrument error, errors in ancillary information, and errors in vertical hydrometeor structure. These estimated input errors are then included in the retrieval calculations, interacting with the physical variables and other errors in an attempt to calculate their impact on the final retrieval error. Estimating all of these input errors and how they interrelate to each other and with the physics of the retrieval is very complicated, however, and this approach has not yet resulted in a usable overall bias estimate, although this approach should eventually be useful.

Some information concerning bias error can, of course, be drawn from validation data (e.g., Adler et al. 2003a,b; Bolvin et al. 2009). Because gauge information is used to constrain the bias of the GPCP estimates over land, however, mean values of gauge-based validation fields tend to mirror the GPCP estimates in areas of good gauge information, although gauge validation of the satellite-only intermediate product does give information on the bias errors of the satellite estimates used. Over ocean, validation is problematic, with atolls concentrated in one location (tropical western Pacific Ocean) and with possible island effects making them probably unrepresentative of open-ocean mean rainfall. Buoy rain gauges (Bowman 2005) and atoll-based radars (Wolff et al. 2005) also have difficulty producing mean rain estimates with accuracies much smaller than those of the satellite estimates themselves. Outside of tropical oceans, there is a complete dearth of oceanic, accurate mean precipitation validation data.

The routine estimation of bias error for individual algorithms and for merged precipitation products has, therefore, not been successful to this point. Such error estimates are, however, important—for example, in water budget calculations where the observed precipitation data are being combined with observations or calculations of other components of the water cycle and a balance is mandatory. If a water imbalance occurs, which component or components should be adjusted and what is the limit of adjustment (i.e., what is the estimate of bias error?)? A simpler application of bias error estimates is the validation of global model rainfall calculations. For example, is a global-model-generated mean January ocean precipitation, which is 20% higher than GPCP, still within the estimated bias error of the observed GPCP product and therefore plausibly correct, or is the model mean value clearly outside the estimated bias error and therefore probably incorrect?

With these types of incentives in mind, we have developed an approach to estimate systematic or bias errors for satellite-based precipitation annual and monthly climatologies with an eye on applying it specifically to the GPCP product. The approach eschews an analysis of the detailed errors related to the physics of individual retrievals but instead drops back to examine the variations among different estimates using different satellites, merged products, and specific algorithms. The size of the estimate of the bias error will be directly related to the magnitude of the dispersion or spread of the different datasets. One justification for this type of approach is that each algorithm or merged-product developer does his or her best at taking into account the physics and statistics of the process to make their best estimate, but not all arrive at the same answer. Thus, by examining a set of such estimates (and the spread among them), we are actually indirectly measuring the effect of different physical assumptions in the retrievals, impacts of different sampling strategies or limitations, and the effect of various merger schemes. It is hoped that a dispersion statistic from among these products reflects the spread of estimates that result from the state of knowledge of the process.

The approach used here is similar to that of Smith et al. (2006), in terms of using multiple satellite precipitation estimates, but will include a scheme to screen the input estimates and allow for the calculation of area means of the estimated bias error. Our approach will be applied to the GPCP record so that we can examine maps of errors and then use the results to estimate both global and regional errors on the climatological scale. One goal is to achieve a technique that can be applied directly to the monthly GPCP analysis for use by the user community. Combined with the existing estimate of random error for GPCP monthly values (Huffman 1997), this new bias error would allow for an estimate of total (bias plus random) error.

## 2. Data resources

The monthly precipitation product from the GPCP is a community-based analysis of global precipitation under the auspices of the WCRP from 1979 to the present (Adler et al. 2003a; Huffman et al. 2009). Archived on a 2.5° × 2.5° grid, the data are combined from various information sources: microwave-based estimates from SSM/I, IR rainfall estimates from geostationary and polar-orbiting satellites, estimates from Television and Infrared Observation Satellite (TIROS) Operational Vertical Sounder (TOVS) and Atmospheric Infrared Sounder (AIRS) sensor soundings, and surface rain gauges. Although the data are homogeneous since 1988 in terms of input datasets, the satellite inputs are limited to IR-based estimates during the pre-1988 period. These pre-1988 estimates are trained on the later period to reduce possible differences. We should certainly be cautious of this time inhomogeneity in the analysis in terms of satellite input datasets, even though Smith et al. (2006) showed that the impact of this time inhomogeneity is not a major concern. Detailed procedures and input data information can be found in Adler et al. (2003a). Version 2 of the GPCP dataset has been used in this study. The recent release of version 2.1 (Huffman et al. 2009) will change the results minimally. When the next version is released, the numerical results will be updated.

In addition to the GPCP analysis, other precipitation estimates are used to estimate the bias error. Over ocean these include the Climate Prediction Center Merged Analysis of Precipitation (CMAP; Xie and Arkin 1997), Hamburg Ocean–Atmosphere Parameters and Fluxes from Satellite Data (HOAPS; Klepp et al. 2005), SSM/I *F-13* through the Goddard profiling algorithm [GPROF; Kummerow et al. (2001)], SSM/I *F-13* through the Remote Sensing Systems product (RSS; Hilburn and Wentz (2008), TRMM-2A12 (Kummerow et al. 2001), TRMM-2A25 (Iguchi and Meneghini 1994), and TRMM-2B31 (Haddad et al. 1997). Over land, four products are used (i.e., GPCP, CMAP, TRMM-2A25, and TRMM-2B31). The others are not used over land because of increased error there (SSM/I *F-13* computed with GPROF and TRMM-2A12) or lack of estimates (HOAPS and SSM/I *F-13* computed with the RSS algorithm).

## 3. Approach

The goal of this work is to produce bias error estimates on three time scales: 1) annual climatology, 2) month-of-the-year climatology, and, eventually, 3) individual months. The basic idea is to use multiple estimates of the monthly (or climatological) precipitation and to use the spread or dispersion among the estimates as a measure of the bias error. The statistic chosen to represent the degree of dispersion, and therefore the bias error estimate in this study, is the standard deviation *σ*. Other statistics could have been chosen, including 2*σ*, or even range, but *σ* is simple and can usually be converted into the other statistics, assuming a normal distribution. Therefore, in the rest of this paper bias error (or absolute bias error) will be defined as the standard deviation (*σ*) among the included products and will have the units of precipitation (e.g., mm day^{−1}). In addition, the relative bias error will be defined as the bias error (*σ*) divided by the mean precipitation *μ* (i.e., *σ*/*μ*). Often one or both of *σ* and *σ*/*μ* will be used, because they each have strengths in assessing the bias errors, especially at different magnitudes. When mean precipitation is near zero, the relative error becomes meaningless and *σ* must be used; when the mean rainfall over an area is relatively large, the relative, or percentage, error becomes much easier to use in comparisons.

The spatial scale for this study is 2.5° × 2.5° latitude–longitude. The base period for the study is 1998–2007 to incorporate the TRMM period. The products selected for inclusion in this exercise are many of the standard precipitation products used in a number of applications and studies. In general, they are considered by the community to be of good quality and validated. This assessment of generally good quality, however, does not guarantee their accuracy in general or their accuracy at all locations and/or in all seasons for which the products present estimates.

To be included in the bias error calculations (calculations of *σ*), each product is examined in terms of zonal average (ocean and land separately) for individual months. An example of zonal-averaged, ocean mean values for January and July 2003 is shown in Fig. 1. In the tropics there are eight products having fairly good agreement. At higher latitudes there is greater dispersion of the estimates. Some products are thought to be accurate in one region but perhaps not in another location. To avoid using products in regions for which they are believed to be less accurate, we devised a simple check to be applied to the zonal-averaged data for each month. Because we think the GPCP estimates are reasonable everywhere (but certainly not perfect) and because the focus of the error application is eventually on the GPCP product, we discard products whose zonal-mean value (ocean and land separately) is more than ±50% from the GPCP estimate. We apply this test on data for individual months so that the use of datasets varies as a function of latitude, season, and even year. We tested using smaller and larger ranges (from 25% to 100%) for inclusion, but the results were fairly insensitive to this variation. Limiting the inputs in this objective way, therefore, gives a procedure that includes information at latitudes at which the inclusion is reasonable but that eliminates estimates from the same product in regions where it is far from the expected value. The choice of the GPCP estimate as the base assumes that its value at a particular month and latitude cannot be more than 50% from the real value. This relatively large range of values to be included in the calculation makes the dispersion calculation realistic and yet excludes clearly incorrect or suspect estimates.

To be specific, all products are first regridded to the GPCP grid (2.5° × 2.5°). Then, they are chosen to be included for the bias error estimation based on the following procedure:

Compute zonal-mean profiles for the monthly GPCP precipitation [

*μ*_{GPCP}(*φ*,*t*)] and other products [*μ*(_{i}*φ*,*t*)], where*φ*denotes latitude,*t*is time, and*i*represents a specific product.Calculate the mean rainfall map by averaging the chosen products and GPCP. This is done for the 10-yr climatology, the 10-yr seasonal cycle, and for each month. Then, the standard deviation (

*σ*) among the products is calculated at each grid. Note that a product (with the exception of GPCP) may be included along*some*latitudes but not along*other*latitudes and in*some*months but not in*other*months.

The resulting values of dispersion among the estimates are then considered to be estimates of bias error and to be applicable to GPCP mean rainfall estimates. Results at various spatial and temporal scales are discussed in the following sections.

## 4. Results

### a. Ten-year climatologies: Means and estimated bias error fields

#### 1) Maps and regional variations

The 10-yr climatologies and parameters from the bias error calculations are shown in Fig. 2. Figure 2a shows the 10-yr GPCP mean precipitation, and Fig. 2b gives the mean precipitation for the composite (or mean) of the products used at each grid point, with Fig. 2f showing the number of products going into the composite. Figure 2c indicates the difference between the two mean fields. As one would expect, the two fields are very similar, with the same major features and only slight differences in magnitude over much of the globe. In mid- to high-latitude oceans, however, the composite is lower than the GPCP estimate, indicating that the non-GPCP products going into the composite are generally lower than those of GPCP in this area. Over tropical oceans there are small areas with composite values that are slightly lower than GPCP values, mainly in the eastern Pacific Ocean and in the eastern Atlantic Ocean. We are mainly concerned, however, with the variation among the input products to the composite. Those variations are given in Figs. 2d and 2e. The simple standard deviation *σ* of the input products in Fig. 2d shows generally higher values of *σ* with higher mean precipitation *μ*, as expected. In the tropical eastern Pacific Ocean, *σ* reaches 1.2 mm day^{−1}, giving a *σ*/*μ* up to 20% (Fig. 2e). Over the tropical oceans in areas of significant rainfall the percentage variation among the estimates is generally lower than this peak, with maximum values of 10%–15% in the western Pacific Ocean, even in areas of significant annual rainfall. In the midlatitude oceanic maxima (off the east coasts of Japan, the United States, and South America) and the midlatitude extension of the South Pacific convergence zone, the variation of the estimates is also about 15%. At higher latitudes over the ocean the percentage variation tends to increase from midlatitudes toward higher latitudes, with values reaching over 50% at 60° latitude in either hemisphere. Even higher percentage variations are found farther poleward, but these are in areas with only a few (two or three) contributing products. Over land the percentage variation is about 10% in most areas of significant rain but is higher (up to ~20%) in eastern Africa along the equator.

Figure 3a shows an example of the distribution of variation among estimates as a function of mean rain rate for various parts of the tropical ocean, including the tropical western and eastern Pacific Ocean areas. Fitted straight lines summarize each area. As expected, the *σ*/*μ* values decrease with increasing mean rain rate *μ*, with a majority of values being between 5% and 20%. The fitted lines clearly indicate that the eastern Pacific area has higher variability among the estimates, with mean *σ*/*μ* of about 15% versus 10%–12% at 5 mm day^{−1}. If these measures of dispersion are equivalent to bias errors (or are at least proportional), a conclusion is that we are less certain of our estimates in the eastern part of that ocean. This quantification of estimated bias errors agrees with a number of individual studies of these two areas (Berg et al. 2002, 2006; Shige et al. 2008). The tropical Indian Ocean also has higher bias errors, whereas the tropical Atlantic appears to have the lowest estimated errors.

Figure 3b shows the relations for *σ*/*μ* versus *μ* for midlatitude oceans in the Northern Hemisphere (NH). The fitted lines indicate a surprising slight increase of *σ*/*μ* with *μ*, with the maximum value of error at intermediate rain rates. Whereas it might have been expected that midlatitude ocean bias errors would be larger than in the tropics, the estimated errors at these latitudes (30°–45°N) are similar to those in the tropics at higher rain rates (>5 mm day^{−1}). Although we may think we know less about the magnitude of higher-latitude ocean rainfall as compared with that of the tropics, the dispersion of the available estimates is about the same.

Over tropical land (Fig. 3c), the estimated bias errors for Africa and South America are about the same value as that over the western Pacific Ocean but are less than that over the eastern Pacific. At overlapping rain rates (3–6 mm day^{−1}), South America has higher errors, possibly due to the different structure of rainfall there, with less deep convection.

#### 2) Zonal means of precipitation and associated errors

As discussed in section 4a(1), there are latitudinal variations in the estimated bias errors. Figure 4 shows the mean annual, zonal-mean precipitation values over ocean and the associated estimated errors. Keep in mind that the error values in Fig. 4 are not the zonal mean of the errors in the maps (Fig. 2) but are the standard deviation *σ* of the zonal-mean precipitation estimates. In terms of the mean precipitation, the composite is very close to GPCP values in the tropics and is less than GPCP values in the midlatitude maxima of the two hemispheres, again indicating that the four–five products that go into the composite tend to have lower precipitation than GPCP does. The standard deviation among the zonal means of the input estimates (*σ*; Fig. 4b) has a value of greater than 0.6 mm day^{−1} at 60° latitude in both hemispheres and has a secondary maximum in the tropics that is associated with the zonal maximum of rainfall. The *σ*/*μ* ratios vary from a low point of just below 10% in the tropical rain maximum to ~15% at 45° and ~25% at latitudes above 55°. These estimated errors in the zonally averaged precipitation tend to be smaller than those in specific latitude–longitude locations because of a canceling effect among the products when zonally averaged. These results seem to confirm the notion that we have a better knowledge of mean precipitation over tropical oceans than we do for higher ocean latitudes, even when we do zonal averages.

Over land (Fig. 5) in the tropical rainy belt, the *σ* value is smaller than over ocean, with the estimated percentage error being ~5%. This relatively low value reflects the presence of surface gauge information in two of the products and possibly the use of gauge information as validation during satellite algorithm development. There is a general poleward increase in estimated error, with values up to 20%, but outside of 40° latitude there are typically only two products. At southern latitudes outside 40°, there is also diminishing land. That the errors increase toward the pole over land seems reasonable, considering difficulties in making measurements with both satellite and gauges in high latitudes, especially during the cold season.

### b. Seasonal variations

The climatological means and estimated bias errors are calculated for each month. January and July will be described here as examples. Figure 6 shows the results for the January climatology. First of all, the estimated bias errors for the climatology of an individual month are typically greater than for the annual climatology. Just as with spatial averaging (as in the last section), time averaging over the annual cycle with compensating errors tends to result in lower error values. So, for example, the January mean *σ*/*μ* in the North Pacific Ocean is generally larger than 20% (Fig. 6e), whereas the same area for the annual climatology is mostly below 20% (Fig. 2e).

In the tropics for January the oceanic precipitation maxima are pushed toward the Southern Hemisphere. In the eastern Pacific Ocean the narrow rain maximum lies nearly along the equator, with stronger peak values in the composite mean than with GPCP. This distinct difference (Fig. 6c) is probably related to the GPCP product broadening the rain maximum and therefore underestimating the peak values. This effect can be seen in the reversal of the sign of the GPCP–composite difference as one moves a small distance north and south of the maximum. The estimated bias errors remain higher in the eastern Pacific maximum relative to the western Pacific Ocean feature.

The January midlatitude NH ocean maxima in both the Atlantic and Pacific are located on a zone of tight gradient of estimated error, increasing toward the pole, with larger error values found for January than for the annual climatology. These *σ*/*μ* values of greater than 30% clearly indicate a weakness in estimating precipitation over ocean in the cold season at these latitudes. The estimated errors in July (Fig. 7) in these locations are lower although still high relative to the annual cycle values. These higher errors in midlatitude winter are due to shallower liquid precipitating layers (more difficult for passive-microwave instruments to detect) and greater depth of falling snow (with attendant complex scattering signals). Over land the estimated errors in the tropics are similar for each season and are only a small amount larger than the errors for the annual mean. In higher-latitude areas (e.g., Asia above 50°N), however, estimated errors jump from 10% to greater than 30%. This large error jump is due to having only two estimates included in this area and a use of wind loss adjustment in one (GPCP) being much larger in winter during probable snow conditions.

### c. Averaging errors over large areas

After estimating the bias error across a global grid, one obvious extension is to estimate the error over an arbitrary area of the grid. One can do this calculation simply with a knowledge of the input precipitation fields, taking the areal mean of each input for the selected area and then calculating the *σ* and *σ*/*μ* of that set. Examples of such a calculation, the zonal-mean ocean and land climatological error estimates, were already presented in section 4a(2). Remember, the mean of the estimated errors is not equal to the estimated error of the area means. It is the second parameter that is the desired one, and it is usually smaller because of compensating differences with positive and negative signs. Situations can arise that make the calculations more complicated, however. For example, the area over which one wishes to make the calculations can have a different number of products at different places in the selected area. Computing the zonal average over ocean (and land separately) avoided this issue, because the technique for selection of the input datasets is based on the zonal means and variations. If one selects an area that includes ocean and land areas with different input datasets, however, or an area that goes across latitudinal boundaries (e.g., 40°) with different numbers of input datasets, one cannot simply calculate the areal means of the *n* datasets, because *n* is not a constant over the entire area.

To calculate the estimated bias error of an arbitrary area, knowing the estimated errors at each grid location, an empirical approach is developed below that allows us to calculate tropicwide and global estimated errors in the next section. It also provides a tool so that an attached grid of estimated bias errors could accompany the GPCP monthly precipitation dataset, from which arbitrary area-averaged errors can be estimated. Again, the goal is to go from a two-dimensional grid of estimated errors to an estimate of error over an arbitrary area. In other words, we want to estimate the real bias error *σ* over an area from the domain mean of the gridded bias error estimates , which is, of course, not the same. The empirical approach is to calculate the “real” *σ* and the area mean of the gridded *σ*s (i.e., ) as a function of area size and type of domain and to compare the two parameters. These calculations are done for areas of the grid for which all grid points have the same number of products. For an area of one grid cell (2.5°), the ratio of the two is, of course, 1. As the area gets larger, this ratio increases above 1 because of the canceling effect of bias errors when averaging precipitation values over areas. Examples of many such calculations over many different areas are shown in Fig. 8 for the *annual climatology* over ocean and land. The domain size is represented by a “pseudosize,” representing its comparable domain size (counted by the corresponding grid boxes of size 2.5° latitude × 2.5° longitude) at the equator. This adjusting size takes into account the changing latitude–longitude grid size. The calculations are done for a large number of rectangular areas (in terms of grid boxes) of various aspect ratios over ocean and land separately, all within the latitude bounds of 37.5°N–S. There is a fair amount of scatter, but a very large number of the calculations cover a narrow range, especially for larger size areas. The scatter is fit through a nonlinear relation using the ITT Visual Information Solutions, Inc., Interactive Data Language (IDL) Gaussfit program (see the appendix). The fitted curves (*R _{o}* and

*R*) in Fig. 8 rise sharply from 1.0 at the smallest (one grid) size and then continue to increase at a shallower slope until reaching an asymptote. The fitted curves capture the empirical relation related to the area-averaging process. The asymptote for the ocean is considerably smaller than that for land. This land–ocean difference is due to the larger horizontal variability in climatological precipitation (and related variations among estimates) over land as compared with ocean. This higher variability in mean precipitation (mainly due to orographic and other surface effects) in turn affects the variability in the calculated

_{l}*σ*(tends to be proportional to mean precipitation). The more horizontally variable

*σ*fields over land then lead to the higher ratio values over land.

Figures 9 and 10 show the ratio results for the tropical climatology of each month, with different figures for ocean and land. The results at this time scale (climatological month) have larger ratios than for the annual climatology. The asymptote is about 2.3 over ocean and 3.1 over land, with fairly tight scatter resulting from the exclusion of the extratropics in this case.

To evaluate how this technique performs in different regions, the monthly climatological errors are estimated for different-sized areas by the use of the formulas of *R _{o}* and

*R*(see the appendix) and are compared with estimated bias errors calculated by using the precipitation information, area averaging the information, and then calculating the estimated errors (standard deviations) from the area means of the multiple products. This evaluation can only be done for areas with the same number of products, and therefore we use the tropical band (37.5°S–37.5°N) and a midlatitude area (40°–52.5°N and 40°–52.5°S). The results for the 37.5°S–37.5°N band are essentially a check of the technique on dependent data, whereas results for the higher-latitude band are independent, since the relations were only developed using the tropical (37.5°S–37.5°N) area. These results are shown in Figs. 11 and 12 for ocean and land, respectively. For ocean, Fig. 11 indicates that the technique is fairly accurate over ranges of

_{l}*σ*(Fig. 11a), area size (Fig. 11b), and mean rain rate (Fig. 11c). The results for the midlatitude areas are slightly biased, because the sample is dominated by the tropics. Over land (tropics; see Fig. 12), the results have a greater variance but should still be useful.

These results indicate that the technique to use the area mean of the *σ*s can be converted into a *σ* of the precipitation means over an arbitrary area. This conclusion allows the use of the grid of *σ*s for a number of applications. Two of these will be addressed in the next section.

### d. Estimating the bias errors of tropical and global climatological precipitation

What is our state of knowledge of the magnitude of total tropical and global precipitation? In other words, what is the total mean precipitation over these areas and what is the error bar on that estimate? We will use the procedure described in this paper to estimate the bias error for these large-area estimates. The following equation will be used to make the estimates when combining estimates over land and ocean. As stated above, *R _{o}* and

*R*are parameters denoting the empirical relations over ocean and land between averaged bias and real bias (real

_{l}*σ*) over a domain, which are estimated by the fitted red curves in Fig. 8 (see the appendix). They may vary with both size of domain and domain-mean rain rate. The technique is a method for going from a map of gridded σs to an estimate of the area-mean (i.e., an estimate of bias error over the selected area).

For a domain covering both land and ocean, the adjusted bias error can then be estimated as

where *φ* denotes latitude and *σ _{oi}* and

*σ*are bias at grid points over ocean and land, respectively. This area weighting is necessary because of the difference in

_{lj}*R*values over ocean (

*R*) and land (

_{o}*R*), as seen in Figs. 8–10.

_{l}The results for the tropical case (25°N–25°S) are shown in Table 1. The climatological ocean value (Table 1, bottom row) for GPCP is 3.13 mm day^{−1}, very close to the composite estimate of the multiple products. The estimated bias error for the tropical ocean area is nearly 8%, however, indicating a relatively wide spread among the relevant satellite products. The rightmost column in Table 1 shows bias error estimates from Adler et al. (2009), with a tropical ocean value of only 3%. This lower estimate is based on only three TRMM products, two of which are not independent, and therefore the 3% value is considered to be a lower bound on the error estimate (Adler et al. 2009). The 8% value of this study may be an upper bound, however. Although the three TRMM products have nearly identical sampling and also have uniform sampling of the diurnal cycle, many of the additional products in the collection in this study have various-time-of-day sampling and other factors (e.g., adjustment to atoll gauges) that may widen the spread. Over land, the calculated bias error estimate is about 4%–5%, close to that of the earlier, TRMM-based estimate. For the combined land + ocean tropical error estimate, the 7% value is again considered to be somewhat of an upper bound because of the lack-of-cancellation-of-error effect being ignored when areas with different numbers of inputs are combined (in this case ocean and land). In summary, for the tropics the error estimates calculated for this paper are probably near an upper bound, and the actual errors are in the neighborhood of 6%–7% for ocean and ocean + land and a little lower for land by itself (5%). The lower value over land is because some included products contain gauge information, even though in some tropical areas the quality of that information may be suspect.

The global calculation takes full advantage of the techniques developed for this paper. There are many different regions, both ocean and land, with different numbers of products accepted into the calculation, and the technique takes these into account. The results are shown in Table 2. As stated before, the error estimates may be an upper bound because we do not consider canceling effects when combining areas (ocean and land) with different numbers of input products. For this global calculation the composite estimates for both ocean and land are lower than the GPCP mean value. Over ocean this is mainly due to differences over the midlatitude ocean, where a number of the passive microwave products are thought to have a negative bias that results from most being developed and tested over tropical oceans. As we have already seen in section 4a(2), the estimated bias errors increase in percentage terms in going from the tropics to the mid- and high latitudes. So, when the global estimates are made, the error magnitudes increase a few percent—for example, to 9% for the global value. The land and ocean values also increase to 7% and 10%, respectively. As stated before, these values are considered to be upper bounds because of the necessity of piecing together areas with different numbers of products and taking the sum of the errors instead of the error of the sums. This eliminates the canceling effect of high and low estimates in different regions, which reduces the overall bias error estimate. Even if the real *σ*/*μ* error bar is, say, 7% (lower than our calculation), adjustments of global precipitation estimates, especially GPCP, by 5% (e.g., Trenberth et al. 2007) are reasonable within the error estimates calculated here.

## 5. Summary and concluding remarks

Estimated bias errors are derived for mean precipitation by using a technique that employs multiple estimates from different algorithms, satellite sources, and merged products. The GPCP is used as a base product, and potential input products are screened out when they disagree with monthly GPCP estimates on a zonal-mean basis (ocean and land separately) by more than 50%. The results allow us to examine monthly climatologies and the annual climatology, producing maps of estimated bias errors, zonal-mean errors, and estimated errors over large areas such as ocean and land for both the tropics and for the globe.

For ocean areas, where there is the largest question as to absolute magnitude of precipitation, the analysis shows variations in the estimated errors, indicating some areas where we should be less confident of our mean precipitation estimates. Error estimates over the eastern Pacific Ocean are as large as 20%, as compared with 10%–15% in the western Pacific part of the ITCZ. Our calculations help to quantify this long-known spatial difference with confidence. Examining latitudinal differences over ocean clearly shows an increase in bias error estimate at higher latitudes, reaching up to 50%. Over land the error estimates also indicate potential locations of problems and the general cold-season problems at high latitudes.

The empirical technique to estimate area-average errors allows us to make error estimates for the tropics and for the globe (land and ocean separately, and combined). The estimated bias errors in this paper are considered to be upper bounds because of lack of canceling the sign of the error when integrating over different areas with different numbers of input products. Over the tropics, this calculation leads to larger error estimates than were found by Adler et al. (2009), using just TRMM data. The estimate for the tropics as a whole is 7% as compared with 3% from the earlier study. These upper and lower bounds indicate that the actual answer may be around 5%. For the globe the calculated error estimate from this paper is about 9%. Again this is considered to be an upper bound, and the actual error estimate may be closer to 7%. Combining this with the GPCP global mean gives a global precipitation estimate of 2.6 mm day^{−1} ± 7%.

As the differences between the two recent studies show, we have not determined a final answer. The procedures described here give one way of calculating estimated errors that are certainly useful in a relative sense (e.g., spatial variations) and give a usable error estimate for various water and energy balance studies. For the future we hope to translate this technique to provide to users the GPCP monthly gridded values with an accompanying estimated error bias. This would complement the random error estimates already in place that are based on Huffman (1997). Research will continue into refining this technique, validating the error estimates over a few locations, and, it is hoped, providing a solid bias error estimate.

## Acknowledgments

This research is supported under the NASA Energy and Water-Cycle Study (NEWS) program.

### APPENDIX

#### Estimation of Fitted Curves

The Gaussfit function in IDL is applied to estimate the fitting curves shown in Figs. 8–10. The function is a linear combination of a Gaussian function and a quadratic function; that is,

where *z* = (*X* + *b*_{1})/*b*_{2}. In Figs. 8–10, *Y* represents the fitted curves for the ratios between domain-mean bias error and real bias *σ* over either land (*R _{l}*) or ocean (

*R*), and

_{o}*X*denotes the size of domain or pseudosize. This fitting function can obviously provide nonlinear least squares fits. The actual curve equations are given as follows: In Fig. 8,

where

and

where

In Fig. 9,

where

In Fig. 10,

where