A new passive microwave rainfall retrieval algorithm for the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) that relies on an a priori database derived from matchups between TMI brightness temperatures and precipitation radar (PR)-derived surface rain rates has been developed. In addition to implementing a fairly conventional Bayesian approach to precipitation estimation, it exploits a dimensional reduction technique designed to increase the effective sample density in the database and also to improve the detectability of precipitation over problem surface types. The details of the algorithm itself are described in a companion paper. In this paper, the algorithm is validated against independent PR–TMI matchups from calendar year 2002. The validation results are benchmarked against results obtained for the same scenes from the current standard (version 7) 2A12 rainfall product for TRMM.
Validation statistics considered include the biases, correlation coefficients, and root-mean-square (RMS) differences for annual precipitation totals on a 1° grid as well as two-threshold Heidke skill scores (HSS) for instantaneous (pixel level) retrievals, determined separately for each of seven surface classes, including ocean, coast, and five other basic land surface types as well as for cold (<275 K) and warm surface skin temperatures.
Overall, the University of Wisconsin (UW) algorithm exhibits markedly reduced RMS error and bias in the annual total rainfall and markedly improved instantaneous skill at delineating light rain rates, especially over land and the coast. To ensure that the improved results were not due to both the training and validation data having been taken from the same calendar year, the validation of the UW algorithm is repeated using 2005 matchup data.
Beginning in the 1970s, algorithms began to be developed to retrieve surface precipitation from the observations of satellite passive microwave radiometers (Wilheit et al. 1977). Some of the literature relevant to the history of these efforts is cited in the companion paper by Petty and Li (2013). Our focus here is on precipitation retrievals using passive microwave data from the Tropical Rainfall Measuring Mission (TRMM; Kummerow et al. 1998, 2000), which was launched on 27 November 1997.
The TRMM mission was designed to measure surface precipitation equatorward of approximately 38° latitude using a combination of two microwave instruments: the TRMM Microwave Imager (TMI) and the precipitation radar (PR). Originally designed for a lifetime of only 3 years, an orbital boost from an altitude of 350 to 402 km in August 2001 reduced its onboard fuel consumption and greatly extended its mission life. More than 13 years later, it is still returning high-quality observations of the tropics and subtropics.
The PR measures microwave backscatter at 13.4 GHz from falling precipitation at 5-km ground resolution while being largely insensitive to cloud water and many other atmospheric variables. It is therefore considered by far the more reliable and direct of the two TRMM microwave instruments for estimating rainfall. But with a swath width of only 247 km, the area coverage and temporal sampling by the PR are not ideal, especially for observations of instantaneous precipitation in tropical cloud systems, including tropical cyclones.
The TMI has a significantly wider swath width of 878 km, so that spatial and temporal sampling is improved by more than a factor of 3 relative to the PR. However, the problem of retrieving accurate surface precipitation rates from multifrequency passive microwave observations (nine channels ranging from 10.7–85.5 GHz in vertical and horizontal polarization) is less direct, is sensitive to a far wider range of environmental variables, and is especially susceptible to the effects of variable land surface emissivity (Wang et al. 2009).
The current standard TRMM precipitation product from the TMI is known as 2A12 version 7 (V7). The over-ocean component of 2A12 is based on the Goddard profiling algorithm (GPROF). GPROF is the product of a long evolution that began before the launch of TRMM (Kummerow et al. 1996, 2001, 2007, 2011). The history and limitations of the distinct overland component of 2A12 are summarized by Wang et al. (2009); it was long closely linked to the scattering-based retrieval formulations of Grody (1991) and Ferraro (1997) with refinements by McCollum and Ferraro (2003, 2005). Significant revisions to the land component were undertaken with version 7 of 2A12 (Gopalan et al. 2010), in part by incorporating a new PR–TMI matchup encompassing a wider range of climate regimes and precipitation types.
In this paper, we undertake the global validation of the new University of Wisconsin (UW) passive microwave precipitation rate algorithm described by Petty and Li (2013) for the TMI. To put the raw validation results for the new algorithm into a meaningful context, we apply the same validation procedures to the standard 2A12 version 7 product and compare the two products side by side.
Unique features of the UW algorithm include the following:
It uses a conceptually and computationally consistent retrieval framework for all surface types, including the ocean, coastlines, and land surface classes.
It exploits the dimensional reduction algorithm described by Petty (2013) to sharply improve the effective sample density in the a priori database while also purportedly improving the discrimination between rainfall and background noise (included spatial and temporal variability of surface emissivity).
Like GPROF, it is Bayesian in the sense that it relies on a prior statistical distribution of matchup data as the basis for retrieving not only the expected precipitation rate in a given scene; unlike GPROF (and 2A12), it also yields the posterior probability density function (PDF) of rain rate.
It is compact and efficient, able to process a complete TMI orbit in less than a minute on a desktop workstation.
A follow-on satellite mission with near-global coverage, the Global Precipitation Measurement (GPM) core observatory (Shepherd and Smith 2002; Smith et al. 2007; Hou et al. 2008) is currently scheduled for launch in 2014. The GPM Microwave Imager (GMI) will bring more channels to bear at similar and higher frequencies (10.65–183 GHz), and the new dual-frequency precipitation radar (DPR) will improve on the TRMM PR by adding the ability to distinguish the effects of variable mean drop size on radar backscatter.
As with TRMM, the GPM's active DPR will provide the highest-quality precipitation estimates over a narrow swath, while the passive GMI will provide the spatial and temporal sampling required for a wide range of meteorological, hydrological, and climatological applications. In addition, a number of passive-only sensors will be launched on other satellites in complementary orbits, with the GPM core observatory serving as a calibration reference.
While both GPROF and the UW algorithms have been developed for and tested on TMI data, both are under continued development for GPM and will eventually utilize extensive matchups between DPR and GMI in the same manner as for the current PR and TMI.
The official rainfall product for TMI, known as 2A12, is based primarily on the GPROF algorithm. Over water, it implements a nominally Bayesian approach to precipitation retrieval—that is, it searches a database of candidate solutions for approximate matches to the observation vector.
Building on work that began prior to the launch of TRMM (e.g., Kummerow et al. 1996), GPROF initially relied on a synthetic database derived from cloud model simulations and radiative transfer calculations (Kummerow et al. 2001). It has evolved since then: the latest version, which is identified with the product 2A12 V7, now relies partly on an observational database derived from matchups between the PR rainfall retrievals and the TMI brightness temperatures (Kummerow et al. 2011; Gopalan et al. 2010). In addition to its status as the official algorithm for TMI, GPROF has also been chosen as the standard algorithm for the constellation of passive microwave imagers that will soon be deployed as part of GPM.
A characteristic of all versions of GPROF for TMI over ocean is that they depend on adequately populating a nine-dimensional channel space with representative solutions and on correctly specifying match tolerances and/or channel weights in this nine-dimensional space. In addition, the current database is stratified by sea surface temperature and total precipitable water so as to reduce the solution subspace that needs to be searched as well as to help constrain the underdetermined retrieval problem.
Prior to version 7 of 2A12, the full Bayesian framework of GPROF was employed only over water. Over land, a reduced database was employed that was constrained to satisfy the empirical relationships of Grody (1991), Ferraro and Marks (1995), and McCollum and Ferraro (2003). In version 7, a new empirical database was constructed from 7 years of PR–TMI matchups, though problems were still noted with certain land surface types (Gopalan et al. 2010; Sudradjat et al. 2011).
Recognizing the problems posed by variable land surfaces and the need for a more flexible and physically robust retrieval framework over land, ongoing development of GPROF in preparation for GPM is branching into three variations (C. Kummerow 2012, personal communication):
A surface-blind version that requires no knowledge of local surface emissivity, only a mean and covariance of background brightness temperatures characteristic of a large but reasonably well-defined fraction of Earth's surface. This version seeks to identify precipitation signatures that are approximately orthogonal to the background variability. The methodology for this version is tentatively being based on the “pseudochannel” transformations of Petty (2013) and Petty and Li (2013).
A version that directly utilizes climatological knowledge of the local emissivity, including covariance, to optimize retrievals at that location (Aires et al. 2011).
A version that seeks to explicitly retrieve surface parameters associated with variable multichannel emissivity. This version would be invoked over surface types for which well-tested parametric emissivity models are available (Ferraro et al. 2013).
b. UW algorithm
The UW algorithm was recently developed as a standalone implementation of the surface-blind retrieval strategy described above. It utilizes empirical transformations designed to reduce the original nine TMI channels to three pseudochannels that retain most of the desired signature of precipitation while filtering, normalizing, and decorrelating as much as possible of the geophysical noise due to spatial and temporal background variations in emissivity and other variables. Such variations included land–water boundaries, local emissivity gradients, changes in surface moisture and temperature, and in some cases the presence or absence of snow cover.
The three pseudochannels then index into a preaveraged empirical database derived from a large number of resolution-matched TMI brightness temperatures and PR estimates of surface precipitation rate. The theoretical basis and technical details of the UW algorithm are given by Petty and Li (2013).
In the context of TRMM and GPM, the new algorithm serves several distinct purposes:
It is presented as a compact and efficient stand-alone algorithm for retrieving surface precipitation rate from TMI observations.
It serves as a test bed for conceptual improvements that can be incorporated into the surface-blind version of GPROF for GPM.
It offers a capability not offered by any previous operational passive microwave retrieval algorithm: the ability to provide not only a pixel-averaged precipitation rate but also the posterior PDF of possible rain rates, expressed as the fractional likelihood of precipitation exceeding a variety of intensity thresholds, including zero.
In support of the first two functions, the sole purpose of this paper is to evaluate the large-scale empirical performance of the new UW algorithm applied to TMI data in direct comparison to that of the latest available 2A12 product based on GPROF.
In the next section, we describe the datasets required both for the UW algorithm and for its validation and intercomparison with GPROF. Section 3 introduces and justifies the performance metrics that will be utilized in the intercomparison. Validation and intercomparison results are given in section 4, followed by conclusions in section 5.
a. Multiproduct matchups
The foundation for algorithm validation and intercomparison in this paper is time-, space-, and resolution-matched pixel-level rain rates for the full calendar year 2002 from three sources:
the standard TMI-derived surface rain-rate product based on GPROF known as 2A12 (version 7) and documented by Kummerow et al. (2011);
the new TMI-derived surface rain-rate estimates based on the UW algorithm described by Petty and Li (2013); and
Note that the express purpose of the UW algorithm is to attempt to reproduce the usually much more reliable radar-derived 2A25 values. It is understood that the radar retrievals are themselves subject to errors and limitations, especially in shallow, light, and/or frozen precipitation. Even if the UW passive microwave algorithm were perfectly successful according to the stated objective, it would reproduce those defects as well.
Half of the data (odd pixels) in the same year-long matchup record were previously used to populate the database for the UW algorithm, as described in the companion paper Petty and Li (2013). The remainder (even pixels, approximately 189 million total) is utilized in the comparisons that follow for both TMI products. Multichannel brightness temperatures and rain rates are presumed to have a sufficiently small spatial autocorrelation in precipitation that we can treat the training and testing datasets as effectively independent, especially considering the very large numbers of pixels (hundreds to thousands or more) that are typically averaged into a single bin in the UW database.
As discussed in section 4e, we subsequently added the full calendar year of 2005 to the intercomparison so as to further rule out any possibility that the validation results described below were tainted by a lack of independence between the training and validation data.
b. Surface classification mask
The UW algorithm, while utilizing a common overall strategy for retrievals over both land and water, relies on different databases and different linear transformation coefficients for each of seven different surface classes. The intent is to optimize the detection of precipitation in the presence of the temporal and spatial brightness temperature variability characteristic of each particular surface class.
The classes were derived objectively from the year-long record of resolution-matched TMI brightness temperatures, using PR rain rates to exclude all pixels identified as having nonzero precipitation. Means and covariances of brightness temperatures were computed within each 1° grid box. An ad hoc algorithm was utilized to objectively classify all grid points based on a measure of pairwise similarity between the means and covariances. The threshold of similarity defining each class was adjusted so as to place most grid boxes into one of seven classes. A few residual grid boxes were classified manually.
The seven classes, which are defined on a 1° latitude/longitude grid, are numbered in order of decreasing geographic coverage and are depicted in Fig. 1. Approximate descriptive names, based on the observed geographic distribution of each class, are given in Table 1.
In addition to being necessary for the selection of the correct coefficients and database in the UW algorithm, the surface mask also provides a convenient basis for assessing the performance of both TMI algorithms over different surface types. In particular, coastal regions (Class 2), desert regions (Class 3), and portions of the Tibetan Plateau (Class 5) have all been long known to pose special problems for passive microwave rain-rate algorithms.
In the comparisons that follow, the ocean class (Class 0) is considered to apply not only to TMI pixels falling in grid boxes explicitly tagged with that class but also to pixels within the coastal grid boxes (Class 2) that are more than 50 km from significant land areas. Thus, the true geographic extent of Class 2 is significantly smaller than would be inferred from Fig. 1 alone.
c. Surface skin temperature
In addition to requiring the surface class, the UW algorithm further distinguishes between cold and warm surfaces, where the distinction between the two is defined by a surface skin temperature Tskin less than or greater than 275 K. The latter is determined from the European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis (ERA-Interim).
The purpose of this threshold is to distinguish between surfaces that might have surface snow or ice cover and those that are unlikely to be affected by snow or ice. The UW algorithm uses different coefficients and databases for these two cases in each surface class except the ocean.
Once again, it is useful to evaluate the performance of both TMI products separately for cold and warm surfaces, as cold surfaces and especially snow cover have long been known to pose severe problems for passive microwave algorithms.
3. Basis for validation and intercomparison
We will be concerned here with several metrics of TMI algorithm performance with regard to their ability to mimic the reference 2A25 rain-rate product:
bivariate histograms and associated mean ratios, RMS differences, and correlations for 1° gridded annual totals of precipitation amount;
latitudinal profiles of annual total rainfall; and
the intrinsic (bias independent) skill of each TMI algorithm in delineating instantaneous 2A25 rain rates exceeding an arbitrary intensity threshold.
Each of the above is undertaken separately for each of the seven surface classes and, for land classes, for both “warm” and “cold” surfaces.
The first two sets of statistics listed above are standard and require no explanation. The third remains relatively unknown despite having been demonstrated by Conner and Petty (1998) and Wolff and Fisher (2009). We will devote some space below to reviewing that method, which entails computing the Heidke skill score (HSS; Heidke 1926; Doswell et al. 1990) for independently varying pairs of rain-rate thresholds.
Detection skill for arbitrary intensity thresholds
The nominal purpose of the Heidke skill score is to characterize the skill of an algorithm or forecast with respect to a binary (i.e., yes/no) variable. Given a contingency table capturing the number of samples reflecting each of four possible outcomes (Table 2), the HSS is defined as
The HSS yields a score of +1 for perfect skill, −1 for perfect negative skill (i.e., all incorrect), and 0 for outcomes indistinguishable from random chance after allowing for the relative proportion of “yes” and “no” in the validation dataset.
In the present context, a typical application of the HSS would be to determine the success of an algorithm at determining rain versus no-rain as compared to an independent validation source such as radar. But as demonstrated by Conner and Petty (1998), we may generalize this application to evaluate the skill of an algorithm at distinguishing rain rates exceeding any arbitrary threshold of intensity R1,0 in the validation rain-rate R1. The skill score will be a function of that threshold.
This approach is especially useful for establishing the effective lower limit of detectability of precipitation over troublesome surface types. Since the explicit aim of the UW algorithm formulation is to markedly improve that detectability relative to previous passive microwave rainfall retrieval products, determinations of maximum HSS as a function of the variable 2A25 rain-rate threshold are ideal for evaluating those improvements.
Moreover, it may prove that a different threshold R2,0 applied to the algorithm rain-rate R2 may yield the best skill. Significantly, the maximum skill achievable using any threshold R2,0 for distinguishing validation rain rates greater than R1,0 will be invariant with respect to monotonic (not necessarily linear) scalings of the algorithm's retrieved values. It is therefore independent of calibration bias and can be understood as a measure of intrinsic algorithm skill, unlike RMS error.
Beginning with a scatterplot or bivariate histogram of matchups between algorithm and validation data, any pair of rain rates (R1, R2) divides the cloud of data into four quadrants. The number of points falling within each quadrant correspond to the entries A, B, C, and D in the contingency table.
As illustrated in Fig. 2, every such pair falling with the data cloud thus has an associated HSS. These values may be contoured to depict the consequences of choosing one threshold in the algorithm rain rate to delineate validation rain rates exceeding a different threshold.
For any given validation rain-rate threshold R1, we may then determine two performance metrics for the algorithm:
the maximum skill Hmax(R1) achieved by the algorithm for any threshold R2; and
the algorithm threshold R2,max(R1) for which the above maximum skill is achieved.
a. Global annual rainfall maps
1) Difference maps
Figure 3 depicts the total annual precipitation for 2002 derived from the three products. Visually, they are similar, except that the radar-derived 2A25 product shows regions off the western coasts of South America and Africa as well as over Egypt and Libya that are considerably drier (less than 5-mm annual total) than the passive microwave-based counterparts. Note, however, that there are large portions of the North African desert and the Arabian Peninsula for which annual values reported by 2A12 are exactly zero.
Maps of the differences between the two TMI products and the reference 2A25 product (Fig. 4) reveal other differences in the degree to which 2A12 and UW accurately reproduce the latter. In particular, 2A12 exhibits large areas of overestimation by at least 300 mm in the Pacific intertropical convergence zone (ITCZ) and South Pacific convergence zone (SPCZ) as well as the South Atlantic and North Pacific midlatitude storm tracks. There is also strong overestimation by 2A12 along the Himalayan mountain range and parts of south-central China.
Large negative biases by 2A12 are noted throughout much of the eastern Indian Ocean and over the so-called Maritime Continent composed of Indonesia, the Philippines, and other island landmasses.
These large biases are sharply reduced over large areas in the UW retrievals, especially over ocean areas. At a few specific locations—for example, in North Africa near 15°N, 20°E—localized positive biases of 100–200 mm in the UW retrieval are noted that were not present in the 2A12 or 2A25 retrievals. For most other desert regions, the UW results show reduced bias.
Both passive microwave algorithms exhibit similar positive biases relative to 2A25 in central Africa and similar negative biases in northern South America.
2) Ratio maps
While Fig. 4 depicts absolute biases that naturally tend to be largest in the areas of heaviest rainfall, it gives little insight into relative (i.e., multiplicative) biases in drier regions. For that we examine the ratio of the passive microwave retrievals to the 2A25, as depicted in Fig. 5. Both algorithms tend to exhibit positive biases of at least a factor of 2 in the driest areas of the subtropical high zones over both water and land. The most notable exception is a pronounced negative bias for 2A12 over significant portions of the Sahara Desert and Saudi Arabian peninsula—as noted, 2A12 reported exactly zero for many of these locations (indicated here in white owing to the logarithmic color scale).
It must be emphasized that the above biases are all relative to another remote sensing product that itself undoubtedly has systematic regional biases with respect to the true rainfall.
1) Annual totals
Using the same gridded annual rainfall results as previously depicted in Fig. 3, bivariate histograms of ocean-only values from the two TMI algorithms versus those from 2A12 are shown in Fig. 6. The 2A12 rainfall is seen to have a slight overall positive bias of about 4% over the ocean, while UW has no overall bias. This is not surprising, as the a priori database used in the retrievals is statistically similar to that used in the validation.
More striking is the substantially smaller scatter in the UW results. RMS errors relative to 2A12 are a full factor of 2 lower, and the correlation coefficient improves from 0.96 to 0.99.
These large differences in performance are especially unexpected over the ocean, as the improvements embodied in UW were largely developed with the special problems posed by land surfaces in mind. Both 2A12 and UW are fundamentally Bayesian algorithms relying on 2A25 rain rates to populate their databases, so major differences in performance are presumably traceable to the most important difference between the two; specifically, the reliance of UW on a transformation of the original nine TMI channels to three quasi-independent pseudochannels from which background variability has been largely removed.
Latitudinal profiles of total annual rainfall are shown in Fig. 7. It is apparent that 2A12 exhibits modest latitude-dependent biases relative to 2A25, while the UW results are in almost perfect agreement with 2A25 at all latitudes. Again, the lack of global bias is not unexpected, but the ability to give unbiased results across several climate zones suggests that the algorithm is successfully rejecting extraneous radiometric signatures and responding primarily to those factors most directly related to precipitation.
2) Instantaneous retrievals
We now consider each algorithm's ability to retrieve instantaneous precipitation rates consistent with the 2A25 rain rates. Our starting point lies in the bivariate histograms depicted for both algorithms in Fig. 8.
Both algorithms are seen to be capable of large errors on an instantaneous basis—for example, matchups for which 2A25 occasionally gives rain rates of 10 mm h−1 or more where the TMI algorithm gives zero and vice versa. The differences between the two algorithms do not appear to be large on that count. Otherwise, both algorithms reveal a fairly similar—and symmetric—clustering of points about the 1:1 line.
Contours of HSS help reveal fairly subtle differences between the two instantaneous rainfall products. Both algorithms exhibit nearly identical skill (HSS = 0.66) in delineating all nonzero 2A25 rain rates (Fig. 9), albeit with different optimal thresholds applied to the respective TMI algorithms: 0.13 mm h−1 for 2A12 and 0.08 mm h−1 for UW. The skill diverges at higher rain rates—0.70 at 1.0 mm h−1 for 2A12 and 0.76 for UW.
c. Warm land surfaces
Now we will examine results for each of the six land surface classes under the condition that the ERA-Interim surface skin temperature Tskin is greater than 275 K. The latter condition ensures that the scenes under consideration are likely to be free of surface snow cover, thus eliminating a major source of potential error in the passive microwave estimation of precipitation. In all relevant figures, the class will appear as 1w–6w, with the “w” suffix indicating the warm condition.
1) Annual totals
Figures 10 and 11 depict scatterplots and latitudinal profiles, respectively, of annual total precipitation retrieved by the two TMI algorithms in comparison to the reference 2A25 rainfall product. To facilitate side-by-side comparison, results for 2A12 appear in the left column and for UW in the right column of each figure.
Even a cursory perusal of Fig. 10 reveals that for all six surface classes, there is a significant reduction in the scatter exhibited by UW in comparison to the standard 2A12 product. For example, for Class 1w (Figs. 10a,b), which is by far the largest and most generic land class observed by TMI, there is nearly a factor-of-2 decrease in the RMS error and an increase in the correlation coefficient from 0.73 to 0.91. Moreover, 2A12 overestimates rainfall for this class by an average of 9%.
While for both products in Class 1w there are instances of relatively dry grid boxes (less than 1000-mm precipitation, as inferred from 2A25) for which the TMI algorithm retrieves well in excess of 2000 mm, these cases are vastly more numerous for 2A12. Referring to Figs. 3a and 3b, we can trace many of these cases for 2A12 to a mountainous region on the eastern edge of the Tibetan Plateau in the vicinity of 30°N, 100°E. By comparison, UW agrees well with 2A25 (see also Figs. 4 and 5) at those same locations.
More generally, 2A12 reveals wide variation in the average bias for different classes, ranging from a mean ratio as low as 0.68 in the coastal class (2w) to as high as 8.56 in Class 6w. For UW the range is much smaller, generally hovering around the desirable ratio of 1, with a minor exception (0.67) for Class 6w.
Perhaps the most dramatic differences occur for Classes 5w and 6w, both of which are associated with the rarest and most troublesome surface classes, including portions of the Tibetan Plateau and the Himalayas. For Class 5w, the correlation coefficient for 2A12 is essentially zero (−0.05), while for UW it jumps to 0.83. The corresponding RMS error drops from 676 to only 85 mm. In Class 6w, 2A12 exhibits a reasonable correlation coefficient but also the aforementioned massive overestimate, leading to an RMS error of 4221 mm. For UW, the correlation declines markedly, presumably owing to a couple of outliers in this small sample, but the bias and RMS error are greatly improved.
Note that the PR retrievals themselves may have problems in the above regions (see, e.g., Fu and Liu 2007). We reiterate that the goal of the TMI algorithm is to reproduce the 2A25 retrievals, even if this may mean that it reproduces deficient performance in certain cases.
Figure 11 confirms the previous observations about the mean biases of each algorithm. The 2A12 overestimates slightly for 1w, underestimates more significantly for 2w, and overestimates severely for 5w and 6w. By comparison, the latitudinal profiles for UW are essentially unbiased over the applicable range of latitudes, including traditionally difficult cases such as coasts (Class 2w), deserts (3w), and high terrain (5w and 6w).
2) Instantaneous retrievals
Figure 12 depicts bivariate histograms of TMI versus 2A25 instantaneous (single pixel) rain rates. The left column is the 2A12 product; the right column is UW.
The most obvious property of the 2A12 product for these land surface classes is a strong tendency to retrieve either zero rain rate or else a rain rate in excess of several tenths of a millimeter per hour. This tendency is the weakest for Class 2w, for which 2A12 most likely falls back on the ocean retrieval algorithm for many pixels and strongest for 5w, for which extremely few pixels return rain rates greater than 0 but less than 1 mm h−1. The high threshold for nonzero rain rate in this last case is presumably a by-product of a screening formula (Gopalan et al. 2010) that attempts to minimize false rain signatures. By comparison, the UW algorithm returns a far larger number of low rain rates for all six classes.
The maximum skill of each algorithm at delineating 2A25 precipitation rates exceeding a given threshold is plotted in Fig. 13. For rain rates in excess of a few millimeters per hour, the difference in skill between the two algorithms is minor, but for lower rain rates UW consistently exhibits superior discrimination skill. The difference in skill is smallest for ordinary vegetated surfaces (Classes 1w and 4w) and greatest for coastal zones (2w), deserts (3w), and mountainous regions and plateaus (5w and 6w). Indeed, 2A12 exhibits almost no skill for Class 5w (the very high skill exhibited by UW for rain rates in excess of 1 mm h−1 is a statistical fluke due to an isolated cluster of points on the 1:1 line in Fig. 12j).
Of considerable interest when plotting instantaneous swath-level rain rates is the passive microwave (PMW) algorithm rain-rate threshold Rthresh, which optimally delineates all nonzero rain rates as determined from the radar-derived 2A25 product. These values correspond to the intersection of the heavy black curves in Fig. 12 with the vertical axis on the left. These threshold values for various surface classes, along with the associated skill, are listed in Table 3.
For some classes, the optimum threshold for UW is nonunique; that is, the same delineation skill results from any algorithm threshold between 0.01 and ~0.1 mm h−1. This is a consequence of the algorithm returning very few rain rates within that range.
Otherwise, over warm land surfaces (Classes 1w–6w), the optimum algorithm thresholds range from 0.20 to 0.32 mm h−1 for 2A12 and from 0.08 to 0.16 for UW. Skill scores are in all cases significantly greater for UW. The largest improvement relative to 2A12 is seen for Class 3w, comprising desert surfaces.
d. Cold land surfaces
Cold surfaces pose a multitude of challenges, all of which are likely to degrade the performance of any PMW algorithm:
Even a bare surface exhibits reduced brightness temperatures, challenging algorithms that rely on detecting depressed brightness temperatures to identify precipitation in progress from the ice scattering signature aloft.
The surface may also be snow covered, with a further sharp reduction in surface emissivity that is especially difficult to distinguish from precipitation in progress.
At many locations, cold surfaces occur only rarely, so the observational database used in the retrieval may be very sparsely populated even when trained on a year's worth of data.
It was an explicit objective to design the UW algorithm to be as robust as possible in the presence of the above complications. We therefore wish to evaluate algorithm performance separately for the case of the cold surface and determine whether significant improvements are noted with respect to the standard 2A12 product. In all relevant figures, the class will appear as 1c–6c, with the “c” suffix indicating the cold condition.
1) Annual totals
Because of the limited space, scatterplots of the annual totals on a 1° grid are shown in Fig. 14 only for the largest and most important class, that of vegetated land (Class 1c). The corresponding latitudinal profiles are shown in Fig. 15. Key validation statistics for all surface classes (1c–6c) are tabulated in the last six rows of Table 4.
Generally, 2A12 exhibits a tendency toward large biases, large RMS errors, and poor correlations over cold land surfaces. For example, in Class 1c, 2A12 retrieves approximately 3 times as much precipitation as 2A25 and does so with a spatial correlation of only 0.24. The latitudinal profile in Fig. 15 reveals that the bias has a large latitudinal dependence. For UW, the ratio is 1.11, the correlation is 0.87, and there is very little latitudinal dependence.
The worst results for 2A12 are found for Class 4c, corresponding to the seemingly strange combination of land class 4 (nominally near-equatorial forests; see Fig. 1) with ERA-analyzed surface skin temperatures less than 275 K. Closer examination revealed that Class 4 includes a very small number of grid boxes in mountainous, possibly forested, terrain near 30°N, 97°E (south-central China) that were evidently classified together with rain forest based on their mean radiometric characteristics. These locations do experience subfreezing temperatures and possible snowfall and are responsible for the small number of matchups classified as 4c.
For every cold surface class, ratios, RMS errors, and correlations are superior, often markedly so, for the UW retrievals. For all except 3c and 6c, the UW ratios are within 20% of unity, and even for those two cases, the RMS errors and correlations are greatly improved relative to 2A12.
Recall that Classes 2c, 3c, and 4c all involve rather rare combinations of land surface class and cold analyzed surface skin temperature. The validation statistics cited for these classes are therefore not necessarily expected to be robust. Indeed, as discussed in a later section, they change markedly when reevaluated for a different validation year.
2) Instantaneous retrievals
As was done for the warm surface classes, we now examine the instantaneous skill of each TMI algorithm at discriminating precipitation over cold surfaces. Histograms are given for Class 1c only in Fig. 16, skill score plots are given in Fig. 17, and optimum thresholds for discriminating nonzero precipitation rates appear in the final six rows of Table 3.
For the largest surface class, 1c, the 2A12 algorithm is notable for returning essentially no nonzero precipitation rates less than about 1 mm h−1. Moreover, the discrimination skill is uniformly at or below 0.1 for all intensity thresholds (Fig. 17a). Similar or even poorer performance is noted for most other surface classes except 2c. In particular, discrimination skill is almost indistinguishable from zero for Classes 3c (cold desert), 5c (high cold plateau), and 6c (Himalayas).
In all cases the UW algorithm exhibits markedly improved discrimination skill, although it remains significantly poorer than for the warm surface classes (cf. Fig. 13).
e. 2005 results
As noted in section 2a, the database for the UW algorithm was derived from half of the available matchup pixels in calendar year 2002, with the other half being utilized for the above validation and intercomparison. By comparison, 2A12 utilized matchups from 1 June 1999 to 31 May 2000 for its database (Kummerow et al. 2011).
After the intercomparison of both algorithms for 2002 was completed and had revealed consistent and significant differences in performance as described above, we elected to repeat the validation of UW only for the entire calendar year of 2005 so as to address any concerns about the true independence of the training and testing datasets.
Validation statistics for 1° gridded annual rainfall appear as added columns in Table 4. These statistics reveal that for ocean retrievals and for warm land surfaces, there are only very minor changes in mean ratios, RMS errors, and spatial correlation coefficients relative to the 2002 results for UW. Thus, a lack of independence is not responsible for the significant performance differences between 2A12 and UW over these surfaces in 2002.
Over cold surfaces (Classes 1c–6c) it is a different story, with substantial degradations in certain statistics relative to the 2002 results, especially for 2c (coast) and 4c (heavy forest). Recall that the 4c case in particular is associated with a very few isolated grid boxes for which the training data sample was sparse to begin with. Nevertheless, the mean ratios, RMS errors, and, in all but two cases (2c and 5), correlation coefficients remain significantly better than their counterparts for 2A12.
In this paper, we have undertaken a global validation and intercomparison of two retrieval algorithms for TMI, the first being the standard 2A12 V7 product for TRMM based on GPROF and the second being the new UW algorithm described by Petty and Li (2013). Specifically, using 50% of a full calendar year's worth of matchups (the other 50% having been used to populate the UW database), we evaluated the ability of each algorithm to retrieve (i) annual total precipitation on a 1° grid and (ii) instantaneous pixel-level rain rates, in both cases using the PR-derived 2A25 product as “truth.”
In each of seven surface classes, including ocean, coast, and five other basic land surface types, the UW algorithm showed markedly reduced RMS error and bias in the case of the annual total rainfall and markedly improved instantaneous skill at delineating light rain rates. To ensure that the improved results were not due to both the training and validation data having been taken from the same calendar year of 2002 (albeit from alternating pixels), the validation was repeated for UW only using independent matchups from a completely different calendar year, 2005. In all but a few cases involving anomalous combinations of surface class and temperature, the algorithm performance remained about the same.
Not only does the new algorithm yield significant performance improvements, especially over several different troublesome surface types (Sudradjat et al. 2011), but it does so using a compact, simple, and efficient Bayesian retrieval framework that does not even rely on traditional surface screening logic. Moreover, as discussed by Petty and Li (2013), the UW algorithm provides uniquely detailed information concerning the posterior PDF of rain rate in a given pixel.
In its current form, the UW algorithm retrieves only the surface precipitation rate, unlike GPROF, which attempts to retrieve hydrometeor profiles as well. It would be conceptually straightforward to extend the UW algorithm to retrieve hydrometeor profiles, but this is not currently a priority, in part because of the much smaller number of usable PR–TMI matchups available for this purpose (only near-nadir PR profiles can be easily utilized).
Further development of the UW algorithm will focus on adapting both the methodology and the PR–TMI database to GPM prior to the launch of the core observatory in 2014. Postlaunch development will include rederiving transformations and the a priori database from real matchups between GMI brightness temperatures and the dual-frequency precipitation radar. It seems likely that higher-frequency GMI channels near 166 and/or 183 GHz will provide additional physical information and lead to further improvements in performance relative to the TRMM Microwave Imager.
We thank Dr. F. Joseph Turk and two anonymous reviewers for comments that led to significant improvements in this paper. TRMM 2A12 and 2A25 data were obtained from the GSFC Precipitation Processing System (PPS). This work is supported by NASA Grant NNX10AGAH69G.