The current emphasis on climate change has motivated an interest in lengthy time series of satellite-derived geophysical parameters in order to better understand the degree of variability inherent to the global climate system. One parameter of great interest is rainfall and its global distribution on a month-to-month and even a day-to-day basis. Satellite retrieval of rainfall is now a science over 25 years old, although it has only recently emerged as a widely accepted technique since the organization of the Global Precipitation Climatology Project (WMO/ICSU, 1990). This project followed on the heels of the launch of the first Special Sensor Microwave/Imager (SSM/I) instrument on Defense Meteorological Satellite Program (DMSP) space platforms beginning in July 1987 (CalVal 1989; Hollinger et al. 1990) and the availability of global datasets derived from the Geostationary Precipitation Index (GPI). The GPI is a relatively simple infrared (IR) algorithm that is largely designed for applications with geosynchronous satellite measurements enabling full sampling of the diurnal cycle; see Arkin (1988) and Joyce and Arkin (1997). Today, the two mainstay approaches for estimating global rainfall from satellites are the Arkin and Meisner (1987) GPI algorithm, and various passive microwave (PMW) algorithms that were included in this study. Smith et al. (1994c), Wilheit et al. (1994), Liberti (1995), and this paper review the design a number of modern PMW techniques; Petty (1995) and Petty and Krajewski (1996) have conducted focused reviews on current advances in land-based PMW retrieval.
In the last few years, various precipitation algorithm intercomparison projects have been conducted in order to assess the state of the art and the degree of accuracy now possible with satellite-based methods. The Global Precipitation Climatology Project (GPCP), an arm of the World Climate Research Programme (WCRP), has coordinated three such projects, referred to as Algorithm Intercomparison Projects AIP-1, -2, and -3. Arkin and Xie (1994) and Ebert (1996) have provided overviews of the AIP studies. The NASA WetNet Project has also coordinated three projects referred to as Precipitation Intercomparison Projects PIP-1, -2, and -3. Dodge and Goodman (1994) and Barrett et al. (1994a) provide overviews of the WetNet program and the PIP projects.
The goals of PIP-2 (which is the focus of attention in this study), are 1) to improve the performance and accuracy of different SSM/I algorithms at full resolution–instantaneous scales by seeking a better understanding of the relationship between microphysical signatures in the PMW measurements and physical laws employed in the algorithms; 2) to evaluate the pros and cons of current individual algorithms and algorithm techniques (subsystems) and to seek combination algorithms that exploit the best subsystem features of the many individual algorithms; and 3) to demonstrate thatPMW algorithms generate credible instantaneous rain-rate estimates.
Notably, the AIP and PIP intercomparison projects have had a distinct set of goals (with some overlap), while each of the individual intercomparisons conducted by the two projects has had a different focus. Three of the AIP–PIP projects have been completed and published in some form; these are AIP-1, AIP-2, and PIP-1. The calculations and analysis for PIP-2 and AIP-3 have been completed, with the final results for both of these projects being presented in this special issue. PIP-3 is starting into its last phase; the calculations and analysis are complete with the scientific reporting left to be done (R. Adler 1997, personal communication). This paper summarizes the results of PIP-2, while the paper of Ebert and Manton (1998) summarizes the AIP-3 results. Extended reports by Smith et al. (1995), Ebert and Manton (1995), and Ebert (1996) provide more details on the PIP-2 and AIP-3 calculations.1 For those interested in the main findings of PIP-2, it is only necessary to read the abstract and sections 1 (Introduction), 2 (PIP/AIP Overview), and 5 (Conclusions) of this paper. Section 3 discusses the organizational and methodological details of PIP-2, while section 4 discusses the detailed analysis of the results. The remaining papers in this issue address different aspects of PIP-2 and other PMW precipitation retrieval topics.
Although both the AIP and PIP projects have had the common objective of improving satellite algorithms through intercomparison, the underlying goal of the GPCP project has been to use ground-based rain gauge and radar data to directly “validate” satellite algorithms (Allam et al. 1993; Arkin and Xie 1994; Ebert et al. 1996). Here validation denotes quantifying algorithm error in terms of bias and rms differences with respect to the ground measurements. The AIP view is that ground data provide the best possible means to absolutely “calibrate” satellite algorithms (in spite of imperfections) and should be merged with the retrievals to produce optimal global monthly averaged rain maps;see Huffman et al. (1995), Huffman et al. (1997), and Xie and Arkin (1997). In this context, the AIP projects have been conducted over regional areas with differing atmospheric conditions and with readily available wide-area coverage “ground truth” measurements. Although the AIP designers recognize the uncertainties in ground measurements and the ambiguity in referring to ground measurements as truth, they pursue the direct validation approach to ensure consistency in the merged global datasets.
The WetNet PIP projects have taken a different view, concentrating on determining how much and why different PMW algorithms differ among themselves, aswell as conventional ground measurements, while seeking an absolute calibration through physical modeling. It is the ultimate goal of PIP-2 to reduce uncertainty through merger of the best features of individual PMW algorithms, using checks against ground measurements to shed light on the uncertainty problem but without using the associated differences as measures of uncertainty. The PIP projects are designed to examine datasets over the entire globe. PIP-1 and PIP-3 involve retrievals of monthly rain accumulations over global grid meshes, while PIP-2 considers instantaneous rain rates of numerous individual case studies distributed over the globe. Although conventional ground data have been incorporated into the PIP intercomparisons in the context of validation, the PIP philosophy disavows the notion that ground data are accurate enough to serve as a final calibration standard.
In order for ground measurements to be used as an absolute calibration reference, they would have to be specially calibrated and processed along rigorous scientific lines. This is not how the data paths of operational ground rain measuring systems are treated, whether they involve rain gauge accumulations or radar reflectivities;for example, Dai et al. (1997). There have been attempts to implement calibration-quality ground radar systems on a regional scale, such as the two TOGA COARE 5-cm shipboard radars discussed in the AIP-3 paper of Ebert and Manton (1998), although major uncertainties remain in those radar retrievals precluding their utility as a calibration reference (see Short et al. 1997). An attempt to acquire calibration-quality ground validation data at multiple radar sites spread over the globe will be carried out in 1998 as part of the ground validation system supporting the United States–Japan Tropical Rainfall Measuring Mission (TRMM). The major components of this system consist of four volume scan radars at Kwajalein Atoll (Marshall Islands), Darwin (Australia), Houston (Texas), and Melbourne (Florida). However, these four radars only view 0.055% of the earth’s surface and are not expected by the TRMM Science Team to provide an absolute calibration standard since conversion of radar reflectivities to rain rates, even with perfect calibration, is in itself an underdetermined remote sensing problem; see Cerro et al. (1997) and Yuter and Houze (1997).
The study of Barrett et al. (1994b), which involved all the algorithm participants of PIP-1, also raised the question as to the wisdom of validating satellite algorithms with ground data, although a sense of fatalism enters the accompanying discussion in that there were no other practical options for validation at that time. As this study and that of Ebert and Manton (1998) indicate, the critical issue is that the present agreement between the published PMW algorithms is generally within the degree of uncertainty in the ground measurements, so that it is questionable whether satellite algorithms validated with ground data can be made more accurate and precise than those calibrated against a physical model.
This general consistency among the PMW algorithms has motivated a search for “consensus” or “combined” algorithms, which presumably represent the highest degree of accuracy now possible with the PMW approach. The study of Kniveton and Barrett (1994) has already examined a “back-end” approach for generating an improved consensus algorithm by merging retrieval results from the different algorithms submitted to PIP-1, according to how well they compared to that project’s validation data. However, the improvement in their study was guaranteed because the standards of comparison were the validation datasets themselves.
An alternate approach, or the “front-end” combined algorithm, involves extracting different software modules and rule-based procedures used by the collection of algorithms, according to how well the various rules and assumptions stack up against the actual physical nature of the retrieval problem. In other words, a new algorithm is developed by selecting different parts of prior algorithms, based on physical principles. This approach is directed toward an optimal algorithm design but requires an “error model” to serve as final arbitrator of accuracy. Currently no such model exists, although Wilheit et al. (1995) have initiated the development of an error model for TRMM; see Simpson et al. (1996). The combined radar–radiometer algorithm designed for operational use at launch on TRMM by Smith et al. (1997) and Haddad et al. (1997) is a good example of a front-end combined algorithm, although it contains a mix of passive and active techniques rather than a mix of solely passive techniques (see also Bauer et al. 1998). Notably, the PIP-3 project is now focusing on the performance of various preliminary combined algorithms.
2. Review of precipitation intercomparison projects
The AIP-1 and AIP-2 projects involved a mix of IR and PMW algorithms, although AIP-1 was primarily focused on IR algorithms. The AIP-1 study area extended over the islands of Japan and their surroundings, and considered a dataset over two 1989 time periods (1–30 June, 15 July–15 August). This project focused on estimating the range of differences between primarily infrared algorithms, with little emphasis on explaining why the differences arose. A summary and interpretation of the AIP-1 results for 16 IR algorithms is given by Arkin and Xie (1994); more detailed information on individual algorithm results is given in the report by Lee et al. (1991). They found that the IR algorithms differed by as much as 200% for monthly averaged rainfall over the 30° × 30° study area. No firm conclusions were offered concerning what could be done to reduce the differences, although the study of Negri and Adler (1993) on three of the AIP-1 IR algorithms sheds some light on intrinsic problems with the IR approach.
The AIP-2 study area was a box over the British Isles and surrounding waters plus much of west-central Europe. An ECMWF (European Centre for Medium-Range Weather Forecasts) report by Allam et al. (1993) has described the algorithms used in AIP-2 and has discussed the calculations. Summary descriptions of the algorithms are found in Liberti (1995). The algorithms consisted of a mixture of IR, IR/visible (VIS), IR/PMW, and PMW-only schemes submitted by 16 different groups, some of whom submitted multiple versions. The main preliminary finding about PMW schemes was that they did not outperform the IR algorithms and appeared to do better over land than water. These results are at odds with the recent intercomparison projects, although AIP-2 did not provide a thorough sampling of the microwave algorithms, nor were the climatological conditions and predominantly land background ideal settings for evaluating PMW algorithm performance over water. Because a final refereed analysis of the AIP-2 results has yet to be published, the preliminary conclusion concerning the near-equivalent performance of IR and PMW algorithms remains uncertain.
The AIP-3 project attracted 52 algorithms from 24 research groups, consisting of 22 IR algorithms or mixed IR algorithms (IR/VIS or IR/PMW) and 29 PMW-only SSM/I algorithms. (Of the latter, 20 are candidates for comparison to the PIP-2 algorithms as discussed in section 4c.) The study area was the outer sounding array of the TOGA COARE experiment domain in the tropical western Pacific ocean; the project focused on both IR and PMW algorithm performance in a “wet” tropical ocean environment. In some ways, the convective nature of the precipitation elements in this environment represents the simplest targets for satellite algorithms; therefore the spread of results of AIP-3 offers an excellent indicator of how well the different algorithm techniques are converging. For purposes of validation, 4 months of 10-min sampled rain rates derived from C-band radar measurements obtained from two ships deployed for the TOGA COARE experiment were used. According to Ebert and Manton’s interpretation (1998), which provides the summary results of AIP-3, there is a spread in the rain rate magnitudes of the satellite algorithms, with the satellite results generally overestimating the radar by 30% if the median is considered and 48% if the mean is considered. However, there is consistent agreement among the algorithms vis-a-vis the retrieved patterns of rainfall, with the PMW algorithms outperforming the IR algorithms in spatial correlation tests with radar data. By the same token, the best temporal correlations are obtained from the combined SSM/I–IR algorithms that sample the diurnal cycle.
The main objectives of the PIP-1 study were to seek the best performing PMW algorithms for land and ocean environments, and to use the results from the different algorithms to form a “consensus” algorithm. This product was then evaluated by a GPCP merged rain gauge dataset over land and a west Pacific atoll rain gauge dataset over ocean; see Morrissey et al. (1994) for a description of the PIP-1 validation datasets. Of the 17participating algorithms, 16 were designed for application with SSM/I measurements, while the algorithm of Spencer (1993) was designed for measurements from the National Oceanic Atmospheric Administration (NOAA) satellite microwave sounding units (MSU). The PIP-1 study area covered the entire globe and concentrated on a 4-month dataset from August 1987 through November 1987. The designated calculations for PIP-1 were 0.5° resolution rainfall accumulations for the four 1-month time periods. Although PIP-1 went a long way in documenting the magnitude of differences between the different algorithms at different places for different months, as well as quantifying differences with the validation datasets, the extended space–time averaging scales precluded an in-depth understanding of why the algorithm differences arose and what could be done about obtaining better convergence between the different types of algorithms. A special issue of Remote Sensing Reviews has been devoted to the PIP-1 results. The articles of Adler et al. (1994), Barrett et al. (1994b), Smith et al. (1994c), Spencer (1994), and Wilheit et al. (1994) document some of the recent progress that has been made with SSM/I algorithms. The results interpretation paper by Barrett et al. (1994b) concluded that no given algorithm outperformed all others everywhere, but that individual algorithms tended to do best in certain places at certain times and, if calibrated with ground data, perform optimally in the region over which they are calibrated. Notably, the latter point does not apply to all SSM/I algorithms, since various physically based algorithms cannot be defensibly calibration-adjusted to ground data.
The PIP-2 has had a different focus than the other projects, again like PIP-1 examining only PMW algorithms (all using SSM/I measurements), but concentrating on explaining why algorithm differences occur in terms of the basic physical methods used in the algorithm designs. Also, this project differentiates the process of precipitation detection from brightness temperature (TB) to rain-rate (RR) conversion in order to better distinguish the sources of algorithm-to-algorithm differences. The PMW focus results from the organizational structure of the WetNet project, which was to foster the application of SSM/I measurements to studies of the global hydrological cycle (Dodge and Goodman 1994). Therefore, there has been a natural inclination to concentrate on SSM/I algorithms to the exclusion of IR algorithms and even other PMW type algorithms such as those designed for Scanning Multichannel Microwave Radiometer (SMMR) measurements.
3. Description of PIP-2
PIP-2 has focused on individual cases and instantaneous rain rates at full SSM/I resolution. The project has incorporated ground radar and rain gauge measurements to aid the intercomparison analysis. These are referred to as validation data but do not serve as a calibration reference nor to quantitatively assess uncertainty. The main purpose of these data is to help understand how and why the algorithms agree or disagree, but to use physical principles in gauging the individual strengths and weaknesses and diagnosing the ultimate cause of differences. Land and ocean results are analyzed separately to elucidate the pros and cons of the various algorithms according to whether the background emits as a radiometrically warm (land) or cold (ocean) surface.
Besides distinguishing between land and ocean backgrounds, the analysis also distinguishes between rainfall detection (referred to as screening) and rain-rate estimation (referred to as TB–RR conversion). Although the distinction between these two processes can become blurred in the context of the various algorithm designs, a distinction can always be made as to what algorithm rule constitutes detection and what algorithm rule constitutes conversion. Screening methods represent a set of algorithm tools that are explicit or implicit to all algorithms. Unfortunately, they have not been given as much scientific emphasis as conversion methods. Screening is important because it is the gateway to successful retrieval. In essence, the most brilliantly conceived retrieval scheme or most cleverly designed variable transform can be rendered impotent if the screening portion of the algorithm cannot distinguish between raining and nonraining pixels. Fortunately, a bridgehead of literature focused on this topic has emerged, beginning with the SMMR-based study of Ferraro et al. (1986), followed by the study of Grody (1991) and other papers, for example, Fiore and Grody (1992), Ferraro et al. (1994), and Ferraro and Marks (1995). It is also possible to find discussion of the screening problem in some of the algorithm description papers (see various references indicated in Table 4 and discussed in the appendix), although the details and physical assumptions behind the screening procedures are not always carefully spelled out. Ferraro et al. (1998) have provided a systematic explanation of how practical screening rules can be implemented over land and ocean surfaces, and have provided the explanation for the “common” screening procedure used in this study to isolate algorithm differences due to the conversion schemes from those due to screening. As will be seen, this is an important issue in attempting to assess algorithm performance.
There were 20 algorithms submitted to PIP-2 from 19 groups. The appendix provides brief summaries of the individual algorithms. These descriptions are aided by the papers of Wilheit et al. (1994) and Liberti (1995), who adroitly tackled this subject for the PIP-1 and AIP-2 projects, respectively, by seeking common themes in the various algorithm designs. Here, we synthesize this process even further by placing each algorithm into one of four algorithm categories, conducting the analysis by stratifying intercomparison results according to the various solution methods. Although not all algorithms fitperfectly into the four-definition framework (spelled out in section 3e), the framework provides a philosophically consistent means to identify algorithms according to fundamental principles. As a further refinement to algorithm categorization, the analysis also stratifies according to the channel input used for the TB–RR conversion and according to the severity of screening approach that is used. The summaries in the appendix seek to identify these distinct algorithm features, to the degree space allows.
It is also worth noting that contained within the selected cases, are different types of raining systems, that is, tropical storms, midlatitude cyclones, nonsevere and severe convective storms, squall lines, and stratiform rain areas. To help interpret the intercomparisons, the analysis stratifies the results according to the meteorological nature of the rain systems, using four classifications for both land and ocean rainfall. A final stratification for the analysis discriminates between low and high spatial resolution algorithm results. In the remainder of this section, six subsections describe the PIP-2 analysis procedures and provide details on the cases and overpasses under analysis, as well as descriptions of the various categories of algorithms.
a. Selection of cases and overpasses
The PIP-2 initially focused on 27 individual rainfall cases. Each case consists of one or more SSM/I overpasses during the course of its development. Thus each overpass contains an image of a precipitating weather system, while all overpasses associated with the same weather system constitute a case. The cases are distributed throughout a latitude range of 60°N–17°S, over continental and oceanic surfaces, for every season of the year, and from multiple years. Cases have been selected from each of the first three SSM/I satellite platforms (F8, F10, and F11) since the inaugural launch of the F8 platform in July 1987. The cases extend from July 1987 to February 1993. Table 1 identifies the 27 cases, including information on the names of the individuals who recommended the cases, case names, dates, satellite sources, background types, types of validation data associated with the cases, and total number of overpasses selected for the cases. Cases are numbered up to 28, but case 7 is not included since its data were unusable.
A total of 118 overpasses were selected for the initial analysis. Of these, 18 are completely oceanic, 18 are completely continental, and 82 are combined land–ocean. The rainfall targets for the 27 cases were carefully selected by the individual PIP-2 participants identified in Table 1. The target areas were generally intended to surround a storm or storm system (to the extent possible given the orbit swath coverage), including surrounding areas that contained either additional convective elements, stratiform sheets, or clear areas of interest. The targets are variable in size, with the number ofoverpasses selected for a case depending on a number of factors involving duration of the storm system, SSM/I coverage, and validation data availability. Table 2 provides details on all 118 overpasses indicating exact dates, times, locations, target sizes, data sources, and the availability and quality of the corresponding radar or rain gauge validation datasets.
The analysis was conducted in two stages: 1) an initial stage and 2) a reprocessing stage in which the algorithm participants resubmitted their results after a group analysis of the initial results held at a WetNet meeting in Durham, New Hampshire, in February 1995. During the initial analysis, a set of 56 of the highest quality overpasses were selected as the basis for the reprocessing. They are indicated by a “#” symbol in the overpass number column of Table 2. Of these, 7 are completely oceanic, 6 are completely land, and 43 are mixed land–ocean (25 of the original 27 cases were retained—no overpasses were retained for cases 3 and 15, each a single overpass case). A high quality overpass signifies that the SSM/I coverage of the precipitation system was relatively complete, that there were minimal bad data problems with the SSM/I pixels, and that the validation data (if available) were reliable over a portion of the SSM/I target. However, validation data coverage is always limited, invariably truncating important features of the rainfall targets.
b. Selection of validation data
For a selection of overpasses, the individual(s) who recommended the cases indicated in Table 1 also provided validation data processed into rain-rate maps. These datasets are mostly used to help consider how well algorithms differentiate between raining and nonraining areas, and for the raining areas, to help evaluate how the algorithms perform in retrieving both the average rain rate over the validation sectors and the pixel-to-pixel variability. However, as discussed, the validation data cannot be used for absolute calibration because of intrinsic uncertainties in representing the intensity of rainfall.
There are six codes used in Table 1 for the ground validation data: 1) Grnd Rad S (C) indicates S- or C-band ground weather radar; 2) A/C Rad indicates C-band aircraft radar from NOAA P3 aircraft; 3) CP-2 Rad indicates the National Center for Atmospheric Research’s dual polarization-dual frequency (S–X band) multiparameter radar; and 4), 5), and 6) LD, MD, and HD gauge indicate low-, medium-, and high-resolution rain gauge networks, where these descriptors denote gauge densities on the order of 1, 10, and 100 per degree square, respectively. A number of the radar validation datasets are from coastal radar installations, which are troublesome for satellite intercomparison in that satelliteretrieval algorithms often either cannot retrieve or retrieve only poorly over mixed land–ocean pixels. In addition, there are some validation datasets for which it was clear that the data quality was insufficient, plus a few situations where the validation data were so poor that they are not useful for evaluation purposes.
Not all overpasses had validation datasets available for analysis. Of the 27 cases, 23 had at least one overpass with a corresponding validation dataset. Of the 118 total overpasses, 83 had validation datasets. These included 57 radar datasets and 26 rain gauge datasets. Of these, 4 were oceanic, 18 were continental, and 61 were a combination of land and ocean. The requirements for the validation datasets were that they be mapped into the same pixel format as the algorithm results (i.e., the native SSM/I orbit swath format) and converted to the common rain-rate units used by the algorithms (i.e., mm h−1). As will be discussed, a general problem with the validation datasets is that their areal coverages are almost always much smaller than the rainfall target areas selected by the investigators. It should be recognized that the satellite targets were not limited to areas covered by the ground measuring systems. This was by design since the statistics of the algorithm-to-algorithm intercomparisons are more robust when entire rain systems are considered, regardless of whether the algorithm-to-validation intercomparisons can only be done over partial targets.
Each pixel of a validation dataset is associated with a binary flag set to “reliable” or “unreliable,” according to the investigator who provided the dataset. For example, in a rain gauge–based validation dataset, the investigators defined all pixels whose footprint contained a gauge as reliable, whereas all other pixels generated from objective analysis over the grid mesh were set as unreliable. For a radar-based dataset, the investigators defined unreliable pixels at radar ranges where beam broadening and beam height were large. As it turned out for a few overpasses, a large number of the validation pixels were set as unreliable by the data providers. Moreover, it was found that some of the radar-based rain-rate estimates and rain gauge estimates were clearly beyond the fringe of reality, although repeated checks of the quality control and analysis procedures did not reveal the source of the problems. The clearly deficient validation datasets are marked with a “poor” designation in the data quality indicators in Table 2 (four overpasses were designated this way).
As seen in Table 2, a number of the validation datasets are judged “fair,” denoting that the coverage of the rain system captured in the SSM/I overpass was limited and did not capture the focal point of the SSM/I observed rainfall. There are well-understood reasons for these problems. For example, radars have limited range, all but high density rain gauge networks are too sparse to provide robust statistics, and well-distributed gauge networks are not available over the open ocean. In general, the radar-based products provided more uniform coverage than the gauge-based products and generally indicated reasonable appearing rain area coverages over the limited scan domains.
All of the validation pixels were used in the initial intercomparison analysis, including the unreliable pixels. For the reprocessing stage in which a subset of 56 of the best quality overpasses were selected, the associated validation datasets (when present) were made up mostly of reliable pixels in which the validation data coverage was assessed as fair or “good.” It should be emphasized here that the general quality of the validation datasets selected for PIP-2 is not below the data quality of radar–rain gauge data in general. In fact, the overall quality of the PIP-2 validation datasets is likely much higher than a randomly selected group of datasets of this size because of their careful selection and preparation by the investigators. The main issue is that developing a high quality validation dataset at high time and space resolution, concomitant with the type of coverage and data quality that is needed for exacting calibration of satellite algorithms, is a major undertaking. This has been made particularly clear by the ground validation project of TRMM.
The shortcomings with ground validation datasets motivate the use of composite algorithms as an additional means to evaluate algorithm performance through intercomparison. Composite algorithms are formed by assembling the results from a set of algorithms (e.g., according to the solution method, channel input, screening approach, or by simply taking all algorithms together) and assigning to each pixel position the mean or median of all the individual algorithm-derived rain rates for that pixel. A median value approach is sometimes useful over a straight averaging approach, since the latter produces nonzero values of rain rate for any pixel position in which just one algorithm may report rain, thus generating unrealistic rain cover statistics.
c. Rules of engagement
A total of 19 individuals or groups supplied results for 20 different algorithms to the intercomparison. The authors and names of the 20 algorithms are given in Table 3 (affiliations are found at the front of the paper). Table 4 provides a summary of the algorithms, including which SSM/I channels are used for screening and which are used for rain rate conversion, type descriptors of the algorithms according to five descriptor categories, and the principal references describing the algorithms. All 20 algorithms are applicable to ocean, whereas only 17 are applicable to land.
Detailed instructions were sent to all participants to ensure the algorithms were applied in a consistent fashion. The SSM/I datasets used for the analysis consisted of predominantly “Wentz type format” (Wentz 1993), although a few overpasses were derived from the NESDIS format; see Ritchie et al. (1998) for an explanation of the various SSM/I data sources. Each overpass contains an inner target array over which rain-rate results were calculated. These target arrays are at most 105 low-resolution scan lines in length and at most 64 low-resolution pixels wide. SSM/I files for all 118 overpasses were distributed to the different participants via a CD-ROM entitled “PIP-2 SSM/I Brightness Temperatures For Selected Cases.”2 After receiving the CD-ROM and applying their algorithms, the participants transferred the algorithm-derived rain-rate results to FSU by ftp.
There were two submissions of results for the project. The first submission, called initial processing, was used to shake down the intercomparison procedures, identify data problems and obvious errors in the algorithms, and examine ways to improve the overall intercomparison analysis. The second submission called reprocessing, involved 56 of the original overpasses (as indicated in Table 2), and represented the final set of results used for the major analysis. A component of the final analysis process was a Final PIP-2 Results Participant Workshop held in June 1995 in Huntsville, Alabama.
d. Differentiation between screening and TB–RR conversion
The term “screening” refers to the technique of assigning a given pixel a possible-rain or no-rain flag. Pixels flagged as no-rain are automatically assigned azero rain rate, whereas pixels flagged as possible-rain may or may not be assigned nonzero rain rate according to a given algorithm’s specific TB–RR conversion scheme. The results of the initial processing indicated that various rain-rate screening techniques employed by the different algorithms contributed significantly to algorithm-to-algorithm rain-rate differences. For that reason, the reprocessing included a method of analyzing algorithm-derived rain rates in a manner so that the effects of the different screening techniques could be isolated from the effects of the different conversion techniques. This was accomplished by incorporating two sets of rain-rate results in the reprocessing effort. In the first “principal investigator” set, the algorithm authors used their own specific screens (PI screens), as was done in the initial processing. In the second “common screen” set, the authors used a standardized screen developed for this project based on the operational screen used at the National Environmental Satellite Data Information Service (NESDIS) fine-tuned with selected PIP-2 validation data. This scheme is described in detail in Ferraro et al. (1998). The common screen is applied using one of two approaches. In the first approach, the algorithm uses the subroutine for the common screen in place of the code for the algorithm’s original screen. In the second, the algorithm calculates its rain rate results as always, and then, in a back-end fashion, Boolean“ANDS” its result with the Common screen by zeroing out each pixel where the Common screen reports no rain. The first method is preferred since it allows an algorithm to calculate nonzero rainrates for pixels that the common screen reports as raining but the algorithm’s original screen reports as no-rain. When the back-end method is used, these pixels must remain zero. However, since not all algorithms could be modified to use the first approach, the back-end approach was applied in various cases.
Besides assigning pixels as rain or no-rain, the common screen also identifies bad data pixels and pixels over or near coastlines. These pixels are eliminated from the rain calculations, ensuring that the common screen results use standardized bad data and coastline filters. This procedure enables rain rates to be intercompared for a common set of raining pixels, regardless of residual errors in the common screen due to methodology and fine-tuning with validation data. Thus, differences are removed due to rain-area dissimilarities, such that the remaining discrepancies are only due to differences in the TB–RR conversion methods.
The two-pass screening approach is useful in pinpointing the source of algorithm differences, as screening and conversion procedures manifest themselves in entirely different ways in the final rain rate maps and the accumulated intercomparison statistics. If a screening procedure allows nonraining pixels to be processed, then spurious rain rates will be retrieved. Over ocean, these will be low rain rates, whereas over land even high rain rates can be retrieved erroneously. If thescreening procedure is too aggressive, then some rain will be missed. Thus errors in the screening procedure can increase or decrease the retrieved rain totals. Similarly, the underlying rainfall retrieval can err in either direction. All combinations of these errors adding or compensating were observed among the PIP-2 algorithms.
e. Stratification procedures
Eight stratifications, briefly described above, are used in the intercomparison analysis to help synthesize the results. The first three stratifications involve discriminating between land and ocean, between screening and TB–RR conversion, and between the PI and common screen results. The fourth stratification assigns the individual algorithms to four categories associated with their solution method. The four methods are as follows:1) statistical deterministic rain map techniques, which base the relationship between TBs (and/or TB transform variables) and surface rain rates on regression, calibrated empirically with ground measurements; 2) quasi-physical deterministic rain map techniques, which base the relationship between TBs (and/or TB transform variables) and surface rain rates on radiative transfer (RTE) modeling with respect to some type of specified cloud structure, but empirically calibrated with ground measurements; 3) physical deterministic rain map techniques along the same lines as category 2 but without empirical ground calibration; and 4) physical inversion profile techniques, which generate vertical distributions of rain rate by adjusting predefined hydrometeor profiles derived from either aircraft radar measurements or numerical cloud models until calculated TBs from a forward RTE model are consistent with measured TBs at different channel frequencies, without dependency on empirical ground calibration.
The fifth and sixth stratifications represent a reshuffling of the solution method approach, in which for the fifth stratification the algorithms are assigned to different categories according to the channel input used for the TB–RR conversion. For land, two categories are used: 1) scattering and 2) mixed. For ocean, three categories are used: 1) emission, 2) scattering, and 3) mixed. In this nomenclature, emission algorithms use only 19 GHz or a combination of 19 GHz with 22 and/or 37 GHz in their conversion schemes, scattering algorithms use 37 or 85 GHz alone or in combination with each other in the conversion, and mixed algorithms involve both 19 and/or 22 GHz together with 85 GHz in the conversion (37 GHz may or may not be present). In the sixth stratification, the algorithms are assigned to one of three categories based on their screening approach. These categories are: 1) “heavy,” 2) “light,” and 3) intrinsic. These names denote screens that tend to be comprehensive, relatively simple, or intrinsic to the TB–RR conversion scheme, respectively. Table 5 provides a summary of how the 20 algorithms wereclassified for the fourth, fifth, and sixth stratifications involving solution method, channel input, and screening approach.
The seventh stratification differentiates between types of raining systems, using four meteorological classifications for both land and ocean, with the first three classes being common to both categories. The land classes are 1) tropical cyclones, 2) midlatitude cyclones, 3) convective systems, and 4) squall line systems. Ocean classes 1–3 are the same; its fourth class is stratus systems. Table 6 provides the summary of how the 56 reprocessed overpasses were assigned to the meteorological categories (44 were assigned to land categories, 49 to ocean).
The eighth and final stratification for the analysis discriminates between low and high spatial resolution results. SSM/I imagery is composed of five low-resolution channels, 19V, 19H, 22V, 37V, 37H, and two high-resolution channels, 85V and 85H. Each SSM/I overpass consists of a series of scans called B-scans, each of which is 128 pixels wide. Every pixel includes measurements of the two high-resolution 85-GHz channels, with a ground distance of approximately 12.5 km between adjacent pixels and between successive scans.Each odd numbered pixel of every odd numbered scan also includes measurements of the five low-frequency channels (64 pixels per scan). These are the low-resolution pixels (A-scans). Thus, one out of every four pixels is low resolution; see CalVal (1989) for further details on the A/B scan formats.
The algorithm participants were asked to submit results at both resolutions if feasible. Of the 20 algorithms, 10 submitted both low and high resolution, whereas 9 were submitted at low resolution only. For one algorithm (GPROF), only high-resolution files were submitted. To enable the comparison of all algorithms at both resolutions, the results for low-resolution-only algorithms were bilinearly interpolated to a high-resolution grid mesh. For the GPROF results, averaging was used to create the necessary low-resolution files. In addition, all validation products were also submitted at both low and high resolution.
f. Intercomparison procedures
Six basic intercomparison procedures explained below are used to evaluate the retrieval results. The schematic diagram given in Fig. 1 is helpful in understanding the calculations described in procedures 2, 3, and 4.
1) Rain rate maps
Qualitative analysis of individual algorithm rain-rate maps for all overpasses considered separately and in relationship to eight additional rain-rate maps available for each overpass. These eight additional maps consist of: the four algorithm category composite rain-rate maps based on the medians of all reporting algorithms at individual pixel locations (the four maps correspond to the statistical rain map, quasi-physical rain map, physical rain map, and physical profile categories); composite rain-rate maps created by combining categories 1 and 2 (for convenience referred to as the statistical algorithms) and categories 3 and 4 (the physical algorithms); the all-algorithm composite rain-rate map made up of all algorithms, that is, all categories (also based on the use of medians); and the validation rain-rate map.
2) Along-diagonal statistics
Quantitative analysis of the statistics of the individual algorithms, the all-algorithm composite, and the validation results, for each overpass considered independently and for all overpasses taken together, differentiated into land and ocean categories (individual overpass results are not shown). Separate sets of calculations are made for the PI and common screen results. Statistical variables consist of the mean rain rate over thetarget area, percentage of pixels analyzed, and percentage of raining pixels. Means are also calculated for the rain-only pixels. These statistics provide a basic overview of how robust the algorithms are in processing the raw brightness temperature information, how much data are rejected because of individual algorithm features such as coastline filtering or bad data detection, and whether the average rain rates are credible. For example, a result reporting a negative mean rain rate or a positive value exceeding a large number such as 100 mm h−1 would be obviously suspect. In fact, the former situation never occurred, and the latter situation occurred only rarely and did so only for over-land situations in which the algorithm typically confused rain with snow over a wide area. Using these independent statistics, it is not necessarily meaningful to compare a given algorithm’s result to another algorithm’s result (whether they be individual or composite) or to a validation result, if the sizes of the two comparator sets of pixel populations are not of the same order.
3) Off-diagonal statistics—Individual algorithms
Quantitative analysis of individual algorithm intercomparison statistics for all overpasses taken together, in which cross-comparison statistics are calculated between the respective algorithms and both the validation results and the all-algorithm composites, but only where a pair of comparators intersect in terms of areal coverage. The statistics are differentiated into land andocean categories and stratified according to the PI and common screen calculations. These population-matched intercomparisons allow for the calculation of a number of statistical parameters, six of which are generated for purpose of this analysis: 1) bias between means; 2) bias-adjusted root-mean-square difference (rms); 3) ratio of bias to mean of comparator; 4) ratio of means; 5) ratio of bias adjusted rms to mean of comparator; and 6) correlation coefficient. It should be recognized that when an intercomparison involves a validation result as the comparator, the sample population may be reduced beyond that of the restricted sampling supplied by a radar system or gauge network, because of the necessity of masking out SSM/I pixels lying over or near coastlines, that is, pixels for which most algorithms are not designed.
4) Off-diagonal statistics—Composite algorithms
Quantitative analysis of composite algorithm intercomparison statistics for overpasses accumulated into meteorological categories and for all overpasses taken together. Cross-comparison statistics are calculated between the respective algorithm category composites (considering solution method, screening approach, and channel input stratifications), the all-algorithm composite, and the validation results, but only where a pair of comparators intersect in terms of areal coverage. The statistics are differentiated into land and ocean categories and stratified according to PI and common screen results as well as low- and high-resolution results. In tabulating the results, the statistical rain map and quasi-physical rain map categories in the solution method stratification are not differentiated to ensure more robust statistics. These population-matched intercomparisons allow for the calculation of a number of statistical parameters, 13 of which are generated for the purpose of this analysis: means of reference and comparator algorithms plus bias between means, offset of linear regression line between reference and comparator algorithms, standard deviations of reference and comparator algorithms plus bias adjusted rms difference, percentages of raining pixels of reference and comparator algorithms plus percentage difference; correlation coefficient between reference and comparator algorithms, Heidke skill score between reference and comparator algorithms, and slope of linear regression line between reference and comparator algorithms. Any statistics text can be referred to for definitions of the above parameters, except possibly for the Heidke skill score, whose definition can be found in the report of Ebert (1996). Additional statistical tests were applied, including the Student’s t test for estimating the probability that samples were drawn from the same population assuming equal variance of samples, a similar test assuming unequal variance of samples, an F test for checking variance equivalence, and a combination of chi-square andKolmogorov–Smirnov tests for evaluating the probability that the accumulated distribution properties of a pair of results were drawn from the same population. However, these tests did not prove to be useful because of the often nonnormal properties of the distributions. The comment in part 3 above concerning sampling limitations when intercomparing with validation data also applies here.
5) Difference histograms
Quantitative analysis of the histograms of individual algorithm rain-rate differences against the all-algorithm composites and validation results stratified according to land and ocean and according to PI and common screen. These calculations help identify situations in which an algorithm may match the group composite or validation result in a mean sense but produces systematic differences at specific points along the rain-rate scale.
6) Fan maps
Quantitative analysis of the nature of the relationships between brightness temperatures and rain rates. This analysis is accomplished by producing what are called fan maps between 19–37-GHz and 19–85-GHz TB pairings (taken as unpolarized quantities), for different ranges of rain rates. Fan maps are produced for each algorithm on each of the overpasses, differentiated between land and ocean categories. These calculations associate rain rate differences to the multispectral TB properties of pixels, regardless of which SSM/I frequencies are used in the screening or conversion portions of the algorithms. They can be used to help identify particular case overpasses and specific areas within overpass target areas where one algorithm’s retrieved rain rates are at odds with those of another resolved into selectively defined rain-rate intervals, irrespective of either algorithm’s channel inputs.
4. Intercomparison of results
a. Image analysis
The first salient point concerning the PIP-2 intercomparisons is that validation data coverage is rarely consistent with the spatial extent of the rainfall system target. This is highlighted in Fig. 2 with three overpass examples. The first is a land-based squall line over thecentral United States (case 2/overpass 1), the second a combined land–ocean convective system over the north coast of Australia in the vicinity of Darwin (case 12/overpass 5), and the third a west Pacific tropical storm (Tropical Cyclone Oliver) south of the TOGA COARE study area. The top row of panels shows the all-algorithm rain rate composites from SSM/I (based on median value compositing), while the bottom row of panels shows the validation data maps. The validation data are derived from ground weather radar systems located in Kansas City and Darwin, plus a P3 aircraft used during TOGA COARE. It is evident that radar coverage in each example presents a limited and truncated view of the system of interest. Each of these overpasses is used to emphasize a different weakness with radar-based ground validation intercomparison. In the first example, radar coverage is limited to the central portion of the squall line so that in not covering the northern and southern sectors, the radar does not sample the complete rainfall gradient intrinsic to an elongated multicell squall system. In the second example, since the radar is located on the coast with Melville Island positioned in its northwest sector, much of the coverage area involves coastal pixels insofar as the SSM/I measurements are concerned. Moreover, the radar only provides a truncated view of the convection field that has an intensity range evident in the SSM/I map exceedingthat which the radar has sampled. In the third example, the radar field of view not only truncates the coverage of the storm, it is limited to an outer band region whose rain-rate properties are unlike the focal point of the storm in the inner eyewall area where deep convection and intense rain rates are located.
The three examples presented here are similar to the other radar examples excepting cases 4, 5, and 6, which use radar validation data from the FRONTIERS radar network deployed within the British Isles. For those cases, the spatial sampling is greater, but at the same time much of the radar coverage is in the vicinity of coasts, in which intercomparison to SSM/I retrievals is inhibited by the mixed land–ocean pixel problem. Although examples involving rain gauge–based validation data are not shown, the problems are analogous. The limited scale high and medium density gauge networks (cases 25 and 26) truncate rain system coverage, while the spatially extended low density networks (cases 14, 15, 23, and 24) do not adequately sample the range of rain-rate intensities evident in the SSM/I-based composite analyses.
The above discussion is not intended to castigate ground radar or rain gauge systems, but to point out that there are intrinsic difficulties with using data from such systems to validate satellite rainfall algorithms at high-resolution–instantaneous space–timescales. Although the sampling limitations that PIP-2 experienced at the time the validation data were acquired have been partly mitigated by the installation of the upgraded Next Generation Weather Radar (NEXRAD) radar network, that system only provides coverage over the continental United States. The main point is that in conducting these intercomparisons, in the context of the SSM/I rainfall targets, in no case did a validation dataset provide representative coverage and sampling. Thus, it remains as a problem in statistics to determine if representative case mean intercomparison results can be obtained by accumulating the patchwork of limited-view validation scenes. It will be shown that the statistics derived from the validation data intercomparisons do not exhibit stationarity, unlike the more stable algorithm-to-algorithm intercomparison statistics. This leads to the conclusion, even overlooking known problems with accuracy and precision in the validation measurements, that in assessing high-resolution–instantaneous rain rates, differences between the algorithms are below the uncertainty limits inherent in the validation data. This is due to scale incompatibility between what ground systems “see” and what constitutes the rainfall targets as defined by a collection of scientific observers. Thus the validation data gathered for PIP-2 cannot be used in an objective sense to rank-order satellite algorithms in terms of accuracy and precision.
In Figs. 3a–c and 4a–c, examples of algorithm-derived rain-rate maps are shown for case 11/overpass 2 (winter nor’easter) and case 28/overpass 5 (west Pacific tropical storm–Tropical Cyclone Oliver). The first 20panels of each of these three-part figures show results from the individual algorithms, whereas the last seven panels show the composite algorithms based on solution method. There are two important features apparent in the individual results. First is the dispersion in the rain-area coverage pertaining to differences in how the various algorithms detect light rain. For example, in Fig. 3a compare the BRIS result to the MSFC result, which demonstrates that although both algorithms tend to agree on the focal point of the storm (in this case an occluding midlatitude winter cyclone), the BRIS algorithm generates an extensive light rain background, whereas the MSFC algorithm generates almost no light rain background. Since both these algorithms calculate the oceanic rain-rate magnitudes from scattering-type methods based on the 85-GHz channels (see Table 4 and the appendix), the rain-area coverage differences arise from differences in how the rain detection (screening) is carried out. Other examples of rain-area coverage differences are found throughout the individual algorithm panels in both Figs. 3 and 4.
The second major feature of the individual algorithm results are that there are differences in the maximum rain rates in the focal regions of the storms, bearing on how the different algorithms convert brightness temperatures to rain rates from different mixes of channel inputs, different calibration procedures, different microphysical underpinnings, and different radiative transfer assumptions. For example, in Fig. 4a, considering an intense tropical cyclone, compare the CALVAL result to the GSCAT result, particularly along the main north–south and east–west aligned feeder bands, as well as in and around the cyclone eyewall where the most intense rain rates occur. Whereas the GSCAT algorithm generates maximum rain rates exceeding 24 mm h−1, the maximum values from CALVAL do not exceed 12 mm h−1, or less than half the GSCAT maxima. This isrelated to a known weakness with the CALVAL algorithm, which is an empirically calibrated statistical algorithm whose radar training dataset did not contain a sufficient frequency of high rain rates. By the same token, as will be discussed in later sections, the GSCAT algorithm, whose rain rates are derived from the 85-GHz horizontally polarized TBs, is consistently on the high end of the scale vis-a-vis the ocean algorithm group composites. This, by itself, does not denote an accuracy problem with the GSCAT results since the composite results are only a relative measure to gauge the dispersion of the individual algorithms and not an absolute calibration reference. However, it does denote that maximum intensity differences are inherent to these intercomparisons, an issue of algorithm design that needs further examination.
It is also worthwhile to examine the different group composite panels, because there are differences in these panels that relate both to the method of solution and to the mix of screening procedures used for the individual algorithms within the different solution method categories. This is best described by the rain coverage differences evident between the statistical rain map composites (STAT RM maps in upper right-hand panels of Figs. 3c and 4c) and the quasi-physical rain map composites (Q-PHYS RM maps in left-hand middle panels of Figs. 3c and 4c). Note that although both of these solution methods are statistical in nature, because the former group of algorithms uses straightforward empirically formulated regression schemes applied to ground measurements and the latter group makes final empirical calibrations of physically formulated algorithms with ground measurements, there are differences in the rain-area coverages. This is independent of the fact that case 11 is a winter cold-core midlatitude cyclone and case 28 is a warm-core tropical cyclone. Part of this is because the regressions intrinsic to the STAT RM algorithms are designed to pass through zero rain rate at the TB–RR conversion stage, whereas a number of the remaining algorithms do not produce continuous rain rates through zero. Note that when the STAT RM and Q-PHYS RM groups are combined (STAT), the associated rain area coverages agree well with the composite of the two physically based composites (PHYS).
The differences and similarities between composites have more to do with how specific features of screening methods used by the various algorithms in different groups tend to impose their individual signatures on the composites, differences that tend to disappear as the smaller groups are combined into larger groups and the individual screening impacts on rain-area coverages become more randomized in the composite maps. By contrast, unlike the individual algorithm results, it is not evident that there are systematic differences in the maximum rain rates within the different solution methods in going from case 11 to case 28 (also borne out by individual analysis of the remaining 54 reprocessed overpasses). Thus there is no qualitative evidence that physical methods are superior to empirical methods, the subject of discussion in Kidd et al. (1998).
b. Along-diagonal statistics
Quantitative summaries of algorithm performance are shown in Figs. 5a,b, illustrating the main along-diagonal statistical results for both ocean and land cases, each stratified according to whether PI or common screening has been applied. All four panels in the two figures illustrate four variables: (a) percentages of target pixels processed (%calc), (b) percentages of processed pixels detected as raining (%rain), (c) area-averaged rain rates over the processed target areas (average rain), and (d) rain-only area-averaged rainrates (rain only ave.). Results are presented for the 20 individual algorithms, the composite-all results, and the validation data results. The ocean intercomparison results given in Fig. 5a reveal six important features:
there are variations in the percentages of pixels thatindividual algorithms process (%calc), related to differences in how individual algorithms detect bad data and impose coastal masks;
there are variations in percentages of processed pixels detected as rain (%rain) by the individual algorithms operating with the PI screens, variations associated mainly with differences among the various screening techniques, and variations that diminish when the common screen is applied;
there is a factor of 5–6 difference between minimum and maximum averaged rain rates of individual algorithms (e.g., GSCAT vs CalVal), regardless of which screening procedure is used and regardless of whether all-target or rain-only area averaging is considered;
although some algorithms exhibit little change between their PI and common screen results insofar as averaged rain rates, other algorithms exhibit significant differences, highlighted by an approximate 3 to 1 increase in the rain-only area average for the composite-all result in going from PI to common screen (although the all-target area-average changes are small), again emphasizing the influence of screening differences among individual algorithms;
the ratios of rain-only area-averaged rain rates to all-target area-averaged rain rates vary from algorithm to algorithm, but there is a generally dramatic increase in the ratio in going from PI screening to common screening, manifested in the composite-all results in which the ratio goes from less than 2 to 1 for the PI screen results to some 5 to 1 for the common screen results, another facet of the underlying screening differences; and
for average rain rates, the individual algorithm results are nearly all biased high with respect to the validation results, but since the validation data coverages of the cumulative target area for the 50 ocean overpasses are less than 10% for both the PI and common screen results, these biases are not statistically significant.
Overlooking the differences, it is worth considering that there are underlying consistencies in the results, the most important being the general reduction in percentages of processed pixels assigned to rain in going from PI to common screening, the associated small changes in the all-target area averages resulting from changing the screening procedure combined with the generally large increases in rain-only area-averaged results after the screening procedure is altered. The explanation for these systematic changes is that there are generally significant differences in how individual algorithms detect light rain, differences that have major impact on the rain-area coverage and rain-only area-averaged rain rates but minor impact on total area-averaged rain rates. Otherwise, with the exception of a few of the algorithms (identified in the next section), the individual algorithms tend to preserve their relative relationships to one another in going from PI to common screening, and in how they would be ranked in going from all-target area-averaging to rain-only area-averaging. It remains to be determined why there is such a large range of variability between the individual algorithm’s averaged rain rates, regardless of the averaging area selected. It is evidently not due to the light rain end of the spectrum, otherwise the all-target area averages would exhibit more dramatic changes. This issue is considered in more detail in sections 4d and 4e in conjunction with the large rain-rate end of the spectrum.
For the land case (Fig. 5b), which had only one fewer overpass considered (49 instead of 50), most of the same features observed in the ocean results remain. Only 17 algorithms report rain since the BERG, RSS, and TAMU algorithms do not contain a land module (see Table 4). The most notable change in the intercomparison relationships is that the amount of validation data that survives after the bad data detection and coastal masking procedures of the common screen are applied represents only about half of the original validation pixels that intersect the targets. This is not surprising since the percentages of pixels processed for the individual algorithms undergo a systematic drop of some 30% after the common screen is employed, related to the conservative masking procedure employed for coastal pixels. However, similar to the ocean case, the percentages of pixels processed, regardless of whether the PI or common screen is used, are much smaller than even the smallest percentage value coming from the individual algorithms (in this case the OLSON algorithm). Thus, direct statistical comparisons between the along-diagonal algorithm and validation rain-rate averages are meaningful only to the extent that the validation sample represents the algorithm sample. The other notable difference between land and ocean intercomparison relationships is that the ratio between rain-only and all-target area-averaged rain rates increases from about 2 to 1 to about 8 to 1 in going from PI to common screening (it went from 2–1 to 5–1 for ocean), indicating that screening differences among land algorithms are even more significant in establishing the light rain threshold.
c. Off-diagonal statistics—Individual algorithm analysis
The off-diagonal intercomparison statistics are summarized in Tables 7a–d. Here it is important to recognize that the individual algorithm results are intercompared to both the validation results and the composite-all results, and that the portions of the target areas that are incorporated into the intercomparisons are those in which there is an intersection of valid pixels between an algorithm and its associated comparator variable. Obviously, given the dearth of validation data over the target areas, the composite-all intercomparisons are more statistically robust in terms of sample sizes than the validation data intercomparisons. Each of the fourtables contain two subtables. The upper subtables consider the intercomparisons to the validation data; the lower ones consider the composite-all intercomparisons. Tables 7a and 7b consider the PI and common screen results for the ocean case. Tables 7c and 7d consider the same results for the land case.
In a given subtable, the algorithms are identified under the name column (second column) and ordered (first column) according to their all-target area-averaged rain-rate magnitudes obtained from the along-diagonal analysis. For example, in the upper subtable in Table 7a, the algorithm assigned 1 is the IFA–SAP algorithm, which had indicated the lowest all-target area-averaged rainrate in the top panel of Fig. 5a (i.e., 0.25 mm h−1). The algorithm in the 20th position for the upper subtable is GPROF, based on its along-diagonal all-target area-averaged rain rate of 1.34 mm h−1. In the third column (immediately to the right of algorithm name) are the numerals of a second ordering based on the rain-only area averaging. If an algorithm’s position between the first and second ordering changes by more than four places, an asterisk is placed to the right of the name, thereby denoting that the algorithm is sensitive to this ordering procedure. Except for FER–AVE, the algorithms that are sensitive for the PI screened ocean results, that is, IFA-SAP, MSFC, FSU, and FER–AVE, tend to report the lowest percentages of raining pixels (lowest rain coverages), indicating the tightest screens and thus the greatest susceptibility to changes between all-target area averages to rain-only area averages.
In considering the common screen results, the same dual ordering procedure is used, but in this case (as evident in Table 7b), if an algorithm’s original position in the first ordering based on PI screen results changes by more than four positions in the first ordering of the common screen results, a “#” sign is placed to the right of the numeral in the first column. Note for the ocean case, only the IFA–SAP and BAUER algorithms are sensitive to the screening change. In fact, the IFA–SAP algorithm moves from position 1 for the PI screen results to position 20 for the common screen results, indicative of an algorithm whose PI screen is very tight, but whose TB–RR conversion screen produces relatively large rain rates. In terms of sensitivity to the all-target versus rain-only ordering procedure, only the MSFC and OLSON algorithms change more than four positions between the two orderings for the common screen results (denoted by asterisks in Table 7b).
Following the name and position order columns are the actual intercomparison statistics. They include the mean rainrates in mm h−1 for the algorithm pixels and associated validation or composite-all pixels lying within the intersection areas (Alg and Val or Com All), the bias, the bias-adjusted rms, the bias ratio (bias rat, given by the ratio of the bias to the validation or composite-all average), the means ratio (means rat, given by the ratio of the algorithm to the validation or composite-all average), the rank of means rat in terms of its absolutedifference from 1.0 (1.0 designates perfect agreement), the bias-adjusted rms ratio (rms rat, given by the ratio of the bias-adjusted rms to the validation or composite-all average), the rank of rms rat in terms of smallest to largest (0.0 designates perfect agreement), the correlation coefficient (CC), and the rank of CC in terms of largest to smallest (1.0 designates perfect agreement).
The last two columns provide a scoring system in which the rank values for the means ratio, bias-adjusted rms ratio, and correlation coefficient are added up yielding a final score (F Scr), with the final score ranked from smallest to largest and denoted by F Rnk. This scheme gives each algorithm an intercomparison performance factor relative to the other algorithms in terms of how its results intercompare to the comparator variates (i.e., the validation or composite-all results). This is the “winner–loser” selection procedure that has been adopted by the PIP-1 project, although it does not engender any type of absolute measure of accuracy or precision of the individual algorithms, since neither validation data nor composite-all results represent absolute calibration standards. However, the scoring procedure is helpful in quantifying an algorithms relative proximity to or distance from a comparator variable.
In considering the ocean case, and in comparing algorithms to the validation data for the PI screen results (upper subtable of Table 7a), all correlations are small lying between 0.0 (BERG) and 0.33 (RSS and FER–AVE), the means ratios extend from 1.1 (IFA-SAP) to 4.49 (TAMU), and the bias-adjusted rms ratios (similar to coefficient of variation) extend from 1.84 (OLSON) to 4.27 (KIDD). The five algorithms with highest scores consist of one physical profile (OLSON), one statistical rain map (CALVAL), and three physical rain map (GSFC, FER-AVE, LIU-CUR) designs. The five algorithms with lowest scores consist of one quasi-physical rain map (GSCAT), two statistical rain maps (BERG, KIDD), and two physical rain map (PETTY, TAMU) designs. Thus, there is no preference for a given solution method to stand out in terms of winners or losers.
The fact that the CALVAL algorithm exhibits a means ratio of 1.74 (with a rank of 5) reveals a serious problem with accepting the validation averages as a measure of ground truth. Note that it was established independently that the CALVAL algorithm underestimates rain rates greater than 5 mm h−1 because its training dataset lacked an appropriate sample of intense rain rates (reinforced by the results in the upper subtable of Table 7b for the common screen intercomparisons in which the CALVAL algorithm indicates the smallest means ratio of all 20 algorithms). Therefore, it follows that the validation measurements for the ocean case underestimate the actual rain rates, assuming that the radar samples that were contained within the training dataset for the CALVAL algorithm were accurately calibrated. Given this, all algorithms with means ratios in the vicinity of 1.74 or less, even though they will score high on the basis of their means ratios, are underestimating the true rainfall.In terms of all algorithms with means ratios significantly greater than 1.74, nothing can be firmly established since there is no way of establishing the true bias of the validation data.
On the premise that the bias error in the validation data must amount to a scale factor between 1.74 and some larger value, a value of 2.0 is assumed to remain consistent with the characteristic bias between the AIP-3 radar validation data and the mean satellite results. Based on this assumption, the adjusted means ratios would then range from 0.55 to 2.25 with 19 of the algorithms indicating adjusted means ratios between 0.55 and 1.9, and 13 of 20 adjusted ratios exceeding 1.0. The significance of this exercise is that 19 of the 20 algorithms would then have bias uncertainties below that of the estimated uncertainty in the validation data, that is, below 100% either high or low.
In making these bias estimates, the adjusted means ratios are highly consistent with the means ratios of the set of 20 comparison algorithms participating in the AIP-3 project. Of these 20, 15 are equivalent to those used in PIP-2 with CALVAL, GPROF, MSFC, OLSON, and RSS from PIP-2 not represented in the AIP-3 set, while five were used in AIP-3 but not in PIP-2 (BA3, FE4, HA1, PR1, and SM2). In the AIP-3–PIP-2 vernacular, the 15 algorithm matchups are as follows: AD1–GSCAT, AO1–MRI, BA1–BRIS, BA2–KIDD, BE1–BERG, CH1–GSFC, FE1–NESDIS, FR1–FER–AVE, KM1–KUM, LI1–LIU–CUR, MZ1–IFA–SAP, PE1–PETTY, SC1–BAUER, SM1–FSU, and WI1–TAMU. [The nine AIP-3 algorithms excluded from the total set of 29 SSM/I entries consist of seven special case entries from multiple-entry groups (AO2, BA0, BA4, BA5, FE2, FE3, and PE2) that did not survive after AIP-3 and two (IA1, IA2) that were severe outliers because of now-known problems.] Using the “All Cruises Combined Dataset,” the AIP-3 means ratios range from 0.71 to 2.0, with 17 of 20 ratios exceeding 1.0 (see p. 50 of Ebert 1996). It should be emphasized in this discussion that the assumption that the CALVAL results can be used to show that the PIP-2 ocean radar validation data are low biased, is subject to debate.
From examination of the lower subtable of Table 7a, in which the PI screen ocean intercomparisons to composite-all results are given, it is evident that final rankings change, although some of the algorithms that scored high in the validation data intercomparisons continue to score high in the composite-all intercomparisons. However, in this case, their high scores are derived from top rankings in the bias-adjusted rms ratio and correlation coefficient columns (correlation coefficients now generally run high, ranging between 0.60 and 0.94 for all but two algorithms), whereas for the validation comparisons, the high scoring algorithms obtained their top rankings in means ratios and bias-adjusted rms ratios (while exhibiting mid to low correlation coefficient rankings). As expected, because all algorithms exhibit means ratios greater than 1.0 for the validation intercomparisons, the ones that ranked high in means ratios for the validation intercomparisons necessarily indicate mid to low rankings in means ratios for the composite-all case.
A more important facet of the lower subtable in Table 7a is that the set of means ratios ranges from approximately 0.45 to 1.65, with 13 of 20 algorithms indicating means ratios between ∼0.7 and 1.3 (note for AIP-3, 15of 20 indicate means ratios between ∼0.7 and 1.3). This would indicate that if the composite-all results for ocean are taken as truth, the bias uncertainties for many algorithms are within approximately ±30%, with PIP-2 precisions generally ranging from 35% to upward of 100%. Although the PIP-2 composite-all results do not represent truth, the distributions of both the PIP-2 composite-all and AIP-3 means ratios are consistent withthat of the PIP-2 adjusted means ratios (i.e., from the validation intercomparisons assuming the scale factor error of 2.0 applies).
Repeating this analysis for the Table 7b ocean common screen results indicates that with a couple of exceptions (BAUER and IFA–SAP), algorithms that scored high or low in the PI screen case continue to score high or low in the common screen case for both the validation and composite-all intercomparisons, although the exact rankings are not preserved (denoting another outcome of screening differences in algorithm intercomparison analysis). As before, the means ratios for the composite-all intercomparisons range betweenapproximately 0.45 and 1.65, with 11 of 20 algorithms exhibiting means ratios between ∼0.7 and 1.3. Also consistent with the PI screen results, using an estimated scale factor error of 2.0 for the validation data, are the adjusted means ratios ranging from 0.53 to 2.19 (compare to the 0.55–2.25 range from the PI screen results).
In examining the land results, it is evident that different sets of algorithms belong to the winners and losers groups. For the intercomparisons to validation data using the PI screen (upper subtable of Table 7c), the FER–AVE, NESDIS, MSFC, GPROF, and BRIS algorithms exhibit the highest final rankings, while the BAUER, KIDD, LIU–CUR, GSCAT, and GSFC algorithms exhibit the lowest. These groups are mostly unperturbed in going to the common screen results (upper subtable of Table 7d), although the GSCAT algorithm migrates from the winners to losers group and the IFA–SAP algorithm falls from a middle-ranked position to the bottom-ranked position. Moreover, the algorithms that scored highest in the validation intercomparisons, tend to score highest in the composite-all intercomparisons (lower subtables in Tables 7c,d). However, in contrast to the ocean case, where highest ranking algorithms achieved that performance by generally obtaining high individual rankings in two of the three ranking categories, the high-ranking land algorithms as a group do not outrank the other algorithms in the individual rankings.
Although a number of additional features noted in the ocean intercomparisons continue to hold for the land intercomparisons, there are various differences. The important ones are:
the ordering is volatile for the PI screen results when switching between all-target area averaging and rain-only area averaging (nine algorithms are marked with asterisks denoting position order changes greater than four), but completely stable for the common screen results (no algorithms change position by more than four places between the two orderings);
there is additional volatility in the ordering when switching between PI screen results and common screen results as four algorithms are marked with a“#” sign (only two for ocean); and
the distribution of means ratios associated with the PI screen validation data intercomparisons range from well below 1.0 (0.24) to around 2.0 (2.04), in contrast to the ocean distribution for which all unadjusted ratios were greater than 1.0, with the total range reducing slightly in moving to the common screen results (0.25–1.68).
These three results lead to two important conclusionsabout the land intercomparisons. The first is that they are more sensitive to screening differences, a feature consistent with the along-diagonal statistical analysis. The second is that there does not appear to be a systematic bias to the validation data, as found with the ocean results. In this context, the distributions of means ratios for the PI and common screen validation intercomparisons are consistent with the associated composite-all intercomparisons, as the range limits are nearly identical in both cases. Supporting the notion that this is not a fortuitous result, but an outcome of the fact that radar systems are generally better calibrated over land than ocean, is the fact that the CALVAL algorithm, which is the one algorithm with a known property of underestimating rain rate, exhibits the lowest validation means ratios—0.24 and 0.25 for PI and common screen results, respectively.
d. Off-diagonal statistics—Composite algorithm analysis
The off-diagonal intercomparison statistics for the composite algorithm analysis were prepared as four two-part tables. The first of these is presented in Tables 8a,b, containing the results for all reprocessed ocean overpasses (all rain systems) based on the PI screen and using low-resolution SSM/I measurements. Table 8a shows the intercomparisons between the validation data and the various algorithm composites [in terms of the 13 different statistical measures discussed in section 3f(4)]. The 11 columns correspond to the different composite algorithms that are used as comparators to the validation data. The first four columns (1–4) contain the solution method groupings, that is, physical rain map (PHY-RM), physical profile (PHY-PRO), the first twocategories combined (PHY), and the combined quasi-physical and statistical rain map categories (STAT). The next three columns (5–7) contain the channel input groupings, that is, emission (EMIS), scattering (SCAT), and mixed emission–scattering (MIXED). The eighth, ninth, and tenth columns contain the screening approach groupings, that is, heavy (HEAVY), light (LIGHT), and intrinsic (NONE). Finally, the 11th and last column contains the all-algorithm composite.
Table 8b shows the intercomparisons of different algorithm composites with the separate solution method, channel input, and screening approach stratifications (using the same 13 statistical variates). As indicated, the sample populations for the composite–composite intercomparisons consist of approximately an order of magnitude more pixels than the validation–composite intercomparisons. Results contained in the tables for the common screen ocean results and for both PI and common screen land results are discussed but not shown. The common screen produces improvements in the % difference, correlation, skill score, and slope factor statistics. Similar tables have been prepared for each of the different meteorological categories used for ocean and land, with the entire set of tables reproduced using high-resolution results. Results from these additional tables are also referred to in the discussion below but not shown. The main results gleaned from this analysis are discussed in the following four sections.
1) Validation data intercomparisons
As noted, there are greater than two to one differences between the satellite results and validation results forthe ocean case regardless of the screening procedure (Table 8a) but virtually no overall bias for land (not shown). There are four additional points of note: 1) rain-area coverages (% rain variates) for the composites change significantly between the validation–composite intercomparisons and composite–composite intercomparisons; 2) mean rain rates (mean variates) for the composites also change significantly between the two types of intercomparisons; 3) the percentage differences between rain-area coverages and mean rain rates for the composites found in the validation intercomparisons (i.e., the smaller pixel populations) change significantly in the composite–composite intercomparisons (the larger pixel populations); and 4) the validation–composite intercomparisons are sensitive to retrieval resolution, whereas the composite–composite intercomparisons are not, that is, there are 10%–20% changes between the low-resolution validation–composite differences and the high-resolution differences, but only 0%–4% changes for the composite–composite differences. Taken together, these results demonstrate that regardless of the underlying bias between the satellite results and the validation results (almost negligible for land but approximately 2–1 for ocean), the intercomparison statistics are not preserved in the two sets of populations. This signifies that the validation data, even if perfect measures, do not provide representative and stable intercomparison statistics. Thus, they cannot serve in this type of intercomparison analysis as a final quantitative arbitrator of algorithm accuracy.
2) Intercomparison by solution method, channel input, and screening
It is evident from part b of Table 8 and those tables not shown that for the different category comparisons within the three algorithm stratifications (according to solution method, channel input, or screening approach), there is generally some degree of rain-rate difference as quantified by the main intercomparison statistics (bias, bias-adjusted rms, and correlation). However, close inspection of the ocean results for both PI and common screens indicates generally small biases between the various categories for both the solution method and screening approach stratifications but fairly large biases when stratified by channel input. These greater differences are borne out by the generally larger bias adjusted rms’s, smaller correlations, and greater deviations of the slope factors from 1.0. This is important because it indicates that when considering all types of rainfall, it is the specific channel selection for use in the TB–RR conversion schemes that largely governs differences in rain rate magnitudes between ocean algorithms, not the philosophy behind the algorithm nor the method used to screen between rain and no-rain; see also Janowiak et al. (1997).
This is not so for the land results, where differences in rain-rate magnitudes are generally greatest betweenthe screening approach categories. Note that for land backgrounds, there are no purely emission algorithms, and in essence, all algorithms are governed mostly by the scattering signatures produced at the higher frequencies. Thus for land algorithms the screening scheme not only governs the rain-cover statistics, it also governs the statistically averaged rain rates, unlike ocean algorithms where screening only has a significant effect on rain-cover statistics.
3) Intercomparison by meteorological stratification
In this analysis we focus only on differences between the composite algorithms in the various stratifications, ignoring the validation data since the pixel populations of individual meteorological categories are too small for representative statistics. The intercomparisons duplicate the all-weather system results in that for ocean rainfall, the channel input stratification produces the greatest overall differences for the various meteorological categories, whereas for land rainfall, the composite differences are greatest within the screening approach stratification.
In addition, there are distinct differences in the intercomparisons associated with meteorological category when considering the solution method. In the case of the ocean algorithms, the physical and statistical algorithms compare closely for the midlatitude cyclone and stratus cases, whereas differences crop up for the tropical cyclone and convective system cases. The greatest differences are between the physical profile and statistical algorithms for tropical cyclones. For the land algorithms, there are also two meteorological classes in which physical–statistical algorithm differences are negligible, these being the midlatitude cyclone and tropical cyclone cases, and two classes in which the physical–statistical algorithm differences are significant, these being the squall line and convective system cases. Again, it is the physical profile–statistical algorithm composites exhibiting the greatest differences. In general, the intercomparisons are insensitive to which type of screening is used.
These results suggest that in attempting to understand why different algorithms produce different rain-cover and rain-rate magnitudes, it is not only necessary to discriminate between the screening approach and the set of frequencies used for the TB–RR conversion process, but also necessary to pay attention to the solution method for certain types of rainfall, particularly systems with embedded convection. Although not easy to prove with the low-resolution results alone, this may relate to the fundamental difference in the way physical and statistical algorithms treat inhomogeneous beam filling. Notably, the beamfilling problem is more severe for convective situations (where rain rate varies at small space and time scales) than for stratus systems, frontal bands,and slope convection, where rain patterns are more uniform in space and time.
The basic difference between physical and statistical algorithms in regards to beam filling is that statistical methods account for beam filling intrinsically in the regression coefficients used to calibrate to ground measurements, whereas for physical methods there is no consistent treatment. Some physical algorithms do not account for beam filling explicitly, some apply spatial enhancement operators to the low resolution channels to reduce the inhomogeneity problem, and others use theoretically based correction procedures. Therefore, when the spatial scales of the rain cells fall below the beam sizes of the various SSM/I channels, the underlying differences between the various algorithm methods can be amplified by inconsistencies in how beam filling is addressed.
4) Intercomparison by resolution stratification
It was mentioned above in the all-weather classification, that the validation–composite intercomparisons are sensitive to retrieval resolution whereas the composite–composite intercomparisons are not. This result supports our conclusion that the validation data do not provide representative statistics for intercomparisons at these scales. However, the all-weather composite–composite intercomparisons do change up to 4% when switching between low and high resolution.
The suggestion here is that the average behavior of the statistical algorithms is less sensitive to resolution since all statistical algorithms intrinsically account for beamfilling effects in deriving regression coefficients. In fact, that is what is found in the results. The means of the statistical algorithm composite for the different oceanic meteorological classes never change by more than 0.01 mm h−1 switching from low to high resolution (usually not changing at all), whereas the means for the two physical algorithm categories always change up to 0.07 mm h−1. This extends our conclusion that inconsistencies in how physical algorithms handle beam filling contributes to the underlying difference between physical and statistical algorithms.
5) Impact of rainrate cutoff on intercomparisons
An important element of the differences between algorithms stems from variations in detection of rain area due to different screening techniques and the use of low-end cutoffs by various algorithms (see the appendix). Therefore, it is worth examining the effect of a consistently imposed cutoff on the intercomparison statistics. Figure 6 illustrates the baseline discrepancies in rain area by presenting coverage percentages for each of the 20 algorithms according to the different meteorological classifications for ocean and land, based on all the overpasses used for the reprocessing calculations. It is clear that the variance in rain-area coverage stemming from the different algorithms can be significant and depends on the meteorological classification.
The question arises as to the sensitivity of the intercomparison statistics to the degree of consistency in the minimal level of detection and the assignment of light rain rates. To test for sensitivity to rain-coverage differences, we consider the statistical intercomparison measures as a function of increasing rain-rate cutoff. In general, increasing the cutoff above zero tends to produce agreement among the different algorithms insofar as rain-area coverage, once a rain threshold is reached to which all algorithms comply. Proceeding on this basis, we compare the rain-rate results of different algorithm pairs for different meteorological categories over different cutoff values Cυ ranging from 0 to 10 mm h−1. When a given cutoff value Cυ is applied, every algorithm-derived rain rate less than Cυ is set to zero before the results are intercompared. Statistical measures used in the test are 1) correlation coefficient, 2) Heidke skill score, 3) bias, 4) bias-adjusted rms, and 5) difference in percentage of raining pixels between two algorithms.
The four panels of Fig. 7 present a typical set of results in which the composite of all physical algorithms is intercompared to the composite of all statistical algorithms. The intercomparisons are stratified into ocean and land categories, considering all 56 overpasses of the reprocessed datasets. For the ocean pixels, the correlation coefficient increases from 0.8908 with no cutoff to a maximum of 0.8911 at a cutoff of 0.15 mm h−1 and then falls off. The Heidke skill score increases from 0.65 at no cutoff to a maximum of 0.86 at a cutoff of 0.5 mm h−1 and then falls off. The bias fluctuates without showing any strong pattern because positive and negative differences continue to cancel as the cutoff increases. The bias-adjusted rms starts with a value of 0.86 mm h−1 at no cutoff and then generally increases with increasing cutoff, because the larger squared differences dominate the rms as the cutoff increases. The difference in the percentage of raining pixels starts with a value of −8% at no cutoff, then approaches zero and remains so once a cutoff is reached that all algorithms can agree on as a minimum detectable rain rate. Each of the five statistics for land follow the same general patterns as for ocean.
Results of these comparisons are very similar, regardless of the algorithm pair, meteorological category, or land–ocean classification. In general, the correlations increase slightly or remain constant up to a cutoff value of just under 1.0 mm h−1, then drop off. The Heidke skill scores almost always increase sharply up to a maximum at a cutoff value of greater than 0.1 mm h−1 but less than 3.0 mm h−1, and then drop off. The biases fluctuate in a somewhat random fashion. The bias adjusted rms’s remain constant or increase as the cutoff is increased. The difference in percentage of raining pixels approaches zero as increasing cutoffs are applied.
It is apparent that imposing a light rain cutoff increases agreement with respect to pixels that are raining, that is, skill scores improve dramatically and correlations improve slightly. In essence, intercomparison statistics are variably sensitive to the minimum level of detectability and there is an optimum level of this factor in the proximity of 1 mm h−1 when intercomparing many algorithms.
e. Difference histograms
To understand the distributions of rain rate differences with respect to either the validation data or the composite-all algorithm, rain rate difference histograms for the 20 individual algorithms are analyzed separately. An example of a set of these histograms for the FER–AVE algorithm is shown in Fig. 8. Eight panels are prepared for each algorithm, since the histograms consider both validation data (VAL) and the composite-all algorithm (ALL) as comparators, stratify according to ocean and land, and consider both PI and common screen results (PI and CS). The histograms only consider pixel pairs within the intersections of the reprocessed overpasses in which at least one of the rain rates from each pair exceeds 1 mm h−1. In other words, light rain rates have been excluded from the analysis. This is done to preventthe relatively large number of small rain rates from dominating the statistics and to specifically focus on differences between the medium to large rain rates. Presented numerically below each histogram panel are the average values of the pixels from the specific algorithm and the comparator algorithm used to form the differences, as well as the bias, correlation coefficient (corr), and bias-adjusted rms factors between these two sets of pixels. Table 9 summarizes the analysis. The algorithms are grouped into three categories depending on whether the bias is large negative (<−1.5 mm h−1), small to medium (between ±1.5 mm h−1), or large positive (>+1.5 mm h−1). The PI screen results are used to do the grouping with ocean and land categories considered separately. Footnotes indicate changes in grouping assignments that would occur if common screen results are used instead of PI screen results.
It is evident from Table 9 that 14 of 20 ocean algorithms exhibit a large positive bias with respect to the validation data, but only 4 of 17 land algorithms exhibit a large bias (positive or negative). Both results are consistent with what has been noted previously when all intersecting pixels are considered. For the composite-all results, only 5 of 20 ocean algorithms fall outside the small-medium grouping, while for land algorithms, only 4 of 17 fall outside the same grouping. This denotes a high degree of consistency between the algorithms in terms of bias.
Regardless of the general bias consistency between the individual algorithms, careful inspection of the entire set of histogram differences indicates significant rain-rate differences between individual pixels up to and exceeding 20 mm h−1. This was alluded to in section 3a when considering various examples of individual overpass rain-rate maps and demonstrates the main source of rms difference between algorithms, that is,inconsistency at the large rain rate end of the scale. It is also evident from this set of figures that the distribution properties of difference histograms are variable, ranging from Gaussian, to skewed left or right, to multimodal. The kurtosis factors (third moments) are also highly variable, with some histograms exhibiting tight clustering near zero-difference and others exhibiting large spreads.
The foremost inference from the histogram analysis is that the major cause of algorithm differences in regards to TB–RR conversion stems from discrepancies occurring at large rain rates. It is also clear that in a number of instances, these discrepancies would be missed if the intercomparisons only consider averages and the difference histograms are Gaussian with near-zero means. Therefore, the large rain rate spectra should be given closer attention in the future in seeking convergence among the algorithms and optimal designs for combined algorithms.
f. Fan maps
To illustrate and explain why algorithms behave differently and why the greatest differences among the algorithm composites occur when the channel inputstratification is used, we consider the transform relationships between TBs and RRs on a pixel by pixel basis. As described in section 3f, a fan map encompasses the TB behavior of all pixels for a given algorithm and given overpass (stratified by land or ocean) whose rain rates fall within a specified rain rate range, regardless of which channels are used as input. The mapping consists of one bundle of lines connecting the 19-GHz unpolarized TBs (19Us) on the left-side abscissa to the corresponding 37Us on the left-hand ordinate, and another bundle of lines connecting the 19Us on the right-side abscissa (mirror image of left abscissa) to the corresponding 85Us on the right-hand ordinate. Seven rain-rate ranges (categories) are used to generate seven separate fan maps, forming the transform relationships for all raining pixels, for a given algorithm, for the land or ocean area of an overpass.
Figure 9 presents the results for nine algorithms based on the ocean area of case 11/overpass 2 (winter nor’easter). Note that 19U–37U and 19U–85U relationships are given for each algorithm, regardless of which channels are used in the associated TB–RR conversion procedure. The foremost result from this selection of fan maps is that the maximum rain-rate categories associated with the different algorithms are different. Forexample, compare FER–AVE and LIU–CUR, whose rain rates do not exceed category 4 (7.5–12.5 mm h−1, with GSCAT and NESDIS, whose rain rates extend to category 7 (>25 mm h−1), a difference in maximum rain rates exceeding a factor of 2. Next note the varying positions and thicknesses of the bundles of lines originating from the abscissas and terminating at the ordinates. Were two algorithms to perform similarly, they would exhibit similar line bundle properties for each rainrate category.
For example, compare the scattering algorithm GSCAT, which only uses 85 GHz in its TB–RR conversion scheme, to the emission algorithm TAMU dominated by 19 GHz in its conversion scheme. The GSCAT fan maps indicate, as they should, that as the rain rates increase and scattering depresses the 85-GHz TBs, the right-hand bundles terminate at decreasing values of 85-GHz TB. However, for the first five categories, the originating points of the 19Us on the right-side abscissas show little variation and varying degrees of thickness, indicating little correlation between rain rate assignment and 19-GHz TB for rain rates up to 17.5 mm h−1. Examination of the corresponding left-hand bundles oflines connecting the 19-GHz TBs with 37-GHz TBs indicates that rain-rate assignments up through the fifth category are unrelated to brightness temperatures at both the lower frequencies. In the case of TAMU, the dominant feature is the steady rise of TB at 19 GHz as rain rate increases (i.e., the thin spreads of the line bundles emanating from both left- and right-side abscissas), associated with the thick spreads and near-invariance in positions of the bundle terminations on the right-hand 85-GHz ordinates for categories 2–6. Although there is clearly an increase in the 37-GHz TBs between category 1 and 6 (indicative of the emission signal), the rain rates between 2.5 and 12.5 mm h−1 (categories 2–4) show little relationship to 37-GHz TB.
There are additional examples of differences evident in Fig. 9 and numerous other examples when all the fan maps generated for the entire PIP-2 intercomparison analysis are considered. However, the main issue here is not to compile all the many differences, but to point out that these differences are prevalent and that they can become significant at large rain rates, which ensures that rms differences will become correspondingly large regardless of algorithm-to-algorithm bias.
PIP-2 was conducted as a follow up to PIP-1 to better understand the performance of current PMW retrieval algorithms at the instantaneous, full-resolution scale. PIP-1 focused on climate scales by considering monthly gridded averages. It provided the first comprehensive test of PMW algorithms implemented at the global scale. PIP-2 has focused on determining the strengths and weaknesses of the current algorithms, accomplished by examining a number of case studies distributed in time and space over the globe. PIP-2 intercompares algorithm results to one another, to their own algorithm category composites, to all-algorithm composites, and to validation maps. The differences are interpreted in terms of various stratifications, particularly concerning how algorithms are categorized according to their solution method, their basic screening approach, and their channel input used in the TB–RR conversions.
There are six main conclusions that can be offered from this analysis. The first conclusion is that many current SSM/I precipitation algorithms, individually and as a group, are performing credibly in estimating rainfall from space at instantaneous high-resolution time and space scales. This assertion can be defended on the basis of the consistency of the biases and the correspondence with the AIP-3 results. Clearly, this study and AIP-3 have documented differences among the various algorithms and between the algorithms and validation datasets. Nevertheless, because of the intrinsic uncertainties in operational validation data and the insufficient sampling capabilities of operational radars and gauge networks relative to the scales and organization of rainfall systems, almost all algorithm differences are within the uncertainty limits of the validation data. Although the exactness of this claim will undergo refinement after the launch of TRMM, the PIP-2 study provides concrete evidence that ground-based radar–rain gauge validation data acquired from reputable sources are not reliable as a calibration standard for instantaneous rainrates. This was most evident from the detection of erroneous measurements in the original processed radar datasets, the underestimation found in the ocean radar data in contrast to the land radar data, and the fact the validation data could not produce stationarity in the intercomparison statistics because coverage for the PIP-2 overpasses was systematically nonrepresentative. Therefore, even if one of the current PIP-2 algorithms were performing optimally, we could not prove it based on what has been found here concerning the reliability of the validation datasets. It should be pointed that these remarks are primarily aimed at operational radar data, not all ground radar data. For example, the TRMM experiment is seeking to produce much higher quality ground radar data by requiring 1) frequent calibrations at the selected radar sites using dense gauge networks and disdrometers, along with constant monitoring of electronics; 2) volume scan instead of plan position indicator data acquisition; and 3) refined radar retrieval algorithms tailored for the individual radar sites.
Unfortunately, the trouble with the search for the best algorithm is that it is an endless exercise, since any claim to that effect stemming from an observational ground validation approach can be disputed by other investigators based on alternate sources of validation measurements. In essence, the quantitative differences among current algorithms are small enough that defending the concept of “best” requires a laboratory-standard calibration analysis. This is too far out of the realm to be given serious consideration for the AIPs and PIPs; however, a coauthor of this paper (T. Wilheit) has outlined the development of a “calibration-level forward model,” which seeks to get beyond the futile debate of how to determine the best algorithms as we move into the TRMM era. At this stage, the more pressing question for purposes of climate and weather analysis is, given the current state-of-the-art in making precipitation estimates from space, what are the underlying uncertainties? From the analysis given in section 4c, we estimate accuracies of ±30% for 13 of 20 algorithms participating in PIP-2, a result consistent with the results of AIP-3 (15 of 20). For PIP-2, the precision estimates for instantaneous full resolution rain rates range from ∼35% to 100%.
The second conclusion is that as a method of retrieval science, the PMW algorithms should seek convergence in establishing a nonzero rain threshold in conjunction with an optimal screening procedure for defining the area of rain cover. It is clear from comparing the PI and common screen results, that there are discrepancies between the algorithms concerning where it is and is not raining, most of which takes place at the light-rain end of the scale. Differences arise because of the varying severity of screening methods, the use of different minimum rain-rate cutoffs, inconsistencies in whether regression designs pass through zero, setting rain–no rain thresholds according to different thresholds of retrieved liquid water path, and other more obscure light-rain threshold definitions. The tendency for the statistical rain map algorithms to use regressions that pass through zero actually produced systematic rain-area coverage differences relative to other solution methods. Although for ocean, the resultant discrepancies do not significantly impact the area-averaged rainrates (as with land), they still give rise to two problems.
First is the obvious dilemma that rain-area coverage differences undermine and confound intercomparison–validation analyses because they directly corrode the correlation, skill score, and rms difference measures. Second is the faulty notion, debated during the PIP-2 workshops, that rain cover should only be determined within the framework of an individual algorithm’s design and not through some type of independent, optimal determination scheme. To an observer on land, it is always possible to determine whether it is raining or not, based on the simple test of whether the ground is gettingwet. Over the ocean, an observer must be more circumspect, but the presence or absence of rain is not ambiguous. Thus, eliminating numerous specialized methodologies in favor of a single optimal approach for estimating rain cover appears to be a reasonable path toward progress. Moreover, moving to an optimal screening approach might help identify an optimal means to reduce beam filling effects, which would directly improve agreement among algorithms and may even bring about better consistency between rain cover and average rain rate, which some investigators maintain is an immutable property of rain; for example, see Oki et al. (1997). It is less certain that such a monolithic view should be extended to the TB–RR conversion problem, since there are legitimate and defensible differences of interpretation, in a physics context, of what constitutes the true rain intensity signature in terms of the various emission, scattering, depolarization, and differential frequency signals generated by multichannel PMW radiometers.
The application of a common screen to help assess the degree of the rain-cover discrepancies in the PIP-2 calculations should not be considered an optimal solution. The common screen employed in PIP-2 was generated within the semiempirical NESDIS screening framework, then empirically tuned with imperfect validation data (Ferraro et al. 1998). Therefore, it is reasonable to conclude that the PIP-2 common screen is imperfect and not optimal. As emphasized, screening has received less attention than the more glamorous end of precipitation retrieval associated with the TB–RR conversion schemes. Therefore, screening deserves greater attention in the future since accurate rain detection is a prerequisite of successful rain retrieval.
The third conclusion is that a major source of the remaining rain-rate biases between individual algorithms stems from significant differences at the high rain-rate end of the scale and the maximum rain rates that a given algorithm will produce. These differences are most evident from the analyses of the difference histograms and fan maps but also are evident from the qualitative image analyses. The difference histograms indicate that there is no systematic relationship between the solution method and the maximum rain-rate intensities that can be generated. Differences in the levels of maximum intensity from different algorithms arise from various sources: 1) the use of arbitrary high-end cutoffs;2) the nature of nonlinear relationships in algorithms using regression formulations (statistical or physical); 3) the relative weights given to high-frequency scattering measures in the scattering and mixed algorithms; 4) the rain-rate probability distribution functions (pdfs) associated with the underlying microphysics of the algorithms whether they be based on ground data, a conceptual model, or a cloud model; 5) beamfilling correction methods; and 6) additional miscellaneous factors associated with specific algorithm designs. A lesson of PIP-2 is that future intercomparisons such as TRMMshould heed the differences of rainrate pdfs between the various algorithms and those derived from volume scan radar measurements, seeking to resolve discrepancies in how intense rain rates are calculated.
The fourth conclusion is that the solution method is not the only important factor governing systematic differences among groups of algorithms. We have seen that for certain meteorological categories, there are systematic differences between the solution methods. Furthermore, based on the resolution tests, the statistical algorithms appear less vulnerable to beam filling because the regressions tend to correct for this factor intrinsically. However, in the case of land algorithms overall, the severity of the screen appears to have greatest effect on systematic differences. And in the case of ocean algorithms overall, it is the channel input that dominates group differences insofar as TB–RR conversion, more so than the philosophical design behind the solution. The latter was the most salient result of the analysis in section 4d(2). Since emission algorithms only apply to the ocean, the significance of this result is that fundamental inconsistencies remain in the algorithms concerning interpreting the intensity of rain between emission frequencies and scattering frequencies. For the land case, where only mixed and scattering algorithms are used, screening has a major effect because these two types of algorithms only respond to the more intense rain rates, and therefore mean values become affected by setting the rain–no-rain threshold at different levels.
The fifth conclusion is that until a calibration model is developed, the use of ground data for validation in intercomparison projects such as PIP and AIP (plus the soon-to-be-conducted TRMM GV intercomparisons) represents an important part of seeking algorithm improvement. Algorithm developers profit from the mix of competition and exchange offered by these projects to uncover flaws and discover more fruitful pathways in their algorithms. This includes the option to empirically calibrate to ground data if the underlying assumptions and formulations cannot be defended in a purely physical framework. Notably, through massive intercomparison calculations, the PIP-3 project now offers a forum to test the notion of whether merged satellite-ground products are more reliable than stand-alone satellite algorithms. PIP-3 has also entrained the modeling community into the process, enabling a serious debate as to the validity of GCM simulations of the global rain field. In the past, unsubstantiated claims have been made that GCM models had superseded the need for rain measurement, claims that influenced the interpretation of GCM rainfall prognoses (e.g., Hastenrath 1990). PIP-3 offers a genuine scientific opportunity to test such claims.
The sixth and final conclusion is that since we have reached a point where ground validation datasets are not effective in arbitrating the absolute accuracy and precision of different satellite algorithms, more emphasis is needed on developing alternate validation strategies,particularly calibration models. The concept of a calibration model is not new. For example, prior to the modern age of radio wave transmissions, the Keplerian model of planetary orbits was used to establish distances between planets from simple measures of orbital recurrence times. Another is that orbital periods and separation distances have been used to accurately establish the earth’s mass, an otherwise immeasurable quantity. For the precipitation retrieval problem, a calibration model would represent the most complete forward RTE model offered by current theory methods, including the use of detailed microphysical models. The combined modeling system would relate microphysical profiles and their associated rain rate structures to upwelling TBs at the top of atmosphere at selected view angles. Such models would incorporate the most advanced single scatter techniques for arbitrarily shaped hydrometeors including mixtures of water, air, and ice, and the most detailed three-dimensional, nonsteady-state, multiple-scattering techniques incorporating arbitrary canting geometry. They would include updated H2O–O2 line and continuum absorption data (the dominant absorbing gases in the mm–cm spectrum), and refined data on dielectric properties of water and ice (including temperature dependence). Such a model would provide a consistent testbed for both statistical and physical retrieval techniques, and for understanding their sensitivity to measurement error, microphysical details, beam filling, and geometry.
The authors extend their appreciation to Dr. Jim Dodge of NASA Headquarters and to Mr. Michael Goodman of NASA/Marshall Space Flight Center for developing the WetNet project, which supports the PIPs, and to Mr. Jim Merritt of the Florida State University (FSU) for his assistance with computer graphics. Research support at FSU was provided by NASA Grants NAGW-3970 and NAG5-2672; essential travel support was provided by NATO Grant CG-890894.
Adler, R. F., H.-Y. M. Yeh, N. Prasad, W.-K. Tao, and J. Simpson, 1991: Microwave simulations of a tropical rainfall system with a three-dimensional cloud model. J. Appl. Meteor.,30, 924–953.
——, A. J. Negri, P. R. Keehn, and I. M. Hakkarinen, 1993: Estimation of monthly rainfall over Japan and surrounding waters from a combination of low-orbit microwave and geosynchronous IR data. J. Appl. Meteor.,32, 335–356.
——, G. J. Huffman, and P. R. Keehn, 1994: Global tropical rain estimates from microwave adjusted geosynchronous IR data. Remote Sens. Rev.,11, 125–152.
Alishouse, J. C., S. Snyder, and R. R. Ferraro, 1990: Determination of oceanic precipitable water from the SSM/I. IEEE Trans. Geosci. Remote Sens.,26, 811–816.
Allam, R., G. Holpin, P. Jackson, and G.-L. Liberti, 1993: Second Algorithm Intercomparison Project of the Global Precipitation Climatology Project: AIP-2. Pre-Workshop Report, 133 pp. [Available from Satellite Image Applications Group, UKMO, Bracknell, Berkshire, United Kingdom.].
Aonashi, K., A. Shibata, and G. Liu, 1996: An over-ocean precipitation retrieval using SSM/I multichannel brightness temperatures. J. Meteor. Soc. Japan,74, 617–637.
Arkin, P. A., 1988: Estimating climatic scale tropical precipitation from satellite observations. Tropical Rainfall Measurements, J. S. Theon and N. Fugono, Eds., A. Deepak Publishing, 151–157.
——, and B. N. Meisner, 1987: The relationship between large-scale convective rainfall and cold cloud over the Western Hemisphere during 1982–84. Mon. Wea. Rev.,115, 51–74.
——, and P. Xie, 1994: The global precipitation climatology project:First Algorithm Intercomparison Project. Bull. Amer. Meteor. Soc.,75, 401–419.
Barrett, E. C., J. Dodge, H. M. Goodman, J. Janowiak, C. Kidd, and E. A. Smith, 1994a: The first WetNet Precipitation Intercomparison Project (PIP-1). Remote Sens. Rev.,11, 49–60.
——, and Coauthors, 1994b: The First WetNet Precipitation Intercomparison Project (PIP-1): Interpretation of results. Remote Sens. Rev.,11, 303–373.
Bauer, P., and P. Schluessel, 1993: Rainfall, total water, ice water, and water vapor over sea from polarized microwave simulations and Special Sensor Microwave/Imager data. J. Geophys. Res.,98, 20 737–20 759.
——, L. Schanz, R. Bennartz, and P. Schlüssel, 1998: Outlook for combined TMI–VIRS algorithms for TRMM: Lessons from the PIP and AIP projects. J. Atmos. Sci.,55, 1714–1729.
Berg, W., and R. Chase, 1992: Determination of mean rainfall from the Special Sensor Microwave/Imager (SSM/I) using a mixed lognormal distribution. J. Atmos. Oceanic Technol.,9, 129–141.
——, and S. K. Avery, 1994: Rainfall variability over the tropical Pacific from July 1987 through December 1991 as inferred via monthly estimates from SSM/I. J. Appl. Meteor.,33, 1468–1485.
——, W. Olson, R. Ferraro, S. J. Goodman, and F. J. LaFontaine, 1998: An assessment of the first- and second-generation navy operational precipitation retrieval algorithms. J. Atmos. Sci.,55, 1558–1575.
Calavieri, D. J., P. Gloerson, and W. Y. Campbell, 1984: Determination of sea ice parameters with the Nimbus 7 SMMR. J. Geophys. Res.,89, 5355–5369.
CalVal, 1989: DMSP Special Sensor Microwave/Imager calibration/validation. CalVal Final Rep., Vol. I, 176 pp. [Available from Naval Research Laboratory, Washington, DC 20375.].
——, 1991: DMSP Special Sensor Microwave/Imager calibration/validation. CalVal Final Rep., Vol. II, 277 pp. [Available from Naval Research Laboratory, Washington, DC 20375.].
Cerro, C., B. Codina, J. Bech, and J. Lorente, 1997: Modeling raindrop size distribution and Z(R) relations in the western Mediterranean area. J. Appl. Meteor.,36, 1470–1479.
Chang, A. T. C., L. S. Chiu, and T. T. Wilheit, 1993a: Random errors of oceanic monthly rainfall derived from SSM/I using probability distribution functions. Mon. Wea. Rev.,121, 2351–2354.
——, ——, and ——, 1993b: Oceanic monthly rainfall derived from SSM/I. Eos, Trans. Amer. Geophys. Union,74, 505–513.
——, ——, and G. Yang, 1995: Diurnal cycle of oceanic precipitation from SSM/I data. Mon. Wea. Rev.,123, 3372–3380.
Chiu, L. S., G. R. North, D. A. Short, and A. McConnell, 1990: Rain estimation from satellites: Effect of finite field of view. J. Geophys. Res.,95, 2177–2185.
Dai, A., I. Y. Fung, and A. D. Del Genio, 1997: Surface observed global land precipitation variations during 1900–88. J. Climate,10, 2943–2962.
Deirmendjian, D., 1964: Scattering and polarization properties of water clouds and hazes in the visible and infrared. Appl. Opt.,3, 187–196.
Dodge, J. C., and H. M. Goodman, 1994: The WetNet project. Remote Sens. Rev.,11, 5–21.
Ebert, E. E., 1996: Results of the 3rd Algorithm Intercomparison Project (AIP-3) of the Global Precipitation Climatology Project (GPCP). BMRC Research Rep. No. 55, 199 pp. [Available from Bureau of Meteorology Research Centre, Box 1289K, Melbourne, Victoria 3001, Australia.].
——, and M. J. Manton, 1995: Summary report of the third Algorithm Intercomparison Project(AIP-3) of the Global Precipitation Climatology Project (GPCP). WMO Tech. Note WMO/TD No. 714, 30 pp. [Available from World Meterological Organization, Case Postale 2300, CH-1211 Geneva 2, Switzerland.].
——, and ——, 1998: Performance of satellite rainfall estimation algorithms during TOGA COARE. J. Atmos. Sci.,55, 1537–1557.
——, ——, P. A. Arkin, R. J. Allam, G. E. Holpin, and A. Gruber,1996: Results from the GPCP Algorithm Intercomparison Programme. Bull. Amer. Meteor. Soc.,77, 2875–2887.
Evans, K. F., J. Turk, T. Wong, and G. L. Stephens, 1995: A Bayesian approach to microwave precipitation profile retrieval. J. Appl. Meteor.,34, 260–279.
Farrar, M. R., and E. A. Smith, 1992: Spatial resolution enhancement of terrestrial features using deconvolved SSM/I microwave brightness temperatures. IEEE Trans. Geosci. Remote Sens.,30, 349–355.
Ferraro, R. R., and G. F. Marks, 1995: The development of SSM/I rain-rate retrieval algorithms using ground-based radar measurements. J. Atmos. Oceanic Technol.,12, 755–770.
——, N. C. Grody, and J. Kogut, 1986: Classification of geophysical parameters using passive microwave satellite measurements. IEEE Trans. Geosci. Remote Sens.,24, 1008–1013.
——, ——, and G. F. Marks, 1994: Effects of surface conditions on rain identification using the DMSP-SSM/I. Remote Sens. Rev.,11, 195–210.
——, F. Weng, N. C. Grody, and A. Basist, 1996: An eight-year (1987–1994) time series of rainfall, clouds, water vapor, snow cover, and sea ice derived from SSM/I measurements. Bull. Amer. Meteor. Soc.,77, 891–905.
——, E. A. Smith, W. Berg, and G. J. Huffman, 1998: A screening methodology for passive microwave precipitation retrieval algorithms. J. Atmos. Sci.,55, 1583–1600.
Ferriday, J. G., and S. K. Avery, 1994: Passive microwave remote sensing of rainfall with SSM/I: Algorithm development and implementation. J. Appl. Meteor.,33, 1587–1596.
Fiore, J. V., and N. C. Grody, 1992: Classification of snow cover and precipitation using the Special Sensor Microwave/Imager (SSM/I). Int. J. Remote Sens.,13, 3349–7435.
Goodberlet, M. A., C. T. Swift, and J. C. Wilkerson, 1989: Remote sensing of ocean surface winds with the Special Sensor Microwave/Imager. J. Geophys. Res.,94, 14 547–14 555.
Grody, N. C., 1991: Classification of snow cover and precipitation using the Special Sensor Microwave/Imager (SSM/I). J. Geophys. Res.,96, 7423–7435.
Haddad, Z. S., E. A. Smith, C. D. Kummerow, T. Iguchi, M. R. Farrar, S. L. Durden, M. Alves, and W. S. Olson, 1997: The TRMM‘Day-1’ radar/radiometer combined rain-profiling algorithm. J. Meteor. Soc. Japan,75, 799–809.
Hastenrath, S., 1990: Tropical climate prediction: A progress report, 1985–90. Bull. Amer. Meteor. Soc.,71, 819–825.
Hinton, B. B., W. S. Olson, D. W. Martin, and B. Auvine, 1992: A passive microwave algorithm for tropical ocean rainfall. J. Appl. Meteor.,31, 1379–1395.
Hollinger, J., R. Lo, G. Poe, R. Savage, and J. Pierce, 1987: Special Sensor Microwave/Imager User’s Guide. Naval Research Laboratory, 120 pp.
——, J. L. Pierce, and G. A. Poe, 1990: SSM/I instrument evaluation. IEEE Trans. Geosci. Remote Sens.,GE-28, 800–810.
Houze, R. A., Jr., and C. P. Cheng, 1977: Radar characteristics of tropical convection observed during GATE: Mean properties and trends over the summer season. Mon. Wea. Rev.,105, 964–980.
Huffman, G. J., R. F. Adler, B. R. Rudolf, U. Schneider, and P. R. Keehn, 1995: Global precipitation estimates, rain gauge analysis, and NWP model precipitation information. J. Climate,8, 1284–1295.
——, and Coauthors, 1997: The Global Precipitation Climatology Project (GPCP) combined precipitation dataset. Bull. Amer. Meteor. Soc.,78, 5–20.
Janowiak, J. E., P. A. Arkin, P. Xie, M. L. Morrissey, and D. R. Legates, 1995: An examination of the east Pacific ITCZ rainfall distribution. J. Climate,8, 2810–2823.
Joyce, R., and P. A. Arkin, 1997: Improved estimates of tropical and subtropical precipitation using the GOES Precipitation Index. J. Atmos. Oceanic Technol.,14, 997–1011.
Kedem, B., L. S. Chiu, and G. R. North, 1990: Estimation of mean rain rate: Application to satellite observations. J. Geophys. Res.,95, 1965–1973.
Kidd, C., 1998: The rainfall retrievals using the polarization-corrected temperature algorithm. Int. J. Remote Sens., in press.
——, and E. C. Barrett, 1990: The use of passive microwave imagery in rainfall monitoring. Remote Sens. Rev.,4, 415–450.
——, D. Kniveton, and E. C. Barrett, 1998: The advantages and disadvantages of statistically derived–empirically calibrated passive microwave algorithms for rainfall estimation. J. Atmos. Sci.,55, 1572–1582.
Kniveton, D. R., and E. C. Barrett, 1994: Composite algorithms for rainfall estimation using data from the DMSP SSM/I. Preprints, Seventh Conf. on Satellite Meteorology and Oceanography, Monterey, CA, Amer. Meteor. Soc., 140–143.
Kummerow, C., 1993: On the accuracy of the Eddington approximation for radiative transfer in the microwave frequencies. J. Geophys. Res.,98, 2757–2765.
——, and L. Giglio 1994a: A passive microwave technique for estimating rainfall and vertical structure information from space. Part I: Algorithm description. J. Appl. Meteor.,33, 3–18.
——, and ——, 1994b: A passive microwave technique for estimating rainfall and vertical structure information from space. Part II: Applications to SSM/I data. J. Appl. Meteor.,33, 19–34.
——, R. A. Mack, and I. M. Hakkarinen, 1989: A self-consistency approach to improve microwave rainfall rate estimation from space. J. Appl. Meteor.,28, 869–884.
——, I. M. Hakkarinen, H. F. Pierce, and J. A. Weinman, 1991: Determination of precipitation profiles from airborne passive microwave radiometric measurements. J. Atmos. Oceanic Technol.,8, 148–158.
——, W. S. Olson, and L. Giglio, 1996: A simplified scheme for obtaining precipitation and vertical hydrometeor profiles from passive microwave sensors. IEEE Trans. Geosci. Remote Sens.,34, 1213–1232.
Lee, T. H., J. E. Janowiak, and P. A. Arkin, 1991: Atlas of Products from the Algorithm Intercomparison Project 1: Japan and Surrounding Oceanic Regions. UCAR, 131 pp.
Liberti, G. L., 1995: Review of SSM/I-based algorithms submitted for the GPCP-AIP/2. Microwave Radiometry and Remote Sensing of the Environment, D. Solimini, Ed., VSP Press, 297–306.
Liu, G., and J. A. Curry, 1992: Retrieval of precipitation from satellite microwave measurement using both emission and scattering. J. Geophys. Res.,97, 9959–9974.
——, and ——, 1993: Determination of characteristic features of cloud liquid water from satellite microwave measurement. J. Geophys. Res.,98, 5069–5092.
——, and ——, 1998: An investigation of the relationship between emission and scattering signals in SSM/I data. J. Atmos. Sci.,55, 1628–1643.
Marshall, J. S., and W. M. K. Palmer, 1948: The distribution of raindrops with size. J. Meteor.,5, 165–166.
Marzano, F. S., A. Mugnai, E. A. Smith, X. Xiang, J. E. Turk, and J. Vivekanandan, 1994: Active and passive remote sensing of precipitation storms during CaPE. Part II: Intercomparison of precipitation retrievals from AMPR radiometer and CP-2 radar. Meteor. Atmos. Phys.,54, 29–52.
McFarland, M. J., and C. M. U. Neale, 1991: Land parameter algorithm validation and calibration. DMSP Special Sensor Microwave/Imager Calibration/Validation Final Rep., Vol. II, 277 pp. [Available from Space Sensing Branch, Naval Research Laboratory, Washington, DC 20375.].
Morrissey, M., and J. S. Greene, 1991: The Pacific atoll rain gauge data set. Planetary Geosci. Div. Contribution 648, 45 pp. [Available from the University of Hawaii, Honolulu, HI 96822.].
——, M. A. Shafer, H. Hauschild, M. Reiss, B. Rudolf, W. Reuth, and U. Schneider, 1994: Surface datasets used in WetNet’s PIP-1 from the Comprehensive Pacific Rainfall Data Base and the Global Precipitation Climatology Centre. Remote Sens. Rev.,11, 61–92.
Mugnai, A., E. A. Smith, and G. J. Tripoli, 1993: Foundations for statistical-physical precipitation retrieval from passive microwave satellite measurements. Part II: Emission source and generalized weighting function properties of a time dependent cloud-radiation model. J. Appl. Meteor.,32, 17–39.
——, F. S. Marzano, and N. Pierdicca, 1994: Precipitation retrieval from spaceborne microwave radiometers. Description and application of a maximum likelihood profile algorithm. Proc. CLIMPARA ’94: Climatic Parameters in Radiowave Propagation Prediction, Moscow, Russia, International Union of Radio Science, 2.2.1–2.2.4.
Negri, A. J., and R. F. Adler, 1993: An intercomparison of three infrared rainfall techniques over Japan and surrounding waters. J. Appl. Meteor.,32, 357–373.
Oki, R., A. Sumi, and D. A. Short, 1997: TRMM sampling of Radar-AMeDAS rainfall using the threshold method. J. Appl. Meteor.,36, 1480–1492.
Olson, W. S., F. J. Lafontaine, W. L. Smith, R. T. Merrill, B. A. Roth, and T. H. Achtor, 1991: Precipitation validation. DMSP Special Sensor Microwave/Imager Calibration/Validation Final Rep., Vol. II, 277 pp. [Available from Space Sensing Branch, Naval Research Laboratory, Washington, DC 20375.].
Panegrossi, G., and Coauthors, 1998: Use of cloud model microphysics for passive microwave-based precipitation retrieval: Significance of consistency between model and measurement manifolds. J. Atmos. Sci.,55, 1644–1673.
Petty, G. W., 1994a: Physical retrievals of over-ocean rain rate from multichannel microwave imaging. Part I: Theoretical characteristics of normalized polarization and scattering indices. Meteor. Atmos. Phys.,54, 79–100.
——, 1994b: Physical retrievals of over-ocean rain rate from multichannel microwave imaging. Part II: Algorithm implementation. Meteor. Atmos. Phys.,54, 101–122.
——, 1995: The status of satellite-based rainfall estimation over land. Remote Sens. Environ.,51, 125–137.
——, and K. B. Katsaros, 1990: Precipitation over the South China Sea by the Nimbus-7 Scanning Multichannel Microwave Imager during WMONEX. J. Appl. Meteor.,29, 273–287.
——, and W. Krajewski, 1996: Satellite rainfall estimation over land. Hydrol. Sci. J.,41, 433–451.
Pierdicca, N., F. S. Marzano, G. D’Auria, P. Basili, P. Ciotti, and A. Mugnai, 1996: Precipitation retrieval from spaceborne microwave radiometers based on maximum a posteriori probability estimation. IEEE Trans. Geosci. Remote Sens.,34, 1–16.
Ritchie, A. A., Jr., M. R. Smith, H. M. Goodman, R. L. Schudalla, D. K. Conway, F. J. LaFontaine, D. Moss, and B. Motta, 1998:Critical analyses of data differences between FNMOC and AFGWC spawned SSM/I datasets. J. Atmos. Sci.,55, 1601–1612.
Shibata, A., 1994: Determination of water vapor and liquid water content by an iterative method. Meteor. Atmos. Phys.,54, 173–182.
Short, D. A., and G. R. North, 1990: The beam filling error in ESMR-5 observations of GATE rainfall. J. Geophys. Res.,95, 2187–2194.
——, P. A. Kucera, B. S. Ferrier, J. C. Gerlach, S. A. Rutledge, and O. W. Thiele, 1997: Shipboard radar rainfall patterns within the TOGA COARE IFA. Bull. Amer. Meteor. Soc.,78, 2817–2836.
Simpson, J., C. Kummerow, W.-K. Tao, and R. F. Adler, 1996: On the Tropical Rainfall Measuring Mission (TRMM) satellite. Meteor. Atmos. Phys.,60, 19–36.
Skatskii, V. I., 1965: Some results from experimental study of the liquid-water content in cumulus-clouds. Atmos. Ocean. Phys. Ser.,1(8), 833–844.
Smith, D. M., D. R. Kniveton, and E. C. Barrett, 1998: A statistical modeling approach to passive microwave rainfall retrieval. J. Appl. Meteor.,37, 135–154.
Smith, E. A., A. Mugnai, H. J. Cooper, G. J. Tripoli, and X. Xiang, 1992: Foundations for statistical–physical precipitation retrieval from passive microwave satellite measurements. Part I: Brightness temperature properties of a time dependent cloud–radiation model. J. Appl. Meteor.,31, 506–531.
——, X. Xiang, A. Mugnai, and G. J. Tripoli, 1994a: Design of an inversion-based precipitation profile retrieval algorithm using anexplicit cloud model for initial guess microphysics. Meteor. Atmos. Phys.,54, 53–78.
——, A. Mugnai, and G. Tripoli, 1994b: Theoretical foundations and verification of a multispectral, inversion-type microwave precipitation profile retrieval algorithm. Proceedings of the ESA/NASA International Workshop on Microwave Radiometry, VSP Science Press, 599–621.
——, C. Kummerow, and A. Mugnai, 1994c: The emergence of inversion-type precipitation profile algorithms for estimation of precipitation from satellite microwave measurements. Remote Sens. Rev.,11, 211–242.
——, J. Chang, and J. E. Lamm, 1995: Pip-2: Intercomparison results report. Tech. Document, 38 pp. and 7 appendices. [Available from NASA WetNet Projects (NAGW-3970), Dept. of Meteorology, Florida State University, Tallahassee, FL 32306.].
——, J. E. Lamm, and H. M. Goodman, 1996: WetNet PIP-2 (Second Precipitation Intercomparison Project): SSM/I data, rainfall estimates and results. Earth System Science Division, NASA/Marshall Space Flight Center, CD-ROM, disk 1.
——, J. Turk, M. Farrar, A. Mugnai, and X. Xiang, 1997: Estimating 13.8-GHz path integrated attenuation from 10.7-GHz brightness temperatures for TRMM combined PR-TMI precipitation algorithm. J. Appl. Meteor.,36, 365–388.
Spencer, R. W., 1993: Global oceanic precipitation from the MSU during 1979–1991 and comparisons to other climatologies. J. Climate,6, 1301–1326.
——, 1994: Oceanic rainfall monitoring with the microwave sounding units. Remote Sens. Rev.,11, 153–162.
——, H. M. Goodman, and R. E. Hood, 1989: Precipitation retrieval over land and ocean with the SSM/I: Identification and characteristics of the scattering signal. J. Atmos. Oceanic Technol.,6, 254–273.
Szoke, E. J., E. J. Zipser, and D. P. Jorgensen, 1986: A radar study of convective cells in mesoscale systems in GATE. Part I: Vertical profile statistics and comparison with hurricanes. J. Atmos. Sci.,43, 182–197.
Tao, W.-K., and J. Simpson, 1989: Modeling study of a tropical squall-type convective line. J. Atmos. Sci.,46, 177–202.
——, and ——, 1993: Goddard cumulus ensemble model. Part I: Model description. TAO,4, 35–72.
——, ——, and S.-Y. Soong, 1987: The statistical properties of a cloud ensemble: A numerical study. J. Atmos. Sci.,44, 3175–3187.
Tesmer, J. R., and T. T. Wilheit, 1998: An improved microwave radiative transfer model for tropical oceanic precipitation. J. Atmos. Sci.,55, 1674–1689.
Todd, M. C., and J. O. Bailey, 1995: Estimates of rainfall over the United Kingdom and surrounding seas from the SSM/I using the polarization corrected temperature algorithm. J. Appl. Meteor.,34, 1254–1265.
Tripoli, G. J., 1992a: A nonhydrostatic model designed to simulate scale interactions. Mon. Wea. Rev.,120, 1342–1359.
——, 1992b: An explicit three-dimensional nonhydrostatic numerical simulation of a tropical cyclone. Meteor. Atmos. Phys.,49, 229–254.
Weng, F., and N. C. Grody, 1994: Retrieval of cloud liquid water using the Special Sensor Microwave Imager (SSM/I). J. Geophys. Res.,99, 25 535–25 551.
Wentz, F., 1993: User’s manual SSM/I antenna temperature tapes—Revision 2. RSS Tech. Rep. 120193, 34 pp. [Available from Remote Sensing Systems, Santa Rosa, CA 95404.].
——, 1997. A well-calibrated ocean algorithm for Special Sensor Microwave/Imager. J. Geophys. Res.,102, 8703–8718.
Wentz, F. J., and R. W. Spencer, 1998: SSM/I rain retrievals within a unified all-weather ocean algorithm. J. Atmos. Sci.,55, 1613–1627.
Wilheit, T. T., A. T. C. Chang, M. S. V. Rao, E. B. Rodgers, and J. S. Theon, 1977: A satellite technique for quantitatively mapping rainfall rates over the oceans. J. Appl. Meteor.,16, 551–560.
——, ——, and L. S. Chiu, 1991: Retrieval of monthly rainfall indicesfrom microwave radiometric measurements using probability distribution functions. J. Atmos. Oceanic Technol.,8, 118–136.
——, and Coauthors, 1994: Algorithms for the retrieval of rainfall from passive microwave measurements. Remote Sens. Rev.,11, 163–194.
——, W. R. Russell, and J. R. Tesmer, 1995: Physical principles of retrieval of rainfall from passive microwave measurements. Proc. Int. Geoscience and Remote Sensing Symp. IGRSS’95, Florence, Italy, IEEE Geoscience and Remote Sensing Society, 64.
Wu, R., and J. A. Weinman, 1984: Microwave radiances from precipitating clouds containing aspherical ice, combined phase, and liquid hydrometeors. J. Geophys. Res.,89, 7170–7178.
WMO/ICSU, 1990: The Global Precipitation Climatology Project—Implementation and data management plan. Tech. DocumentWMO TD No. 367, 45 pp. and 6 appendixes. [Available from World Meteorological Organization, Case Postale 2300, CH-1211 Geneva 2, Switzerland.].
Xiang, X., E. A. Smith, and C. G. Justus, 1994: A rapid radiative transfer model for reflection of solar radiation. J. Atmos. Sci.,51, 1978–1988.
Xie, P., and P. A. Arkin, 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc.,78, 2539–2558.
Ye, H., T. T. Wilheit, and W. R. Russell, 1997: Estimation of monthly rainfall over ocean from truncated rain-rate samples: Application to SSM/I data. J. Atmos. Oceanic Technol.,14, 1012–1022.
Yuter, S. E., and R. A. Houze Jr., 1997: Measurements of raindrop size distributions over the Pacific warm pool and implications for Z-R relations. J. Appl. Meteor.,36, 847–867.
Summary Descriptions of Algorithms
This appendix is divided into four sections corresponding to the four different algorithm classifications by solution method. The organization of the sections consists of first, a brief description of the nature of the method, then a summary of the screening procedures of all algorithms in the group, and finally a summary of the TB–RR conversion procedures of all algorithms individually. Mainstream citations are also given for each algorithm. Note that Table 4 contains a useful summary of the algorithms, their classification, identification of the specific SSM/I channels used in both screening and TB–RR conversion schemes, and foremost references.
Statistical rain map (BRIS, BERG, NESDIS, CALVAL, KIDD)
This type of algorithm is developed by deriving a statistical regression between a measured brightness temperature dataset (involving one or more frequencies) and a rainfall dataset obtained from rain gauge or radar ground measurements. As emphasized by Kidd et al. (1998), it is generally simplistic to consider these algorithms as merely a statistical relationship between the passive microwave TBs and the ground measurements. This is because the TB measurements themselves are often cast in the form of a transformed temperature (e.g., Hinton et al. 1992), a TB index, or a set of such variates, related to a distinct physical radiative property of rain such as a scattering signature, a polarization signature, or an emission signature. However, the governing philosophy behind these algorithms is that an empirical calibration is essential in formulating the TB–RR conversion.
Of the five statistical rain-map algorithms used in PIP-2, two do not explicitly screen pixels to detect the presence or absence of rain. These are the BRIS and KIDD algorithms, each of which is applicable to ocean and land. Both of these algorithms use TB indices in their conversion schemes, which are bounded in such a fashion that the regressions pass through zero without necessitating a separate rain detection step (intrinsic screening). However, both algorithms use snow-ice detection over land and sea ice detection over ocean at high latitudes following the Calavieri et al. (1984), Grody (1991), and Fiore and Grody (1992) screening methods to prevent their algorithms from mistaking frozen surfaces for rain. The ocean-only BERG algorithm uses a light screening procedure, based on a 19-GHz emission warming test and a 37-GHz depolarization test. The ocean–land CALVAL algorithm also uses a light screen over ocean involving a 37-GHz-based discriminant test, but a heavy screen over land involving an extensive land classification scheme that includes differentiating between rain over vegetation or rain over bare soil from flooded ground (McFarland and Neale 1991). Finally, the operational ocean–land NESDIS algorithm uses a fairly comprehensive sequence of tests (a heavy screen) using the 85-GHz scattering index (SI) test developed by Grody (1991) as the basic means to determine whether a scattering surface is present and then, through process of elimination, to determine if the scattering is being produced by rain rather than by snow, ice, or bare ground. In the case of the ocean screen, if the SI test fails to confirm rain, two additional emission tests at 19 and 37 GHz developed by Weng and Grody (1994) are invoked to test for light rain (useful because light raindoes not produce significant scattering effects but is detectable over a low-emissivity water background).
The TB–RR conversion schemes of the five statistical rain-map algorithms are developed along similar lines. The BRIS and KIDD algorithms, both of which are scattering-based, use nonlinear regression between radar-derived rainfall measurements from the FRONTIERS radar system deployed over the British Isles and different TB indices used in the conversion relationships. In the case of BRIS (D. M. Smith et al. 1997), there are separate indices for ocean and land. The land scheme is called the Adjusted Frequency Difference (AFD) method, which uses a temporally adjusted discriminant relationship formed by the slope of rain-free 37V and 85V TBs in which the AFD index is the distance from the rain-free line toward the rain side of the line. The ocean scheme is called the Oceanic Scattering Index (OSI), similar to the AFD scheme in that it defines a discriminant line formed by the slope of rain-free 85H TBs and 85-GHz polarization differences (i.e., 85V–85H). The OSI index is the distance from the discriminant line toward the rain side of the line, but instead of regressing this index directly to the radar measurements, it is first adjusted on a spatial grid to account for temperature variations affecting the exact position of the rain-free discriminant line.
The scattering index used in the KIDD algorithm is an 85-GHz polarization corrected temperature (PCT), formulated as PCT (85) = 85V + b(85V–85H), where b is an empirically determined constant (Todd and Bailey 1995; Kidd 1998). Correctly formulated, PCTs effectively eliminate the emission signature of the surface, resulting in an index responding to the attenuation properties of rain and insensitive to whether the background is land or ocean.
The TB–RR conversion scheme of the BERG algorithm is a modification of the original Hughes D-matrix algorithm developed for the United States Navy at the advent of the SSM/I program (see Hollinger et al. 1987). This is a regression scheme using 19-, 22-, and 37-GHz TBs as independent variables, that is, a mixed algorithm (Berg and Chase 1992). This version of the algorithm was designed to estimate monthly ocean rainfall free of noise problems that were embedded in the D-matrix regression coefficients. The original coefficients were developed from a combination of radiative transfer calculations and ground-based climatological data. They were generated as a function of latitude and season but had never been quality controlled to prevent discontinuities across the associated space and time boundaries. The modified algorithm uses an empirical weighting scheme based on rain gauge data (a tropical dataset from the National Climate Data Center) that is applied to the estimates and that reduces most of the space–time noise, producing credible global rain maps (Berg and Avery 1994).
The second version of the navy algorithm is the CALVAL algorithm described by Olson et al. (1991) in volume II of the CalVal Final Report (CalVal 1991). The training datasets used to develop this algorithm were developed from radar-derived rain rates made at Kwajalein atoll and Darwin, Australia. These were used in a regression expression relating the log of rain rate to five of the seven SSM/I TBs for ocean (a mixed algorithm) but only to the 85-GHz V- and H-Pol TBs for land (a scattering algorithm). Follow-up studies determined that there was a scarcity of high intensity rain rates in the training datasets, thus producing a low bias in the algorithm that does not produce rain rates exceeding about 6 mm h−1. The study of Berg et al. (1998) assesses the first- and second-generation operational Navy algorithms.
The NESDIS algorithm bases its rain rates on nonlinear regression between the 85-GHz scattering index used in the screening procedure and operational radar measurements (Ferraro et al. 1994; Ferraro and Marks 1995). The radar data are obtained from three worldwide sites: 1) the 22-platform Japanese Meteorological Agency AMeDAS system; 2) the 14-platform U.K. Meteorological Office FRONTIERS system; and 3) the 13-platform U.S. National Weather Service RADAP-II system. In the ocean and land conversion relationships, power laws are used with exponents for the SIs of about 2.0. To prevent unrealistically large values from occurring if an SI becomes too large, the maximum rain rates are arbitrarily bounded at 35 mm h−1.
Quasi-physical rain map (GSCAT, MRI, MSFC)
This type of algorithm uses a physically based method (a combination of radiative transfer and some type of conceptual or explicit cloud model) to formulate the functional relationship between rain rates and brightness temperatures but includes in the TB–RR transform, an empirical dependency on rain gauge or radar ground data through one or more coefficients. The distinction between these types of algorithms and the statistical rain map algorithms is that the use of an empirical calibration is more of a convenience than a necessity, in that thecalibration coefficients could be specified through physical modeling. The three algorithms making up this group are quite distinct both in terms of screening methodologies and in terms of their TB–RR conversion methodologies.
The ocean–land GSCAT algorithm uses a heavy screening procedure somewhat akin to the NESDIS operational screen (Ferraro and Marks 1995) but without the 19- and 37-GHz emission checks over ocean, and more extensive false detection checks over land, snow, and ice backgrounds. The ocean MRI algorithm uses an intrinsic screen. Instead of detecting the presence or absence of rain, it uses liquid water content (LWC) retrieved from the Shibata (1994) combined water vapor–LWC algorithm (based on 19-, 22-, and 37-GHz TBs) to characterize pixels as greater or less than 50% rain fraction according to the LWC magnitudes (dependent on whether an 85-GHz TB test classifies them as warm or cold rain). The algorithm only considers the greater than 50% rain fraction pixels to be raining, but then under the assumption that these pixels are entirely covered with rain. The raining pixels are also assumed to contain heterogeneous distributed rain rates described by a log-normal distribution, whose standard deviation is taken from GATE-radar statistics (Kedem et al. 1990). The screen of the MSFC algorithm is considered“heavy” because it uses both 37- and 85-GHz PCTs (with separate thresholds) to detect rain over either ocean or land.
The TB–RR conversion scheme of GSCAT is based on a linear regression between RTE model generated 85H TBs and cloud model generated rainrates (Adler et al. 1993; Adler et al. 1994). The cloud model database is taken from the simulation of a GATE squall line (Adler et al. 1991) produced by the three-dimensional nonhydrostatic cumulus ensemble model of Tao et al. (1987) and Tao and Simpson (1989). Because the conversion is restricted to 85A-GHz TBs, it is a scattering-type algorithm. For ocean pixels, the slope of the regression relationship is modified by a factor of 2.0 so that results better agree with the Morrissey and Greene (1991) West Pacific atoll rain gauge data, thus the ocean component of the algorithm is empirically calibrated.
Only the ocean component of the MRI algorithm has been published (Aonashi et al. 1996), although an unpublished land version (not described here) was submitted for the PIP-2 calculations. The TB–RR conversion is based on an iterative variational method that minimizes a cost function made up of differences between measured and RTE-modeled TBs at three frequencies (19, 37, 85 GHz); thus it is a mixed-type algorithm. The RTE model of Liu and Curry (1993) is used to relate TBs to rain rates based on the mean precipitation profile structure found in the GATE radar data (Houze and Cheng 1977; Szoke et al. 1986). The empirical nature of the algorithm stems from the use of GATE radar ground measurements used in making the beamfilling correction and in relating TBs to RRs, as well as the fact that the Shibata LWC algorithm is based on an empirical calibration to radar measurements obtained from a Japanese radar site (Chichijima).
The scattering-type MSFC algorithm linearly relates the depression of the 85-GHz PCT below the rain–no-rain threshold to rain rate (Spencer et al. 1989). This algorithm is based on physical principles in that the PCT coefficients are derived from radiative transfer calculations, but the relationship between PCT depression and rain rate is derived empirically. The coefficient of the relationship stems from a regression in which the zonally averaged retrieved rain rates are forced to agree (statistically) with zonally averaged rain gauge measurements taken from a selected set of globally distributed island and coastal stations. These rain gauge data were also used to calibrate the MSU algorithm developed by Spencer (1993) and used in PIP-1 intercomparisons (Spencer 1994).
Physical rain map (BAUER, GSFC, FER-AVE, LIU-CUR, PETTY, RSS, TAMU)
The physical rain map algorithms all use radiative transfer models and some type of conceptual cloud model to establish relationships between surface rain rate and one or more brightness temperatures and/or brightness temperature indices. The distinguishing aspect of these algorithms, relative to the quasi-physical rain map algorithms, is that they do not calibrate the rain-rate relationships to ground measurements. All but one of these algorithms use plane-parallel RTE techniques but widely diverse schemes to represent the microphysical elements of the precipitating cloud (cloud water, rainwater, and ice). All of these algorithms account for heterogeneous beam filling in some fashion but with different methods. Of the seven algorithms in this group, four are noniterative (BAUER, GSFC, FER–AVE, LIU–CUR), while three use an iterative solution approach (PETTY, RSS, TAMU). Note that computational overhead for each of the three iterative schemes is minor since they do not involve explicit RTE model calculations. For ocean, the GSFC and TAMU are low-frequency emission-type algorithms, while all the rest use mixed-type conversion schemes. For land, three arescattering-type algorithms (BAUER, GSFC, PETTY), the rest using mixed-type conversion schemes.
The screening schemes of these algorithms are diverse. The BAUER algorithm consists of both ocean and land components, but since only the ocean component is described in the published description, the land component will not be addressed here. BAUER screens for clear, cloud but no rain, and raining cloud conditions based on a regression estimate of the total column liquid water path (TLWP) derived from RTE modeling using TBs at all four SSM/I frequencies, i.e., a heavy screen. For TLWPs of less than 0.005 g cm−2, a scene is presumed cloud free, whereas for TLWPs greater than this threshold, test rain rates are calculated. If a test rain rate falls below a second threshold of 0.3 mm h−1, the final rain rate is zeroed and conditions are taken as cloudy but without rain. Above the rain-rate threshold, the final rain rates are assigned from the test rain rates. In the ocean–land GSFC algorithm, there is no explicit screen used over ocean. Instead, if the TB–RR conversion scheme produces rain rates less than 0.5 mm h−1, they are zeroed. This is the same intrinsic screening approach used for the ocean-only TAMU algorithm. The GSFC land algorithm is equivalent to that used for GSCAT described above and uses the GSCAT screening approach. The ocean and land screens used for the FER–AVE algorithm are classified as light and heavy, respectively. The ocean screen is based on a 19-GHz polarization test, while the land screen uses a 19V emission warming threshold test followed by polarization threshold tests at both 19 and 37 GHz to ensure the scene scatters enough to indicate rain. The screens for the ocean–land LIU–CUR algorithm are based on threshold tests of 19–85-GHz indices, the same indices used for the TB–RR conversion relationships (see below). A 37-GHz polarization test is also included for ocean. Both ocean and land screens are considered heavy. The ocean–land PETTY algorithm uses a heavy screening procedure for ocean involving all SSM/I channels but does not screen explicitly over land. The ocean screen involves testing both 85-GHz attenuation and scattering indices against fixed thresholds (the attenuation index is related to liquid water path and thus constitutes a columnar liquid water threshold test). If either test is positive and rain is considered possible, it must be confirmed during the iterative TB–RR conversion process to adjust an initial guess rain rate to a rain rate in which modeled TB indices match those derived from the measured TBs. The ocean-only RSS algorithm does not screen explicitly. During the retrieval, when a retrieved total columnar liquid water optical depth at 37 GHz exceeds 0.04, rain is assumed to exist within the scene.
The TB–RR conversion schemes of this group of algorithms generally adopt the strategy of reducing a large set of radiative transfer calculations to a manageable TB–RR relationship either through regression or some type of analytical expression. In the case of BAUER (Bauer and Schluessel 1993), an eight-stream adding-doubling model is used to create a TB–RR database in which nonlinear regression is used to create a relationship between surface rain rate and TBs at 19, 22, and 85 GHz. The vertically structured conceptual cloud model includes cloud water, rainwater, and ice. The cloud vertical structure is based on the work of Skatskii (1965) with drop size spectrums for six cloud types based on data reported at the 1970 Weather Modification Conference of the American Meteorological Society. The rain rate intensity is assumed to be height dependent, taken as piecewise linear in two sections with maximum intensity at the freezing level, assuming a Marshall–Palmer drop size distribution (Marshall and Palmer 1948). The structure of the ice profile follows that used by Wu and Weinman (1984). Beamfilling correction is accomplished by randomly defining partial cloud cover within the beam patterns, allowing the regression procedure used with the foreword RTE model calculations containing the partial cover effects to produce the statistically best fit TB–RR equation.
The GSFC and TAMU ocean algorithms are somewhat similar; the GSFC algorithm also includes a land component. The ocean component of the GSFC algorithm is a modification of the algorithm described in Wilheit et al. (1991), which was designed for monthly averages, whereas the land component is described by Adler et al. (1993) and has already been discussed. In the ocean algorithm, nonlinear analytical TB–RR expressions for different freezing level heights are found from RTE calculations generated with the successive order of scattering model described in Wilheit et al. (1977) for a conceptual cloud model similar to that described in the same publication. The raining cloud consists of a Marshall–Palmer rain layer up to the freezing level with a fixed amount of cloud liquid water and no ice. Because the PIP-2 calculations are instantaneous, the iterative approach assuming a lognormal form of the rain-rate distribution over a time–space grid used in the original Wilheit et al. (1991) algorithm is replaced with a one-step approach in which rain rates are assigned according to a TB–RR relationship based on a derived channel made up from the V-Pol TBs at 19 and 22 GHz (2 × 19V − 22V), that is, an emission index. The appropriate analytical expression (i.e., the freezing level height dependence) is determined by using the climatological freezing level height associated to the location of a pixel scene. A beamfilling factor of 1.5 is used toscale the retrieved rain rates guided by the studies of Chiu et al. (1990) and Short and North (1990) based on GATE radar statistics. The studies of Chang et al. (1993a,b) and Chang et al. (1995) describe the application and validation of the algorithm.
The TAMU algorithm, whose foundations are described in Wilheit et al. (1977) and Wilheit et al. (1991) and advances the GSFC algorithm, differs from GSFC in four ways: 1) a more advanced RTE model is used (Tesmer and Wilheit 1998); 2) the freezing level height is determined iteratively based on the TB–RR relationships at 19 and 22 GHz, instead of using climatological information; 3) a second rain-rate estimation is made from a 37-GHz TB–RR relationship with the final estimated rain rate taken as the maximum of the 19-GHz estimate and twice the 37-GHz estimate (the factor of 2.0 is a cross-channel calibration factor determined statistically over the TB region where the 37-GHz dynamic range intersects the 19-GHz dynamic range); and 4) the beamfilling correction factor is defined as 1.8, consistent with the value used in Wilheit et al. (1991); see Ye et al. (1997). The modifications are largely described in Wilheit et al. (1994). This algorithm is considered an emission algorithm because the 37-GHz estimates can only be used in the final estimate at the light-rain end of the scale where their rain signature is emission warming.
The ocean–land FER–AVE algorithm, as described by Ferriday and Avery (1994), uses the two-stream Eddington RTE model described in Kummerow (1993), along with a conceptual cloud model similar to that given by Wu and Weinman (1984), to produce a cloud-radiation database. From the database, and considering ocean and land separately, single TB indices are found based on combinations of TBs at different frequencies and polarizations (through straight addition and subtraction), which are linearly related to surface rain rates. The ocean index is given by 19H + 19V + 37H − 22V − 37V − 85H; the land index is given by 19H + 37H − 2 × 85H. Linear regression is then used to determine offset and slope coefficients for the separate ocean–land TB–RR relationships, designed to produce rain rates from 0.5 mm h−1 upward (the land offset is varied according to latitude and season to account for variation of the land surface temperature). Beamfilling correction is accomplished by adjusting the TB indices so that rain rates less than 20 mm h−1 obtained from the unadjusted relationships are doubled for the adjusted relationships (i.e., assumes a beamfilling factor of 2.0 for low or moderate rain rates).
The LIU–CUR algorithm defines an index from 19- and 85-GHz TBs, seeking to create a TB variable that responds linearly with respect to surface rain rate (Liu and Curry 1992). The index is derived from analysis of a cloud-radiation dataset generated from the successive order of the scattering RTE model of Liu and Curry (1993), and three types of conceptual cloud models. The first cloud model is a liquid-only stratiform structure consisting of cloud drops and raindrops, in which the cloud drop size spectrum follows the formulation used in Deirmendjian (1964) and the raindrops are taken as Marshall–Palmer. The second cloud model (also stratiform) adds a thin ice layer to the top of the liquid-only cloud, taking the ice particles as Marshall–Palmer. The third cloud structure represents deep convection (cloud drops, raindrops, and ice), following the conceptual model developed by Wu and Weinman (1984). The TB index is expressed as δTB = (TB19H − TBo19H) − (TB85H − TBo85H) where the TBo terms are rain threshold values at the two frequencies, ultimately determined from statistical analysis of a large set of SSM/I measurements. The index combines the emission warming response function at 19 GHz with the scattering depression response function at 85 GHz, which, because of the saturation and flattening of the 19-GHz function at around 15 mm h−1, produces a combined function that is somewhat linear with respect to surface rain rate. The actual TB–RR conversion relationship considers a nonlinear power law regression of the form RR =
The PETTY algorithm used in PIP-2 is designed for both ocean and land applications, but the following description is limited to the ocean component since the land algorithm has not been published (it is similar to the GSCAT land algorithm). The approach taken in the ocean component is to relate transmittances of the rain layer at different frequencies to attenuation indices referred to as normalized polarization differences (NPDs) at 19, 37, and 85 GHz (P19, P35, P85). These are formed by taking ratios of the differences of the measured V-Pol and H-Pol TBs (TV and TH) to polarized TBs representing cloud-free conditions (TV,0 and TH,0) in which the effects of atmospheric water vapor emission and elevated surface emission due to wind roughening are accounted for in the TV,0 and TH,0 terms. Empirical expressions are used to formulate the cloud-free TBs based on the SSM/I water vapor and surface wind algorithmsof Alishouse et al. (1990) and Goodberlet et al. (1989). A three-dimensional reverse Monte Carlo RTE model is used to formulate a hypothetical rain transmittance (TRh,f) relationship given in terms of a power law, that is, Pf =
This algorithm estimates surface rain rate based on the retrieval of the mean attenuation due to cloud-rainliquid water over a beam (defined by optical depth τcr), in which a physical beamfilling correction is made in determining τcr (Wentz and Spencer 1998). A seamless integration of the rain algorithm has been performed with a previous algorithm designed for simultaneous retrieval of columnar water vapor (V), columnar cloud liquid water (Lc), and surface wind speed (W); see Wentz (1997). In the new algorithm, atmospheric transmittances due to liquid water at 19 and 37 GHz (TR1,19, TR1,37) along with an effective radiating temperature (TU) are included as parameters along with V and W in a set of TB equations (involving 19V, 22V, 37V, and 37H) that is solved simultaneously using an iterative scheme. Rain is assumed present below a fixed TR1,37 threshold. The transmittance terms are related to the total columnar water for a raining cloud (Lr) while TU is related to the height from which the radiation is emanating and is used in the retrieval to account for scattering effects on the TR1,19 and TR1,37 terms; TB polarization signatures are used to determine TU independent of the transmittances. In inferring the surface RR from TR1,19 and TR1,37, assumptions on beam filling, cloud–rainwater partitioning, and rain column depth (H) are required. The difference between TR1,19 and TR1,37 is used to define a beamfilling correction factor, used along with the 19- and 37-GHz transmittances to formulate the beam-fill-corrected mean optical depth τcr. Rain rate is formulated in terms of τcr, the depth parameter H (related to freezing level height—expressed as a function of SST), and the columnar cloud liquid water Lc (separated from Lr using an ad hoc cloud-rain partitioning given as a function of RR and H).
Physical profile (KUM, GPROF, IFA–SAP, OLSON, FSU)
The distinctive feature of the physical profile algorithms are that they retrieve the vertical structure of one or more hydrometeor categories through multispectral inversion rather than just surface rain rates. Such algorithms may include both precipitating and suspended hydrometeor categories, in both liquid and frozen states. Surface rain rates themselves are diagnosed after the retrieval process, directly from the retrieved profiles. This is done with either a fallout model applied to the precipitating hydrometeors or with regressions relating surface rain rates to modeled upwelling brightness temperatures associated with the profile structures giving rise to those surface rain rates. The five physical profile algorithms used in PIP-2 consist of three iterative methods (KUM, OLSON, and FSU) and two noniterative methods (GPROF and IFA–SAP). Four of the five algorithms use three-dimensional nonhydrostatic cloud models to develop large databases of microphysical profiles (thousands) that are assigned TBs for all seven polarized SSM/I channels. One algorithm (KUM) producesa similar but far less dense database (27 profiles) using microphysical profiles acquired from aircraft radar data. All five algorithms are designed for ocean and land applications, although the KUM and FSU algorithms sidestep the profile process under certain situations over land backgrounds, directly retrieving surface rain rate in those cases.
The KUM, GPROF, and OLSON algorithms use nearly equivalent light screening procedures, consisting of a 37-GHz polarization check over ocean and two 85-GHz scattering checks over land (the first against a fixed threshold, the second against 37 GHz). The screening procedures for both the IFA–SAP and FSU algorithms are heavy, consisting of modified and nonmodified NESDIS operational screens for ocean and land, respectively.
The TB–RR conversion schemes of the physical profile algorithms are similar in concept but distinct in terms of numerical inversion methods. The KUM algorithm, which uses relatively few radar-derived cloud structure profiles to form the initial microphysical database, creates larger subdatabases for each structure category from which to search a solution by manipulating the values of the underlying parameters affecting the radiative transfer for the cloud structures (Kummerow and Giglio 1994a,b). A solution is achieved by finding the best match of the measured TBs with the modeled TBs within the various cloud structure subdatabases. In so doing, it associates the parameters of interest to the RTE model-generated TBs through regressions for a given structure, then inserts the measured TBs into the regressions thus providing estimates of the retrieval parameters (e.g., surface rain rate). These are then used as guesses back in the RTE model in forward calculations to test for agreement between the model and measurements at all frequencies and polarizations. Beam heterogeneity is accounted for in the forward modeling by assuming that rainfall is lognormally distributed in a manner consistent with the local spatial variance of the 85-GHz TBs. Through this iterative process, the set of parameters and its associated cloud structure yielding the best TB measurement–model matchup, in an rms sense, defines the solution. The formative ideas behind this algorithm in terms of methodology, radar-based cloud structures, and RTE modeling are found in Kummerow et al. (1989), Kummerow et al. (1991) and Kummerow (1993).
The OLSON and FSU algorithm are also iterative algorithms. However, these algorithms use explicitthree-dimensional nonhydrostatic cloud models to form dense cloud-radiation databases that are then used to select initial guess hydrometeor profiles for numerical inversion schemes. The OLSON algorithm, whose rain map counterpart is described in Wilheit et al. (1994) and that uses the RTE model described in Kummerow (1993) and an experimental cloud model developed at the University of Wisconsin (J. Rideout 1995, personal communication), iteratively adjusts an initial guess profile found by identifying the database profile exhibiting the smallest Euclidean distance between its associated modeled TBs and the measured TBs. An optimizer controls the adjustment of the initial profile until the forward modeled TBs for all 7 SSM/I channels are well matched with the measurements. The surface rain rate is derived from the near-surface liquid precipitation water content of the modeled cloud ensembles, averaged to 25-km resolution.
The FSU algorithm, which is described by Smith et al. (1994a,b) and based on microphysics from the University of Wisconsin Nonhydrostatic Modeling System (UW-NMS) developed by Tripoli (1992a,b), also uses an iteration approach but with respect to unpolarized TBs at 19, 37, and 85 GHz. In addition, it uses an elaborate initial guess procedure that leads to rapid optimization and calculates the rain-rate profile from the retrieved hydrometeor profiles based on a rain fallout model. This algorithm uses spatial deconvolution to match resolutions at the different frequencies, set to the beam size of 85 GHz, to reduce heterogeneous beamfilling bias; see Farrar and Smith (1992). The theoretical background for this algorithm is given in Smith et al. (1992) and Mugnai et al. (1993); the RTE model is described in Xiang et al. (1994) and Smith et al. (1994a). Panegrossi et al. (1998) have substantiated the use of cloud model microphysics in this type of algorithm by showing the consistency between the multifrequency TB domain obtained from the RTE-cloud model (referred to as the model TB manifold) with the measured TB manifold.
The GPROF algorithm (Kummerow et al. 1996) was an outgrowth of the KUM algorithm but with two significant changes. The iterative method was replaced by a single-pass Bayesian approach (see also Evans et al. 1995) to reduce computational overhead in anticipation of global operational applications during the TRMM era, and the low-density radar-derived microphysical profile database was replaced by a dense database acquired from cloud model simulations using both the GSFC cumulus ensemble (GCE) model of Tao and Simpson (1993) and the UW–NMS (whose simulation microphysics were provided by FSU). In this algorithm,the cloud model generated profiles are assigned a priori probabilities and then related to brightness temperatures at all SSM/I frequencies and polarizations through a forward RTE model. This establishes a means to give weights to the individual database profiles in representing a solution profile, based on the deviation in an rms sense of the measured TBs for a given pixel from the modeled TBs in the database and the a priori probabilities.
The one pass IFA–SAP algorithm uses a Monte Carlo procedure to expand a cloud database generated from the UW–NMS model, as described in Marzano et al. (1994). The solution profiles are derived without iteration from a Bayesian method based on weighting different database profiles according to the proximity of measured and modeled TBs, and on a priori probabilities of occurrence of given profile structures. The methodology is described in Mugnai et al. (1994) and Pierdicca et al. (1996); the latter reference also contains a description of the RTE model used for the PIP-2 calculations. A simplified version of the algorithm was used for PIP-2 consisting of the identification of the database profile whose Euclidean distance to the measurement is minimized within a given radiometric error. Two recent UW–NMS simulations described in Panegrossi et al. (1998) were used to form the cloud-radiation database. As with FSU, surface rain rates are derived from the estimated hydrometeor profiles using fallout equations, with the deconvolution scheme of Farrar and Smith (1992) used to reduce heterogeneous beamfilling effects.
List of 27 PIP-2 cases in which columns indicate name(s) of case provider(s), case name, month–year of event, satellite source, type of background (L for land; O for ocean; L/O for both), type of validation data provided (see text), and number of overpasses (total of 118). Case numbering goes to 28; however, case 7 does not exist because its data could not be processed.
PIP-2 master overpass list in which columns indicate case number; last name(s) of provider(s); case name; total number of overpasses for case; overpass number within given case (a “#” to the left indicates the overpass was selected for the reprocessing calculations, an “m” to right indicates missing scans in subswath array but not in target grid, an “M” indicates missing scans in target grid within subswath array, an “N” indicates data were extracted from NESDIS format instead of standard Wentz format); date of event (month–day–year); UTC time of event; satellite source; ascending or descending orbit (A or D); type of background (L for land, O for ocean, L/O for both); total number of low-resolution scans in subswath array; latitude–longitude of center of target grid; pixel or cross-track (CT) and scan or downtrack (DT) target grid dimensions; and presence or absence of validation data (Y for yes, N for no) along with validation data quality indicator (G for good, F for fair, P for poor). Note that case 7 is not included in analysis because of data processing problems.
List of names and authors of 20 PIP-2 algorithms.
Summary of PIP-2 algorithm features. The name(s) of the principal author(s) and the algorithm names are indicated, followed by ocean–land indicators, the screening approach including both the usage flag and the input channels used for the screen (see key at bottom of table for explanation of usage flag), the type of algorithm (separated into five descriptors with key below explaining the terms), the input channels used for the TB–RR conversion, and the principal references.
Algorithm categorizations according to solution method, channel input, and screening approach. For the ocean stratifications, all 20 PIP-2 algorithms are included. For the land stratifications, only 17 algorithms are indicated as three algorithms were ocean-only designs.
Classification of 56 reprocessed overpasses according to meteorological nature of rain events (49 are stratified into ocean classes while 44 are stratified into land classes). Note “ca” indicates case while “op” indicates overpass.
Table 7a. Summary of 20 individual algorithm intercomparisons for ocean based on PI screen with respect to validation data (top table) and composite-all results (bottom table). Algorithms (column 2) ordered by magnitude of all-target area-averaged value (from smallest to largest) for along-diagonal results based on PI screen calculations (column 1). Numeral to right of algorithm name in column 3 (RO Ord) represents numerical order if rain-only area averages are used to determine position. Asterisks indicate those algorithms whose position order changes by more than four places between two orderings. Next four columns (4–7) indicate mean algorithm rainrate (Alg), mean validation rain rate (Val), or mean composite-all rain rate (Com All); bias between two (Bias); and bias-adjusted rms (BARMS), all in mm h−1. Columns 8–14 give ratio of Bias to Val or Com All (Bias Rat), ratio of Alg to Val or Com All (Means Rat), algorithm ranking by means ratio, ratio of BARMS to Val or Com All (RMS Rat), ranking by rms ratio, correlation coefficient (CC), and ranking by CC. Columns 15–16 are final score (Final Scr) given by sum of three rankings, and final ranking (Final Rnk) according to Final Scr.
Table 7b. Same as Table 7a except for ocean case based on common screen. Besides asterisks indicating algorithms whose position order changes by more than four places between the two position orderings, there are “#” signs to right of first position ordering. These indicate algorithms whose position order changes by more than four places from PI screen to common screen.
Table 7c. Same as Table 7a except for 17 individual algorithm intercomparisons for land case (based on PI screen).
Table 7d. Same as Table 7b except for 17 individual algorithm intercomparisons for land case (based on common screen). Besides asterisks indicating algorithms whose position order changes by more than four places between the two position orderings, there are “#” signs to right of first position ordering. These indicate algorithms whose position order changes by more than four places from PI screen to common screen.
Table 8a. Off-diagonal intercomparison statistics between validation results and composite algorithm classes, in which various categories within solution method, channel input, and screening approach stratifications are considered. Solution method stratification does not consider quasi-physical and statistical categories separately, only their combined category (STAT). Results consider all rain systems for ocean case based on use of PI screen and low-resolution SSM/I results for reprocessed overpasses. (Bias = Ref − Comp; % Diff = % Ref − % Comp;Y(Comp) = Y0 + S*X(Ref.)
Table 8b. Continuation of Table 8a for intercomparisons between composite algorithm classes and other classes within a given stratification. Bias = Ref − Comp; % Diff = % Ref − % Comp; Y(Comp) = Y0 + S*X(Ref.)
Individual algorithms grouped according to large negative bias (category A), small-medium bias (category B), or large positive bias (category C), in which biases are taken separately with respect to both validation data and all-algorithm composite results using PI screen. Ocean and land cases are considered separately. Footnotes indicate changes in grouping assignments that would occur if common screen is used. Histograms are formed and biases taken with respect to all common raining pixels from reprocessed overpasses in which at least one rain rate in each algorithm pair used in intercomparison exceeds 1 mm h−1. Large negative bias defined as mean value of difference histogram (taken as individual algorithm rain rate minus comparator rain rate) following below −1.5 mm h−1; large positive bias defined as mean value of difference histogram exceeding +1.5 mm h−1; small-medium bias defined as mean value of difference histogram falling between ±1.5 mm h−1.
Comprehensive PIP-2 results are available on CD-ROM; see Smith et al. (1996).
Copies of the original input CD-ROM are obtainable from H. Michael Goodman at the NASA/Marshall Space Flight Center, Huntsville, AL 35806.