Prominent achievements made in addressing global precipitation using satellite passive microwave retrievals are often overshadowed by their performance at finer spatial and temporal scales, where large variability in cloud morphology poses an obstacle for accurate precipitation measurements. This is especially true over land, with precipitation estimates being based on an observed mean relationship between high-frequency (e.g., 89 GHz) brightness temperature depression (i.e., the ice-scattering signature) and surface precipitation rate. This indirect relationship between the observed (brightness temperatures) and state (precipitation) vectors often leads to inaccurate estimates, with more pronounced biases (e.g., −30% over the United States) observed during extreme events. This study seeks to mitigate these errors by employing previously established relationships between cloud structures and large-scale environments such as CAPE, wind shear, humidity distribution, and aerosol concentrations to form a stronger relationship between precipitation and the scattering signal. The GPM passive microwave operational precipitation retrieval (GPROF) for the GMI sensor is modified to offer additional information on atmospheric conditions to its Bayesian-based algorithm. The modified algorithm is allowed to use the large-scale environment to filter out a priori states that do not match the general synoptic condition relevant to the observation and thus reduces the difference between the assumed and observed variability in the ice-to-rain ratio. Using the ground Multi-Radar Multi-Sensor (MRMS) network over the United States, the results demonstrate outstanding potential in improving the accuracy of heavy precipitation over land. It is found that individual synoptic parameters can remove 20%–30% of existing bias and up to 50% when combined, while preserving the overall performance of the algorithm.
Rather than uniform and continuous, transitions of a physical system are often seen through a series of random perturbations characterized by a mean general trend. The capability to promptly recognize such a trend is a key for early detection of the system’s change but is often limited by access to accurate measurements of its perturbations. In an effort to adapt to an ever-changing climate, understanding fluctuations of atmospheric phenomena, especially their extremes (e.g., events with parameters measured to be in the top 10% of historically observed values), has a critical role. Emerging from complex thermodynamical processes, changes in precipitation could be thought of as a reflection of transitions of the changing physical system. Diagnosing the onset of a change in such a complex system is usually done by examining changes in its extremes. Therefore, direct observations of extreme precipitation at global scales are invaluable in understanding the ever-changing climate. Despite a long, albeit sparse, record [first known observations date back 2000 BCE (Wang and Zhang 1988)], globally complete precipitation measurements did not become available until the modern era of satellite Earth-observing systems that employ infrared and microwave radiometric techniques (e.g., Atlas and Thiele 1981). Achieving measurement standards of rainfall in atypical (i.e., extreme) environments on small spatiotemporal scales across the globe, however, has turned out to be more difficult than anticipated. Although satellite observations can have relatively large random errors at small scales, their global nature makes them suitable for addressing potential changes in global precipitation extremes.
The first satelliteborne PMW instruments date back to the mid-1960s (e.g., Basharinov et al. 1971). Rainfall detection from space began with the Scanning Multichannel Microwave Radiometer launched on board the Nimbus-7 satellite in the mid-1970s, making satellite PMW measurements an indispensable part of global precipitation records until the present day. Although far from ideal, the relatively low cost of microwave imagers made them affordable and a popular choice of instrument for many past and upcoming space missions (e.g., Nimbus, DMSP, NOAA, MetOp, TRMM, GPM, JPSS; see appendix for a list of acronyms). At the same time, passive microwave precipitation retrievals became, either directly or indirectly, one of the most important components of gridded products [e.g., IMERG (Huffman et al. 2015), CMORPH (Joyce et al. 2004), TMPA (Huffman et al. 2007), PERSIANN-CCS (Hong et al. 2004), and GSMaP (Kubota et al. 2007; Ushio et al. 2009)] that are commonly used in applications requiring precipitation at high spatial and temporal resolutions.
Continuous work on finding physical relations between the observed [i.e., brightness temperatures (Tbs)] and state (i.e., rainfall) vectors led to PMW retrieval improvements from fairly simple regression models (e.g., Wilheit et al. 1976) to sophisticated algorithms that employ radiative transfer and cloud-resolving models, optimal estimation methods (e.g., Kummerow et al. 2001 ,Elsaesser and Kummerow 2008, Kummerow et al. 2011), and principal component analysis (e.g., Petty and Li 2013). Limitations, however, still exist (e.g., Petković and Kummerow 2017, hereinafter PK2017), especially over land. Specifically, high and variable land surface emissivity obstructs the information content provided by PMW instruments, limiting rainfall signals to an indirect, nonunique relationship between cloud ice-scattering signatures and surface rainfall. Based on the mean observed ratio between ice aloft and the surface rainfall, these estimates can often be inaccurate, with more pronounced biases observed during extreme events. In addition to the example given in study by Petković and Kummerow (2015), a difference in mean precipitation rate bias between ground radar measurements and an operational satellite PMW retrieval (see section 3) is shown in Fig. 1. Heavy precipitation, in this study defined as the top 10% of precipitation rates given by ground reference at satellite field-of-view (FOV) level, for the period between September 2014 and August 2015 (for data and domain description, see section 4) is compared on a 0.25° grid over the eastern United States, ensuring high-quality data and a good understanding of related system features. Clearly, satellite estimates show negative bias over the entire region. In addition, Fig. 2 compares satellite FOVs to the corresponding ground reference (pixel-to-pixel comparison). It reveals a negative (underestimation) bias of 28% for the PMW retrieval for the top 10% of precipitation rates as defined by the ground-based radars. Characterized by a relatively high correlation coefficient (0.66), the retrieval’s performance is consistent in its negative bias at all precipitation rate values (black crosses mark mean retrieved precipitation rates for each of the ground reference precipitation rate bins). This is the result of an assumed relationship between the cloud property (i.e., ice content) and precipitation rate used to retrieve the precipitation that was derived from a broad range of observations but used only on a narrow portion of the rainfall spectrum (i.e., a heavy precipitation regime).
Fixing this problem requires a better understanding of the ice content in heavy precipitation events. Rather than trying to improve the retrieval itself, a solution is seen in complementing the observed brightness temperature vector with information that would help mitigate ambiguities in the ice-to-rain relationship. In an attempt to do so, this study seeks to utilize more-complex links between observed cloud properties and common atmospheric parameters (e.g., the large-scale environment). Based on findings presented in PK2017, it is hypothesized that such information is correlated with the synoptic state of the atmosphere. Earlier studies suggest that parameters such as temperature and humidity, atmospheric stability (Behrangi et al. 2015; Casella et al. 2013), and surface conditions (You et al. 2015) successfully reduce errors of the solution in precipitation algorithms. Specifically, Casella et al. (2013) demonstrate the validity of using meteorological parameters in guiding the retrieval scheme in recovering microphysical profiles and surface rain rates from PMW satellite measurements. Using a simulation framework, they found that retrieval ambiguity can be reduced by employing cross combinations of meteorological variables that are consistent with the observed environmental conditions. Therefore, additional variables can mitigate the information gap between the assumed and observed cloud properties. To better understand the challenges inherent in such a scheme, a general review of the Bayesian approach, the approach used by the PMW retrieval validated in Figs. 1 and 2, is provided next.
2. Understanding the sources of rainfall bias: Theoretical background
Developed by Bayes in the eighteenth century, a fairly simple statistical method is used that utilizes the observed probability of an event (an a priori knowledge) to predict the probability of its reoccurrence if similar conditions exist. Using the definition presented in Rodgers (2000) applied to PMW rainfall detection, rain rate probability is given by the following equation:
where P(R|Tb) is the a posteriori conditional probability of rain rate R occurring with observed brightness temperature vector Tb; P(R) and P(Tb) are a priori probabilities of rain rate and brightness temperature, respectively; and P(Tb|R) is the conditional probability of a brightness temperature vector observed with a given rain rate R. Terms on the right-hand side of Eq. (1) are given by a priori knowledge (stored in what is usually referred to as an a priori database), while the left-hand side represents the most likely outcome (a prediction). In the PMW application (Figs. 1 and 2) the retrieved rain rate rr is a weighted mean of the entire spectrum of rain rate values where each value is assigned a weight wi proportional to its probability [i.e., P(R|Tb)]:
In Eq. (2), i is an element of the a priori database, is the Tb error covariance (accounting for both instrument and forward model errors), and Tb is observed, while F(ri) is the ri-associated brightness temperature.
While this approach generally provides excellent results, it has two caveats: 1) the solution is always pulled toward a mean of the a priori statistics (defined by the most frequently observed precipitation rates), and 2) events that are underrepresented, or do not exist in the a priori database, will be assigned low, or even zero, probabilities. The cause of the first problem is the low a priori probability of heavy precipitation rates observed in the a priori database, while the cause of the second is the low a priori probability of the observed Tb vector for a given precipitation rate [i.e., P(Tb|R)].
The two abovementioned problems constitute the bulk of rainfall retrieval biases discussed throughout this study. Unfortunately, since they result directly from the definition of Bayes’ method, as in any other statistic-based method (e.g., regression, neural network), they can only be diminished, not eliminated. If the a priori and observed information are rich and allowed to relate well, however, the performance of the Bayes retrieval will likewise improve significantly. This suggests that the sources of the retrieval’s bias could potentially be mitigated if the information content of both the observed and a priori vectors is complemented by elements that can strengthen their links. Seeking such links requires a better understanding of the Tb (i.e., observed) vector and rainfall rate.
If the retrieval employs an observed relationship between two state vectors to retrieve one when the other is available, then its performance is driven by 1) robustness of the observed relationship and 2) the extent to which it is utilized. In the overland PMW rainfall retrieval used in this study, the most robust relationship is the one between the radiometric signature of the ice scattering aloft and the precipitation rate itself. Therefore, using a proxy of ice amount in a cloud to relate to a surface precipitation rate will inherently introduce noise if the algorithm cannot distinguish between entries with similar brightness temperature vectors but different rainfall rates. For heavy precipitation, this noise translates to a bias for reasons stated above. Atmospheric states, described by CAPE, wind shear, humidity distribution, and aerosol concentrations are thus examined to assess if the extra information they provide allows the algorithm to better distinguish similar precipitation profiles.
3. GPROF rainfall retrieval: Description and general properties
As an operational passive microwave rainfall retrieval for the GPM mission (Hou et al. 2014), the GPROF algorithm has been well documented and extensively analyzed over the past two decades (e.g., Kummerow et al. 1996, 2011, 2015; Meyers et al. 2015). Developed at NASA Goddard Space Flight Center in the mid-1990s from the work of Kummerow and Giglio (1994), primarily for the purpose of the TRMM (Simpson et al. 1988), GPROF lives to the present day, undergoing a number of versions. At the time of this study, its most recent version, labeled by NASA’s Precipitation Processing System (PPS) as GPROF.GPM.V4 developed for GMI, successfully serves a constellation of cross-track (Kidd et al. 2016) and conical scanning PMW instruments, including GMI, SSMI/S (Kunkee et al. 2008), AMSR2 (Shimoda 2005), ATMS (Muth et al. 2005), MHS (Edwards and Pawlak 2000), and others. While more details on prior versions of the GPROF algorithm can be found in the aforementioned literature, a brief overview of the up-to-date algorithm is given below.
GPROF.GPM.V4 (also known as GPROF2014 version 2) is the first fully parametric version of the algorithm that utilizes a Bayesian approach over both land and ocean surfaces. Over land (the focus of this study), it builds its a priori knowledge by employing hydrometeor profiles from the DPR combined algorithm (Grecu et al. 2016). Forward modeling of brightness temperatures is done through radiative transfer modeling (Kummerow et al. 2011), ensuring a good match with observed GMI Tbs. Once available, the simulated Tbs and associated hydrometeor profiles are coupled with the DPR Ku precipitation rate and ancillary data, including TPW, surface type, and 2-m temperature, to constitute GPROF’s a priori database. In the first step (often referred to as preprocessing), the algorithm uses the ancillary information (TPW, surface type, and 2-m temperature) to subset the a priori database and reduce the ambiguity of the Tb–rain rate relationship. This nonunique relationship between the set of Tbs and the rainfall rate is caused by similar radiometric properties of different combinations in rain DSDs, water vapor, cloud liquid water, and ice content. McKague et al. (1998), Berg et al. (2006), and Bennartz and Petty (2001) describe a strong correlation between these factors and the three criteria listed above. In this process, surface types are defined using SSM/I observed emissivity climatology (Aires et al. 2011) updated daily by NOAA’s AutoSnow product (Romanov et al. 2000), while TPW and 2-m temperature come from reanalysis datasets such as ECMWF (Dee et al. 2011) and JMA’s global analysis (GANAL; JMA 2000). Upon subsetting, the remaining database elements are exposed to Bayesian computation, and rain rates are assigned a weight proportional to their respective probability given by Eq. (2). The same is done to all other parameters (e.g., hydrometeor profiles, convective fraction, precipitation phase) before outputting their weighted means as the retrieval’s solution. This methodology ensures the database is consistent with our best spaceborne radar observations to minimize errors, which are recognized by L’Ecuyer and Stephens (2002) and Kummerow et al. (2006) as one of the major error sources. The method is also readily adaptable to other sensors that take part in the GPM constellation: three SSMI/Ss (F16, F17, and F18), AMSR2 (GCOM-W1), GMI (GPM), four MHSs (MetOp-A, MetOp-B, NOAA-18, and NOAA-19), and ATMS (Suomi-NPP). The algorithm used in this study fully matches this description except that it replaces DPR Ku surface rainfall with an MRMS dataset and utilizes observed, rather than modeled, Tbs to build its a priori database over the continental United States. This was done to ensure that any retrieval biases against MRMS are the result of the algorithm and not a function of the a priori database or forward model errors.
4. Data and the a priori database
This study employs 1 year (September 2014–August 2015) of GPM-core satellite (both GMI and DPR), MRMS, GEOS-Chem aerosol, and ECMWF reanalysis data to explore GPM PMW rainfall retrieval accuracies in heavy precipitation. The domain is limited to the midwestern and eastern United States (22°–55°N, 105°–65°W) to form a geographically well-understood test bed and allow for high-quality data from the MRMS system. The GMI on the core satellite provides cloud radiometric properties, the DPR provides cloud structure information, and the ground-based measurements serve as the validation reference for satellite observations and the a priori surface rainfall, while aerosol and reanalysis sets provide necessary ancillary elements.
a. GMI data
With its 13 MW channels (10.65V/H, 18.7V/H, 23.8V, 36.5V/H, 89.0V/H, 166V/H, and 183.3 ± 3V, 7V GHz) the GMI instrument (Draper et al. 2015) serves as a calibration standard for PMW conical-scanning radiometers in the GPM constellation. Brightness temperatures used here by GPROF (either as the observed vector or to form the a priori databases) are given by the GPM level-1 1C-R GMI product (GPM Science Team 2016).
b. DPR data
The DPR instrument, developed by Japan Aerospace Exploration Agency (JAXA) and Japan’s National Institute of Information and Communications Technology (NICT), has Ku- and Ka-precipitation radars operating at 13.6- and 35.5-GHz frequencies, respectively, with FOVs of approximately 5 km. The Ku-band radar (the only one used in this study) has a cross-track width of 245 km, vertical sampling of 250 m, and virtually complete sampling at the surface level (e.g., no gaps between individual FOVs). Its algorithm builds on that of the TRMM PR (Iguchi et al. 2009) and, with a minimum detectable signal set to 18 dBZ, is suitable for detection of precipitation rates above 0.5 mm h−1. However, the recent study of Hamada and Takayabu (2016) demonstrates that with the DPR Ku sensitivity limit set to 12 dBZ, retrievals can successfully detect even lower precipitation rates.
c. MRMS data
The MRMS (Zhang et al. 2016) dataset is used as a reference dataset. For this purpose, it is specifically adapted to satellite needs, providing high-accuracy precipitation rate estimates at a 0.01° spatial and 2-min temporal resolution over the entire CONUS at the time of the satellite overpass. Each estimate is assigned a radar quality index (i.e., a quality flag) and a precipitation type. Only the highest-quality data are used to validate GPROF retrievals. In each satellite overpass, MRMS rainfall rate grids are collocated with individual satellite FOVs mimicking sensor-specific antenna geometry (i.e., antenna gain function).
d. GEOS-Chem data
To provide estimates of lower tropospheric CCN concentration, this study employs the GEOS-Chem chemical transport model with the online Two-Moment Aerosol Sectional (TOMAS) microphysics module (Adams and Seinfeld 2002; Pierce and Adams 2009; D’Andrea et al. 2013; Pierce et al. 2013). GEOS-Chem was run globally for 2014–15 at 2° × 2.5° resolution using GEOS “Forward Processing” (FP) reanalysis meteorology fields (Rienecker et al. 2008) GEOS-Chem–TOMAS simulates the particle size distribution from 3 nm to 10 μm in 15 size bins, tracking sulfate, sea salt, organics, black carbon, and dust aerosol species within these size ranges, comparing well to the observed aerosol size distributions (Kodros and Pierce 2017). Kodros et al. (2016) provide a complete description of emissions used in the simulations. Simulated CCN concentrations, given at 2° × 2.5° grids at 6-h time resolution, are collocated with other ancillary data and joined to both observed fields and the a priori database. With the goal of optimizing ingestion of aerosol information to GPROF retrieval and supported by findings of Dusek et al. (2006) and Stolz et al. (2015), only number concentration of aerosols with diameters larger than 40 nm (0.04 μm) are used as a proxy for CCN.
e. ECMWF data
The ECMWF interim reanalysis (ERA-Interim) model data (Dee et al. 2011) are used to provide environmental parameters—specifically, 2-m temperature, total column water vapor, CAPE, wind profile, temperature, dewpoint, and specific humidity—at 0.75° horizontal and 6-h temporal resolution, at four pressure levels (850, 700, 500, and 200 mb; 1 mb = 1 hPa) for the year of GMI data. While model-induced uncertainties exist, this dataset is still seen as a robust resource based on its consistency, coverage, and previous validation of the ECMWF analyses. Similar to PK2017, vertical wind shear is defined as the difference in wind magnitude at the 500- and 850-mb levels. The low-level dewpoint depression is defined as the difference between the 2-m temperature and dewpoint. A vertical humidity deviation is defined as the ratio between specific humidity at low- and midtropospheric levels (850 and 500 mb, respectively). To ensure that the height of the planetary boundary layer (PBL) does not affect these results, midlevel humidity is taken as a mean value of 450 and 500 mb, while low-level humidity is required to be within the PBL (e.g., 850 mb). To minimize the effect of precipitation on the atmospheric column, the environmental parameters to be used as cloud morphology predictors in the a priori database are chosen to correspond to the time step preceding their coupled precipitation rates.
The above datasets are grouped to build the a priori knowledge for GPROF retrieval. Each of 14 surface types is treated separately. Data count distributions of eight land surface classes occurring over the domain of this study are given in Fig. 3 as a function of TPW and 2-m temperature. Number of elements in each surface class ranges from 1.4 × 105 (for minimum snow) up to 1.4 × 107 (for maximum vegetation), totaling more than 26 million.
5. Complementing the retrieval’s a priori knowledge
PK2017 demonstrated a relation between the large-scale environment and precipitation system regime that was related to PMW retrieval systematic errors. The authors found that a strong link between PMW bias and environment is related to the variability in cloud microphysics and morphology. To support these findings and demonstrate that this relationship is not specific only to the Amazon–African region (the test bed used in PK2017), a link between the precipitation regime and high-frequency Tb over the U.S. region is examined. Using the same methodology (see Elsaesser et al. 2010) and 1 year of GPM data over the United States, all 1° × 1° raining scenes are separated into shallow, deep-unorganized, and deep-organized systems as defined by Elsaesser et al. (2010). Employing a k-means clustering technique and using the same five-dimensional space as in PK2017, self-similar scenes are identified. Once again, the convective-to-total rainfall ratio and DPR Ku echo-top height (with height bins being 0–5, 5–9, and above 9 km) serve as the key properties in defining these structurally distinct regimes. The result for nonisolated scenes (e.g., larger than 25 km) is shown in Fig. 4 together with the relationship between DPR Ku echo-top height and GMI’s 89-GHz Tb.
As expected, based on the sensitivity of the 89-GHz channel frequency to the presence of ice in the atmospheric column and the fact that deeper clouds are likely to have greater ice content, echo-top height is strongly correlated with high-frequency Tb depressions. However, the more important finding is the indication of a clear separation in the slope of the echo top to brightness temperature relation shown in Fig. 4 for the three regimes. This suggests that brightness temperatures are strongly linked to variability in cloud organization (i.e., microphysics and dynamics), similar to what was found in PK2017.
a. Correlation of the synoptic state and radiometric properties of the precipitating scene
To better understand the relationship between the ice content (and thus Tb depression) and the large-scale environment, a closer look is necessary. While the synoptic environment is indubitably linked to the storm’s morphology (e.g., Ek and Mahrt 1994; Xu and Moncrieff 1994; LeMone et al. 1998; Parsons et al. 2000; Bretherton et al. 2004; Sherwood et al. 2004; Holloway and Neelin 2009; James and Markowski 2010; Ford et al. 2015), it is not guaranteed that this link is strong enough to provide useful information to the retrieval. With a goal of estimating the potential of the synoptic state to predict storms' morphology, an analysis, focused on the variability of the correlation coefficient between the ice signal (i.e., 89-GHz Tb) and rainfall rate in the retrieval's a priori, is performed.
Motivated by the PK2017 study and based on the aforementioned literature, predictors considered in this analysis include those found to relate with PMW systematic errors over the Amazon–African region, plus the aerosol concentrations, which are widely recognized as a major factor in cloud formation (e.g., Storer and van den Heever 2013). The environmental parameters are CAPE, low-level humidity, wind shear, vertical distribution of humidity, and CCN concentration. Using one environment at a time to define a synoptic state, five distinct, equally frequent atmospheric conditions are first recognized for each of the environmental variables. Then, a corresponding correlation coefficient between 89-GHz Tb and MRMS surface rainfall rate is found for each state. Defined to represent the strength of a relationship, the correlation coefficient is used here to estimate the utility of each environmental parameter. The results are listed in Table 1.
Upon inspection, it is clear that the correlation coefficient shows significant change across each of the five environmental states. Greater differences in coefficient value are evident between the environments expected to strongly relate to a specific storm morphology. For example, the correlation coefficient for the last quintile of CAPE values is −0.55, while the value in the first two quintiles of this environment is close to −0.18. This can be explained by the expectation that high CAPE values are more typical for strong, often well-developed, storms with a well-defined ice-to-rain relationship. To better depict this effect, Fig. 5 compares ice-to-rain ratios of five CAPE subsets relative to the mean a priori relation. While general findings hold for each of the five synoptic variables, CAPE is chosen as the easiest one to interpret.
The plot shows the relationship between the 89-GHz Tb, a proxy here for the ice content, and precipitation for the full a priori dataset (red line) and its five subsets defined by the CAPE environment. Characterized by different slopes, each environment line indicates a unique ice-to-rain relationship across both environmental and rainfall rate bins. Choosing the same example of the two lowest CAPE quintiles (yellow and blue lines in Fig. 5), it is obvious that for a given 89-GHz Tb corresponding to low precipitation rate values (e.g., 250–260 K), both of the two CAPE environments indicate a lower precipitation rate than suggested by the mean (red line) relationship. While the opposite is true for the other three CAPE subsets (gray, purple, and green lines), these are, however, atypical (i.e., infrequent) for low rainfall rate conditions, and as such would typically introduce a bias. This relation, although straightforward, requires additional attention when heavy precipitation rates are considered.
b. Links between GPROF biases and precipitation regime
Concentrating on the highest precipitation rates only (e.g., above 5 mm h−1), which is the focus of this study, the distribution of the five environment groups in Fig. 5 is such that while clearly suggesting an increase in precipitation rates for the same brightness temperature depression for four CAPE environments, the highest CAPE underestimates the precipitation relative to the ensemble mean (red) line. This is consistent with results from the PK2017 study, where the most vigorous CAPE environment is indeed recognized as the one where GPROF rainfall is positively biased. To support this statement, analyses similar to those of PK2017 are repeated over the 20°–40°N, 65°–101°W U.S. region over nonocean surfaces (as in Fig. 1) using a year of MRMS, DPR, and GMI measurements. The relationship among five environments, storm structures, and GPROF biases is presented in Table 2.
As expected, these results confirm the findings of the previous study (PK2017), with shallow regimes on average being underestimated (22%) and deep-organized regimes being overestimated (12%). These correspond to low and high CAPE values, respectively. Besides demonstrating that the bias is sensitive to the environmental conditions, Table 2 also confirms the relationships of the bias to the amount of ice typical for each of the three regimes. Here, the sum total of DPR reflectivity above the freezing level is used as a relative measure of the ice amount in the three regimes. As in PK2017, transition from negative to positive biases relates to the increase of the amount of ice in the atmospheric column.
c. Links between environments: Independent information content
The theoretical aspects of the relationship among CAPE, wind shear, humidity environments, and precipitation regimes are discussed in PK2017. Here, the attention is briefly given to the CCN concentrations, which, according to Table 2, when preceding shallow systems tend to be higher (by 20%) compared to those occurring prior to deep and more organized convection. At the same time, lower CCN concentrations clearly correspond to higher rainfall rates. One possible explanation for this result is that aerosol concentrations over land, while perhaps suppressing the warm rain processes, act to invigorate the ice phase processes as reported by various authors (e.g., Twomey 1977; Andreae et al. 2004; Storer et al. 2010; DeMott et al. 2011; Rosenfeld et al. 2013), with a recent study by Lin et al. (2016) listing relevant research on aerosol interaction with continental precipitation and modeled sensitivities based on field campaign measurements.
To effectively use environmental parameters and understand their mutual links, we focus here on the correlation between them (see Table 3). The highest correlation is found between wind shear and both CAPE and CCN concentration with values of only 0.3. This excludes the possibility that high CCN concentrations are exclusively related to very stable atmospheric stratification (e.g., inversions), which are typically characterized by low CAPE and shear values. Low correlations also imply that combinations of environmental parameters may be useful in defining cloud morphology and microphysics, as found in the literature. The study by Stolz et al. (2015), for example, recognizes a strong relationship between cloud warm depth (i.e., a layer between the cloud base and the freezing level) and lightning density as a function of both CAPE and CCN concentrations, where all these properties are suggested to play a key role in systems characterized by deep convection.
With the goal of improving the accuracy of heavy precipitation estimates, the top 10% of precipitation rates given by the MRMS (at FOV scale) is used to test the proposed approach. To implement and utilize established links between the environment and the relative amount of ice in raining clouds, environmental parameters are joined to both the observed and a priori vectors. One year of ECMWF, GMI, MRMS, and GEOS-Chem data is used to generate the a priori knowledge that included all observed FOV values and corresponding ancillary fields over the 20°–50°N, 65°–101°W U.S. region. To ensure independence between the observed and a priori vectors, the true answer is withheld from the retrieval for each pixel. In the baseline run, precipitation rates are retrieved using the operational GPROF.GPM.V4 algorithm described in section 3. To allow for analysis of the Bayesian weighting, the assigned weight and precipitation rate of each database element that took part in Eq. (2) calculations are added to the output. In this run, the database is constrained using only TPW, surface type, and 2-m temperature. Once available, the retrieved values are assessed using MRMS precipitation rates.
To test each of the five environments, using one environment at a time, the retrieval is run again but with the information on the environment state included in the a priori knowledge. Prior to employing Eq. (2), an atmospheric property is used to separate a priori knowledge into 10 equally frequent environment state categories. Upon constraining the a priori information by TPW, 2-m temperature, and surface type, database elements characterized by environmental categories other than the category of an observed precipitation scene are considered as nonmatching and ignored. This causes redistribution of weights assigned to the database elements in Eq. (2) and, consequently, results in a new weighted mean value. Although alternatives exist, this simple approach is seen as a first choice because of its easy interpretation.
One year of GMI observations over the 20°–55°N, 65°–101°W U.S. region is used to perform five separate runs of the GPROF algorithm using one environment at a time to complement the retrieval’s existing a priori information content. To bring the focus on heavy precipitation, only the pixels corresponding to the top 10% ground reference precipitation rate values (at the satellite pixel level) are assessed using the MRMS dataset. The results are presented in Fig. 6 and Table 4.
Figure 6 offers a side-by-side comparison of the original (Fig. 6a; also given in Fig. 1) against five new GPROF to MRMS precipitation rate relations. Based on corresponding statistics, listed in Table 4, with no exception, the overall bias is decreased while correlation coefficients increase. Also, the dispersion in the scatter of each of the five experiments is reduced compared to the original. This suggests that the expansion of the retrieval’s a priori information allowed for 1) reduction of the randomness (i.e., improved precision) and 2) improved accuracy (the overall bias is lower). Additionally, comparison of the mean retrieved precipitation rates at a number of precipitation rate intervals (black “x” symbols in density plots) reveals consistency of these improvements—when compared to the original run, the mean values of the other five are all closer to the one-to-one line (red lines in Fig. 6).
To complement these results, biases between GPROF rainfall and the MRMS reference are mapped for each environment criteria using a 0.25° grid. Simple subtraction of the original from the new map reveals regions where the algorithm makes improvements in the top 10% of precipitation rates. An example depicting algorithm performance for CAPE and CCN concentration environments is given in Fig. 7 (right and left panels, respectively).
Notably, areas of improved biases dominate the maps. Regions where the retrieval performs worse than originally are fairly small and generally correspond to the areas of lower initial biases (see Fig. 1). Maps for the other three environments have very similar general properties (not shown here). A relatively strong variability in the magnitude of improvement maxima, and their locations, in the two panels of Fig. 7 suggests that the two environments address different portions of the precipitation bias. Examples can be seen over New Hampshire and Vermont, northern Ohio, central Georgia, and Kentucky, where improvement is present in one but absent in the other map. This is in support of the relatively low correlation between the two environments (Table 3), suggesting a high potential for complementarity. To test this potential, another experiment is performed in which the algorithm is modified to use only those database elements that are close enough to the observed atmospheric state defined by two environment properties. Results for the example of CAPE and CCN concentrations are given in Fig. 8.
Compared to any of the individual environmental variables, the new criteria remove a significantly larger portion of the original bias, bringing it down from 28% to only 13%. This corresponds to a change in mean overall rainfall rate of 0.6 mm h−1, which is more than a half of the original discrepancy. The correlation coefficient also improves, increasing from 0.66 to 0.77. Despite the fact that other combinations of environments (not shown here) do not perform as well as this scenario, they all make improvements comparable to or greater than any of the environments alone.
A positive assessment of the retrieval with the updated a priori information content supports the findings presented in previous sections. Here, we assess whether the improvement is due to an improved characterization of cloud ice processes or simply a statistical artifact (e.g., high CAPE values are associated with high precipitation rates). Also, while the heavy precipitation estimates are improved, it is unclear what effect this method has on the overall performance of the retrieval. Each of these questions is discussed separately.
a. Weight distribution
To offer a better insight on how the change in the a priori content improves the retrieval itself, a closer look is taken at the process of forming the weighted mean value in Eq. (2). If the retrieval is improved, the weights assigned to pixels closer to the true value should be higher. If, instead, the retrieval result is simply better because the distribution of rainfall rates in the reduced a priori database more closely resembles the true answer, then no impact on the fit between observed and database Tb values should be evident. In the first step of this analysis, collocated ground reference (i.e., MRMS observations) values are used to identify rainfall bias associated with each of the database elements for all retrieved pixels. Then, for each pixel, using an arbitrarily determined bin width of 0.1 mm h−1, all weights falling within the same bias bin are averaged. Zero-weighted database elements are not included. Finally, the overall mean weight values are calculated using all pixels retrieved by the original and modified a priori information. Their difference, given as a function of precipitation rate bias, identifies the origins of bias reductions seen in the previous section. Once again, an example where CAPE is used to complement the a priori content is used to show the effect (Fig. 9). To assure valid comparison, before subtracting the original from the new weighted mean, both sets are standardized and normalized. This ensured that the area above and below the zero line is equal.
The redistribution of weights corresponds to the overall bias reduction of 21% (see Table 4 and Fig. 6b). The two areas (red and blue) in Fig. 9 indicate that after the a priori knowledge is complemented by information on the environment’s CAPE, database elements with rainfall rates closer to the observed value received more weight. At the same time, this gain in the weight (blue shading) is compensated by the equivalent reduction (red shading) distributed across negatively biased elements. Being very close to the ideal (i.e., the gain maxima centered at zero biased elements), this distribution of weight adjustments clearly indicates that a goal of identifying elements that relate better to the observed rainfall scene properties is achieved.
b. The overall effect
The positive effect of introducing complementary information to the a priori knowledge on heavy precipitation is not guaranteed to hold when the full rainfall spectrum is considered. This is examined by testing the performance of GPROF with the a priori complemented by CAPE information. The result is given in Fig. 10.
The increased correlation coefficient, reduced dispersion, and positive bias reduction over the great majority of the domain demonstrate that the additional information content improves the performance of the retrieval in every aspect. While the overall retrieval has significantly lower bias than the top 10% of rainfall rates (only 13%), making the overall bias reduction less impressive, the true value of the updated a priori information is found in its ability to improve both high and low precipitation rates. This is depicted in Fig. 10 by the reduced bias and increased correlations of the density plots. Similarly, both positive and negative bias regions in the top-right panel experience a reduction in the bias upon using information on the environment to improve the a priori content.
9. Summary and conclusions
The goal of this study was to develop, understand, and test the potential for improving the quality of heavy precipitation estimates from satellite passive microwave rainfall retrievals over land. Focusing on Bayes’ approach and using GPROF, the operational PMW retrieval for the GPM mission, this study builds on previous findings to hypothesize that the relationship between the large-scale environment and satellite precipitation biases can be used to reduce precipitation estimate uncertainty in extreme atmospheric conditions. The idea of using large-scale environmental variables that are associated with the potential for the atmosphere to produce heavy precipitation is tested. This is accomplished by subsetting the a priori information in a Bayesian scheme for distinct states of CAPE, CCN concentration, wind shear, and humidity distribution, supporting the hypothesis. Robustness of the links between five predictors of precipitation system morphology is examined by evaluating the skill of each predictor to recognize a strong link between surface rainfall rate and the Tb vector. Analysis suggested that system morphology and the retrieval’s biases can indeed be linked through the use of their environmental predictors in atmospheric states favorable for convection.
Considering three structurally different precipitation regimes (shallow, deep unorganized, and deep organized), it is found that extreme states of the environments lead to distinct cloud morphology. Those characterized by the highest CAPE and wind shear values, as well as large low-level humidity depressions or greater-than-average decreases in specific humidity with height, are found to typically precede storms with strong radar reflectivity above the freezing level. The opposite is true for their counterparts, while transitioning states had a less-defined preference confirming the complexity of cloud microphysics drivers.
Using MRMS rainfall data to assess performance, it is found that by complementing a priori information by collocated environment properties, the retrieval reduces the overall pixel-level bias for heavy precipitation by 20%–30%. These improvements are accompanied by a noticeable reduction in the random error as well. The analysis of the Bayesian-averaging process revealed that added information content successfully shifts the probability-based weight toward database elements of precipitation rate values similar to those given by the reference. This consequently reduces the overall bias in heavy precipitation. A use of more than one parameter to define an atmospheric state is also tested, yielding bias reductions of up to 50% for the environment categories defined by CAPE and CCN concentrations. Finally, the effect of using this approach is tested on the entire rainfall spectrum, finding that the overall performance of the GPROF retrieval is preserved, with improvements in the correlation coefficient and biases at both the low and high end of the rainfall rate range.
This work was funded by the NASA Earth and Space Science Fellowship (NNX14AK85H) and PMM Grant NNX16AE23G. Authors would like to thank Dr. Pierre Kirstetter (NOAA/NSSL, Norman, OK) for his help with the ground radar data. Our thanks also go to the helpful advice of three anonymous reviewers.
List of Acronyms
Advanced Microwave Scanning Radiometer 2 on board GCOM-W1 satellite
Advanced Technology Microwave Sounder on board Suomi-NPP satellite
Climate Prediction Center morphing technique
Continental United States
Defense Meteorological Satellite Program
Dual-frequency precipitation radar
Global Change Observation Mission–Water
GPM Microwave Imager
Global Precipitation Measurement
Goddard profiling algorithm
Global Satellite Mapping of Precipitation
Integrated Multisatellite Retrievals for GPM
Joint Polar Satellite System
Meteorological Operational satellites
Microwave Humidity Sounder on board MetOp and NOAA satellite series
Multi-Radar Multi-Sensor system
National Aeronautics and Space Administration
National Oceanic and Atmospheric Administration
Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks Cloud Classification System
Special Sensor Microwave Imager/Sounder on board DMSP satellites
TRMM Multisatellite Precipitation Analysis
Total precipitable water
Tropical Rainfall Measuring Mission
This article is included in the Global Precipitation Measurement (GPM) special collection.