## 1. Introduction

For the majority of global land areas, satellite-based rainfall estimates offer the only possible source of near-real-time precipitation accumulation information for operational hydrologic applications. However, providing such information at required accuracy levels has proven difficult (Hossain et al. 2004; Hossain and Anagnostou 2004). The expected (next decade) deployment of the Global Precipitation Mission (GPM) satellite constellation represents a critical advance in these efforts (Hou 2006). In addition to GPM, a currently underexplored possibility for improving land rainfall retrieval lies in the development of efficient techniques for leveraging complementary spaceborne water cycle observations (McCabe et al. 2008). A promising class of such techniques for rainfall correction is based on interpreting variations in soil water storage realized upon the assimilation of satellite-based surface soil moisture retrievals into a water balance model (Pan and Wood 2007. Despite the limited heritage of soil moisture remote sensing methods and products relative to their rainfall equivalents, surface soil moisture retrievals, unlike instantaneous rainfall rate measurements, reflect memory of antecedent rainfall amounts and can therefore be sampled at relatively low temporal frequencies (e.g., once every 1–3 days) and still provide useful rainfall accumulation information (Crow 2003).

Recently, Crow (2007) demonstrated that spaceborne estimates of surface soil moisture can be processed (via their Kalman filter–based assimilation into a water balance model) to reveal valuable information concerning the sign and magnitude of antecedent rainfall accumulation errors. The approach is based on the assimilation of remotely sensed surface soil moisture retrievals into a simple soil water balance model using a Kalman filter. The model is assumed to be forced primarily by a satellite-based rainfall product. Net additions or subtractions of soil water suggested by the filter upon assimilation of a soil moisture retrieval (commonly referred to as “analysis increments”) contain useful information about recent rainfall errors. That is, the underestimation (or overestimation) of antecedent rainfall accumulations by a satellite-based rainfall product should result in the subsequent addition (or removal) of water by the filter upon assimilation of a post-event surface soil moisture retrieval. Correlation between rainfall errors and subsequent filter analysis increments implies that analysis increments realized during soil moisture data assimilation are adequately compensating water balance predictions for the impact of antecedent precipitation errors. Based on this reasoning, Crow and Zhan (2007) use the magnitude of the correlation between rainfall error and analysis increment correlations as a metric to evaluate the continental-scale performance of various remotely sensed surface soil moisture products.

A separate possibility examined here is that the presence of such a correlation can also be exploited to dynamically filter errors existing in satellite-based rainfall accumulation estimates. That is, analysis increments could be combined with satellite-based rainfall retrievals in such a way that error in rainfall accumulation estimates is minimized. The successful application of variance-minimizing data assimilation techniques to this problem would ensure that the soil moisture–based correction of a rainfall time series would never increase the root-mean-square (RMS) error of rainfall estimates—even in areas where soil moisture retrievals are of poor quality. While past work (see, e.g., Crow and Bolten 2007) has demonstrated the potential for passively validating satellite-based rainfall products using soil moisture observations, the active correction of rainfall products with remotely sensed soil moisture retrievals has not yet been attempted.

Here, we present an approach for correcting short-term (2- to 10-day) satellite-based rainfall accumulation products over land using analysis increments calculated during the sequential assimilation of a remotely sensed soil moisture product into a simple water balance model. Surface soil moisture products are based on the application of the Jackson (1993) single-channel retrieval algorithm to H-polarized 10.6-GHz (X band) brightness temperature (*T _{B}*) observations acquired from the Advanced Microwave Scanning Radiometer (AMSR-E) aboard the National Aeronautics and Space Administration’s (NASA)

*Aqua*satellite (Jackson et al. 2007). The correction scheme is applied to a number of satellite-based precipitation products, including rainfall accumulation products generated by the Tropical Rainfall Measuring Mission (TRMM) Precipitation Analysis (TMPA). As an initial validation exercise, the procedure is evaluated over the contiguous United States (CONUS) based on comparisons with the National Centers for Environmental Prediction/Climate Prediction Center’s (NCEP/CPC) unified rain gauge dataset (Higgins et al. 2000). However, the ultimate value of the approach is likely to be the greatest for continental areas possessing relatively limited ground-based observing capabilities. As required by such areas, the procedure is designed to operate solely on satellite-based data.

## 2. Kalman filtering

*P*′) to drive a spatially distributed, daily antecedent precipitation index (API) model: where

*i*and

*j*are time and space indices (respectively) and

*γ*in (1) is varied according to the day of the year (

*d*) as Following Crow and Zhan (2007),

*α*and

*β*are held constant (in both space and time) at 0.85 and 0.10. In (1),

*P*′ represents the accumulation depth of rainfall on day

*i*and, consequently, API is expressed in dimensions of water depth.

*θ*are used to update (1) via a Kalman filter: Here, the superscript minus (−) and plus (+) signs denote API values before and after Kalman filter updating. Following Reichle and Koster (2005), daily

_{i,j}*θ*estimates (in water depth dimensions) for a particular pixel are obtained by linearly rescaling a time series of raw surface soil moisture retrievals

_{i,j}*θ*° (in volumetric soil moisture dimensions) such that retrievals long-term mean (

*μ*) and standard deviation (

*σ*) match those derived from a multiyear integration of API calculated for the same pixel: Implicit in this transformation is an assumption that API estimates and the satellite soil moisture retrievals posses the same vertical support within the soil column. While the purely linear API models lacks an explicit representation of a soil layer depth, its inherent memory of past rainfall (and therefore its storage capacity) is implied via

*γ*. The sensitivity of subsequent results to our particular parameterization of

*γ*in (2) is examined in section 5e. Also, note that API mean and standard deviation statistics for (4) are sampled from a time series generated using (1) and no Kalman filter updating. The required length of the data heritage required to obtain stable estimates of these statistics is examined in section 5f.

*K*in (3) is given by where

*T*is the error variance for API forecasts and

*S*is the error variance for

*θ*retrievals. At measurement times,

*T*is updated via Between soil moisture retrievals, and the adjustment of API and

*T*via (3) and (6), API is forecasted in time using observed

*P*′ and (1). In parallel, forecast error

*T*is updated using where

*Q*relates the forecast uncertainty added to an API estimate between time

*i*− 1 and

*i*.

*S*) leads to smaller

*K*and a reduction in the weighting applied to observations by (3). A critical aspect then is the estimation of the error parameter

*S*in (5) (describing the soil moisture retrieval error) and

*Q*in (7) (determining the model uncertainty in API forecasts). Here,

*S*and

*Q*are assumed to constant, scalar quantities that are calibrated on a pixel-by-pixel basis until a time series of filtering innovations, is obtained that is temporally uncorrelated and has a variance of one. In particular, the lack of temporal correlation in

*ν*ensures that the filter is accurately partitioning total error between the observations and modeling sources (Gelb 1974). Note that such calibration requires no outside information other than time series variables already used in the filtering procedure (e.g.,

*θ*° and

*P*′). Despite the potential for non-Gaussian rainfall errors, such an approach has been successfully applied to estimating modeling errors in a similar soil moisture data assimilation system (Crow and Bolten 2007).

## 3. Rainfall correction

*θ*retrievals are minimally skilled and the Kalman filter properly parameterized,

*δ*values obtained from (9) should correlate with antecedent errors in

*P*′ values used to force (1). In regions where ground-based rain gauge observations are dense enough to be considered a validation data product

*P*, the error in

*P*′ can be explicitly calculated as

*P*′ −

*P*. To examine the relationship between

*δ*and

*P*′ −

*P*, Crow (2007) temporally aggregated both quantities within a series of nonoverlapping windows of length

*m*: and calculated the long-term correlation coefficient between [

*δ*] and [

*P*′ −

*P*]. The new index

*k*= 0, 1, 2, … , counts nonoverlapping

*m*-day time periods, and a lag of

*n*days is introduced in (10) to account for the causal relationship between past rainfall and current soil moisture.

*m*= 7,

*n*= 1, and an AMSR-E soil moisture product, Crow and Zhan (2007) found a statistically significant correlation between [

*δ*] and [

*P*′ −

*P*] (at 95% confidence) for 84% of the CONUS land surface. This result suggests that [

*P*′ −

*P*] rainfall accumulation errors can be estimated, and systematically reduced, based on [

*δ*] values realized during the sequential assimilation of remotely sensed surface soil moisture into (1). However, because [

*δ*] reflects the correction of all error sources in API predictions (e.g., the poor parameterization of soil water loss) and not just those associated with rainfall, the linear relationship between

*δ*and [

*P*′ −

*P*] is not generally one to one. In response, we propose to correct

*m*-day remotely sensed rainfall accumulations ([

*P*′]) using an additive correction of the form where

*λ*is a time constant scaling factor. While consistent with the application of a Kalman filter, it is possible that an additive (as opposed to multiplicative) error model for precipitation accumulations has shortcomings at the space–time scales under study (1° latitude–longitude, 2–10 days), so results obtained from (12) will be scrutinized for evidence concerning the appropriateness of this assumption. One immediate consequence of an additive correction is that (12) can yield negative

*m*-day periods containing more than

*m*/2 distinct soil moisture retrievals.

The eventual accuracy of the corrected rainfall product *λ* for each pixel in which rainfall is corrected. Unfortunately, the optimal choice for *λ* demonstrates a theoretical dependence on a number of unknown factors. As noted above, the magnitude of *λ* is sensitive to the relative partitioning of the total modeling error in (1) between external rainfall forcing and shortcomings in the internal model structure. For instance, if error in (1) is dominated by the poor treatment of soil water loss, as opposed to error in *P*′, analysis increments will mostly reflect corrections due to highly inaccurate soil loss predictions and overestimate the volume of water required to compensate for (more modest) rainfall errors. This, in turn, implies that *λ* values less than one are required in (12) to obtain optimally accurate *λ* will be higher as updates to API made via (3) strongly reflect the impact of rainfall uncertainty.

A further complication in applying (1) is the role of surface runoff and quick drainage of the near-surface (1–3 cm) soil moisture layer after the end of rainfall events, but prior to the subsequent acquisition of a surface soil moisture retrieval. Both processes reduce the volume of the rainfall error that manifests itself in the satellite-estimated surface soil moisture anomaly. Consequently, their impact necessitates an increase in the value of optimized *λ* to compensate for the undetected rainfall volume that either runs off or infiltrates beyond the shallow microwave sensor measurement depth.

These factors suggest that the estimation of *λ* will pose a significant obstacle for the operational implementation of (12). Here, we pursue and evaluate two potential strategies for dealing with this issue. The first is a naïve “default” strategy of simply assigning *λ* to a fixed value of either 0.5 or 1 for all grid cells at all times. The second option is a nonparametric “estimated *λ*” strategy of utilizing a second, independently acquired satellite rainfall dataset (*P*″) and tuning time-constant values of *λ* on a pixel-by-pixel basis until the RMS difference between *P*″] is minimized. If errors in *P*′ and *P*″ are independent, *λ* values that minimize the mean-squared average of *λ* approach can be evaluated based upon comparisons with an “optimized *λ*” strategy in which time-constant *λ* values are explicitly tuned (on a pixel-by-pixel basis) to minimize the RMS difference between *P*]). This final approach is, of course, not feasible in a global setting as it requires long-term access to *P* values acquired from dense, ground-based rain gauge networks.

In summary, it is important to stress that not all differences between API predictions in (1) and rescaled AMSR-E soil moisture retrievals obtained from (4) can be attributed to the impact of errors in *P* on API predictions. As discussed above, a portion of such differences arises from both AMSR-E soil moisture retrievals errors and nonrainfall sources of error in API predictions. Within this context, our entire methodology can be viewed as a two-step filtering procedure designed to isolate these non-rainfall-based error sources before they can be misattributed to rainfall. The first step in this filtering process is the application of a Kalman filter in (9) to estimate API analysis increments *δ*. Here, the total difference between rescaled AMSR-E retrievals (*θ*) and background API forecasts (API^{−}) is reduced by a fractional Kalman gain (*K* < 1) calculated in (5) based on consideration of the relative magnitude of soil moisture retrievals and API errors. As such, it quantifies the fraction of the difference between *θ* and API^{–} attributable to modeling, as opposed to AMSR-E retrieval, error. Subsequently, we apply a secondary correction via a temporally constant *λ* parameter in (13) to extract, from a temporal aggregation of *δ* ([*δ*]), the fraction of the total modeling error directly attributable to rainfall and discard the complementary portion due to nonrainfall sources of error in API forecasts. Both of these two filtering steps are critical before resulting water depths can be correctively applied to a satellite-based rainfall accumulation estimate.

## 4. Data and approach

Remotely sensed surface (1–3 cm) soil moisture retrievals *θ*° are obtained from application of the single-polarization Jackson (1993) algorithm to X-band AMSR-E *T _{B}* data (Jackson et al. 2007). Climatological normalized difference vegetation index (NDVI) composite products derived from the Advanced Very High Resolution Radiometer (AVHRR) and the vegetation water content (VWC)–NDVI regression relationship of Jackson et al. (1999) are used to estimate VWC. Surface soil moisture retrievals are acquired with a spatial resolution of about 40

^{2}km

^{2}and a measurement frequency of 1–2 days at midlatitudes. Screening is performed to mask areas with snow cover and/or experiencing active rainfall. After screening, retrievals obtained from both ascending and descending overpasses between 1 July 2002 and 31 December 2006 are combined and aggregated to form a (near) daily, 1° latitude–longitude product. Prior to their assimilation into (1),

*θ*° retrievals are linearly rescaled (on a pixel-by-pixel basis) using (4) to form a new product (

*θ*) with the same temporal mean and standard deviation as API products derived from (1).

During this same 2002–06 time period, a number of different remotely sensed rainfall datasets are used to estimate the daily rainfall accumulation *P*′ in (1). TMPA results are computed retrospectively as the version 6 3B42 product. This approach combines multiple passive microwave estimates, microwave-calibrated infrared (IR) estimates, and monthly gauge data (Huffman et al. 2007). The real-time, combined passive microwave portion of the TMPA is computed experimentally as the 3B40RT product (Huffman et al. 2007). It differs from the version 6 3B42 product in only using microwave data, being run in real time, and having a heterogeneous computational record. In particular, the inventory of microwave data approximately doubled in February 2005. The Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) employs a neural network to calibrate IR estimates with passive microwave data (Sorooshian et al. 2000). Finally, the Hydroestimator (HE) product is based on radar-calibrated IR estimates and uses numerical weather model data to adjust for moisture availability, the height of the convective equilibrium level, and orographic influences (Scofield and Kuligowski 2003). Use of the HE product is delayed until after July 2003 due to temporal gaps in its coverage. Benchmark rainfall magnitudes, *P*, for 2002–05 are obtained from the CPC’s retrospective rain gauge analysis product within the contiguous United States (Higgins et al. 2000). For 2006 only, *P* is based on the real-time CPC product, which is derived from slightly fewer rain gauges (information online at http://www.cpc.noaa.gov/products/precip/realtime/index.shtml). Although individual rain gauges provide sampling that is quite different from that of satellite-based rainfall estimates, it is conventional to use interpolated analyses of sufficiently dense gauge networks as validation for satellite estimates (e.g., Ebert et al. 2007).

All rainfall accumulation products (TRMM 3B42, TRMM 3B40RT, PERSIANN, HE, and NCEP CPC) are resampled to a daily, 1° latitude–longitude grid overlying the CONUS area. For these products, we define daily accumulation as the total depth of rainfall occurring between 1200 and 1200 UTC. However, for the soil moisture product the same day is defined by a period shifted 12 h into the future (0000–2400 UTC). Here, this 12-h shift is assumed to effectively capture the necessary delay between rainfall and the resulting soil moisture, and *n* in (10) is set to zero. Results will focus primarily on a choice of 3 days (i.e., *m* = 3) for the time scale of corrected accumulation products. The spatial domain of interest is the entire CONUS land area with a special focus on a lightly vegetated southern Great Plains (SGP) subdomain between 33°–40°N and 100°–105°W that is generally considered to be well suited for soil moisture remote sensing (Jackson et al. 1999). Future lower-frequency (1.4 GHz) satellite sensors (Kerr et al. 2001) should yield higher-accuracy soil moisture products over moderately to heavily vegetated surfaces. Consequently, AMSR-E SGP results, derived using higher-frequency X-band retrievals over a lightly vegetated area, are likely representative of future 1.4-GHz L-band spaceborne results for a geographic domain extending beyond the SGP to more densely vegetated regions.

## 5. Results

For the single lightly vegetated 1° box centered at 35°N and 100°W, Fig. 1 plots time series of (a) daily API values derived from (1) using the TRMM 3B40RT rainfall product for *P*′, (b) the raw AMSR-E surface soil moisture product (*θ*°), and (c) the analysis increments (*δ*) realized when assimilating rescaled *θ*° into (1). For the same grid box over a shorter time period during summer 2004, Fig. 2 shows the impact of using (12) and the *δ* time series in Fig. 1c to correct 3-day TRMM 3B40RT accumulation products. Periods of time when the TRMM 3B40RT product overestimates (or underestimates) benchmark CPC accumulation products are labeled. Note that, for each instance, the corrected products obtained from (12) are able to revise the TRMM product in the correct direction (i.e., remove accumulated rainfall in overestimated cases and add rainfall volume in underestimated cases). The source of this skill is the time series of *δ* values (Fig. 1c) derived from *θ*° (Fig. 1b) using (4) and (9).

### a. Estimation of *λ*

Figure 2 plots both optimized and estimated *λ* rainfall correction results. As discussed in section 3, optimized results are based on tuning a temporally constant *λ* factor on a pixel-by-pixel basis to minimize the RMS difference between corrected 3-day accumulation products *P*]. Such explicit tuning, however, is not possible for the application of (12) outside of limited areas with extensive ground-based observations. A globally feasible alternative is to calibrate *λ* in (12) to minimize the RMS difference between *λ* values—obtained by minimizing the RMS difference between the corrected 3-day TRMM 3B40RT rainfall product (*λ* results, which explicitly minimize the RMS difference between *P*]. Unless otherwise noted, all estimated *λ* values are based on HE data acquired between July 2003 and December 2006.

Values of optimized *λ* plotted on the ordinate in Fig. 3 typically fall between zero and one. The relative lack of optimized *λ* values greater than one suggests that, while [*δ*] and [*P*′ − *P*] tend to be positively correlated, the entire volume of [*δ*] should not necessarily be attributed directly to rainfall error. More importantly, the relatively strong (*R*^{2} = 0.56) correlation in Fig. 3 implies that optimized time-constant *λ* values can be effectively estimated without the aid of extensive ground-based rainfall observations. However, Fig. 3 also demonstrates a tendency for estimated *λ* values to slightly underpredict their optimized counterparts. One consequence of this bias is that negative estimated *λ* values are calculated for a small number of 1° pixels (representing about 5% of the total CONUS area). Since no physical basis exists for the positive correlation between [*δ*] and [*P*′ − *P*] (required by a negative value for *λ*), negative *λ* values are reset to zero. Because of the form of (12), this prevents the estimated *λ* approach from modifying rainfall in these pixels.

### b. SGP correction of TRMM 3B40RT

For all 1° boxes in the SGP subdomain, Fig. 4 plots original and corrected 3-day TRMM 3B40RT accumulations against their benchmark CPC equivalents. Specifically, Fig. 4a plots benchmark CPC 3-day accumulations versus original (uncorrected) TRMM 3B40RT accumulations. As suggested by earlier results in Fig. 2, this relationship is enhanced through application of the optimized *λ* correction procedure (Fig. 4b). Furthermore, an approximately equivalent correction is obtained when utilizing estimated (as opposed to optimized) *λ* values in (12) (Fig. 4c). Relatively small differences between Figs. 4b and 4c suggest that the relationship in Fig. 3 produces sufficiently accurate *λ* estimates to form the basis of a robust, and operationally feasible, correction procedure.

An alternative, but less encouraging, explanation for the lack of difference between Figs. 4b and 4c is that HE daily accumulations (used as the independent rainfall data source [*P*″] to define *λ*) are significantly more accurate (relative to TRMM 3B40RT) and can therefore be substituted for benchmark CPC rainfall estimates with little subsequent impact on *λ* estimates. However, if this effect was truly responsible for the small differences seen between Figs. 4b and 4c, then our estimation approach should yield much poorer results when the roles of HE and TRMM 3B40RT are reversed (i.e., when TRMM 3B40RT data are used as [*P*″] to define *λ* and correct HE accumulations). In fact, the HE correction results for this reverse case demonstrate only very small differences between optimized and estimated *λ* results. Specifically, relative to an *R*^{2} of 0.48 for uncorrected HE 3-day accumulations, duplication of Fig. 4 for this reverse case (not shown) leads to an *R*^{2} of 0.64 for optimized *λ* versus 0.63 for estimated *λ*. The lack of difference between estimated and optimized results for both the original and reverse cases implies that the success of our estimation approach is not contingent on a significant difference in accuracy between the two satellite-based products (HE and 3B40RT) but rather on the mutual independence of their errors.

Results in Table 1 summarize the accuracy of our original and corrected rainfall products (relative to the CPC benchmark) using a variety of accuracy metrics. The root-mean-square error (RMSE) and square of the correlation coefficient (*R*^{2}) results are based on the comparison of corrected 3-day accumulation products with the benchmark CPC product. The false alarm ratio (FAR) is defined as the fraction of estimated events that are actually nonevents in the CPC product. The probability of detection (POD) relates that fraction of all actual events that are correctly estimated. Here, an event is defined as a 3-day rainfall accumulation that exceeds the 95th quantile for all 3-day CPC accumulations intervals in each 1° box. Since our selection of the 95th quantile is essentially arbitrary, results for other quantile choices are discussed later (section 5e). Table 1 is also broken down according to both geographical domain (i.e., the entire CONUS area and the smaller SGP subdomain) and the method utilized for obtaining *λ* (i.e., the estimated, optimized, and default approaches introduced in section 3 and discussed above).

Note that the correction procedure improves all five performance metrics relative to the uncorrected, original TRMM 3B40RT product within the SGP subdomain (Table 1). As in Fig. 4, relatively small differences exist between the corrected results obtained via the optimized *λ* and estimated *λ* approaches. Both approaches also slightly outperform the *λ* = 1 default case. However, the fixed default choice of *λ* = 0.5 for all pixels comes close to matching the results for the estimated *λ* correction. This lack of a significant difference emphasizes that, despite the need to quantify *λ* in a reasonable way, SGP accuracy improvements demonstrated in Table 1 are ultimately due to the skill embedded in temporally variable Kalman filter analysis increments (Fig. 1c) and not the pixel-by-pixel tuning of a time-constant *λ* factor performed by both the estimated and optimized *λ* strategies. Improvement in all rainfall accumulation metrics is possible even for the simplistic treatment of *λ* as a temporally and spatially fixed variable.

Returning to the additive form for (12) and the occasional negative values of *λ* results in Table 1, but only modestly (e.g., RMSE increases from 9.15 to 9.80 mm and *R*^{2} decreases from 0.53 to 0.49). In addition, because our approach is based on rescaling raw *θ*° retrievals into a potentially biased API climatology prior to correction, it cannot correct for long-term accumulation bias. All positive impacts noted in Table 1 are therefore based on the correction of random and/or slowly varying accumulation error components. Correction of the long-term bias in rainfall products will require the implementation of other techniques (e.g., Smith et al. 2006).

### c. CONUS correction of TRMM 3B40RT

Results presented up to this point have been based on the application of our approach to a lightly vegetated area (the SGP) known to be well suited to soil moisture remote sensing (Jackson et al. 1999). Relative to the SGP subdomain, application of the estimated *λ* correction procedure to the entire CONUS area yields substantially smaller relative improvements in all accuracy metrics except FAR (Table 1). This reduction likely reflects the wider range of land surface conditions outside of the SGP subdomain, some of which are not well suited to X-band soil moisture remote sensing (Njoku et al. 2003). As with the SGP region, only small CONUS-wide differences are noted between corrected results acquired using optimized and estimated *λ* values in (12). In contrast, all four accuracy metrics for the *λ* = 1 default correction case are degraded relative to the original (uncorrected) TRMM 3B40RT product. The failure of this naïve approach underscores the importance of estimating *λ* in an appropriate manner. However, as in the SGP, relatively good results are obtained for the better default choice of *λ* = 0.5 (Table 1).

Figure 5a shows CONUS-wide imagery of the relative RMSE improvement observed between the original and corrected TRMM 3B40RT accumulation products [(RMSE_{corrected} − RMSE_{original})/RMSE_{original}]. Despite the use of estimated, rather than explicitly optimized, *λ* values in (12), relative RMSE improvement is observed over almost the entire domain, and, within a large area of the central United States, relative RMSE reductions greater than 0.30 are found. Increased RMSE in corrected rainfall is limited to a small number (∼30) of 1° cells in heavily forested areas (e.g., New England). The mischaracterization of nonrainfall error sources in our approach would almost certainty lead to areas of increased rainfall accumulation RMSE. The relative absence of such degraded areas in Fig. 5a demonstrates that our particular rainfall correction approach is adequately filtering out the corrupting effect of soil moisture retrieval uncertainty and nonrainfall sources of error in API estimates.

While the approach requires the specification of several parameters using retrospective data [e.g., the Kalman filtering parameters in *Q* and *S* in (5) and (7) and the scaling factor *λ* in (12)], all results in Fig. 5 are based on the approximation of these parameters using only remotely sensed data products. At no point during the correction process is access to the benchmark CPC rain gauge data required (or assumed) to tune a particular parameter. In particular, *λ* can be adequately estimated from satellite-based precipitation products (Fig. 4). Even a simple default case of *λ* = 0.5 leads to substantial correction (Table 1). Consequently, RMSE correction results displayed in Fig. 5a can be considered representative of results obtainable in an operational setting lacking access to ground data.

However, improved RMSE results do not guarantee the enhancement of other rainfall accuracy metrics. Figure 5b plots CONUS results for the absolute difference in the *R*^{2}(*R*^{2}_{corrected} − *R*^{2}_{original}) results calculated between *P*]. Here, clusters of slightly degraded (i.e., reduced) *R*^{2} are found in areas of high vegetation cover (e.g., the Pacific Northwest, the Ozarks, and along the Appalachian Mountains). Nevertheless, outside of these clusters, areas of more substantial enhancement can be seen, particularly along a broad north–south swath of the western United States. Enhanced POD is also observed throughout relatively lightly vegetated and nonmountainous areas of the western United States (Fig. 5c). Figure 5d shows that while FAR skill is generally enhanced throughout the entire CONUS domain, the pattern of improvement is somewhat erratic in that it does not conform to any known spatial variation in vegetation cover or climatological rainfall characteristics.

### d. Correction of TRMM 3B42, HE, and PERSIANN

To this point, all results have been based on the using the TRMM 3B40RT product for *P*′ in (1). Table 2 also summarizes our correction results based on application of our estimated *λ* correction procedure to three other rainfall products (TRMM 3B42, HE, and PERSIANN). Note that the TRMM 3B40RT product is used as the independent *P*″ data source during the correction of the HE and PERSIANN accumulation products. All results in Table 2 are given in terms of absolute metric differences (corrected − original). Within the SGP subdomain, all four products are enhanced with respect to all four accuracy metrics (i.e., reduced RMSE and FAR and increased *R*^{2} and POD). Corrections made to the satellite-only TRMM 3B40RT, PERSIANN, and HE products capture the potential range of improvements obtainable for satellite-base rainfall estimates acquired using a variety of algorithms and satellite-based observations. For instance, slightly smaller corrections made to the PERSIANN product (relative to TRMM 3B40RT) may be attributable to its incorporation of more frequent, although less precise, thermal infrared remote sensing observations not considered in the microwave-only TRMM 3B40RT product. A larger difference can be noted between the results for the purely satellite-derived products and the gauge-corrected TRMM 3B42 product (Huffman et al. 2007). For all metrics except FAR, the retrospective gauge-based correction of the TRMM 3B42 appears to limit the added utility associated with our surface soil moisture–based correction procedure. This implies that the primary value of remotely sensed soil moisture for rainfall correction will be in areas lacking adequate rain gauge coverage for use in either real-time or retrospective rainfall analyses. Finally, as in Table 1, enhancements to the RMSE, *R*^{2}, and POD skill for all four products are reduced when expanding the domain of interest from the lightly vegetated SGP subdomain to the wider range of land coverage types found within the entire CONUS region.

### e. Parameter sensitivity and correction robustness

Our procedure contains four separate parameters that could conceivably be modified to impact results: the choice of *α* = 0.85 and *β* = 0.10 in (2), the use of a 3-day accumulation window (*m* = 3) in (10) and (11), and the choice of a 95th quantile (of 3-day rainfall accumulations) as the event threshold for POD and FAR calculations. The summary below examines the potential sensitivity associated with these parameters for correction of the TRMM 3B40RT product within the SGP subdomain.

Values of *α* and *β* used here (0.85 and 0.10) are based on default values used previously in Crow and Zhan (2007). Modest sensitivity is observed to variations in these parameters (not shown). The most significant trend is a tendency for obtaining slightly better correction results for lower values of *α* (e.g., between 0.75 and 0.80). In addition, correction results are degraded for choices of *α* greater than 0.90 and *β* values less than 0.05.

With regard to variations in the length of the accumulation window, Fig. 6 plots the original and corrected TRMM 3B40RT RMSE and *R*^{2} results over the SGP subdomain as a function of accumulation period length. The relatively small differences between the results based on the optimized and estimated *λ* in Fig. 6 underscore the robustness of our estimation procedure for *λ* over a range of accumulation time scales. In addition, the relative magnitude of corrections for a 3-day accumulation period (*m* = 3) are generally representative of any accumulation length between 2 and 10 days. However, the correction skill declines significantly at accumulation time scales finer than the 1- to 2-day retrieval frequency of the assimilated AMSR-E soil moisture product. At present, this reduces the effectiveness of our approach at daily and subdaily time scales.

FAR correction results are also degraded when specifying lower event thresholds (not shown). In fact, SGP FAR results for corrected rainfall products slightly increase (relative to the original, uncorrected TRMM 3B40RT product) when defining an event threshold below the 85th quantile for 3-day CPC accumulations. Since *λ* is restricted to positive values, the additive form of (12) implies a positive *δ*]. Positive [*δ*] are commonly associated with underestimated rainfall but can also arise during periods of no precipitation and excessively rapid dry-down dynamics in API predictions (section 3). Despite our attempts to separate out the effects of such non-rainfall error sources, it is possible that the additive form of (12) may erroneously add small accumulation depths to *P* to compensate for what is, in reality, a model evaporative or drainage parameterization problem. This, in turn, would lead to the false detection of low- to moderate-intensity rainfall events. Difficulties with FAR correction are also seen in Fig. 2, where (12) is typically able to only partially correct the accumulation periods in which the TRMM 3B40RT product substantially overestimates rainfall accumulations. Consequently, the corrected accumulation time series remains prone to false alarm errors.

### f. Training period requirements

A potential obstacle to the operational implementation of this approach lies in the need to sample the long-term mean and variance statistics in order to parameterize both the scaling relationship described in (4) and obtain the RMS statistics upon which we base our estimated *λ* correction procedure (see section 3). Up to this point, both calculations have been performed assuming that data are available for the entire AMSR-E data period (July 2002–December 2006). In reality, an operational implementation of this approach using a new soil moisture or rainfall product would need to acquire these statistics within a shorter time frame in order to order to commence real-time product generation.

Within the SGP subdomain, Fig. 7 examines this issue by plotting the *R*^{2} performance metric for the estimated *λ* correction procedure as a function of training period length. Data available within these finite training periods are sampled to obtain both the rescaling statistics used in (4) and the RMS error statistics needed to estimate *λ*. Here, training period length refers to the number of days with good data that are sampled to obtain such statistics. As in Table 1, results are based on the correction of TRMM 3B40RT accumulations within the SGP subdomain. However, to maximize the length of the data time series, the longer PERSIANN time series data were used for *P*″ in place of the HE product. Multiple gray lines in Fig. 7 are generated by starting finite training periods at various times along the entire July 2002–December 2006 period. These starting points are spaced 100 days apart in order to sample across the seasonal cycle.

Results in Fig. 7 demonstrate that improved accumulation estimates can generally be obtained after sampling for as little as 20 days, and results essentially consistent with the use of the entire period as a training period (dashed black line in Fig. 7) are possible after a training period of about 200 days. These results suggest that it may prove possible to introduce a new data source with preliminary coefficients after 1 month of training and, subsequently, to optimize them using about 1 year of data.

## 6. Conclusions

Remotely sensed rainfall and surface soil moisture retrievals contain complementary information that can be exploited to enhance both types of hydrologic observations. Past work has demonstrated how the root-mean-square accuracy of daily rainfall accumulations can be estimated through the assimilation of surface soil moisture retrievals into an API model (Crow and Bolten 2007). Here, we expand on this by moving past the passive evaluation of rainfall products to demonstrate how remotely sensed soil moisture retrievals can be used to actively enhance the accuracy of short-term (2- to 10-day) rainfall estimates derived from satellites. Because of the availability of high-quality rain gauge datasets for validation purposes, this initial study is limited to the well-instrumented CONUS region. However, the procedure is explicitly designed to require only satellite-based inputs and can therefore be applied over continental areas lacking extensive ground instrumentation.

Results demonstrate that our additive correction model (12) can be adequately parameterized using two quasi-independent remotely sensed rainfall datasets (Fig. 2). Application of this approach leads to large-scale improvements in the accuracy of the TRMM 3B40RT rainfall product for a range of rainfall accuracy metrics (Table 1, Figs. 4 and 5). While qualitatively similar results are found for other satellite-only products (HE and PERSIANN; Table 2), reduced improvements noted for the TRMM 3B42 product (which incorporates a retrospective correction based on ground-based rain gauge observations) suggest that the highest utility for the procedure lies in enhancing rainfall measurements in continental regions lacking adequate ground-based rain radar or rain gauge coverage.

Because of its basis in Kalman filtering, the approach can be widely applied in regions where soil moisture retrievals suffer poor accuracy (due to the presence of dense vegetation) without reducing the RMS accuracy of the rainfall product (Fig. 5a). However, improvements in other rainfall accuracy metrics (e.g., *R*^{2} and POD) are generally limited to areas of light and moderate vegetation cover (Figs. 5b and 5c). This limitation is consistent with the performance of current-generation soil moisture retrievals derived from AMSR-E X-band (10.7 GHz) *T _{B}* observations. However, the future availability of lower-frequency L-band (1.4 GHz)

*T*observations from the European Space Agency’s (ESA’s) Soil Moisture and Ocean Salinity Mission (SMOS; Kerr et al. 2001) should significantly enhance results in areas of moderate and dense vegetation. Relative to X-band AMSR-E retrievals, L-band sensors will also reduce the impact of atmospheric hydrometeors on soil moisture retrievals and should therefore improve our ability to retrieve soil moisture during and immediately after storm events.

_{B}Despite such generally encouraging results, a close examination of our approach reveals limitations that may be attributable to our implicit choice of an additive error model for rainfall accumulations. In particular, FAR correction results at low event thresholds appear to pose a particular challenge (section 5e). Error associated with an inappropriate rainfall error model may also underlie the erratic pattern of FAR correction results in Fig. 5d. These problems suggest that it may prove advantageous to reformulate the approach using an ensemble Kalman or particle filtering approach capable of better representing the (potentially) multiplicative structure of rainfall errors. Additionally, improved FAR results may also require explicitly considering saturation effects whereby land surface signals become insensitive to additional amounts of antecedent rainfall once runoff generation occurs. Finally, limitations in the temporal frequency and spatial resolution of satellite-based soil moisture retrievals may limit the time–space scales (i.e., 2–10 day and 1°) at which our procedure can successfully correct rainfall and, consequently, its value for certain hydrologic applications.

Further research is required to fully address these issues and clarify the potential benefits of more complex modeling and data assimilation approaches. In particular, the constrained ensemble Kalman filtering (CEnKF) concept proposed by Pan and Wood (2007) provides an optimal framework for explicitly decomposing the water balance analysis increments into their runoff, rainfall, evapotranspiration, and soil moisture storage error components. Consequently, CEnKF forms a natural framework for the correction of rainfall accumulation products via soil moisture remote retrievals, and may enhance the efficiently of our approach when a more complex land surface model is utilized. It should also be noted that the basic structure of our correction (in which near-past rainfall accumulations are modified using current soil moisture retrievals) is more suggestive of a smoothing—rather than a filtering—problem. It may therefore prove beneficial to reformulate the approach using a fixed-lag smoother to update rainfall accumulation estimates (Dunne and Entekhabi 2005). Future work in this direction could further enhance the encouraging results noted here.

## Acknowledgments

This research was partially supported by NASA EOS Grant NNH04AC301. Data support from Robert Kuligowski (NOAA/NESDIS) and the NOAA/Climate Prediction Center is gratefully acknowledged.

## REFERENCES

Crow, W. T., 2003: Correcting land surface model predictions for the impact of temporally sparse rainfall rate measurements using an ensemble Kalman filter and surface brightness temperature observations.

,*J. Hydrometeor.***4****,**960–973.Crow, W. T., 2007: A novel method for quantifying value in spaceborne soil moisture retrievals.

,*J. Hydrometeor.***8****,**56–67.Crow, W. T., , and Bolten J. D. , 2007: Estimating precipitation errors using spaceborne surface soil moisture retrievals.

,*Geophys. Res. Lett.***34****,**L08403. doi:10.1029/2007GL029450.Crow, W. T., , and Zhan X. , 2007: Continental-scale evaluation of spaceborne soil moisture products.

,*IEEE Geosci. Remote Sens. Lett.***4****,**451–455.Dunne, S., , and Entekhabi D. , 2005: An ensemble-based reanalysis approach to land data assimilation.

,*Water Resour. Res.***41****,**W02013. doi:10.1029/2004WR003449.Ebert, E. E., , Janowiak J. E. , , and Kidd C. , 2007: Comparison of near-real-time precipitation estimates from satellite observations and numerical models.

,*Bull. Amer. Meteor. Soc.***88****,**47–64.Gelb, A., 1974:

*Applied Optimal Estimation*. MIT Press, 374 pp.Higgins, R. W., , Shi W. , , and Yarosh E. , 2000: Improved United States precipitation quality control system and analysis.

*NCEP/Climate Prediction Center Atlas,*Vol. 7, 40 pp.Hossain, F., , and Anagnostou E. N. , 2004: Assessment of current passive-microwave- and infrared-based satellite rainfall remote sensing for flood prediction.

,*J. Geophys. Res.***109****,**D07102. doi:10.1029/2003JD003986.Hossain, F., , Anagnostou E. N. , , and Dinku T. , 2004: Sensitivity analysis of satellite rainfall retrieval and sampling error on flood prediction uncertainty.

,*IEEE Trans. Geosci. Remote Sens.***42****,**130–139.Hou, A., 2006: The Global Precipitation Mission (GPM): An overview.

*Proc. 2006 EUMETSTAT Meteorological Satellite Conf.,*Helsinki, Finland, EUMETSTAT, 12–16.Huffman, G. J., and Coauthors, 2007: The TRMM multisatellite precipitation analysis: Quasi-global, multiyear, combined-sensor precipitation estimates at fine scale.

,*J. Hydrometeor.***8****,**28–55.Jackson, T. J., 1993: Measuring surface soil moisture using passive microwave remote sensing.

,*Hydrol. Processes***7****,**139–152.Jackson, T. J., , Le Vine D. M. , , Hsu A. Y. , , Oldak A. , , Starks P. J. , , Swift C. T. , , Isham J. , , and Haken M. , 1999: Soil moisture mapping at regional scales using microwave radiometry: The Southern Great Plains Hydrology Experiment.

,*IEEE Trans. Geosci. Remote Sens.***37****,**2136–2151.Jackson, T. J., , Cosh M. H. , , Bindlish R. , , and Du J. , 2007: Validation of AMSR-E soil moisture algorithms with ground based networks.

*Proc. Int. Geoscience and Remote Sensing Symp. 2007,*Barcelona, Spain, IEEE, 1181–1184.Kerr, Y. H., , Waldteufel P. , , Wigneron J-P. , , Martinuzzi J-M. , , Font J. , , and Berger M. , 2001: Soil moisture retrieval from space: The Soil Moisture and Ocean Salinity Mission (SMOS).

,*IEEE Trans. Geosci. Remote Sens.***39****,**1729–1735.McCabe, M., , Wood E. F. , , Wojcik R. , , Pan M. , , Sheffield J. , , Gao H. , , and Su H. , 2008: Hydrological consistency using multi-sensor remote sensing data for water and energy cycle studies.

,*Remote Sens. Environ.***112****,**430–444.Njoku, E. G., , Jackson T. J. , , Lakshmi V. , , Chan T. , , and Nghiem S. V. , 2003: Soil moisture retrieval from AMSRE.

,*IEEE Trans. Geosci. Remote Sens.***41****,**215–229.Pan, M., , and Wood E. F. , 2007: Data assimilation for estimating the terrestrial water budget using a constrained ensemble Kalman filter.

,*J. Hydrometeor.***7****,**534–547.Reichle, R. H., , and Koster R. D. , 2005: Global assimilation of satellite surface soil moisture retrievals into the NASA catchment land surface model.

,*Geophys. Res. Lett.***32****,**L02404. doi:10.1029/2004GL021700.Scofield, R. A., , and Kuligowski R. J. , 2003: Status and outlook of operational satellite precipitation algorithms for extreme-precipitation events.

,*Wea. Forecasting***18****,**1037–1051.Smith, M. S., , Arkin P. A. , , Bates J. J. , , and Huffman G. J. , 2006: Estimating bias of satellite-based precipitation estimates.

,*J. Hydrometeor.***7****,**841–856.Sorooshian, S., , Hsu K. , , Gao X. , , Gupta H. V. , , Imam B. , , and Braithwaite D. , 2000: Evaluation of PERSIANN system satellite-based estimates of tropical rainfall.

,*Bull. Amer. Meteor. Soc.***81****,**2035–2046.

Spatially averaged RMSE, *R*^{2}, FAR, and POD statistics for 3-day accumulation estimates derived from both the original and corrected versions of the TRMM 3B42RT dataset. The FAR and POD statistics are based on exceeding the 95th quantile for 3-day CPC accumulations.

For estimated-*λ* corrections derived from various precipitation products, spatially averaged absolute change (corrected − original) in RMSE, *R*^{2}, FAR, and POD for 3-day rainfall accumulation estimates. The FAR and POD statistics are based on exceeding the 95th quantile for 3-day CPC accumulations.