The Lagrangian Model (LMODEL) is a new multisensor satellite rainfall monitoring methodology based on the use of a conceptual cloud-development model that is driven by geostationary satellite imagery and is locally updated using microwave-based rainfall measurements from low earth-orbiting platforms. This paper describes the cloud development model and updating procedures; the companion paper presents model validation results. The model uses single-band thermal infrared geostationary satellite imagery to characterize cloud motion, growth, and dispersal at high spatial resolution (∼4 km). These inputs drive a simple, linear, semi-Lagrangian, conceptual cloud mass balance model, incorporating separate representations of convective and stratiform processes. The model is locally updated against microwave satellite data using a two-stage process that scales precipitable water fluxes into the model and then updates model states using a Kalman filter. Model calibration and updating employ an empirical rainfall collocation methodology designed to compensate for the effects of measurement time difference, geolocation error, cloud parallax, and rainfall shear.
There is a continuing need for high-resolution satellite rainfall products to supplement limited surface-based monitoring networks for regional hydrology and to provide global calibration and validation datasets for climate models. Two types of satellite sensors provide information pertinent to rainfall monitoring. Active and passive microwave (MW) sensors are directly sensitive to precipitation-related hydrometeors and provide instantaneous precipitation estimates at sampling frequencies of up to twice a day, limited by the need to mount sensors on low Earth-orbiting (LEO) platforms. Multiple satellite platforms provide improved temporal sampling, and the proposed Global Precipitation Measurement (GPM) Mission aims to coordinate MW data collection to achieve a maximum return time of three hours (Hou et al. 2008). In contrast to MW data, visible (VIS) and infrared (IR) images from satellite in geostationary Earth orbit (GEO) provide high temporal resolution (up to 15 min) information on cloud patterns but are not directly sensitive to precipitation processes within the clouds.
Most high-resolution satellite rainfall algorithms combine information from GEO imagery and LEO MW sensors. These algorithms may be divided into two categories: microwave-calibrated and morphing algorithms. Microwave-calibrated algorithms dynamically calibrate an empirical GEO rainfall algorithm against local microwave data (Bellerby et al. 2000; Bellerby 2004; Huffman et al. 2007; Kidd et al. 2003; Marzano et al. 2004; Nicholson et al. 2003a,b; Sorooshian et al. 2000; Todd et al. 2001; Turk and Miller 2005; Xu et al. 1999) For example, the study area and period may be divided into separate spatiotemporal calibration domains, typically 100 km × 100 km × 1 month, and the GEO algorithm independently calibrated against coincident microwave data within each domain (Todd et al. 2001; Xu et al. 1999). More sophisticated approaches involve the continuous training of neural networks (Sorooshian et al. 2000). Morphing algorithms use GEO cloud movements to advect microwave-derived rainfall patterns between overpasses (Joyce et al. 2004; Okamoto et al. 2005). Each of these approaches experiences its own limitations. Microwave-calibrated algorithms are limited by the physically indirect relationships between cloud patterns and rainfall. Local calibration procedures are able to accommodate meteorological variations in these relationships, but they cannot fully compensate for the inability of GEO sensors to resolve precipitation processes. Basic morphing algorithms assume that rainfall varies smoothly along advection streamlines. This causes problems in regimes where storms build and dissipate rapidly (Joyce et al. 2004).
The Lagrangian Model (LMODEL) algorithm adopts a new approach that combines features of both microwave-calibrated and morphing techniques within a conceptual modeling framework. A simple conceptual mass balance model is used to trace cloud development and dispersal along Lagrangian streamlines that trace equivalent pixels within the same cloud through successive geostationary images. The seasonally calibrated model is then locally updated against rainfall measurements from available MW satellite overpasses in two stages: The first stage locally scales precipitable water fluxes into the model, and the second stage updates model state variables using a Kalman filter.
Geostationary satellite rainfall algorithms require the identification and quantification of raining areas in cloud imagery. These algorithms most commonly use thermal IR data because this is available throughout the diurnal cycle. With the advent of more advanced sensors, multispectral GEO techniques are being developed (Ba and Gruber 2001; Bellerby et al. 2000; Turk and Miller 2005). Many IR algorithms depend on the relationship between pixel brightness temperatures and cloud-top height. Cold brightness temperatures are generally associated with high cloud tops that may be indicative of strongly precipitating vertically extended convective systems. Some algorithms identify areas of cold cloud and relate these to areas of rainfall (Arkin and Meisner 1987; Xu et al. 1999). Other algorithms seek to derive a statistical relationship between pixel brightness temperature and underlying rainfall (Todd et al. 2001; Marzano et al. 2004; Vicente et al. 1998). Neural network techniques enable cloud textures and other information to be included in the estimation process (Bellerby et al. 2000; Bellerby 2004; Grimes et al. 2003; Hsu et al. 1997, 1999; Sorooshian et al. 2000; Tapiador et al. 2004; Zhang and Scofield 1994).
The Climate Prediction Center (CPC) morphing algorithm (CMORPH; Joyce et al. 2004) uses a correlation-matching procedure employing a 5° × 5° template to determine cloud advection at 2.5° × 2.5° spatial resolution from GEO IR imagery. The resulting advection vectors are used to “morph” MW precipitation patterns between sensor overpasses, with MW rainfall rates linearly interpolated along cloud-advection streamlines. Joyce et al. (2004) note that cloud-top advection always closely corresponds to rainfall advection, with some IR features rapidly streaming off from precipitating systems to give average cloud-advection speeds 2–4 times faster than rainfall advection. They overcome this discrepancy by applying an empirical correction to the cloud displacement vectors before using them to morph the rainfall. CMORPH performs well in comparison to other multisensor approaches. However, especially in rapidly changing convective regimes, linear interpolation between MW overpasses may be insufficient to represent a developing precipitating system accurately. To accommodate this type of regime fully, a satellite rainfall algorithm needs to derive additional storm development information from the GEO imagery.
A significant number of GEO algorithms incorporate some information related to cloud dynamics into the estimation process. Rapid cloud development is frequently associated with convection, and many algorithms make use of changes in cloud temperature or area between successive images to distinguish between growing and dissipating systems (Bellerby et al. 2000; Stout et al. 1979; Wu et al. 1985). The Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks with a Cloud Classification System (PERSIANN-CCS) algorithm (Hong et al. 2004) uses a statistical analysis of cloud patch textures to estimate their position within the cloud life cycle, whereas other techniques have employed whole-cloud tracking to vary cloud-area/rainfall relationships throughout cloud histories (Augustine et al. 1981; Griffith et al. 1981; Woodley et al. 1980). Horsfield (2006) developed a cloud-patch water balance model for convective systems. Following Machado et al. (1998) and Kuo (1974), the model employed patch-area expansion as a proxy for convective updraft strength and divided water vapor in the convective column into two fractions: one yielding immediate convective rainfall and the other entering storage in the stratiform anvil to rain out at a steady rate. The model proved effective at reproducing the area-total rainfall dynamics for individual cells, provided sufficient calibration information was available. However, it could not handle cells that split and merge.
Data assimilation techniques are established procedures in hydrology and numerical weather forecasting. These approaches update the process states in a sequential estimation system using available new observations, enabling enhanced process states as well as improved output variables to be estimated in real time. The Kalman filter is a data assimilation technique widely used in hydrological modeling and numerical weather forecasting (Anderson and Anderson. 1999; Anderson 2001; Liu and Gupta 2007; Slater and Clark 2006; Walker et al. 2002). The Global Satellite Mapping of Precipitation (GSMaP) algorithm advects MW estimates using cloud motion vectors estimated for the past hour from IR cloud imagery and then adjusts the propagated MW rainfall field using a Kalman filter (Okamoto et al. 2005). Here advected MW rainfall is assigned the role of system state variable while GEO IR brightness temperatures provide the observations used to adjust the system state, yielding updated rainfall estimates. GSMaP generates 1-h rainfall estimates at a 0.1° spatial resolution.
The LMODEL technique is based on a conceptual model of cloud development that is forced by estimated precipitable water fluxes and cloud dispersal rates derived from GEO imagery. The semi-Lagrangian mass balance model is run at full GEO pixel resolution and then its outputs are aggregated to a coarser spatiotemporal resolution before being used. The LMODEL rainfall estimation algorithm operates in three stages, with each stage yielding successively improved products.
Stage 1 runs the unadjusted cloud development model using seasonally derived calibrations.
Stage 2 compares stage 1 outputs to rainfall estimates from available MW overpasses to derive a correction ratio. This ratio is interpolated between MW overpasses along cloud-advection streamlines and then it is used to scale local precipitable water inputs to the model.
Stage 3 locally adjusts model state variables at each MW overpass using a Kalman filter.
The model is initially calibrated and then later updated using an adjusted MW dataset that attempts to correct empirically for the combined effects of measurement timing mismatch, geolocation error, cloud parallax, and rainfall shear.
The prototype LMODEL algorithm was developed using GEO IR data extracted for a window covering the conterminous United States (CONUS) for two periods—July–August 2006 and February–March 2007—from the CPC full-resolution IR dataset (Climate Prediction Center 2008a; Janowiak et al. 2001). This is a 0.04° spatial resolution 30-min composite of available geostationary IR (∼11 μm) imagery, with individual satellite contributions corrected for zenith angle dependence to reduce interplatform discontinuities. Over the study area, these image data originate from the Geostationary Operational Environmental Satellite East (GOES-E) and West (GOES-W). Corresponding microwave data were obtained from the CPC merged microwave dataset (Climate Prediction Center 2008b), a composite dataset combining data from the Defense Meteorological Satellite Program Special Sensor Microwave Imager (DMSP SSM/I), the Polar Orbiting Environmental Satellite Advanced Microwave Sounding Unit B (POES AMSU-B), the Aqua Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E), and the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) instruments, interpolated to a common 0.08° spatial resolution and 30-min temporal resolution (Ferraro 1997; Ferraro et al. 2000; Kummerow et al. 2001; Weng et al. 2003). Hourly, 0.04° resolution, composite National Centers for Environmental Prediction (NCEP) stage 2 gauge-corrected radar rainfall analyses from the National Oceanic and Atmospheric Administration (NOAA)/NCEP/Environmental Modeling Center (EMC) provided an independent validation dataset (Lin and Mitchell 2005).
d. High-resolution 2D cloud tracking
Cloud-advection tracking for LMODEL employs the mesh-deformation algorithm of Bellerby (2006). This algorithm starts by draping coarse-resolution rectangular meshes over a GEO image and its immediate predecessor in time sequence (Fig. 1a). A rectangular-window, translational, correlation-matching procedure then deforms the rectangular mesh covering the preceding image into a convex quadrilateral mesh, optimizing the correspondence between the two images at and around equivalent mesh nodes (Figs. 1a,b). The meshes over both images are interpolated to twice their previous spatial resolution (Fig. 1c) and the correlation-matching procedure is repeated, this time taking into account local distortions represented by the nonrectangular mesh (Figs. 1d–f). Incorporating these local distortions enables the tracking algorithm to accommodate rotation and shear effects in addition to translations. The interpolation and matching stages iterate until the mesh resolutions reach the original image GEO resolution. Later iterations of the algorithm interpolate both images to 4 times their original spatial resolution using bicubic splines before starting the correlation-matching procedure. At the end of the final iteration, each pixel location x in the main image is associated with an equivalent location xt−1(x) in the same cloud in the preceding image (Fig. 2). The algorithm is additionally capable of deriving the reverse mapping xt+1(x), relating each pixel in the preceding image to an equivalent location in the current image from the same pair of final meshes without rerunning the tracking procedure. The 2D cloud-advection algorithm is computationally efficient and has been shown to be both robust in the presence of image rotation and shear and accurate to within 2–3 pixels (Bellerby 2006).
It is possible to use the cloud-advection algorithm to quantify cloud expansion and contraction by comparing corresponding cell areas in the two meshes. This was implemented using the total area of the four cells surrounding a given mesh node (Fig. 2). The resulting area change ratio A(x, t) is greater than one for expanding clouds and less than one for contracting clouds. The accuracy of this product is restricted by the precision to which the correlation-matching procedure can locate mesh nodes, which in turn is limited to less than one-quarter pixel by the use of fourfold-interpolated GEO imagery in later iterations of the advection-tracking algorithm.
e. Collocation of clouds and rainfall
Both calibration and updating of the cloud development model require cloud imagery to be compared to coincident MW rainfall data. However, a number of factors make a direct comparison between MW satellite rainfall estimates and GEO cloud imagery very difficult to achieve in practice. The most significant problem is measurement timing: an MW sensor overpass may occur at any time between successive GEO images. Moreover, GEO image collection is not itself instantaneous. Even if data collection were precisely synchronized, difficulties with geolocation error, cloud parallax, and cloud/rain shear would remain (Vicente et al. 2002). The spatial resolution of MW rainfall products is generally less that that provided by GEO imagery, and it is tempting to adopt a low spatial resolution for the cloud development model, alleviating difficulties with GEO/MW collocation. However, low model resolutions create significant problems with cloud tracking. For the 30-min GEO image data used in this study, a slow-moving cloud may change location by only ∼2–3 GEO pixels between successive images. If the model spatial resolution were reduced to less than 2 GEO pixels, then cloud motion would become difficult to represent, with an average cloud moving less than one grid cell in a single time step. This problem would be even more significant for 15-min GEO imagery. Schemes in which model grid cells for the current time step are fractionally associated with several cells in the preceding time step will tend to give rise to aliasing effects. For example, a slowly moving one-cell-wide cloud will spread over a growing area as the fractional partition scheme incorrectly divides the cloud state variables among an increasing number of model cells with successive time steps. The LMODEL design avoids these difficulties by operating a full GEO pixel resolution, employing an empirical correction for collocation errors between MW and GEO datasets and spatially and temporally aggregating model outputs before they are used as operational rainfall products.
The LMODEL empirical approach to MW/GEO collocation moves interpolated MW rainfall estimates to GEO clouds. MW rainfall estimates are linearly interpolated to GEO image resolution, and the following procedure is applied at each GEO pixel location: Pixels within a set radius of the given pixel location (16 pixels in the prototype implementation) are identified in both the GEO IR (Fig. 3a) and interpolated MW rainfall (Fig. 3b) images, and the two sets of pixels are independently ranked from lowest to highest rainfall rate and highest to lowest IR brightness temperature (Fig. 3c). The corrected rainfall value for the current pixel location is then taken from the same position within the ranked list of increasing rainfall values as the central IR pixel occupies within the ranked list of decreasing brightness temperatures (Figs. 3c,d). While the resulting product is gridded at GEO pixel resolution, its spatial variability is determined by the coarser MW sensor resolutions. This limitation in effective spatial resolution must be taken into account when calibrating or updating the model against this dataset. In addition, it should be noted that the collocation procedure slightly reduces the spatial coverage of the MW swaths and does not extend the MW coverage beyond the original overpasses.
The collocation algorithm moves the rainfall to the clouds and in particular moves rainfall maxima to IR minima. Many satellite rainfall techniques assume a monotonic relationship between decreasing IR brightness temperature and increasing rainfall rates (Marzano et al. 2004; Todd et al. 2001; Vicente et al. 1998). These techniques rest primarily on the assumption that the coldest IR brightness temperatures are associated with the overshooting tops of heavily precipitating convective systems. Although IR minima do not always coincide with the exact location of maximum rainfall on the ground (Adler and Mack 1986), a number of studies have noted a strong quantitative relationship between IR minimum brightness temperatures and the peak or total rainfall rates for the cells in which they reside (Adler and Negri 1988; Hong et al. 2004; Horsfield 2006). It is therefore reasonable to associate the dynamics of rainfall maxima with those of local brightness temperature minima, providing that it is acknowledged that pixel resolution model outputs may be spatially displaced with respect to observed surface rainfall.
f. Cloud development model (LMODEL stage 1)
The cloud development model was designed to form the core of an operational satellite rainfall system based on model updating. In this context, the model had to be as simple and as linear as possible while providing a sufficiently realistic representation of cloud process in both convective and stratiform regimes. Following Horsfield (2006), the model adopts the assumption of Kuo (1974) that a fraction (b) of the water condensed in convective updrafts (Mu) enters storage in the cloud (w) while the remaining fraction immediately precipitates out as convective rainfall (Fig. 4). The model further assumes that nonconvective processes (frontal systems, among others) contribute a flux Ms into the cloud precipitable water (w). Over unit time, a fraction a of w is converted into stratiform rainfall with a further fraction e evaporating or otherwise dissipating (a complete list of symbols used by the LMODEL algorithm is given in Table 1). These assumptions yield the following model for the surface rainfall rate R based on a Lagrangian continuity equation for w:
R is the surface rainfall rate,
Rc is immediate (mainly convective) rainfall,
Rs is delayed (mainly stratiform) rainfall,
v is the local 2D cloud-advection velocity,
w is the total cloud precipitable water content,
Mu is the precipitable water flux in convective updrafts,
Ms is the accumulation of w as a result of nonconvective processes,
a is the fraction of w becoming stratiform rain in unit time,
b is the fraction of Mu contributing to w,
e is the fraction of w lost to evaporation in unit time,
D/Dt≡∂/∂t + v · ∇ is the Lagrangian (along-stream) derivative with time, and
∇ · v quantifies cloud convergence/divergence.
To solve this system numerically, Eqs. (1) are approximated using a semi-Lagrangian discrete system operating at GEO IR pixel resolution:
F(x, t) = (1 − b)Mu,
G(x, t) = bMu + Ms,
B(x, t) = (1 − a − e),
A(x, t) is the local change in cloud area,
x is the pixel location,
xt−1(x) is the equivalent pixel location of the same cloud in the preceding image, and
t is the time in units of GEO image time steps.
Rainfall (R), cloud precipitable water (w), and the precipitable water flux terms (F and G) are assumed to adopt zero values in cloud-free areas, identified with IR brightness temperature (Tb) above a threshold value, Tcld, set at 265 K in the prototype implementation.
Cloud divergence is quantified using the area-change product A. For expanding clouds, the mass balance between successive model cells is approximated by dividing precipitable water by A, giving (1 − ∇ · v) ≈ 1/A. Thus, a cloud that expands to twice its previous size will have half the precipitable water per model cell, prior to other forcing factors taking effect. Dispersing clouds may be associated with apparently contracting areas in GEO imagery but are unlikely to be associated with actual areas of converging cloud precipitable water. For this reason, the divisor based on A is forced to remain greater than or equal to 1.0, meaning that for approximating the mass balance, cloud expansion is quantified but cloud contraction (as opposed to dispersal) is assumed not to happen. Because the cloud-tracking algorithm has limited accuracy at subpixel resolutions, the gridded model state (w) field for the preceding time step is not interpolated prior to its use for the current time step. This contrasts with the semi-Lagrangian integration schemes employed by more conventional atmospheric models (Staniforth and Côté 1991).
The cloud development model described by (2) includes three forcing terms: F, G, and B. Here F represents only convective fluxes, G incorporates a mixture of convective and stratiform inputs, and B is a decay ratio that models cloud dissipation. These terms are estimated from two GEO satellite cloud indices: IR brightness temperature and its rate of change along an advection streamline,
Here Tb(x, t) is average brightness temperature over a 3 × 3 pixel window centered on x. Neighborhood averages are used to reduce the effects of the residual 1–2 pixel noise in the advection-tracking algorithm. If the previous pixel is cloud free, then the temperature change is computed from the cloud/no-cloud threshold, not from the actual preceding brightness temperature. These indices are likely to be somewhat more effective at resolving convective development processes (Mu) than the purely stratiform processes (Ms), such as frontal systems, because the latter do not maintain such a strong correlation between visible cloud growth and precipitable water input.
The cloud decay term B is derived as a piecewise linear function of S2, whereas the two flux terms F and G are derived as polynomial expressions of a simple rainfall estimate R̂ that combines S1 and S2 using the histogram-matching technique of Marzano et al. (2004),
The cloud development model is seasonally calibrated in two steps against corrected MW data from archived overpasses. The first step computes R̂ and B. Here R̂ is derived for growing clouds only and is set to zero for decaying clouds (S2 > 0); B is calculated by identifying those relatively rare pairs of MW observations in the calibration dataset that are available for two consecutive GEO time steps in a decaying cloud and then modeling the associated rainfall decay ratios as a function of S2:
Here E[|] is a conditional expected value computed over all suitable pairs of data points in the seasonal calibration dataset and R* is the 0.04° corrected interpolated MW rainfall. Notice that Eq. (5) is robust with respect to uncertainties introduced by interpolating MW rainfall, whereas the transformation of R̂ by Eq. (4) will compensate for any scaling errors introduced by its computation at full GEO pixel resolution.
The second step in the calibration process derives (θ1, θ2, θ3) and (ϕ1, ϕ2, ϕ3) using multilinear regression (Press et al. 1992). In this step, model outputs and corrected MW data are aggregated to a coarse (0.12°) resolution before comparison to overcome difficulties introduced by MW interpolation. If the parameter a can be assumed to be constant, then computing aG as opposed to G simplifies the overall calibration procedure by eliminating the need to determine a specific value for a. This is a slightly problematic assumption in terms of cloud physics, but it should have a relatively modest effect on model outputs because B is specifically constructed to model stratiform rainfall decay.
g. Local scaling of precipitable water fluxes (LMODEL stage 2)
LMODEL implements a two-stage scheme to locally update model input fluxes and then model states against rainfall data from MW satellite overpasses. The first stage of updating computes the ratio C between estimated and MW-observed rainfall and uses this to locally scale the precipitable water fluxes F and G,
where R* is the corrected MW rainfall. Here C is computed over a 0.28° area Ω, centered on the pixel being updated, with each mean rainfall value offset by a small increment γ (0.1 mm h−1). This serves to overcome the spatial resolution mismatch between GEO and MW data and to avoid stability problems resulting from the comparison of high-resolution noisy estimates, especially at low rainfall rates. Here C is further constrained to lie within the range of 0.2–5.0, to cater for problems of zero rainfall and avoid extreme values.
Once C has been determined at MW overpasses, it is interpolated to all GEO time steps along advection streamlines using simple lognormal kriging (Rendu 1979). This technique requires a covariance function c(t1, t2) for the log-transformed field Z = ln C. Covariance functions were calculated for July–August 2006 and February–March 2007 from the test dataset. Both of these curves could be effectively modeled using an exponential function
Notice that c(t1, t2) → 0 as |t1 − t2| → ∞ implies Z → 0 as |t1 − t2| → ∞. This means that at long intervals from an MW overpass, C will have a value of 1. The winter data display a more persistent parameter covariance with time, reflected in an exponential decay coefficient α = 0.08 as opposed to α = 0.18 for the summer dataset. This is consistent with the predominantly convective nature of the summer rainfall regime that would be expected to display shorter correlation distances than the more stratiform winter regime.
h. Model state adjustment using a Kalman filter (LMODEL stage 3)
Stage 2 of the LMODEL algorithm locally scales precipitable water fluxes to improve the match between model outputs and available MW estimates. However, modifying fluxes alone is not sufficient to optimally match model outputs to observations because these outputs are also dependent on precipitable water states. This is particularly true in decaying clouds where the input fluxes are zero. LMODEL stage 3 applies a Kalman filter to update model state variables whenever rainfall information is available from an MW overpass.
A Kalman filter is an optimal filter that improves both the model states and outputs of a linear Gaussian system at time steps for which external observations of model output variables are available. Although the basic nonstationary Kalman filter is guaranteed to yield optimal estimates only in the case of Gaussian noise, the robustness of the procedure makes it useful in situations where this assumption does not fully hold. Advanced Bayesian filters have been devised specifically to cater to nonlinear non-Gaussian systems (Arulampalam et al. 2002; Andrieu et al. 2003; Doucet et al. 2000; Moradkhani et al. 2005). However, these algorithms introduce a considerable additional computational overhead that is problematic in an operational estimation procedure designed to process large volumes of data. As demonstrated by the case studies presented in Hsu et al. (2009), a basic Kalman filter provides effective state updating for the cloud development model, even in the presence of the mixed, skewed error distributions associated with rainfall estimates. Although it is possible to apply nonlinear transformations to model states and outputs to bring the modeled system closer to the ideal assumed by the Kalman filter, preliminary investigations along these lines introduced significant difficulties and did not yield any significant improvement in updating performance.
A standard nonstationary Kalman filter was implemented to update the precipitable water state variable w whenever a corrected MW observation was available. In contrast with LMODEL stage 2, modeled and observed rainfall was compared at the full 0.04° GEO pixel resolution. The uncertainty introduced by interpolating MW data to a higher spatial resolution was incorporated into the measurement error term allowed by the Kalman filter algorithm. A Kalman filter implementation additionally requires estimates to be made of the magnitude of the Gaussian noise introduced at each model step and an error covariance for the initial state. These values are not easy to quantify. The prototype implementation employed an error covariance increment with twice the magnitude of measurement error covariance and an initial state error covariance equal to the measurement error covariance. Although the error model used in this stage of the updating procedure may not be entirely optimal, this did not impede the effective operation of the Kalman filter.
Two independent LMODEL calibrations were derived for the CONUS dataset: July–August 2006 and February–March 2007. These were used to generate unmodified (stage 1) LMODEL outputs for July–August 2006 and February–March 2007. Model outputs were generated at a 30-min temporal resolution and 0.04° spatial resolution and then aggregated to coarser resolutions for validation against ground radar data.
As described earlier, the model calibration determines three input functions: F, G, and B. Figure 5 shows B(S2) curves calculated using the test dataset for July–August 2006 and February–March 2007. Given the considerable difference in precipitation regime between these two periods, these curves are remarkably similar. Figure 6 shows F(S1, S2) and aG(S1, S2) derived for the July–August and February–March datasets. There is a clear difference between the summer (largely convective) regime and the winter period dominated by stratiform rainfall. This is most noticeable in the presence of significantly higher values for F in the summer period. Figure 7 plots F/(F + aG) as a function of F + aG for each regime. Here there is a remarkable similarity between the two curves, especially if MW coverage problems in the winter dataset are assumed to introduce errors into the polynomial series coefficients. This suggests that the seasonal variations in F and G may be primarily related to changes in the local cloud and rainfall statistics determining their total magnitude, whereas their relative magnitudes may be relatively invariant. The similarities in both the relative values of F and G and the absolute values of B between the summer and winter regimes suggest that the model is at least partially succeeding in its goal of separately representing convective and stratiform rainfall processes.
Figure 8 shows the spatial path of a selected advection streamline and plots model inputs, state, and output along this streamline. The cloud-advection streamline is not a wind-based product and frequently oscillates to the right or left of a smooth trajectory as it follows equivalent points through expanding and contracting clouds. Both model inputs and outputs along this trajectory display a considerable degree of noise, attributable to small errors in advection tracking and to unresolved precipitation processes. However, there remains a clear relationship between modeled and observed rainfall that may be enhanced by aggregating the product to a lower spatiotemporal resolution. The proportion of stratiform rainfall generated by the model is relatively high, and it is clear that in some convective systems the storage term Rs in the cloud development model is serving to provide a partial time delay for convective processes in addition to representing the accumulation of stratiform anvils.
Figure 9 shows instantaneous MW and LMODEL stages 1, 2, and 3 rainfall maps for a single 30-min time step occurring at 0415 UTC 26 August 2006 when heavy storms appeared near the border of Nebraska, Iowa, Missouri, and Kansas (Fig. 9a). The LMODEL stage 1 rainfall map shows that the spatial coverage of rainfall was well captured by fixed-parameter LMODEL outputs but that heavy rainfall intensities in the storm were underestimated. The LMODEL stage 2 rainfall map, on the other hand, corrects the underestimation over the high-intensity regions while overall rainfall areas remain larger than those shown in the MW map. The stage 3 rainfall map shows not only improved rainfall intensities but also improved rain areas. This suggests that, as intended, the parameter-updating stage serves to improve the representation of convective cores, whereas the Kalman filter refines wider rainfall areas by adjusting the precipitable water available to create stratiform rainfall.
LMODEL validation is discussed in Part II of this paper (Hsu et al. 2009). However, two sets of validation statistics are included here because they address specific elements of the model. The first set of statistics was compiled to assess the effectiveness of the MW/cloud collocation procedure. Table 2 compares LMODEL products calibrated against corrected MW data to those generated by a model run calibrated against MW satellite rainfall data that had not been further processed. Notice that the skill score is the percentage of correctly identified events (rain or no-rain defined by a 0.1 mm h−1 threshold.) The collocation procedure clearly improves the optimality of the model calibration.
As mentioned earlier, aspects of the model calibration show a remarkable degree of commonality between summer and winter calibration periods. To explore this further, a new set of winter rainfall products were generated using coefficients for F and aG derived for July–August combined with R̂ recalculated from the February–March dataset (Table 3). Although this partially cross-calibrated product does not perform quite as well as the model run calibrated entirely using winter data, the margin between the two products is not very large. This suggests that the transferred calibration functions may be to some degree universal and indicative of underlying physical processes—or at least their relative magnitudes.
A simple cloud development model has been presented, together with a set of empirical methodologies that enable its calibration and dynamic updating against imperfectly collocated MW satellite rainfall data. The semi-Lagrangian conceptual model incorporates separate representations of convective and stratiform processes, and there is some indication that these process representations are at least partially transferable between meteorological regimes. However, it is clear that the division between convective and stratiform processes in the model does not fully correspond to that prevailing in real precipitating systems, with the model apparently employing its “stratiform” precipitable water state variable to introduce a time delay into some convective processes. The model operates at full geostationary pixel resolution, but it generates useful rainfall products at spatial resolutions somewhat coarser than this resolution. The spatial aggregation of rainfall outputs is not only required to reduce noise attributable to unresolved precipitation processes but is also necessary to average out the effects of satellite geolocation error, cloud parallax, and rainfall shear. These uncertainties are accommodated in one direction (calibration data to GEO imagery) by collocating the rainfall to the clouds. However, no inverse procedure exists to correctly position high-resolution model rainfall outputs to correspond to expected surface rainfall.
The cloud development model is modular and may be straightforwardly extended to incorporate additional satellite inputs. Multispectral GEO imagery is a clear candidate to improve model performance because additional channels may be used to discriminate nonprecipitating cirrus and reduce other misidentifications (Ba and Gruber 2001; Bellerby et al. 2000; Capacci and Conway 2005; Turk and Miller 2005). In addition, cloud texture measures have been shown to be effective inputs to GEO satellite rainfall algorithms (Bellerby 2004; Hong et al. 2004).
The LMODEL updating procedures incorporate two major stages, both of which use the limited MW rainfall measurements available from LEO satellites. The first stage involves the temporal interpolation of a scaling parameter using data smoothing and geostatistical techniques based on a covariance model describing the decorrelation of rainfall anomalies with time along advection streamlines. The covariance function varies between the summer and winter seasons, where convective and stratiform cloud precipitation systems, respectively, predominate. The second stage updates precipitable water states using a Kalman filter and appears to be robust with respect to uncertainties and nonnormality in its associated error models, as evidenced by validation statistics provided in Part II of this paper (Hsu et al. 2009). However, the construction and parameterization of the updating procedures should be studied in more detail. For example, the current Kalman Filter implementation assumes process noise to be Gaussian. Future work will further explore the uncertainty of precipitation estimation arising from non-Gaussian process.
Although they take some time to describe, the LMODEL cloud development model and updating procedures are not computationally demanding and may be efficiently implemented on standard computer hardware to process monthly datasets in a matter of hours.
The authors would like thank NOAA CPC, NESDIS, NCEP, and the NASA TRMM program for providing the satellite and radar data in the study. Satellite and radar data processing was performed by Dan Braithwaite, whose efforts are very much appreciated. Anonymous referees are thanked for their helpful comments. Partial funding support for this research was made available from the NASA TRMM (Grant NAG5-7716) and NEWS (Grant NNX06AF93G) programs.
Corresponding author address: T. J. Bellerby, Department of Geography, University of Hull, Hull HU6 7RX, United Kingdom. Email: email@example.com