1. Introduction
Recent advances in the development of sequential land data assimilation techniques have demonstrated that remote sensing observations of surface soil moisture can improve the dynamic representation of root-zone soil moisture in hydrologic models (Houser et al. 1998; Walker et al. 1999; Montaldo et al. 2001; Reichle and Koster 2005). However, much of the available evidence is based on identical twin experiments using synthetically generated, and artificially perturbed, measurements (e.g., Reichle et al. 2002a, b; Crow 2003). These experiments, while useful diagnostic tools for evaluating filter efficiency, typically simplify or avoid a number of key complexities facing operational efforts to assimilate spaceborne soil moisture observations (Margulis et al. 2002; Crow and Wood 2003; Reichle and Koster 2004). One common assumption in synthetic experiments is that the source and magnitude of model errors are perfectly known in a statistical sense. In reality, error in hydrologic model predictions originates from a wide variety of sources and manifests itself within multiple model state variables. Consequently, complete error information required by sequential data assimilation filters is almost never available in operational settings. Statistical analysis of filter innovations, defined as the observed difference between predicted and actual observations, provides a valuable tool for diagnosing the misspecification of model error (Dee 1995; Reichle and Koster 2002). Several online procedures for estimating model error parameters—based on the analysis of filter innovations—have been introduced for geophysical models (e.g., Mitchell and Houtekamer 1999). However such adaptive filtering techniques have not been widely applied to hydrologic models.
The ensemble Kalman filter (EnKF) is a sequential data assimilation technique that dynamically updates model error covariance information by generating an ensemble of model predictions consisting of individual model realizations independently perturbed by some type of assumed model error. State forecast error covariance information is sampled from this ensemble and used to update model state predictions via the Kalman filter update equation. Model error assumed in order to generate the ensemble may not be an accurate reflection of actual model error. If this is the case, the accuracy of predictions made by the filter will be degraded. Relative to other types of geophysical models, errors in hydrologic models originate from a particularly wide range of sources including poorly specified initial conditions, errors in specified atmospheric forcings, inappropriate parameter choices, neglect of subgrid land surface heterogeneity, and inaccurate model physics. The flexibility of the ensemble Kalman filter with regard to the type and source of model error is frequently cited to support its use in hydrologic data assimilation (see, e.g., Crow and Wood 2003). However, such flexibility cannot be fully exploited if the structure and source of model errors is not constrained in some realistic way. In addition, while the sensitivity of the extended Kalman filter (EKF) to incorrect model error specification has been frequently studied since Ljung (1979), relatively little is currently known about the impact of poorly specified model error on the application of the EnKF to hydrologic problems.
This analysis is based on using the EnKF to carry out a series of synthetic identical twin experiments whereby errors used to generate hydrologic model ensembles (and calculate state forecast error covariance information for the EnKF) are statistically distinct from errors used to originally perturb the model. The impact of inaccurate model error assumptions on the quality of integrated root-zone soil moisture predictions made by the EnKF is evaluated, and the potential for adaptively improving filter results based on the statistical analysis of filter innovations is examined.
2. Approach
The approach is based on a series of synthetic filtering experiments aimed at the assimilation of daily spaceborne surface soil moisture retrievals into a hydrologic model operating over the approximately 1000-km2 Big Cabin Creek watershed in northeastern Oklahoma. Figure 1 contains a schematic representation of the experimental approach. Synthetic identical twin experiments are based on designating a single hydrologic model realization as a “truth run” that produces a set of “truth states” (i.e., soil moisture and temperature values at various depths) and, when processed through an observation operator, “truth observations” (i.e., surface soil moisture or corresponding observed variable). Because of modeling and observing errors, neither quantity can be known with perfect certainty. In a synthetic twin experiment, this uncertainty is reflected through the synthetic application of modeling error to truth states (to produce “modeled states”) and synthetic observing errors to truth observations (to produce “actual observations”). This error will be referred to as “actual error.” “Open loop” simulations are those in which actual errors are implemented but not corrected. The EnKF methodology attempts to correct for the impact of actual model error (i.e., correct modeled states back to truth states) by assimilating observations related to model states. It does so by assuming a given statistical structure for assumed model error and using this assumption to construct a Monte Carlo–based ensemble of model state and observation predictions from which the error covariance information required by the Kalman filter update equation is estimated. The best filtering results will be obtained when the assumed error used to construct the ensemble is an accurate statistical representation of the actual error used to originally perturb the truth simulation. However, in real operational settings, little information may be available concerning the statistical properties of actual model error. This lack of information may degrade the ability of the filter to correct model error.
This article utilizes the synthetic twin methodology outlined in Fig. 1 to study cases in which “assumed model error” on the left-hand side of Fig. 1 differs statistically from “actual model error” above it. The analysis implicitly neglects the impact of incorrect assumptions concerning observing error by assuming observation error statistics to be perfectly known. Such reductionism is necessary for an initial examination of these issues. However, an integrated analysis capturing the interplay between misspecified observing and modeling errors will eventually be necessary. Specific aspects of our methodology are discussed in the following subsections.
a. The ensemble Kalman filter
b. TOPLATS hydrologic modeling
c. Site location and model evaluation
All hydrologic modeling was based on application of TOPLATS to the Big Cabin Creek watershed in northeastern Oklahoma from 1 October 1996 to 30 September 1999. The watershed has very little hydrologic regulation or diversion and is located in a region that has seen extensive fieldwork aimed at the development of remote sensing retrieval algorithms for surface soil moisture. Upstream of the USGS gauging station near Big Cabin, Oklahoma, the watershed has an area of approximately 1170 km2, which is roughly equivalent to the spatial resolution defined through the 3-dB antenna gain (Drusch et al. 1999) of current and next-generation spaceborne radiometers designed to retrieve surface soil moisture (Entekhabi et al. 2004). TOPLATS predictions within the basin were forced by meteorological observations (i.e., rainfall, air temperature, incoming radiation, wind speed, and relative humidity) from the Vinita, Oklahoma mesonet site located near the center of the basin. Soil hydraulic parameters were based on the dominant soil texture found with the basin (loam) and the soil texture/hydraulic parameter lookup table presented in Rawls et al. (1982). However, values of θsat and surface saturated hydrologic conductivity (Ksat) were tuned slightly (from 0.462 to 0.420 and 3.67 × 10−6 to 1.60 × 10−6 m s−1, respectively) in order to match local soil moisture observations. Vegetation characteristics were varied on a monthly basis according the typical seasonality of the dominant land-cover type in the basin (grassland). Values for f and Qo ( f = 3 and Qo = 100 m3 s−1) were obtained through manual calibration of TOPLATS against USGS streamflow observations.
Comparisons between calibrated TOPLATS results and observed streamflow and surface soil moisture are shown in Fig. 2a. Streamflow observations and predictions plotted in Fig. 2a are 30-day moving averages of daily values. TOPLATS soil moisture predictions plotted in Fig. 2b are for the interval of the STI that corresponds to local topography near the Vinita, Oklahoma mesonet site. Since Oklahoma mesonet soil moisture observations are not available prior to 2002, soil moisture intercomparisons are based on simulations conducted during a later time period (1 October 2002 to 30 September 2003). A good calibrated fit was achieved to both streamflow and surface soil moisture except for soil moisture predictions between December 2002 and April 2003. During this time period, TOPLATS systematically under predicts surface soil moisture by up to 0.10 cm3cm−3. This error is likely due to the misrepresentation of the basin-averaged water table depth (
d. Modeling error approach
The experimental methodology (see section 2 and Fig. 1) is based on designating a single, unperturbed model realization as truth. A statistical representation of actual model error is specified by selecting “actual” values for the statistical parameters describing its components (w
3. Results
Values plotted in Fig. 3 are based on daily root-mean-square (rms) differences between truth model simulations and both the open-loop (i.e., truth simulations perturbed by actual error) and EnKF filtering cases. EnKF results are based on the case in which both the source (TOPLATS
Assuming knowledge of modeling error is limited only to its source in
a. Impact of incorrect model error assumptions
Adaptive tuning using
Soil moisture assimilation results in Fig. 5 illustrate the impact of representing model error using random perturbations in
Figures 6b and 6c are analogous except for actual error in θunsat (wθunsat = 0.01 cm3 cm−3 h−1) and rainfall (wp = 1), respectively. The vertical line indicates the location of the
First, there exists the possibility of a spurious local minimum where, because of an incorrect assumption concerning the source of modeling error, the tuning of the wrong error type to move
A potential solution to this problem is to perform a broad enough optimization during adaptive filtering such that the possibility of all three sources of model error is accounted for. That is, ensuring that the adaptive filter is able to find the model error type associated with globally optimal innovation statistics. However, even if a powerful enough optimization scheme could efficiently locate such a global minimum, a second undesirable possibility for adaptive filtering is that the best innovative statistics (for all three possible error-type assumptions) is associated with optimization of the wrong error type. In this case the best obtainable innovation statistics would not reflect the correct error representation. For instance, in Fig. 6b, the best 40-cm soil moisture rmse results are realized by tuning rainfall error parameters—the correct error source. However, better innovation statistics (but reduced root-zone soil moisture accuracies) arise when either
b. Sensitivity to error magnitude
One concern is the potential sensitivity of results to rather arbitrary choices made concerning the magnitude of actual observing and modeling errors. To examine sensitivity to assumptions concerning the magnitude of observation and modeling errors, results were also generated (but not shown) for the cases of doubled error (from 2% to 4% volumetric) in daily surface-zone soil moisture retrievals and for the cases of baseline 2% soil moisture measurement error and both doubled and tripled values of modeling errors (i.e., wθunsat, w
c. Impact of diffuse error sources
Results in Figs. 3 –6 are all based on synthetic simulations where actual (and assumed) modeling error is intentionally restricted to a single source. In reality, land surface modeling errors will likely arise from a broad range of sources. Figure 7 plots results for the case where actual model error arises simultaneously from additive noise in
d. Innovation temporal correlation
In addition to the sampled mean of αk, the temporal correlation coefficient of νk (ρν) provides a diagnostic variable on which to base adjustments to assumed levels of model error. If the EnKF is operating in accordance with its underlying assumptions, including an accurate representation of model error, then the νk time series should be temporally uncorrelated (ρν = 0). A correlated time series can be taken as evidence that model errors are being improperly represented in the filter. To determine if some of the adaptive filtering difficulties encountered in Fig. 6 and Table 1 can be addressed using this additional diagnostic, results in Table 1 were regenerated in Table 2 by tuning the EnKF so that the absolute value of ρν was minimized. Intercomparison of results in Tables 1 and 2 reveals that calibrations based on ρν (Table 2) and
e. Value of ancillary runoff observations
One potential solution for adaptive filtering problems is constraining EnKF results with additional observations. In addition to the assimilation of surface soil moisture, Fig. 5 also examines the case of assimilating runoff observations. Runoff magnitudes are assumed known from streamflow observations within a relative accuracy of 20% and the basin is assumed small enough such that the time lag between runoff generation and streamflow observations can be safely neglected. Actual error is assumed to be due to random noise in θunsat predictions and assumed error to variations in
The most natural way to consider runoff observations is to include them in the observation vector and jointly assimilate both runoff and surface soil moisture observations. Figure 8 replots Fig. 6 for the case of assimilating both surface soil moisture and surface runoff. As in Fig. 6, actual and assumed error is assumed to be restricted to a single error source, and plotted lines demonstrate the relationship between mean normalized innovations
4. Discussion and summary
Because of its Monte Carlo basis, the EnKF can address a wide variety of error sources in geophysical models. This is often cited as a key advantage for its application to hydrologic models where error can arise from a range of sources (Crow and Wood 2003). However, large gaps exist in our knowledge concerning the magnitude and ultimate source of error in hydrologic modeling predictions. Consequently, it is reasonable to expect that the operational application of sequential data assimilation filters to assimilate remotely sensed soil moisture will rely heavily on diagnostic tools like filter innovations to obtain model error statistical information. This analysis examines the potential for using filter innovations to correctly tune model error parameters in such a way that EnKF-based predictions of root-zone soil moisture are optimized.
Results in Table 1 and Figs. 5 and 6 highlight two fundamental challenges facing adaptive filtering strategies aimed at tuning hydrologic model error parameters based on surface soil moisture innovation statistics. First, there exists the potential for spurious local minimums where optimization of the wrong error parameter can actually progressively degrade EnKF results to accuracy levels below what is obtainable for the comparable open-loop case (see Fig. 5). Second, globally optimal innovation statistics do not necessary correspond to the model error parameters that provide the best EnKF results (Table 1). Consequently, tuning of soil moisture innovation statistics to globally optimal levels does not always guarantee optimal EnKF-based root-zone soil moisture predictions (Fig. 6). Results in Fig. 6 are based on the relatively simple case where assumed and actual modeling errors are limited to a single source. Figure 7 presents results for a more realistic case in which actual model error is distributed among a range of sources and assumed error used to generate model ensembles (and the EnKF) is limited to only a single source. The tuning of only a single-error source (again via soil moisture filter innovations) to represent a broader range of errors appears to avoid much of spurious local minimum problem noted in Fig. 6. However, it remains the case that the best (worst) calibrated filter innovations are associated with tuning the wrong (correct) error type and the highest (lowest) error in EnKF root-zone soil moisture predictions (Fig. 7).
Results also clarify the potential of additional diagnostic statistics to detect and correct for impacts associated with incorrect model error assumptions. Calibrating model error such that the temporal correlation of innovations (ρν) was minimized led to results that were qualitatively similar to calibrating against
For simplicity, perturbations used to simulate various sources of modeling error were assumed to be mutually independent and temporally uncorrelated. Both assumptions may qualitatively impact results presented in section 3. The impact of cross-correlated state perturbations depends on the degree of cross correlation in combination with the feedback between the respective model state variables (Ljung 1999). Filter sensitivity to error misspecification may be enhanced when cross correlation is present but assumed to be absent. Likewise, temporal autocorrelation in model perturbations can be modeled using state augmentation procedures (Reichle et al. 2002a). However, failure to account for existing autocorrelations can also degrade filter performance. Finally, it is worth noting that, relative to the approach employed here, more sophisticated methods for estimating model bias and statistical uncertainty have been presented in the systems and control literature. Here the problem is often referred to as filter divergence, referring to the fact that the ensemble sample drifts gradually from the true state and no longer produces a meaningful state forecast. Examples of such techniques include nonstationary stochastic embedding, model error modeling based on prediction error methods, and set membership identification (Reinelt et al. 2002). While potentially attractive for use in the EnKF, the computational difficulties associated with implementing these methods for hydrologic forecasting/monitoring problems are considerable and have yet to be addressed. In addition, the Rauch–Tung–Striebel (RTS) smoother (Rauch et al. 1965) may provide a way to check for filter divergence (rather than estimating modeling errors directly) for hydrologic assimilation systems by comparing the statistics of filter innovations in the forward and backward sequences of the smoother. However, in the context of the EnKF it is not trivial to implement the RTS smoother, and it is not yet clear how this can be done efficiently (Evensen and van Leeuwen 2000; van Leeuwen 2001).
REFERENCES
Burgers, G., van Leeuwen P. J. , and Evensen G. , 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126 , 1719–1724.
Crow, W. T., 2001: The impact of land surface heterogeneity on the accuracy and utility of spaceborne soil moisture. Ph.D. dissertation, Princeton University, 309 pp.
Crow, W. T., 2003: Correcting land surface model predictions for the impact of temporally sparse rainfall rate measurements using an ensemble Kalman filter and surface brightness temperature observations. J. Hydrometeor., 4 , 960–973.
Crow, W. T., and Wood E. F. , 2003: The assimilation of remotely sensed soil brightness temperature imagery into a land surface model using ensemble Kalman filtering: A case study based on ESTAR measurements during SGP97. Adv. Water Resour., 26 , 137–149.
Dee, D. P., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation. Mon. Wea. Rev., 123 , 1128–1145.
Drusch, M., Wood E. F. , and Lindau R. , 1999: The impact of the SSM/I antenna gain function on land surface parameter retrieval. Geophys. Res. Lett., 26 , 3481–3484.
Entekhabi, D., and Coauthors, 2004: The Hydrosphere State (HYDROS) mission concept: An earth system pathfinder for global mapping of soil moisture and land freeze/thaw. IEEE Trans. Geosci. Remote Sens., 42 , 2184–2195.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 , 10143–10162.
Evensen, G., and van Leeuwen P. J. , 2000: An ensemble Kalman smoother for nonlinear dynamics. Mon. Wea. Rev., 128 , 1852–1867.
Famiglietti, J. F., and Wood E. F. , 1994: Multiscale modeling of spatially variable water and energy balance processes. Water Resour. Res., 30 , 3061–3078.
Houser, P. R., Shuttleworth W. J. , Famglietti J. S. , Gupta H. V. , Syed K. H. , and Goodrich D. C. , 1998: Integration of soil moisture remote sensing and hydrologic modeling using data assimilation. Water Resour. Res., 34 , 3405–3420.
Ljung, L., 1979: Asymptotic behavior of the extended Kalman filter as a parameter estimator for linear systems. IEEE Trans. Autom. Control, 24 , 36–50.
Ljung, L., 1999: System Identification—Theory for the User. 2d ed. Prentice-Hall, 609 pp.
Margulis, S. A., McLaughlin D. , Entekhabi D. , and Dunne S. , 2002: Land data assimilation of soil moisture using measurements from the Southern Great Plains 1997 Field Experiment. Water Resour. Res., 38 .1299, doi:10.1029/2001WR001114.
Mitchell, H. L., and Houtekamer P. L. , 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128 , 416–433.
Montaldo, N., Alberton J. D. , Mancini M. , and Kiely G. , 2001: Robust prediction of root zone soil moisture from assimilation of surface soil moisture. Water Resour. Res., 37 , 2889–2901.
Pauwels, V. R. N., and Wood E. F. , 1999: A soil–vegetation–atmosphere transfer scheme for the modeling of water and energy balance processes in high latitudes. 1. Model improvements. J. Geophys. Res., 104 , 27811–27822.
Pauwels, V. R. N., Hoeben R. , Verhoest N. E. C. , and De Troch F. P. , 2001: The importance of the spatial pattern of remotely sensed soil moisture in the improvement of discharge predictions for small-scale basins through data assimilation. J. Hydrol., 251 , 88–102.
Peters-Lidard, C. D., Zion M. S. , and Wood E. F. , 1997: A soil–vegetation–atmosphere transfer scheme for modeling spatially variable water and energy balance processes. J. Geophys. Res., 102 , 4303–4324.
Rauch, H. E., Tung F. , and Striebel C. T. , 1965: Maximum likelihood estimates of linear dynamic systems. J. Amer. Inst. Aeronaut. Astronaut., 3 , 1445–1450.
Rawls, W. J., Brakensiek D. L. , and Saxton K. E. , 1982: Estimation of soil water properties. Trans. ASAE, 25 , 1316–1320.
Reichle, R. H., and Koster R. D. , 2002: Land data assimilation with the ensemble Kalman filter: Assessing model error parameters using innovations. Developments in Water Science—Computational Methods in Water Resources, Vol. 47, Elsevier, 1387–1394.
Reichle, R. H., and Koster R. D. , 2004: Bias reduction in short records of satellite soil moisture. Geophys. Res. Lett., 31 .L19501, doi:10.1029/2004GL020938.
Reichle, R. H., and Koster R. D. , 2005: Global assimilation of satellite surface soil moisture retrievals into the NASA Catchment land surface model. Geophys. Res. Lett., 32 .L02404, doi:10.1029/2004GL021700.
Reichle, R. H., McLaughlin D. B. , and Entekhabi D. , 2002a: Hydrologic data assimilation with the ensemble Kalman filter. Mon. Wea. Rev., 130 , 103–114.
Reichle, R. H., Walker J. P. , Koster R. D. , and Houser P. R. , 2002b: Extended versus ensemble Kalman filtering for land data assimilation. J. Hydrometeor., 3 , 728–740.
Reinelt, W., Garulli A. , and Ljung L. , 2002: Comparing different approaches to model error modeling in robust identification. Automatica, 38 , 787–803.
van Leeuwen, P. J., 2001: An ensemble smoother with error estimates. Mon. Wea. Rev., 129 , 709–728.
Walker, J. P., and Houser P. R. , 2001: A methodology for initializing soil moisture in a global climate model: Assimilation of near-surface soil moisture observations. J. Geophys. Res., 106 , 11761–11774.
Walker, J. P., Willgoose G. R. , and Kalma J. D. , 1999: One-dimensional soil moisture profile retrieval by assimilation of near-surface measurements: A simplified soil moisture model and field application. J. Hydrol., 2 , 356–373.
Normalized 40-cm soil moisture rmse for calibrated ENKF results (assimilating surface soil moisture) for various combinations of assumed and actual error types. Error magnitudes were calibrated such that
Normalized 40-cm soil moisture rmse for calibrated ENKF results (assimilating surface soil moisture) for various combinations of assumed and actual error types. Error magnitudes were calibrated such that ρν is as close to zero possible. The best (i.e., closest to zero) ρν values obtained during calibration are listed in parentheses.
Normalized 40-cm soil moisture rmse for calibrated ENKF results (assimilating surface soil moisture and runoff) for various combinations of assumed and actual error types. Error magnitudes were calibrated such that