## 1. Introduction

Recent advances in the development of sequential land data assimilation techniques have demonstrated that remote sensing observations of surface soil moisture can improve the dynamic representation of root-zone soil moisture in hydrologic models (Houser et al. 1998; Walker et al. 1999; Montaldo et al. 2001; Reichle and Koster 2005). However, much of the available evidence is based on identical twin experiments using synthetically generated, and artificially perturbed, measurements (e.g., Reichle et al. 2002a, b; Crow 2003). These experiments, while useful diagnostic tools for evaluating filter efficiency, typically simplify or avoid a number of key complexities facing operational efforts to assimilate spaceborne soil moisture observations (Margulis et al. 2002; Crow and Wood 2003; Reichle and Koster 2004). One common assumption in synthetic experiments is that the source and magnitude of model errors are perfectly known in a statistical sense. In reality, error in hydrologic model predictions originates from a wide variety of sources and manifests itself within multiple model state variables. Consequently, complete error information required by sequential data assimilation filters is almost never available in operational settings. Statistical analysis of filter innovations, defined as the observed difference between predicted and actual observations, provides a valuable tool for diagnosing the misspecification of model error (Dee 1995; Reichle and Koster 2002). Several online procedures for estimating model error parameters—based on the analysis of filter innovations—have been introduced for geophysical models (e.g., Mitchell and Houtekamer 1999). However such adaptive filtering techniques have not been widely applied to hydrologic models.

The ensemble Kalman filter (EnKF) is a sequential data assimilation technique that dynamically updates model error covariance information by generating an ensemble of model predictions consisting of individual model realizations independently perturbed by some type of assumed model error. State forecast error covariance information is sampled from this ensemble and used to update model state predictions via the Kalman filter update equation. Model error assumed in order to generate the ensemble may not be an accurate reflection of actual model error. If this is the case, the accuracy of predictions made by the filter will be degraded. Relative to other types of geophysical models, errors in hydrologic models originate from a particularly wide range of sources including poorly specified initial conditions, errors in specified atmospheric forcings, inappropriate parameter choices, neglect of subgrid land surface heterogeneity, and inaccurate model physics. The flexibility of the ensemble Kalman filter with regard to the type and source of model error is frequently cited to support its use in hydrologic data assimilation (see, e.g., Crow and Wood 2003). However, such flexibility cannot be fully exploited if the structure and source of model errors is not constrained in some realistic way. In addition, while the sensitivity of the extended Kalman filter (EKF) to incorrect model error specification has been frequently studied since Ljung (1979), relatively little is currently known about the impact of poorly specified model error on the application of the EnKF to hydrologic problems.

This analysis is based on using the EnKF to carry out a series of synthetic identical twin experiments whereby errors used to generate hydrologic model ensembles (and calculate state forecast error covariance information for the EnKF) are statistically distinct from errors used to originally perturb the model. The impact of inaccurate model error assumptions on the quality of integrated root-zone soil moisture predictions made by the EnKF is evaluated, and the potential for adaptively improving filter results based on the statistical analysis of filter innovations is examined.

## 2. Approach

The approach is based on a series of synthetic filtering experiments aimed at the assimilation of daily spaceborne surface soil moisture retrievals into a hydrologic model operating over the approximately 1000-km^{2} Big Cabin Creek watershed in northeastern Oklahoma. Figure 1 contains a schematic representation of the experimental approach. Synthetic identical twin experiments are based on designating a single hydrologic model realization as a “truth run” that produces a set of “truth states” (i.e., soil moisture and temperature values at various depths) and, when processed through an observation operator, “truth observations” (i.e., surface soil moisture or corresponding observed variable). Because of modeling and observing errors, neither quantity can be known with perfect certainty. In a synthetic twin experiment, this uncertainty is reflected through the synthetic application of modeling error to truth states (to produce “modeled states”) and synthetic observing errors to truth observations (to produce “actual observations”). This error will be referred to as “actual error.” “Open loop” simulations are those in which actual errors are implemented but not corrected. The EnKF methodology attempts to correct for the impact of actual model error (i.e., correct modeled states back to truth states) by assimilating observations related to model states. It does so by assuming a given statistical structure for assumed model error and using this assumption to construct a Monte Carlo–based ensemble of model state and observation predictions from which the error covariance information required by the Kalman filter update equation is estimated. The best filtering results will be obtained when the assumed error used to construct the ensemble is an accurate statistical representation of the actual error used to originally perturb the truth simulation. However, in real operational settings, little information may be available concerning the statistical properties of actual model error. This lack of information may degrade the ability of the filter to correct model error.

This article utilizes the synthetic twin methodology outlined in Fig. 1 to study cases in which “assumed model error” on the left-hand side of Fig. 1 differs statistically from “actual model error” above it. The analysis implicitly neglects the impact of incorrect assumptions concerning observing error by assuming observation error statistics to be perfectly known. Such reductionism is necessary for an initial examination of these issues. However, an integrated analysis capturing the interplay between misspecified observing and modeling errors will eventually be necessary. Specific aspects of our methodology are discussed in the following subsections.

### a. The ensemble Kalman filter

**Y**(

*t*) to be a vector of land surface state variables at time

*t*. The equation describing the evolution of these states, as determined by a land surface model

**F**, is given by

**w**relates errors in model physics, parameterization, and/or forcing data and is taken to be mean zero with a covariance

**C**

*. In this analysis,*

_{w}**Y**will contain all the soil water and energy storage (i.e., soil moisture and soil temperature) states of the hydrologic model. The goal of the filtering problem is to constrain these forecasts using a set of observations that are related to the model states contained in

**Y**. Let the operator

**M**represent the observation process that relates

**Y**to the actual measurements taken at time

*t*

_{k}:**v**

*represents Gaussian measurement error with covariance*

_{k}**C**

*and*

_{υ}**Z**

*is an*

_{k}*m*-dimensional vector containing a set of observations. Here, both streamflow observations and surface soil moisture retrievals (ostensibly from a remote sensing source) will be considered for

**Z**. The EnKF is initialized by the introduction of synthetic Gaussian error into initial conditions and generating an ensemble of model predictions using Eq. (1). At the time of measurement, predictions made by the

*i*th model replicate are referred to as the state forecast

**Y**

^{i}

_{−}. If

**F**and

**M**are linear and all errors are additive, independent, and Gaussian, the optimal updating of

**Y**

^{i}_{−}by the measurement

**Z**

*is given by*

_{k}**C**

*is the covariance matrix of the forecasted observations M*

_{M}*(Y*

_{k}^{i}

_{−}) and

**C**

*is the cross-covariance matrix linking the forecasted observations with the state variables contained in Y*

_{YM}^{i}

_{−}. Both covariances are statistically estimated from all individual ensemble realizations and calculated around the ensemble mean. Here Y

^{i}

_{−}signifies the updated or analysis state representation. Filter predictions are obtained by averaging analysis representations across the model ensemble. Of particular interest here are modeling errors represented by

**w**in (1) and the impact of making inaccurate assumptions concerning their source and magnitude. One potential diagnostic tool is the filter innovation (

*ν*), defined as

_{k}**w**is perfectly represented in a statistical sense, then

*ν*should be mean zero, temporally uncorrelated, and have a variance of

_{k}*m*

^{−1}(

**C**

*+*

_{M}**C**

*). Or, alternatively the variable*

_{υ}*α*, defined as

_{k}*ν*, the mean of

_{k}*α*over time provides an integrated measure of the deviation of innovation mean and variance quantities from theoretical expectations. For instance, overestimation (underestimation) of model error will generally lead to a time series of

_{k}*α*with a mean less than (greater than) one. Adaptive tuning strategies exploit the operational availability of innovation statistics by tuning assumed levels of model error such that the statistical properties of

*α*are optimized (Mitchell and Houtekamer 1999).

_{k}### b. TOPLATS hydrologic modeling

*θ*

_{sz}) predictions are based on the linear combination of two dynamic model state variables: the fraction of the land surface saturated from below by the water table (

*f*) multiplied by the soil’s saturation capacity (

_{w}*θ*

_{sat}) and the surface soil water content in nonsaturated portions of the basin (

*θ*

_{unsat}):

*f*is calculated by assuming subbasin-scale spatial variability in water table depth

_{w}*z*is driven solely by variations in the local soils-topographic index (STI) defined as

*a*is area drained,

*T*is soil transmissivity, and

*β*is local slope. Variations in STI are related to variations in local water table depth (

*z*) by

*f*is the vertical decay of saturated hydrologic conductivity. Changes in basin-averaged water table depth

*are predicted through an areally lumped balance of deep soil drainage, upward diffusion from the water table, direct transpiration from the water table, and base flow (Peters-Lidard et al. 1997). Drainage and diffusion fluxes in unsaturated portions of the basin are calculated using a finite-difference numerical approximation to the Richards equation.*z

*z*< = 0 in (9) indicates surface saturation, the saturated fraction of the basin can be calculated as

*F*is the cumulative density function for STI, typically determined from high-resolution soil and topographic maps. Runoff is modeled as a combination of saturation excess runoff (the product of rainfall intensity and

*f*) and a separate parameterization of infiltration excess runoff. Base flow

_{w}*Q*occurs solely from the saturated zone and is modeled as

### c. Site location and model evaluation

All hydrologic modeling was based on application of TOPLATS to the Big Cabin Creek watershed in northeastern Oklahoma from 1 October 1996 to 30 September 1999. The watershed has very little hydrologic regulation or diversion and is located in a region that has seen extensive fieldwork aimed at the development of remote sensing retrieval algorithms for surface soil moisture. Upstream of the USGS gauging station near Big Cabin, Oklahoma, the watershed has an area of approximately 1170 km^{2}, which is roughly equivalent to the spatial resolution defined through the 3-dB antenna gain (Drusch et al. 1999) of current and next-generation spaceborne radiometers designed to retrieve surface soil moisture (Entekhabi et al. 2004). TOPLATS predictions within the basin were forced by meteorological observations (i.e., rainfall, air temperature, incoming radiation, wind speed, and relative humidity) from the Vinita, Oklahoma mesonet site located near the center of the basin. Soil hydraulic parameters were based on the dominant soil texture found with the basin (loam) and the soil texture/hydraulic parameter lookup table presented in Rawls et al. (1982). However, values of *θ*_{sat} and surface saturated hydrologic conductivity (*K*_{sat}) were tuned slightly (from 0.462 to 0.420 and 3.67 × 10^{−6} to 1.60 × 10^{−6} m s^{−1}, respectively) in order to match local soil moisture observations. Vegetation characteristics were varied on a monthly basis according the typical seasonality of the dominant land-cover type in the basin (grassland). Values for *f* and *Q*_{o} ( *f* = 3 and *Q*_{o} = 100 m^{3} s^{−1}) were obtained through manual calibration of TOPLATS against USGS streamflow observations.

Comparisons between calibrated TOPLATS results and observed streamflow and surface soil moisture are shown in Fig. 2a. Streamflow observations and predictions plotted in Fig. 2a are 30-day moving averages of daily values. TOPLATS soil moisture predictions plotted in Fig. 2b are for the interval of the STI that corresponds to local topography near the Vinita, Oklahoma mesonet site. Since Oklahoma mesonet soil moisture observations are not available prior to 2002, soil moisture intercomparisons are based on simulations conducted during a later time period (1 October 2002 to 30 September 2003). A good calibrated fit was achieved to both streamflow and surface soil moisture except for soil moisture predictions between December 2002 and April 2003. During this time period, TOPLATS systematically under predicts surface soil moisture by up to 0.10 cm^{3}cm^{−3}. This error is likely due to the misrepresentation of the basin-averaged water table depth (* z*) by TOPLATS and provides a good example of the impact on surface soil moisture dynamics of inaccurately modeling saturation-zone dynamics.

### d. Modeling error approach

*θ*

_{unsat}and

*f*. Short-term fluctuations in

_{w}*θ*

_{unsat}are due to rainfall flux at the top of the soil column and the cumulative effects of surface evaporation, drainage, and diffusive fluxes between the surface zone and deeper soil moisture states. In contrast,

*f*is based on deeper saturation-zone dynamics and the lateral redistribution of water. Consequently, random errors in precipitation, evaporation, infiltration, and saturation-zone recharge impact TOPLATS soil moisture predictions via contrasting processes acting at the top and bottom of the soil column. Here, three different error sources are considered. Model error at the bottom of the soil column is represented by adding mean-zero additive Gaussian noise to the catchment-averaged water table depth:

_{w}*θ*

_{unsat}:

*γ*,

_{z}*γ*,

_{p}*γ*

_{θunsat}) are modeled as mutually independent and temporally uncorrelated in time.

The experimental methodology (see section 2 and Fig. 1) is based on designating a single, unperturbed model realization as truth. A statistical representation of actual model error is specified by selecting “actual” values for the statistical parameters describing its components (*w _{z}*,

*w*, and

_{p}*w*

_{θunsat}). Using these error parameters and (12)–(14), a TOPLATS open-loop simulation is generated that represents the uncorrected impact of actual model error on the accuracy of hydrologic predictions (see “modeled states” in Fig. 1). Errors present in the open-loop simulations are then filtered via the implementation of the EnKF to assimilate (on a daily basis) values of “actual soil moisture observations” generated from the original truth simulation (Fig. 1). The EnKF filtering methodology is based on an assumed statistical representation of actual modeling errors. Assumed values of

*w*,

_{z}*w*, and

_{p}*w*

_{θunsat}are used with (12)–(14) to randomly perturb

*θ*

_{unsat},

*, and rainfall and create an ensemble of model predictions that, in turn, is sampled to derive the state forecast error covariance information required by the Kalman filter update Eq. (3). As noted above, the statistical properties of assumed model error driving the ensemble generation may or may not accurately represent actual errors used to perturb the original truth simulations. The focus of the analysis will be on instances in which they do not.*z

## 3. Results

Values plotted in Fig. 3 are based on daily root-mean-square (rms) differences between truth model simulations and both the open-loop (i.e., truth simulations perturbed by actual error) and EnKF filtering cases. EnKF results are based on the case in which both the source (TOPLATS * z* predictions) and statistical magnitude (

*w*= 0.025 m h

_{z}^{−1}) of actual modeling error is assumed to be perfectly known. Consequently, assimilating daily surface soil moisture observations (with an assumed absolute volumetric accuracy of 2%) using the EnKF can substantially correct open-loop model errors in TOPLATS root-zone (40 cm) soil moisture predictions.

Assuming knowledge of modeling error is limited only to its source in * z*, Fig. 4a examines how normalized error in 40-cm soil moisture results (defined as the rmse for EnKF results normalized by the open-loop rmse) varies as a function of the assumed standard deviation of

*γ*(i.e.,

_{z}*w*) used to perturb TOPLATS

_{z}*predictions in (12). Each point on the line in Fig. 4 relates the temporally lumped root-zone rmse and innovation statistics calculated when running the EnKF with a particular (constant) choice for assumed model error parameters during the 3-yr period starting in October 1996. The vertical line in Fig. 4 represents the actual value of*z

*w*used to originally perturb the model simulations. While overestimating actual model error has little impact on the accuracy of root-zone soil moisture predictions, underestimating the actual error magnitude sharply reduces the accuracy of filtered results and the overall value of remotely sensed surface soil moisture observations for constraining TOPLATS root-zone soil moisture predictions. The critical issue for our analysis is whether such poor performance can be accurately diagnosed using available filter innovations. Figure 4b displays the temporal average of filter innovations (

_{z}

*w*= 0.025 m h

_{z}^{−1}leads to an

*α*values and a correct assumption concerning the dominant source of modeling error, a properly constructed adaptive filter should be able to converge on a level of assumed modeling error that leads to acceptable EnKF root-zone soil moisture predictions.

### a. Impact of incorrect model error assumptions

Adaptive tuning using *θ*_{unsat} (*w*_{θunsat} = 0.01 cm^{3} cm^{−3} h^{−1}) but, for the purposes of generating the EnKF for assimilating surface soil moisture, error is assumed to be caused by fluctuations in mean water table depth. Because the source of model error is misspecified when assimilating soil moisture, normalized rmse values for EnKF root-zone soil moisture predictions rise as assumed error in * z* is increased. However, observed

*increases. Consequently, adaptive tuning would increase*z

*w*in an ill-advised attempt to lower

_{z}

Soil moisture assimilation results in Fig. 5 illustrate the impact of representing model error using random perturbations in * z* when, in reality, modeling error originates from the inaccurate representation of processes acting along the top of the soil column. Results in Fig. 5 are replotted as the solid line in Fig. 6a by graphing the

*y*axis of Fig. 5b against the

*y*axis of Fig. 5a for the entire range of assumed model error (the

*x*axis in Fig. 5). Results in Fig. 6a are for the case of actual error in

*(*z

*w*= 0.025 m h

_{z}^{−1}) and a range of assumed error magnitudes in

*,*z

*θ*

_{unsat}, and rainfall. For simplicity, both actual and assumed model error are limited to a single source.

Figures 6b and 6c are analogous except for actual error in *θ*_{unsat} (*w*_{θunsat} = 0.01 cm^{3} cm^{−3} h^{−1}) and rainfall (*w _{p}* = 1), respectively. The vertical line indicates the location of the

First, there exists the possibility of a spurious local minimum where, because of an incorrect assumption concerning the source of modeling error, the tuning of the wrong error type to move * z* and rainfall in Fig. 6a where actual error is in

*θ*

_{unsat}. In both cases, tuning the wrong error source to produce better innovation statistics leads to a steady reduction in the accuracy of EnKF root-zone soil moisture predictions. The application of an adaptive filter in such cases will actually worsen root-zone soil moisture predictions relative to what is possible without data assimilation (i.e., a normalized error of one on the

*y*axis in Fig. 6). This interpretation is supported by Table 1, which plots the rmse for optimal innovation statistics—the closest approach of curves in Fig. 6 to

*θ*

_{unsat}is correctly identified as the source of modeling error, adaptive tuning will lead to a relative reduction of 25% in modeling error. However, incorrectly choosing to tune innovations via adjustments to either

*or rainfall error parameters results leads to EnKF predictions that are less accurate than their respective open-loop cases.*z

A potential solution to this problem is to perform a broad enough optimization during adaptive filtering such that the possibility of all three sources of model error is accounted for. That is, ensuring that the adaptive filter is able to find the model error type associated with globally optimal innovation statistics. However, even if a powerful enough optimization scheme could efficiently locate such a global minimum, a second undesirable possibility for adaptive filtering is that the best innovative statistics (for all three possible error-type assumptions) is associated with optimization of the wrong error type. In this case the best obtainable innovation statistics would not reflect the correct error representation. For instance, in Fig. 6b, the best 40-cm soil moisture rmse results are realized by tuning rainfall error parameters—the correct error source. However, better innovation statistics (but reduced root-zone soil moisture accuracies) arise when either * z* or

*θ*

_{unsat}error parameters are tuned. The parenthetical values in Table 1 represent the closest approach of

### b. Sensitivity to error magnitude

One concern is the potential sensitivity of results to rather arbitrary choices made concerning the magnitude of actual observing and modeling errors. To examine sensitivity to assumptions concerning the magnitude of observation and modeling errors, results were also generated (but not shown) for the cases of doubled error (from 2% to 4% volumetric) in daily surface-zone soil moisture retrievals and for the cases of baseline 2% soil moisture measurement error and both doubled and tripled values of modeling errors (i.e., *w*_{θunsat}, *w _{z}*, and

*w*). As expected, increased observation errors tended to move normalized rmse for EnKF predictions toward unity as increased weight is shifted to model open-loop predictions. However, there was no qualitative change in the appearance or interpretation of results. Baseline choices for modeling errors used in this analysis led to rms open-loop errors in root-zone soil moisture predictions on the order of 0.5% to 3% volumetric (Fig. 3). Doubled and tripled modeling error led to slightly less sensitivity to the selection of assumed error source (i.e., column to column variations along a given row of Table 1 are reduced) but did not qualitatively change results.

_{p}### c. Impact of diffuse error sources

Results in Figs. 3 –6 are all based on synthetic simulations where actual (and assumed) modeling error is intentionally restricted to a single source. In reality, land surface modeling errors will likely arise from a broad range of sources. Figure 7 plots results for the case where actual model error arises simultaneously from additive noise in * z*,

*θ*

_{unsat}, and multiplicative rainfall noise (

*w*= 0.025 m h

_{z}^{−1},

*w*

_{θunsat}=0.01 cm

^{3}cm

^{−3}h

^{−1}, and

*w*= 1.0), while assumed error is restricted, in turn, to only one of these three error sources and tuned using

_{p}

### d. Innovation temporal correlation

In addition to the sampled mean of *α _{k}*, the temporal correlation coefficient of

*ν*(

_{k}*ρ*) provides a diagnostic variable on which to base adjustments to assumed levels of model error. If the EnKF is operating in accordance with its underlying assumptions, including an accurate representation of model error, then the

_{ν}*ν*time series should be temporally uncorrelated (

_{k}*ρ*= 0). A correlated time series can be taken as evidence that model errors are being improperly represented in the filter. To determine if some of the adaptive filtering difficulties encountered in Fig. 6 and Table 1 can be addressed using this additional diagnostic, results in Table 1 were regenerated in Table 2 by tuning the EnKF so that the absolute value of

_{ν}*ρ*was minimized. Intercomparison of results in Tables 1 and 2 reveals that calibrations based on

_{ν}*ρ*(Table 2) and

_{ν}

*ρ*-based calibration avoids the adaptive filtering problems noted in Table 1. For instance, as in the case of optimizing

_{ν}

*errors (when the actual error source is noise in*z

*θ*

_{unsat}predictions) to minimize

*ρ*leads to root-zone soil moisture errors that exceed the open-loop case (i.e., normalized errors greater than one). In addition, for the case of actual error in rainfall, better

_{ν}*ρ*statistics are obtainable via calibration of the wrong error parameters,

_{ν}*w*and

_{z}*w*

_{θunsat}, versus tuning of the rainfall error parameter. As a result, even a perfect global optimizing algorithm will be unable to converge on the error source associated with the best root-zone soil moisture filtering results. Overall, the close correspondence between results in Tables 1 and 2 implies that

*ρ*is too closely tied to

_{ν}

### e. Value of ancillary runoff observations

One potential solution for adaptive filtering problems is constraining EnKF results with additional observations. In addition to the assimilation of surface soil moisture, Fig. 5 also examines the case of assimilating runoff observations. Runoff magnitudes are assumed known from streamflow observations within a relative accuracy of 20% and the basin is assumed small enough such that the time lag between runoff generation and streamflow observations can be safely neglected. Actual error is assumed to be due to random noise in *θ*_{unsat} predictions and assumed error to variations in * z*. While increasing the magnitude of error in

*improves the statistics for soil moisture innovations, it makes runoff innovations worse (i.e., moves the dashed line away from one in Fig. 5b). If streamflow observations are available, this incapability will provide a critical diagnostic that allows data assimilation systems to avoid the pitfall associated with adaptively tuning the wrong error type within a land surface model (section 3a). By strongly fluctuating*z

*, the filter can (wrongly) induce sufficient variability in*z

*θ*

_{sz}such that

*θ*

_{sz}innovations will approach a mean of one. However, such excessive variations in

*will also induce excessive background spread in the model-predicted runoff ensemble. This excessive spread is detectible in filter innovations if runoff observations are available and jointly considered.*z

The most natural way to consider runoff observations is to include them in the observation vector and jointly assimilate both runoff and surface soil moisture observations. Figure 8 replots Fig. 6 for the case of assimilating both surface soil moisture and surface runoff. As in Fig. 6, actual and assumed error is assumed to be restricted to a single error source, and plotted lines demonstrate the relationship between mean normalized innovations

## 4. Discussion and summary

Because of its Monte Carlo basis, the EnKF can address a wide variety of error sources in geophysical models. This is often cited as a key advantage for its application to hydrologic models where error can arise from a range of sources (Crow and Wood 2003). However, large gaps exist in our knowledge concerning the magnitude and ultimate source of error in hydrologic modeling predictions. Consequently, it is reasonable to expect that the operational application of sequential data assimilation filters to assimilate remotely sensed soil moisture will rely heavily on diagnostic tools like filter innovations to obtain model error statistical information. This analysis examines the potential for using filter innovations to correctly tune model error parameters in such a way that EnKF-based predictions of root-zone soil moisture are optimized.

Results in Table 1 and Figs. 5 and 6 highlight two fundamental challenges facing adaptive filtering strategies aimed at tuning hydrologic model error parameters based on surface soil moisture innovation statistics. First, there exists the potential for spurious local minimums where optimization of the wrong error parameter can actually progressively degrade EnKF results to accuracy levels below what is obtainable for the comparable open-loop case (see Fig. 5). Second, globally optimal innovation statistics do not necessary correspond to the model error parameters that provide the best EnKF results (Table 1). Consequently, tuning of soil moisture innovation statistics to globally optimal levels does not always guarantee optimal EnKF-based root-zone soil moisture predictions (Fig. 6). Results in Fig. 6 are based on the relatively simple case where assumed and actual modeling errors are limited to a single source. Figure 7 presents results for a more realistic case in which actual model error is distributed among a range of sources and assumed error used to generate model ensembles (and the EnKF) is limited to only a single source. The tuning of only a single-error source (again via soil moisture filter innovations) to represent a broader range of errors appears to avoid much of spurious local minimum problem noted in Fig. 6. However, it remains the case that the best (worst) calibrated filter innovations are associated with tuning the wrong (correct) error type and the highest (lowest) error in EnKF root-zone soil moisture predictions (Fig. 7).

Results also clarify the potential of additional diagnostic statistics to detect and correct for impacts associated with incorrect model error assumptions. Calibrating model error such that the temporal correlation of innovations (*ρ _{ν}*) was minimized led to results that were qualitatively similar to calibrating against

For simplicity, perturbations used to simulate various sources of modeling error were assumed to be mutually independent and temporally uncorrelated. Both assumptions may qualitatively impact results presented in section 3. The impact of cross-correlated state perturbations depends on the degree of cross correlation in combination with the feedback between the respective model state variables (Ljung 1999). Filter sensitivity to error misspecification may be enhanced when cross correlation is present but assumed to be absent. Likewise, temporal autocorrelation in model perturbations can be modeled using state augmentation procedures (Reichle et al. 2002a). However, failure to account for existing autocorrelations can also degrade filter performance. Finally, it is worth noting that, relative to the approach employed here, more sophisticated methods for estimating model bias and statistical uncertainty have been presented in the systems and control literature. Here the problem is often referred to as filter divergence, referring to the fact that the ensemble sample drifts gradually from the true state and no longer produces a meaningful state forecast. Examples of such techniques include nonstationary stochastic embedding, model error modeling based on prediction error methods, and set membership identification (Reinelt et al. 2002). While potentially attractive for use in the EnKF, the computational difficulties associated with implementing these methods for hydrologic forecasting/monitoring problems are considerable and have yet to be addressed. In addition, the Rauch–Tung–Striebel (RTS) smoother (Rauch et al. 1965) may provide a way to check for filter divergence (rather than estimating modeling errors directly) for hydrologic assimilation systems by comparing the statistics of filter innovations in the forward and backward sequences of the smoother. However, in the context of the EnKF it is not trivial to implement the RTS smoother, and it is not yet clear how this can be done efficiently (Evensen and van Leeuwen 2000; van Leeuwen 2001).

## REFERENCES

Burgers, G., van Leeuwen P. J. , and Evensen G. , 1998: Analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126****,**1719–1724.Crow, W. T., 2001: The impact of land surface heterogeneity on the accuracy and utility of spaceborne soil moisture. Ph.D. dissertation, Princeton University, 309 pp.

Crow, W. T., 2003: Correcting land surface model predictions for the impact of temporally sparse rainfall rate measurements using an ensemble Kalman filter and surface brightness temperature observations.

,*J. Hydrometeor.***4****,**960–973.Crow, W. T., and Wood E. F. , 2003: The assimilation of remotely sensed soil brightness temperature imagery into a land surface model using ensemble Kalman filtering: A case study based on ESTAR measurements during SGP97.

,*Adv. Water Resour.***26****,**137–149.Dee, D. P., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation.

,*Mon. Wea. Rev.***123****,**1128–1145.Drusch, M., Wood E. F. , and Lindau R. , 1999: The impact of the SSM/I antenna gain function on land surface parameter retrieval.

,*Geophys. Res. Lett.***26****,**3481–3484.Entekhabi, D., and Coauthors, 2004: The Hydrosphere State (HYDROS) mission concept: An earth system pathfinder for global mapping of soil moisture and land freeze/thaw.

,*IEEE Trans. Geosci. Remote Sens.***42****,**2184–2195.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99****,**10143–10162.Evensen, G., and van Leeuwen P. J. , 2000: An ensemble Kalman smoother for nonlinear dynamics.

,*Mon. Wea. Rev.***128****,**1852–1867.Famiglietti, J. F., and Wood E. F. , 1994: Multiscale modeling of spatially variable water and energy balance processes.

,*Water Resour. Res.***30****,**3061–3078.Houser, P. R., Shuttleworth W. J. , Famglietti J. S. , Gupta H. V. , Syed K. H. , and Goodrich D. C. , 1998: Integration of soil moisture remote sensing and hydrologic modeling using data assimilation.

,*Water Resour. Res.***34****,**3405–3420.Ljung, L., 1979: Asymptotic behavior of the extended Kalman filter as a parameter estimator for linear systems.

,*IEEE Trans. Autom. Control***24****,**36–50.Ljung, L., 1999:

*System Identification—Theory for the User*. 2d ed. Prentice-Hall, 609 pp.Margulis, S. A., McLaughlin D. , Entekhabi D. , and Dunne S. , 2002: Land data assimilation of soil moisture using measurements from the Southern Great Plains 1997 Field Experiment.

,*Water Resour. Res.***38****.**1299, doi:10.1029/2001WR001114.Mitchell, H. L., and Houtekamer P. L. , 2000: An adaptive ensemble Kalman filter.

,*Mon. Wea. Rev.***128****,**416–433.Montaldo, N., Alberton J. D. , Mancini M. , and Kiely G. , 2001: Robust prediction of root zone soil moisture from assimilation of surface soil moisture.

,*Water Resour. Res.***37****,**2889–2901.Pauwels, V. R. N., and Wood E. F. , 1999: A soil–vegetation–atmosphere transfer scheme for the modeling of water and energy balance processes in high latitudes. 1. Model improvements.

,*J. Geophys. Res.***104****,**27811–27822.Pauwels, V. R. N., Hoeben R. , Verhoest N. E. C. , and De Troch F. P. , 2001: The importance of the spatial pattern of remotely sensed soil moisture in the improvement of discharge predictions for small-scale basins through data assimilation.

,*J. Hydrol.***251****,**88–102.Peters-Lidard, C. D., Zion M. S. , and Wood E. F. , 1997: A soil–vegetation–atmosphere transfer scheme for modeling spatially variable water and energy balance processes.

,*J. Geophys. Res.***102****,**4303–4324.Rauch, H. E., Tung F. , and Striebel C. T. , 1965: Maximum likelihood estimates of linear dynamic systems.

,*J. Amer. Inst. Aeronaut. Astronaut.***3****,**1445–1450.Rawls, W. J., Brakensiek D. L. , and Saxton K. E. , 1982: Estimation of soil water properties.

,*Trans. ASAE***25****,**1316–1320.Reichle, R. H., and Koster R. D. , 2002: Land data assimilation with the ensemble Kalman filter: Assessing model error parameters using innovations.

*Developments in Water Science—Computational Methods in Water Resources*, Vol. 47, Elsevier, 1387–1394.Reichle, R. H., and Koster R. D. , 2004: Bias reduction in short records of satellite soil moisture.

,*Geophys. Res. Lett.***31****.**L19501, doi:10.1029/2004GL020938.Reichle, R. H., and Koster R. D. , 2005: Global assimilation of satellite surface soil moisture retrievals into the NASA Catchment land surface model.

,*Geophys. Res. Lett.***32****.**L02404, doi:10.1029/2004GL021700.Reichle, R. H., McLaughlin D. B. , and Entekhabi D. , 2002a: Hydrologic data assimilation with the ensemble Kalman filter.

,*Mon. Wea. Rev.***130****,**103–114.Reichle, R. H., Walker J. P. , Koster R. D. , and Houser P. R. , 2002b: Extended versus ensemble Kalman filtering for land data assimilation.

,*J. Hydrometeor.***3****,**728–740.Reinelt, W., Garulli A. , and Ljung L. , 2002: Comparing different approaches to model error modeling in robust identification.

,*Automatica***38****,**787–803.van Leeuwen, P. J., 2001: An ensemble smoother with error estimates.

,*Mon. Wea. Rev.***129****,**709–728.Walker, J. P., and Houser P. R. , 2001: A methodology for initializing soil moisture in a global climate model: Assimilation of near-surface soil moisture observations.

,*J. Geophys. Res.***106****,**11761–11774.Walker, J. P., Willgoose G. R. , and Kalma J. D. , 1999: One-dimensional soil moisture profile retrieval by assimilation of near-surface measurements: A simplified soil moisture model and field application.

,*J. Hydrol.***2****,**356–373.

Comparison of TOPLATS (a) streamflow predictions to USGS observations (both smoothed within a 30-day moving average window) and (b) volumetric surface (0–5 cm) soil moisture predictions against Oklahoma Mesonet observations at the Vinita, OK, site. Mesonet soil moisture data are unavailable for the main analysis period of October 1996 to October 1999.

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Comparison of TOPLATS (a) streamflow predictions to USGS observations (both smoothed within a 30-day moving average window) and (b) volumetric surface (0–5 cm) soil moisture predictions against Oklahoma Mesonet observations at the Vinita, OK, site. Mesonet soil moisture data are unavailable for the main analysis period of October 1996 to October 1999.

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Comparison of TOPLATS (a) streamflow predictions to USGS observations (both smoothed within a 30-day moving average window) and (b) volumetric surface (0–5 cm) soil moisture predictions against Oklahoma Mesonet observations at the Vinita, OK, site. Mesonet soil moisture data are unavailable for the main analysis period of October 1996 to October 1999.

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Improvements in integrated root-zone (0–40 cm) soil moisture predictions when model error is perfectly represented and surface soil moisture observations are assimilated into TOPLATS using the EnKF.

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Improvements in integrated root-zone (0–40 cm) soil moisture predictions when model error is perfectly represented and surface soil moisture observations are assimilated into TOPLATS using the EnKF.

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Improvements in integrated root-zone (0–40 cm) soil moisture predictions when model error is perfectly represented and surface soil moisture observations are assimilated into TOPLATS using the EnKF.

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Impact on (a) the rmse accuracy of EnKF root-zone soil moisture predictions and (b) the mean normalized innovation (* z* is varied. Dashed horizontal line at 1 indicates theoretical expectation for

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Impact on (a) the rmse accuracy of EnKF root-zone soil moisture predictions and (b) the mean normalized innovation (* z* is varied. Dashed horizontal line at 1 indicates theoretical expectation for

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Impact on (a) the rmse accuracy of EnKF root-zone soil moisture predictions and (b) the mean normalized innovation (* z* is varied. Dashed horizontal line at 1 indicates theoretical expectation for

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

For the cases of independently assimilating both soil moisture and runoff, impact on (a) the rmse accuracy of EnKF root-zone soil moisture predictions and (b) mean normalized innovations (* z* is varied and actual model error is in

*θ*

_{unsat}. Dashed horizontal line at 1 indicates theoretical expectation for

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

For the cases of independently assimilating both soil moisture and runoff, impact on (a) the rmse accuracy of EnKF root-zone soil moisture predictions and (b) mean normalized innovations (* z* is varied and actual model error is in

*θ*

_{unsat}. Dashed horizontal line at 1 indicates theoretical expectation for

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

For the cases of independently assimilating both soil moisture and runoff, impact on (a) the rmse accuracy of EnKF root-zone soil moisture predictions and (b) mean normalized innovations (* z* is varied and actual model error is in

*θ*

_{unsat}. Dashed horizontal line at 1 indicates theoretical expectation for

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Relationship between mean normalized innovations (

Citation: Journal of Hydrometeorology 7, 3; 10.1175/JHM499.1

Normalized 40-cm soil moisture rmse for calibrated ENKF results (assimilating surface soil moisture) for various combinations of assumed and actual error types. Error magnitudes were calibrated such that

Normalized 40-cm soil moisture rmse for calibrated ENKF results (assimilating surface soil moisture) for various combinations of assumed and actual error types. Error magnitudes were calibrated such that *ρ _{ν}* is as close to zero possible. The best (i.e., closest to zero)

*ρ*values obtained during calibration are listed in parentheses.

_{ν}Normalized 40-cm soil moisture rmse for calibrated ENKF results (assimilating surface soil moisture and runoff) for various combinations of assumed and actual error types. Error magnitudes were calibrated such that