• Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224.

  • Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 7283.

  • Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758.

    • Search Google Scholar
    • Export Citation
  • Beezley, J. D., , and J. Mandel, 2007: Morphing ensemble Kalman filters. Tellus, 60A, 131140.

  • Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436.

    • Search Google Scholar
    • Export Citation
  • Burger, J. O., 1985: Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, 624 pp.

  • Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724.

    • Search Google Scholar
    • Export Citation
  • Cohn, S., , and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc. Japan, 74, 6375.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367.

  • Evensen, G., , and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Aghulas Current using the ensemble Kalman filter with a quasigeostrophic model. Mon. Wea. Rev., 124, 8596.

    • Search Google Scholar
    • Export Citation
  • Francis, B. A., 1987: A Course in H Control Theory. Springer-Verlag, 156 pp.

  • Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., , J. S. Whitaker, , J. L. Anderson, , and C. Snyder, 2009: Comments on “Sigma-point Kalman filter data assimilation methods for strongly nonlinear systems.” J. Atmos. Sci., 66, 34983500.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , and D. T. Pham, 2004: An adaptively reduced-order extended Kalman filter for data assimilation in the tropical Pacific. J. Mar. Syst., 45, 173188.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , and J. Blum, 2001: A semi-evolutive partially local filter for data assimilation. Mar. Pollut. Bull., 43, 164174.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36, 101127.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , G. Triantafyllou, , and G. Korres, 2008: A new approximate solution of the optimal nonlinear filter for data assimilation in meteorology and oceanography. Mon. Wea. Rev., 136, 317334.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

    • Search Google Scholar
    • Export Citation
  • Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 400 pp.

  • Kalman, R., 1960: A new approach to linear filtering and prediction problems. Trans. ASME, Ser. D. J. Basic Eng., 82, 3545.

  • Lorenz, E. N., 1996: Predictability—A problem partly solved. Predictability, T. Palmer, Ed., ECMWF, 1–18.

  • Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414.

    • Search Google Scholar
    • Export Citation
  • Luo, X., , and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform. Physica D, 238, 549562.

  • Luo, X., , I. M. Moroz, , and I. Hoteit, 2010: Scaled unscented transform Gaussian sum filter: Theory and application. Physica D, 239, 684701.

    • Search Google Scholar
    • Export Citation
  • Nerger, L., , L. Hiller, , and J. Schröter, 2005: A comparison of error subspace Kalman filters. Tellus, 57A, 715735.

  • Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415428.

  • Pham, D. T., , J. Verron, , and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for data assimilation in oceanography. J. Mar. Syst., 16, 323340.

    • Search Google Scholar
    • Export Citation
  • Schlee, F. H., , C. J. Standish, , and N. F. Toda, 1967: Divergence in the Kalman filter. AIAA J., 5, 11141120.

  • Shaked, U., , and N. Berman, 1995: H nonlinear filtering of discrete-time processes. IEEE Trans. Sig. Proc., 43, 22052209.

  • Simon, D., 2006: Optimal State Estimation: Kalman, H-Infinity, and Nonlinear Approaches. Wiley-Interscience, 552 pp.

  • Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490.

    • Search Google Scholar
    • Export Citation
  • Van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131, 20712084.

  • Van Leeuwen, P. J., 2009: Particle filtering in geophysical systems. Mon. Wea. Rev., 137, 40894114.

  • Verlaan, M., , and A. W. Heemink, 1997: Tidal flow forecasting using reduced rank square root filters. Stoch. Hydrol. Hydraul., 11, 349368.

    • Search Google Scholar
    • Export Citation
  • Wang, D., , and X. Cai, 2008: Robust data assimilation in hydrological modeling—A comparison of Kalman and H-infinity filters. Adv. Water Resour., 31, 455472.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924.

  • Zupanski, M., 2005: Maximum likelihood ensemble filter: Theoretical aspects. Mon. Wea. Rev., 133, 17101726.

  • View in gallery

    A time series generated by Eq. (47).

  • View in gallery

    RMSE of the KF in assimilating the regression model. The jump heights h are (top) 10 and (bottom) 30.

  • View in gallery

    RMSE difference between the TLHF of I-BG and the KF in assimilating the regression model. The jump heights h are (top) 10 and (bottom) 30. In the TLHF, the PLC c = 0.1.

  • View in gallery

    As in Fig. 3, but with c = 0.5.

  • View in gallery

    As in Fig. 3, but with c = 0.9.

  • View in gallery

    RMSE difference between the TLHF of I-ANA and I-MTX and the KF in assimilating the regression model. The jump heights h are (top) 10 and (bottom) 30. In the TLHF, the PLC c = 0.1.

  • View in gallery

    As in Fig. 6, but with c = 0.4.

  • View in gallery

    As in Fig. 6, but with c = 0.6.

  • View in gallery

    Time mean RMSE of the EnTLHF of I-BG as a function of the PLC in assimilating the LE98 model. The values of the parameter F are 6 (dash–dotted) and 8 (dotted). The EnTLHF reduces to the ETKF when the PLC c = 0.

  • View in gallery

    As in Fig. 9, but for I-ANA.

  • View in gallery

    As in Fig. 9, but for I-MTX. Filter divergence occurs for c > 0.6.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 78 78 8
PDF Downloads 37 37 5

Robust Ensemble Filtering and Its Relation to Covariance Inflation in the Ensemble Kalman Filter

View More View Less
  • 1 King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
© Get Permissions
Full access

Abstract

A robust ensemble filtering scheme based on the H filtering theory is proposed. The optimal H filter is derived by minimizing the supremum (or maximum) of a predefined cost function, a criterion different from the minimum variance used in the Kalman filter. By design, the H filter is more robust than the Kalman filter, in the sense that the estimation error in the H filter in general has a finite growth rate with respect to the uncertainties in assimilation, except for a special case that corresponds to the Kalman filter.

The original form of the H filter contains global constraints in time, which may be inconvenient for sequential data assimilation problems. Therefore a variant is introduced that solves some time-local constraints instead, and hence it is called the time-local H filter (TLHF). By analogy to the ensemble Kalman filter (EnKF), the concept of ensemble time-local H filter (EnTLHF) is also proposed. The general form of the EnTLHF is outlined, and some of its special cases are discussed. In particular, it is shown that an EnKF with certain covariance inflation is essentially an EnTLHF. In this sense, the EnTLHF provides a general framework for conducting covariance inflation in the EnKF-based methods. Some numerical examples are used to assess the relative robustness of the TLHF–EnTLHF in comparison with the corresponding KF–EnKF method.

Corresponding author address: Luo Xiaodong, 4700 King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. E-mail: xiaodong.luo@kaust.edu.sa

Abstract

A robust ensemble filtering scheme based on the H filtering theory is proposed. The optimal H filter is derived by minimizing the supremum (or maximum) of a predefined cost function, a criterion different from the minimum variance used in the Kalman filter. By design, the H filter is more robust than the Kalman filter, in the sense that the estimation error in the H filter in general has a finite growth rate with respect to the uncertainties in assimilation, except for a special case that corresponds to the Kalman filter.

The original form of the H filter contains global constraints in time, which may be inconvenient for sequential data assimilation problems. Therefore a variant is introduced that solves some time-local constraints instead, and hence it is called the time-local H filter (TLHF). By analogy to the ensemble Kalman filter (EnKF), the concept of ensemble time-local H filter (EnTLHF) is also proposed. The general form of the EnTLHF is outlined, and some of its special cases are discussed. In particular, it is shown that an EnKF with certain covariance inflation is essentially an EnTLHF. In this sense, the EnTLHF provides a general framework for conducting covariance inflation in the EnKF-based methods. Some numerical examples are used to assess the relative robustness of the TLHF–EnTLHF in comparison with the corresponding KF–EnKF method.

Corresponding author address: Luo Xiaodong, 4700 King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. E-mail: xiaodong.luo@kaust.edu.sa

1. Introduction

The Kalman filter (KF) (Kalman 1960) is a sequential data assimilation algorithm. For linear stochastic systems, it can be shown that the KF is an optimal linear estimator that minimizes the variance of the estimation error (Simon 2006, chapter 5). Because of its relative simplicity in implementation, the KF is suitable for many data assimilation problems. However, for high-dimensional systems such as weather forecasting models, direct application of the KF is prohibitively expensive as it involves manipulating covariance matrices of the system states. For this reason, different modifications of the KF were proposed to reduce the computational cost. These include various ensemble Kalman filters (EnKFs) (Anderson 2001; Bishop et al. 2001; Burgers et al. 1998; Evensen 1994; Evensen and van Leeuwen 1996; Houtekamer and Mitchell 1998; Whitaker and Hamill 2002), the error subspace-based filters (Cohn and Todling 1996; Hoteit et al. 2001, 2002; Luo and Moroz 2009; Pham et al. 1998; Verlaan and Heemink 1997), and filters based on other strategies (Beezley and Mandel 2007; Zupanski 2005), to name but a few. A detailed description of the above filters is beyond the scope of this work. Readers are referred to Evensen (2003), Nerger et al. (2005), and Tippett et al. (2003) for reviews of some of the aforementioned filters. Roughly speaking, these modifications exploit the information of a subset in the state space of a dynamical system, while the information of the complement set is considered less influential and thus ignored. Consequently, the computations of these modified filters are normally conducted on the chosen subsets, instead of the whole state space, so that their computational costs are reduced. For simplicity, we may sometimes abuse the terminology by referring to all the aforementioned filters as the EnKF-based methods (EnKF methods for short).

The KF and the EnKF are among the family of Bayesian filters that adopt Bayes’ rule to update background statistics to their analysis counterparts. In these filters, one needs to make certain assumptions on the statistical properties [e.g., probability distribution functions (pdfs) or moments] of both the dynamical and observation systems. In reality, however, these assumptions may not be accurate, so that a Bayesian filter may fail to achieve good performance with mis-specified statistical information (Schlee et al. 1967). For example, if implemented straightforwardly, an EnKF with a relatively small ensemble size may produce inaccurate estimations of covariance matrices (Whitaker and Hamill 2002). This could degrade filter performance or even cause filter divergence. As a remedy, in practice it is customary to conduct covariance inflation and localization to relieve this problem (Anderson and Anderson 1999; Hamill et al. 2001, 2009; Van Leeuwen 2009).

In contrast, robust filters emphasize the robustness of their estimates, so that they may have better tolerances to possible uncertainties in assimilation. The estimation strategies of robust filters are different from Bayes’ rule. One can take the H filter (Francis 1987; Simon 2006), one of the robust filters, as an example. The H filter (HF) does not require one to exactly know the statistical properties of a system being assimilated. Instead, it accepts the possibility that one may only have incomplete information of the system. Consequently, rather than looking for the best possible estimates based on Bayes’ rule, the optimal H filter employs a robust strategy, namely, the minimax rule (Burger 1985, chapter 4), to update its background statistics. This robustness may be of interest in practical situations. For example, for data assimilation in earth systems, the system models are often not the exact descriptions of the underlying physical processes, and it is challenging to characterize the properties of the corresponding model errors (Wang and Cai 2008 and the references therein). Given an imperfect model, the estimation error of the HF in general grows with the uncertainties in assimilation at a finitely bounded rate (except for the special case when the HF reduces to the KF itself), while the estimation error of the KF does not possess such a guarantee.

In this work we propose a variant of the HF, called the time-local HF (TLHF), to avoid solving global constraints as in the HF. By analogy to the EnKF, we further introduce the ensemble TLHF (EnTLHF) for data assimilation in high-dimensional systems. We show that the EnTLHF can be constructed based on the EnKF, and thus the computational complexity of the EnTLHF is in general comparable to that of the EnKF. We also show that some specific forms of the EnTLHF have connections with some EnKFs equipped with certain covariance inflation techniques. More generally, we show that an EnKF with a certain covariance inflation technique is in fact an EnTLHF.

The organization of this work is as follows. Section 2 presents the data assimilation problem and discusses its solutions in terms of the KF and the HF, respectively. Section 3 introduces the TLHF as a variant of the HF, and its ensemble form, the EnTLHF. Section 4 discusses some specific forms of the EnTLHF and shows their connections with some of the EnKF methods with covariance inflation. In section 5, we use some numerical examples to show the relative robustness of the TLHF (EnTLHF) in comparison to the corresponding KF (EnKF) method.

2. Problem statement

We consider the state-estimation problem in the following scenario:
e1a
e1b
e1c
e1d
e1e

Equations (1a) and (1b) represent the mx-dimensional dynamical system and the my-dimensional observation system, respectively, where xi denotes the mx-dimensional state vector, yi the corresponding my-dimensional observation, the transition operator, and the observation operator, at time i. For convenience, in this section we assume the systems in Eqs. (1a) and (1b) are linear so that and are two matrices. The estimation problem in the presence of nonlinearity will be addressed in terms of the EnTLHF, as will be discussed later.

Equations (1c)(1e) imply that the mx-dimensional dynamical noise ui and the my-dimensional observation noise vi are uncorrelated white noise,1 with zero mean and covariances and , respectively; δij denotes the Kronecker delta function such that δij = 1 if i = j, δij = 0 otherwise.

In what follows, we discuss two filtering approaches as the solutions of the above state-estimation problem: 1) the KF, which is based on the criterion of minimizing the variance of the estimation error [equivalent to applying Bayes’ rule to update background statistics as shown in Jazwinski (1970, chapter 7)]; and 2) the optimal HF, which is based on the criterion of minimizing the supremum (or maximum) of the ratio of the “energy” of the estimation error to the “energy” of the uncertainties in data assimilation (to be made clear shortly). In what follows we outline the main results of the KF and the HF. For more details, readers will be referred to appropriate references.

a. Kalman filter as a solution

The KF algorithm involves prediction and filtering steps, the deductions of which can be found in, for example, Simon (2006, chapter 3). When the KF is applied to assimilate the system in Eq. (1), these steps are as follows.

  • Prediction step: Propagate the analysis at the (i − 1)th cycle and the associated analysis error covariance forward to obtain the background at the next cycle [Eq. (2a)] and the associated background error covariance [Eq. (2b)], respectively:
    e2a
    e2b

  • Filtering step: With a new incoming observation yi, update the background and the associated error covariance to their analysis counterparts, [Eq. (3a)] and [Eq. (3b)], respectively:
    e3a
    e3b
    e3c
where in Eq. (3c) is the Kalman gain. Alternatively, Eqs. (3b) and (3c) can be reformulated as (Simon 2006, chapter 6)
e4a
e4b
Equation (4) will be used to simplify the presentation of the HF later on.
To better understand the difference between the KF and the HF, we extend our discussion to a slightly more general situation. Instead of looking for an estimate of the system state xi, we are interested in estimating some mz-dimensional linear transform zi of xi, in terms of
e5
with being a user-defined mz × mx matrix. In Eq. (5), if is the mx × mx identity matrix, then zi reduces to xi itself. But can be something else in general. For instance, one may let such that zi is equivalent to the first element of xi.
In the KF, one can solve the estimation problem by minimizing the following cost function,
e6
with respect to the variables . Here, represents the analysis of the truth zi, while , the cost function local to the ith cycle with respect to zi, is the variance of the estimation error at the ith assimilation cycle, with being the expectation operator, and denoting the L2 norm. Given an mz-dimensional vector , .
The estimates of the KF are obtained by sequentially minimizing the local cost function and then propagating the resulting analysis forward to yield the background at the next cycle. By treating as a system state, direct sequential minimization of the local cost functions is equivalent to running the KF recursively to assimilate the following modified systems:
e7a
e7b
where zi and are the system state and dynamical noise at instant i, respectively, associated with the modified transition operator , where may have to be treated as the pseudo-inverse of in some circumstances; yi and vi still correspond to the observation vector and noise, respectively, but the observation operator becomes .
Alternatively, one may solve the estimation problem in an indirect way, but without changing the systems in assimilation. To this end, one specifies a new cost function,
e8
with respect to the variables to determine the optimal estimations by sequentially minimizing the local cost functions —equivalent to running the KF recursively to assimilate the system in Eq. (1)—and then applies Eq. (5) to obtain the estimations of interest.
The user-defined matrix does not appear in Eq. (8), which implies that in indirect estimation, the estimates of the KF do not depend on the choice of , so that by putting
e9
direct and indirect methods yield the same estimates . However, for the HF to be presented below, we will see that the choice of does affect the estimates in indirect estimation, which is clearly different from the KF.

b. H filter as a solution

The HF (Simon 2006, chapter 11 and references therein) aims to provide robust, but not necessarily best, estimates. The main idea is to first recognize that in the Eq. (1) system, there are three possible sources that contribute to the uncertainties in data assimilation, namely, the uncertainties in the initial conditions, the model error, and the observation error. Accordingly, during an assimilation time window [0, N], these uncertainty sources are characterized by three uncertainty “energy” terms, defined as , , and , respectively. For consistency, here we assume that an assimilation cycle starts from propagating an analysis forward through a dynamical system, so that the dynamical noise u0 is included in our discussion. Given a symmetric, positive semidefinite matrix , represents the weighted L2 norm so that for a column vector z.

Since the minimum variance criterion in the KF is consistent with Bayes’s rule, it is customary to interpret the matrices Δ0, , and as the covariances (or their estimates) of the corresponding random vectors in probabilistic language. In the context of the HF, however, the minimax rule is adopted instead, which does not have probabilistic interpretations for Δ0, , and in general (Simon 2006). Therefore, for conceptual distinction, in the context of the HF we call Δ0, , and the uncertainty weight matrices (uncertainty matrices for short) with respect to the initial conditions, the model error, and the observation error, respectively, and their inverses, , , and , the corresponding information matrices. The uncertainty matrices Δ0, , and can be user-defined, although in practice it is convenient to let them be equal to their covariance counterparts (or the estimates) in the KF. Under this choice, the difference between the estimates of the KF and the HF becomes clear, as is shown later.

To provide robust estimates, the HF requires that the total “energy” of the estimation errors, in terms of , be no larger than the sum of the above three uncertainty “energy” terms times a constant factor 1/γ, that is,
e10
In Eq. (10), the weight matrix is also user-chosen, which can be used to guide the filter’s behavior in estimation (Simon 2006, chapter 11).
To solve Eq. (10), we first define the following cost function :
e11
which is a function of the variables x0, {ui}, {vi}, and , where { } stand for a set of random vectors.2 Then the inequality [Eq. (10)] is equivalent to , except when the uncertainty “energy” term
eq1
is zero. In this exceptional case, there still exists an HF solution to make the inequality [Eq. (10)] hold. For example, if Δ0, , and are all +∞, so that their inverses are zero, we can choose so that the left-hand side (lhs) of Eq. (10) is also zero (for consistency, we define the products ∞ × 0 = 0 × ∞ = 0 to avoid possible problems). Similarly, if , ui, and vi are all zero, we have the ideal solution to make the lhs of Eq. (10) become zero.
Let γ* be the value such that
e12
where calculates the supremum of the cost function with respect to the variables x0, {ui}, {vi} (which is a function of ), and evaluates the infimum with respect to along the previously obtained supremum plane of x0, {ui}, {vi}. We say that the optimal HF is achieved if γ = γ*. In this sense, the evaluation of γ* is an application of the minimax rule, a strategy that aims to provide robust estimates and is different from its Bayesian counterpart (Burger 1985, chapters 4 and 5).

In practice, it is difficult to evaluate the exact value of γ*, since, by the definition in Eq. (12), γ* depends not only on the initial conditions and the dynamical and observation systems but also on the length N of the assimilation time window. A more convenient strategy is to choose a value γ satisfying 1/γ* < 1/γ ≤ +∞, so that it guarantees that there exists a (suboptimal) HF solution satisfying the inequality Eq. (10) (Simon 2006, chapter 11). In contrast, if 1/γ < 1/γ*, then there is no guarantee (although it is not impossible) that there exists such an HF solution. For instance, in the extreme event with the variables x0, {ui}, {vi} located on their supremum plane, then by the definition in Eq. (12), there is no HF solution to make . For reference, hereafter we call γ the performance level of the HF.

The inequality Eq. (10) can be solved through dynamic constrained optimization, with Eqs. (1a) and (1b) being the constraints at different assimilation cycles. For details, readers are referred to Simon (2006, chapter 11). For convenience of comparison, we also split the algorithm into prediction and filtering steps.

  • Prediction step: As in the KF, we also propagate the analysis forward to produce the background at the next cycle:
    e13a
    e13b
  • Filtering step: With a new observation yi, we update the background to the analysis:
    e14a
    e14b
    e14c
  • subject to the constraints
    e15

Here Δi denotes the uncertainty matrix, analogous to the covariance matrix in the KF; and represents the gain matrix, analogous to the Kalman gain . The ≥0 means that must be positive semidefinite. After obtaining , one can apply Eq. (9) to estimate zi.

Comparing Eq. (13) with Eq. (2), one can see that the prediction steps of the KF and the HF are the same. Furthermore, the update formula Eq. (14a) of the HF is a linear estimator as in the KF, but in general with a different gain matrix . However, if γ = 0 in Eq. (14b), then Eqs. (14a), (14b), and (14c) reduce to Eqs. (3a), (4a), and (4b), respectively. In this case, the HF is equivalent to the KF. Therefore, the KF can be considered as a special case of the HF with a performance level γ = 0.

A further examination of Eq. (14) reveals that the user-defined matrix is involved in the evaluation of , hence those of and . This implies that, unlike the KF, the choice of affects the estimates of the HF. In addition, let
e16
Then, following Eq. (4a), it is clear that is an uncertainty matrix obtained by updating through the KF, while the information matrix
e17
Therefore, the HF appears more conservative, in the sense that it tends to make its analysis uncertainty matrix larger than its counterpart in the KF, given the same background uncertainty matrix . Consequently, when using Eq. (14a) to update the background to the analysis, the HF allocates more weight to the observation yi than the KF does, which may be preferred when the background is not very reliable because of the uncounted sources of uncertainties [similar arguments can also be found in, e.g., Jazwinski (1970) and Van Leeuwen (2009)].3 In fact, as will be shown later, this conservativeness exhibits connections to covariance inflation techniques adopted in some EnKF methods.

The presence of the term in Eq. (14a) reflects the fundamental difference in the estimation strategies employed in the KF and the HF. As pointed out previously, the KF assumes that one has sufficient information of the statistical properties of both the dynamical and observation systems. Under this assumption, the KF looks for the best possible estimates under a certain optimality criterion (e.g., minimum variance or maximum likelihood). Therefore, the KF can achieve good performance if the statistical description of the assimilated system is sufficiently close to the reality. However, if there exists substantial deviation, then the KF may have poor performance, or even diverge [see Schlee et al. (1967) for such examples]. In contrast, the HF is more conservative than the KF since it only aims to provide robust, rather than best, estimates. The HF may perform worse than the KF if the statistical properties (e.g., pdf’s or moments) of the assimilated system are well characterized and close to the truth. However, if there exist more uncertainties, the HF can perform better than the KF [cf. Simon (2006, p. 358) for some examples].

A further issue that may be of interest in practice is the choice of the term , such that one may improve the performance of the filter under other measures, for example, root-mean-square errors (RMSEs). This is possible, since the HF only accounts for the robustness property, and one has certain freedom in choosing the forms of the quantities and γ, as long as satisfies the constraints in the HF.4 In other words, the robustness requirement in general does not yield unique choices of γ and . One may impose other objective functions to the HF to improve its performance under the relevant measures. One such example is the mixed (nonlinear) Kalman and H filter (Simon 2006, chapter 12), where one tries to minimize a least squares cost function as in the KF, while it imposes certain robustness on the estimates in the sense of Eq. (10). Other examples can be found in Anderson’s (2007, 2009) works on adaptive covariance inflation in the EnKF, where the criterion that the pdf of the covariance inflation factor be maximized is used. This point will become clear after we establish the connection between robustness and covariance inflation in subsequent sections.

3. Time-local ensemble H filter

The HF has to satisfy the inequality constraints in Eq. (15), which makes it challenging and inefficient for sequential data assimilation in certain circumstances. To see this, suppose that for i = 0, … , N, the HF has an admissible solution , with a prespecified performance level γ satisfying all the inequality constraints in Eq. (15). When extending the time horizon from N to N + 1, γ may not satisfy the constraint at N + 1. As a result, one has to choose a smaller value for γ and restart the assimilation in the new time window [0, N + 1], resulting in a different filter solution for i = 0, … , N.

Alternatively, one may keep the solution between i = 0, … , N unchanged. From N + 1, one uses a smaller value γ′ for estimation as long as it satisfies Eq. (15). Once γ′ violates the constraint for a larger N, one adopts an even smaller performance level γ″ but still keeps the previously obtained estimates, and so on. In what follows, we extend this idea further. We propose a variant of the HF, called the time-local H filter (TLHF), in which we impose a local cost function and adopt a local performance level γi to solve a local constraint at each assimilation cycle.

a. Time-local H filter for linear systems

We first introduce the TLHF for linear systems. The extension to nonlinear systems, analogous to the EnKF methods, will be presented in the next section.

In the TLHF, we define a local cost function
e18
with respect to the variables xi, ui, vi, and . Here is the background and is fixed relative to the ith cycle . For each i, can be treated as a special case of the cost function in Eq. (11), as if N = i = 0.
Analogous to Eq. (10), we require
e19
where γi is a suitable local performance level, which satisfies
e20
with being the minimax point of . In addition, γi also has to satisfy a local inequality constraint at time instant i (to be shown later). Then for all such γi (0 ≤ iN), we have
e21
This shows that the growth rate of the total “energy” of the estimation error is finite unless γi = 0 for some i. Thus, the corresponding estimates may also provide a robust solution in the same sense as in the HF.

Equation (21) bears a similar form to Eq. (10), but also exhibits a clear difference. That is, in Eq. (21), the total “energy” of the uncertainties includes the contribution from the uncertainty in specifying the background at each assimilation cycle. In contrast, in Eq. (10) the counterpart term only represents the contribution from the uncertainty in specifying the initial conditions. The extra terms in Eq. (21) provide a possibility to take into account the effect(s) of nonlinearity and/or other mechanisms that contribute to the estimation errors in the background, so that one does not have to significantly change the structure of the HF when extending it from linear systems to nonlinear ones. For example, in the presence of nonlinearity, there may exist extra uncertainties incurred by the chosen data assimilation algorithm itself (called algorithm uncertainty hereafter), including the linearization error when one uses the extended Kalman filter (EKF) to assimilate a nonlinear system, and, more generally, the approximation error when one adopts an approximation scheme in assimilation, such as the sampling error in the EnKF, or the rank deficiency in a reduced rank filter. These potential uncertainties influence the estimations of the system states, but conceptually they might not belong to the uncertainties in specifying the dynamical or observation systems. Instead, one may treat them as the uncertainties in specifying the background, an extension of the uncertainties in specifying the initial conditions. With this treatment, one may apply the TLHF to a nonlinear system in the same way as it is applied to a linear system, while including the uncertainties due to the effect(s) of nonlinearity and/or any other error sources into the category of uncertainty in specifying the background.5

Following the same deductions in Simon (2006, chapter 11) one can derive the steps of the TLHF as follows.

Prediction step:
e22a
e22b
Filtering step:
e23a
e23b
e23c
subject to the constraint
e24

Thus, compared with the HF, the TLHF only replaces the (global) performance level γ with the local one γi (i = 1, … , N), without changing anything else.

b. Ensemble time-local H filter

The ensemble time-local H filter (EnTLHF) is a straightforward analog to the EnKF methods. Here the principal idea is that, at the prediction step, one uses the background ensemble, which is the propagation of the analysis ensemble from the previous cycle, to estimate the background and the associated uncertainty matrix. Then, one updates the background uncertainty matrix to the analysis one based on an EnKF method, calculates the corresponding gain matrix of the EnTLHF, and computes the analysis mean and the associated uncertainty matrix [cf. Eq. (28) below].

Concretely, let n}, be the n-member background ensemble at time instant i, which is the prediction of the analysis ensemble at the previous cycle, with the transition operator possibly being nonlinear.

At the prediction step of the EnTLHF,
e25a
e25b
where and denote the estimations of the background and the associated uncertainty matrix, respectively, and may be obtained through different ways in various EnKF methods.
At the filtering step, one uses in place of the background sample covariance in an EnKF method and applies the corresponding formula in the EnKF method to update to an uncertainty matrix . Then, by Eqs. (16) and (17), one has
e26
where denotes the mx-dimensional identity matrix. Thus, in general can be evaluated based on the analysis update through an EnKF method.
Substituting Eq. (26) into Eq. (23c), we have the estimated gain matrix
e27
where is the Kalman gain evaluated by an EnKF method, due to Eq. (4b). Note that for simplicity, in the above deduction we have assumed that the observation operator is linear. In the case that is nonlinear, one could also derive an approximate formula of Eq. (27) (e.g., by linearizing or using a Monte Carlo approach).

In particular, if one chooses γi = 0 for i = 0, … , N in Eqs. (26) and (27), then it is clear that and are reduced to and , their counterparts in the EnKF method, respectively. In this case, the EnTLHF is equivalent to the EnKF method without covariance inflation. If γi > 0, the EnTLHF has connections to some EnKF methods with certain covariance inflation techniques, as will be discussed in section 4.

For summary, at the filtering step of the EnTLHF,
e28a
e28b
e28c
e28d
subject to the constraint
e29
where Eq. (28a) means that and are obtained through an EnKF method, given the information , , and .

After obtaining and , one can draw an analysis ensemble of the system states, which preserves the estimate and the uncertainty matrix (Anderson 2001; Bishop et al. 2001; Hoteit et al. 2002; Whitaker and Hamill 2002). This analysis ensemble is then integrated forward to start a new assimilation cycle.

4. Some specific forms and their connections to covariance inflation

Here we show that some specific forms of the EnTLHF exhibit connections to some existing EnKF methods with covariance inflation. We again assume that the observation operator is linear. This choice is made only for convenience in presenting the term below. The same results can also be obtained if is nonlinear.

From Eq. (29), we have
e30
where the local performance level γi and the information matrix can be chosen by the designer. We consider some specific choices of the term and derive the corresponding relations between and (or ). By treating as the inflated covariance of , the EnTLHF can be interpreted as an EnKF method equipped with a specific uncertainty inflation technique.

a. Case for 0 ≤ c ≤ 1

This implies that
e31
and
e32
In particular, if is an identity matrix, , so that the uncertainty matrix is proportional to the background uncertainty matrix . For convenience, we label this specific form by I-BG and call c the performance level coefficient (PLC).
If one ignores in Eq. (25a) (or if there is no dynamical noise term, so that ) and decomposes and as and , respectively, where Γi, , and are suitable matrices corresponding to the definitions in Hoteit et al. (2002) and Pham et al. (1998), then it can be shown that
e33
which is equivalent to the singular evolutive extended Kalman (SEEK) filter (Hoteit et al. 2002; Pham et al. 1998) with a forgetting factor (1 − c).

b. Case for 0 < c ≤ 1

This implies that
e34
and
e35
In particular, if is an identity matrix, , so that the uncertainty matrix is proportional to the analysis uncertainty matrix that is updated from through an EnKF method. We label this specific form by I-ANA.
Equation (35) means that
e36
which is equivalent to the covariance inflation technique used in Anderson and Anderson (1999), Whitaker and Hamill (2002), and other similar works if one lets 1/(1 − c) = (1 + δ)2, with δ being the inflation factor. Note that, in the EnKF method, one may first update the background mean to the analysis counterpart and then generate an analysis ensemble, the error covariance of which is (implicitly) inflated. In doing so, covariance inflation does not affect the computation of the Kalman gain at the same cycle. Instead, it affects the Kalman gain in the next cycle, since the background covariance at the next cycle will be inflated by conducting covariance inflation. In contrast, in the EnTLHF, the gain matrix is directly affected by the analysis uncertainty matrix at the same assimilation cycle.

c. Case

This implies that
e37
and
e38
In particular, if is an identity matrix, , so that the uncertainty matrix is also equal to the identity matrix . We label this specific form by I-MTX.
Suppose that the analysis covariance obtained by an EnKF method is decomposed as
e39
through a singular value decomposition (SVD), where is a diagonal matrix consisting of the eigenvalues σi,j of , which are arranged in a nonascending order, that is, σi,jσi,k ≥ 0 for j < k, and is the matrix consisting of the corresponding (orthonormal) eigenvectors ei,j (j = 1, … , mx). Then, in order to make in Eq. (38) positive definite, it is sufficient to let
e40
which means
e41
that is, γi is less than the inverse of the maximum eigenvalue of . We can also introduce a PLC to the filter in this case by writing γi = c/σi,1, with 0 ≤ c < 1.
Given a suitable value of γi, can be evaluated as follows. Using Eq. (38), one has
e42
where is a diagonal matrix consisting of the eigenvalues ηi,j (j = 1, … , mx), with
e43
Thus, one has
e44

Through Eqs. (43) and (44), one can see that the analysis uncertainty matrix obtained by the EnTLHF has a similar structure to that of [cf. Eq. (39)]. The eigenvalues of are the inflations of the corresponding eigenvalues σi,j of if σi,j > 0, or remain unchanged if σi,j = 0.

Remarks

A similar eigenvalue inflation technique was used in Ott et al. (2004), where the authors increased all the eigenvalues of by a constant ϵi (relative to the ith cycle), so that after inflation, the eigenvalues of are equal to σi,j + ϵi, but with the same associated eigenvectors as in . One may establish the connection of this inflation technique to the EnTLHF by setting in Eq. (30). Nominally, in light of Eq. (30), one may establish the connection of any inflation technique to the EnTLHF by solving the equation
eq2
As one more example, one may consider the scenario for 0 < c ≤ 1. This leads to
e45
and
e46
To our knowledge, covariance inflation based on Eq. (46) is not used in the literature, possibly because it is more natural to conduct covariance inflation through the background or analysis ensemble of the system state, rather than through the ensemble of observation. However, Eq. (46) might provide an alternative point of view to explain the underperformance of the stochastic EnKF (Burgers et al. 1998; Evensen 1994; Evensen and van Leeuwen 1996; Houtekamer and Mitchell 1998) in comparison to the deterministic ones (Anderson 2001; Bishop et al. 2001; Whitaker and Hamill 2002) in certain situations (e.g., Whitaker and Hamill 2002). In the stochastic EnKF, one generates an ensemble of surrogate observations based on the observation distribution. Because of the effect of small sample size, the sample covariance of the surrogate observations will underrepresent the original covariance of observation. This is equivalent to letting 1 − c > 1 in Eq. (46) (i.e., c < 0), which implies the choice of γi < 0 in Eq. (30). As discussed previously, the negativeness of γi means that, instead of being conservative, the filter designer is confident in the estimation accuracy of the background. Hence when updating the background to the analysis, more weight is allocated to the background, rather than to the observation, which may deteriorate the performance of the filter if there exist more uncertainties in the background than in the observation.

5. Numerical examples

We conduct a series of numerical experiments to assess the relative robustness of the TLHF–EnTLHF in comparison to the corresponding KF/EnKF method without inflation. In all experiments, we estimate the full state vectors so that the transform matrix .

a. Experiments with a linear model

In the first set of experiments, we consider a one-dimensional regression model, governed by the equation
e47
where the random variable ui follows the Gaussian distribution with mean zero and variance 1 [denoted by ui ~ N(ui: 0, 1)]; f(xi, k, h, d) is a discrete boxcar function, starting at the time index k, with a jump height h and a width d, that is,
e48
In the experiment we let four jumps occur with k = 200, 400, 600, and 800, respectively, with the objective to verify that the TLHF can perform (almost) equally well for the jumps that occurred at different times. For each jump, we fix d = 20 but let the jump height h be 10 or 30. For illustration, a time series generated by Eq. (47) with h = 10 is shown in Fig. 1.
Fig. 1.
Fig. 1.

A time series generated by Eq. (47).

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

We suppose that the associated observation system is characterized by
e49
where υi ~ N(υi: 0, 1). In the experiment, we let the time index i = 1, 2, … , 1000, and record the observation every time step. To reduce statistical fluctuations in estimation, we conduct the experiment for 100 times, each time with different initial conditions (drawn at random), hence different truths and observations.
For assimilation, we suppose that one uses the following imperfect model,
e50
as the dynamical system, which ignores the nonlinear term and fails to capture the abrupt change of regime. Because of the linearity in Eqs. (49) and (50), we can apply both the KF and the TLHF to assimilate the system. Given the covariance Σi (of a scalar variable) evaluated by the KF at the ith cycle, in the TLHF one only needs to replace it by Δi = Σi/(1 − γiΣi) = Σi/(1 − c) [with γi = ci, c ∈ [0, 1) in the scalar system] and change the gain matrix accordingly, while the other steps are the same as those in the KF. Clearly, in the scalar system, the specific form I-ANA in section 4b is equivalent to the specific form I-MTX in section 4c.
We use the average RMSE to measure the performance of the filter. For an mx-dimensional system, the RMSE ei of an estimate with respect to the state vector at time instant i is defined as
e51
The average RMSE (RMSE for short) at time instant i over 100 simulations is defined as , where denotes the RMSE at time instant i in the jth simulation. We also define the time mean RMSE as the average of the (average) RMSE over the time horizon (assimilation time window) (1, … , N), that is, (N = 1000 in this experiment).

Figure 2 plots the RMSE of the KF over the time horizon [1, 1000] in the cases h = 10 (top), h = 30 (bottom). In both cases, the KF achieves a relatively low RMSE during the period without any abrupt jump. However, when the abrupt jumps occur, the RMSE of the KF rises sharply in response.

Fig. 2.
Fig. 2.

RMSE of the KF in assimilating the regression model. The jump heights h are (top) 10 and (bottom) 30.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Figures 35 plot the RMSE differences between the TLHF of I-BG with different PLC values and the KF. Throughout this work, we use the RMSEs of the KF as the baselines, and the RMSE differences are defined as the RMSEs of the TLHF subtracted by the corresponding ones of the KF. In all these figures, the top plots correspond to the case h = 10 and the lower ones to the case h = 30. At c = 0.1 (Fig. 3), when there is no abrupt jump, the RMSEs of the TLHF and the KF are nearly indistinguishable, so that their RMSE differences are almost zero. However, when the abrupt jumps appear, the RMSEs of the TLHF do not rise as sharply as those of the KF, so that their RMSE differences become negative, suggesting that the TLHF has relatively more robust performance than the KF during the abrupt jumps. At c = 0.5 (Fig. 4), the RMSE differences during the periods with the abrupt jumps become larger, while those during the periods without any abrupt jump remain close to zero. Further increasing c to 0.9 (Fig. 5), the performance of the TLHF becomes remarkably better than the KF during the periods with abrupt jumps. The RMSEs of the TLHF appear insensitive to the presence of the abrupt jumps, which is not the case for the KF. However, there is also a price for the TLHF to achieve this. During the periods without the abrupt jumps, the TLHF performs worse than the KF, so that their RMSE differences are slightly above zero. Moreover, the divergence of the TLHF is spotted for time indices i > 870. The occurrence of the divergence is possibly due to the fact that the PLC is too large, so that 1/γi becomes less than the minimum threshold 1/γ* defined in Eq. (12). As discussed in section 2b, in such situations there is no guarantee to attain a TLHF solution that satisfies the inequality in Eq. (19). Instead, divergence of the filter solution may occur as observed in the experiment.

Fig. 3.
Fig. 3.

RMSE difference between the TLHF of I-BG and the KF in assimilating the regression model. The jump heights h are (top) 10 and (bottom) 30. In the TLHF, the PLC c = 0.1.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Fig. 4.
Fig. 4.

As in Fig. 3, but with c = 0.5.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Fig. 5.
Fig. 5.

As in Fig. 3, but with c = 0.9.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Figures 68 show the RMSE differences between the TLHFs of I-ANA and I-MTX with three different PLC values (equivalent to each other in scalar systems) and the KF. Similar results are observed. At c = 0.1 (Fig. 6), the RMSEs of the TLHF and the KF are almost indistinguishable when there is no abrupt jump, so that their RMSE differences are very close to zero. The TLHF again performs better than the KF when the abrupt jumps occur. At a larger PLC value, c = 0.4 (Fig. 7), the TLHF performs remarkably better than the KF during the period of abrupt jumps, but at the cost of slightly worse performance than the KF during the period without any abrupt jump. When further increasing c to 0.6 (Fig. 8), the performance of the TLHF deteriorates in comparison with the choice c = 0.4. More investigations (not reported here) show that a larger value (c > 0.6) leads to even worse performance.

Fig. 6.
Fig. 6.

RMSE difference between the TLHF of I-ANA and I-MTX and the KF in assimilating the regression model. The jump heights h are (top) 10 and (bottom) 30. In the TLHF, the PLC c = 0.1.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Fig. 7.
Fig. 7.

As in Fig. 6, but with c = 0.4.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Fig. 8.
Fig. 8.

As in Fig. 6, but with c = 0.6.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

To summarize, our experiment results show that, for a relatively small PLC, the KF and the TLHF have close performance. This is expected, since the TLHF with c = 0 reduces to the KF as we have noted in section 3. As c increases, the TLHF exhibits a better performance than the KF when there are relatively large uncertainties. However, when there only exist relatively small uncertainties in assimilation, a too large c (hence too much uncertainty inflation) may also make the TLHF appear overconservative and deteriorate the filter performance (or even diverge). This is because, with relatively small uncertainties, the backgrounds also provide useful information and, thus, should not be underweighted. To mitigate this problem, one possible strategy is to use a relatively small value of c to make the TLHF less conservative when there only exist relatively small uncertainties, and a larger one when there exhibit more uncertainties. This is essentially a strategy that conducts adaptive covariance inflation, as has already been investigated in some works (e.g., Anderson 2007, 2009; Hoteit et al. 2002; Hoteit and Pham 2004). From our earlier discussion in section 3b, the adaptive inflation problem can be solved under the framework of the HF with an additional optimality criterion (e.g., minimum variance or maximum likelihood), which will be investigated in the future.

b. Experiments with a nonlinear model

In the second set of experiments, we consider the 40-dimensional Lorenz and Emanuel (1998) model (LE98), the governing equations of which are given by
e52
The quadratic terms simulate advection, the linear term represents internal dissipation, and F acts as the external forcing term (Lorenz 1996). For consistency, we define x−1 = x39, x0 = x40, and x41 = x1. We suppose that the true value of the parameter F is 8 for t ≥ 0, but in assimilation one may choose other values for F, which thus yields a potential parameter mismatch. In our experiments we consider two scenarios, with F = 6 and F = 8, respectively.
We use the fourth-order Runge–Kutta method to integrate (and discretize) the system from time 0 to 250, with a constant integration step of 0.05 (overall, 5001 integration steps). We then use the following system,
e53
to record the observation of the state vector xi ≡ (xi,1, xi,2, … , xi,40) at time instant i, where vi follows the Gaussian distribution N(vi: , ), with being the 40-dimensional identity matrix. The observations are made for every four integration steps.

We use the ensemble transform Kalman filter (ETKF) (Bishop et al. 2001) to construct the EnTLHF. The ETKFs with I-BG and I-ANA are constructed by inflating the background ensembles and the analysis ensembles, respectively, in a similar way to that in Anderson and Anderson (1999) and Whitaker and Hamill (2002). To construct the ETKF with I-MTX, one needs to evaluate the analysis covariances, conduct SVDs, and then inflate the associated eigenvalues. In high-dimensional systems, conducting SVDs on the analysis covariances makes the ETKF with I-MTX computationally less efficient than its I-BG and I-ANA counterparts. However, it is possible to implement the I-MTX form in the SEEK filter (Hoteit et al. 2002; Pham et al. 1998) without significant increase of computational cost, since in this case all such SVDs can be conducted on the matrices updated by Eq. (33), the dimension of which is determined by the ensemble size in assimilation.

In our experiments we let the ensemble size n = 10 and vary the PLC values. To reduce statistical fluctuations, for each PLC value c we repeat the experiments for 20 times, each time with a randomly drawn initial background ensemble (with 10 members). In practice, it is customary to introduce covariance localization to the ETKF in order to improve the filter performance (Hamill et al. 2009; Van Leeuwen 2009). Since in our experiments our objective is to assess the relative robustness of the EnTLHF, we choose not to conduct covariance localization to avoid complicating the analysis of our experiment results. In what follows, we examine the time mean RMSE of the EnTLHF as a function of the PLC value c, with c ∈ [0, 0.1, 0.2, … , 0.9]. The ETKF is treated as a special case of the EnTLHF with c = 0.

Figure 9 plots the time-mean RMSEs of the ETKF with I-BG. The result in the case of F = 6 is marked with the dash–dotted line and that in the case of F = 8 with the dotted one. When F = 6, the time mean RMSE appears to be a monotonically decreasing function with respect to c. When F = 8, the time mean RMSE tends to decrease until it reaches c = 0.8. After that, the time-mean RMSE slowly rises. In both cases, all time mean RMSEs with c > 6 are lower than that of the ETKF (c = 0).

Fig. 9.
Fig. 9.

Time mean RMSE of the EnTLHF of I-BG as a function of the PLC in assimilating the LE98 model. The values of the parameter F are 6 (dash–dotted) and 8 (dotted). The EnTLHF reduces to the ETKF when the PLC c = 0.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Similar results of the ETKF with I-ANA are observed in Fig. 10. For both cases, with F = 6 and F = 8, their time mean RMSEs are monotonically decreasing functions with respect to c, and all time mean RMSEs with c > 0 are lower than that of the ETKF (c = 0).

Fig. 10.
Fig. 10.

As in Fig. 9, but for I-ANA.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Figure 11 shows the time mean RMSEs of the ETKF with I-MTX. When F = 6, the time mean RMSE decreases monotonically until it reaches c = 0.4. After that, the time-mean RMSE rises rapidly. Moreover, if c > 0.6, filter divergence is spotted, possibly for the same reason as explained in the previous section. The result of F = 8 is similar: the time mean RMSE decreases until c = 0.5 and then increases as c continues growing. Filter divergence also occurs when c > 0.6. Compared to the ETKF (c = 0), the time mean RMSEs with c > 0 are lower until c reaches the turnaround point.

Fig. 11.
Fig. 11.

As in Fig. 9, but for I-MTX. Filter divergence occurs for c > 0.6.

Citation: Monthly Weather Review 139, 12; 10.1175/MWR-D-10-05068.1

Through the above experiments, we have shown that, with suitable PLC values, the ETKFs of all three specific forms, namely, I-BG, I-ANA, and I-MTX, exhibit relative robustness in comparison with the ETKF without any covariance inflation, which is consistent with the observations in the literature that an EnKF method with suitable covariance inflation may perform better than that without any covariance inflation [see, e.g., Hamill et al. (2009) and Van Leeuwen (2009) and the references therein]. Different inflation schemes may result in different filter performance. For instance, the ETKF with I-ANA appears to have better performance than the other two schemes. The validity of this conclusion may depend on the system in assimilation, though, and may need to be verified case by case.

6. Discussion and conclusions

In this work we considered the applications of the KF and the HF to a state-estimation problem. We discussed the similarity and difference between the KF and the HF, and showed that the KF can be considered as a special case of the HF with the performance level equal to zero. For convenience of applying the H filtering theory to sequential data assimilation, we introduced a variant, called the time-local HF, in which we suggested to solve the constraints in the HF locally (in time). Analogous to the EnKF methods, we proposed the ensemble version of the TLHF, called the ensemble time-local HF (EnTLHF), and showed that the EnTLHF can be constructed based on the EnKF. In addition, we established the connections of some specific forms of the EnTLHF to some EnKF methods equipped with certain covariance inflation techniques.

Compared to existing works on covariance inflation in the EnKF, the H filtering theory provides a theoretical framework that unifies various inflation techniques in the literature and establishes the connection between covariance inflation and robustness. The H filtering theory also provides an explicit definition of robustness and the associated mathematical description. Conceptually, this leads to the possibility of recasting the problem of optimal covariance inflation as an optimization problem with multiple objectives, although further investigations will be needed for practical considerations. In addition, since the definition of robustness is filter-independent, the robustness property may be integrated into other types of nonlinear filters, for example, the particle filter or the Gaussian sum filter (Hoteit et al. 2008; Luo et al. 2010; Van Leeuwen 2003), by imposing constraints similar to that in Eq. (19). In our opinion, it might be less obvious to see how the above extensions can be made from the point of view of covariance inflation.

Through numerical experiments, we verified the relative robustness of three specific forms of the TLHF–EnTLHF in comparison with the KF–ETKF without covariance inflation. There are also some issues that have not been fully addressed in this work, for instance, the optimal choice of the performance level coefficient in conducting uncertainty inflation. Further investigations in these aspects will be considered in the future.

Acknowledgments

We thank two anonymous reviewers for their most constructive suggestions and comments that have significantly improved our work. This publication is based on work supported by funds from the KAUST GCR Academic Excellence Alliance program.

REFERENCES

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224.

  • Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 7283.

  • Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758.

    • Search Google Scholar
    • Export Citation
  • Beezley, J. D., , and J. Mandel, 2007: Morphing ensemble Kalman filters. Tellus, 60A, 131140.

  • Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436.

    • Search Google Scholar
    • Export Citation
  • Burger, J. O., 1985: Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, 624 pp.

  • Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724.

    • Search Google Scholar
    • Export Citation
  • Cohn, S., , and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc. Japan, 74, 6375.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367.

  • Evensen, G., , and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Aghulas Current using the ensemble Kalman filter with a quasigeostrophic model. Mon. Wea. Rev., 124, 8596.

    • Search Google Scholar
    • Export Citation
  • Francis, B. A., 1987: A Course in H Control Theory. Springer-Verlag, 156 pp.

  • Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., , J. S. Whitaker, , J. L. Anderson, , and C. Snyder, 2009: Comments on “Sigma-point Kalman filter data assimilation methods for strongly nonlinear systems.” J. Atmos. Sci., 66, 34983500.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , and D. T. Pham, 2004: An adaptively reduced-order extended Kalman filter for data assimilation in the tropical Pacific. J. Mar. Syst., 45, 173188.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , and J. Blum, 2001: A semi-evolutive partially local filter for data assimilation. Mar. Pollut. Bull., 43, 164174.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36, 101127.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , G. Triantafyllou, , and G. Korres, 2008: A new approximate solution of the optimal nonlinear filter for data assimilation in meteorology and oceanography. Mon. Wea. Rev., 136, 317334.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

    • Search Google Scholar
    • Export Citation
  • Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 400 pp.

  • Kalman, R., 1960: A new approach to linear filtering and prediction problems. Trans. ASME, Ser. D. J. Basic Eng., 82, 3545.

  • Lorenz, E. N., 1996: Predictability—A problem partly solved. Predictability, T. Palmer, Ed., ECMWF, 1–18.

  • Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414.

    • Search Google Scholar
    • Export Citation
  • Luo, X., , and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform. Physica D, 238, 549562.

  • Luo, X., , I. M. Moroz, , and I. Hoteit, 2010: Scaled unscented transform Gaussian sum filter: Theory and application. Physica D, 239, 684701.

    • Search Google Scholar
    • Export Citation
  • Nerger, L., , L. Hiller, , and J. Schröter, 2005: A comparison of error subspace Kalman filters. Tellus, 57A, 715735.

  • Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415428.

  • Pham, D. T., , J. Verron, , and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for data assimilation in oceanography. J. Mar. Syst., 16, 323340.

    • Search Google Scholar
    • Export Citation
  • Schlee, F. H., , C. J. Standish, , and N. F. Toda, 1967: Divergence in the Kalman filter. AIAA J., 5, 11141120.

  • Shaked, U., , and N. Berman, 1995: H nonlinear filtering of discrete-time processes. IEEE Trans. Sig. Proc., 43, 22052209.

  • Simon, D., 2006: Optimal State Estimation: Kalman, H-Infinity, and Nonlinear Approaches. Wiley-Interscience, 552 pp.

  • Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490.

    • Search Google Scholar
    • Export Citation
  • Van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131, 20712084.

  • Van Leeuwen, P. J., 2009: Particle filtering in geophysical systems. Mon. Wea. Rev., 137, 40894114.

  • Verlaan, M., , and A. W. Heemink, 1997: Tidal flow forecasting using reduced rank square root filters. Stoch. Hydrol. Hydraul., 11, 349368.

    • Search Google Scholar
    • Export Citation
  • Wang, D., , and X. Cai, 2008: Robust data assimilation in hydrological modeling—A comparison of Kalman and H-infinity filters. Adv. Water Resour., 31, 455472.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924.

  • Zupanski, M., 2005: Maximum likelihood ensemble filter: Theoretical aspects. Mon. Wea. Rev., 133, 17101726.

1

The deduction will be similar in case that ui and vi are correlated colored noise. Readers are referred to, for example, Simon (2006, chapter 7) for the details.

2

is the prior knowledge of x0, which is assumed to be available, and therefore not influenced by the filter designer.

3

If, in contrast, the observation is very unreliable, then one may choose a negative value for γ such that the background has relatively more weight in the update. In this work we confine ourselves to the scenario γ ≥ 0.

4

Here we assume is determined by the practical need given in Eq. (5).

5

Like the extended KF, there also exists the extended HF containing more thorough treatment of nonlinearity [see, e.g., Shaked and Berman (1995)], whose implementation, however, involves the derivative(s) of nonlinear functions and more sophisticated inequality constraints.

Save