## 1. Introduction

The Kalman filter (KF) (Kalman 1960) is a sequential data assimilation algorithm. For linear stochastic systems, it can be shown that the KF is an optimal linear estimator that minimizes the variance of the estimation error (Simon 2006, chapter 5). Because of its relative simplicity in implementation, the KF is suitable for many data assimilation problems. However, for high-dimensional systems such as weather forecasting models, direct application of the KF is prohibitively expensive as it involves manipulating covariance matrices of the system states. For this reason, different modifications of the KF were proposed to reduce the computational cost. These include various ensemble Kalman filters (EnKFs) (Anderson 2001; Bishop et al. 2001; Burgers et al. 1998; Evensen 1994; Evensen and van Leeuwen 1996; Houtekamer and Mitchell 1998; Whitaker and Hamill 2002), the error subspace-based filters (Cohn and Todling 1996; Hoteit et al. 2001, 2002; Luo and Moroz 2009; Pham et al. 1998; Verlaan and Heemink 1997), and filters based on other strategies (Beezley and Mandel 2007; Zupanski 2005), to name but a few. A detailed description of the above filters is beyond the scope of this work. Readers are referred to Evensen (2003), Nerger et al. (2005), and Tippett et al. (2003) for reviews of some of the aforementioned filters. Roughly speaking, these modifications exploit the information of a subset in the state space of a dynamical system, while the information of the complement set is considered less influential and thus ignored. Consequently, the computations of these modified filters are normally conducted on the chosen subsets, instead of the whole state space, so that their computational costs are reduced. For simplicity, we may sometimes abuse the terminology by referring to all the aforementioned filters as the EnKF-based methods (EnKF methods for short).

The KF and the EnKF are among the family of Bayesian filters that adopt Bayes’ rule to update background statistics to their analysis counterparts. In these filters, one needs to make certain assumptions on the statistical properties [e.g., probability distribution functions (pdfs) or moments] of both the dynamical and observation systems. In reality, however, these assumptions may not be accurate, so that a Bayesian filter may fail to achieve good performance with mis-specified statistical information (Schlee et al. 1967). For example, if implemented straightforwardly, an EnKF with a relatively small ensemble size may produce inaccurate estimations of covariance matrices (Whitaker and Hamill 2002). This could degrade filter performance or even cause filter divergence. As a remedy, in practice it is customary to conduct covariance inflation and localization to relieve this problem (Anderson and Anderson 1999; Hamill et al. 2001, 2009; Van Leeuwen 2009).

In contrast, robust filters emphasize the robustness of their estimates, so that they may have better tolerances to possible uncertainties in assimilation. The estimation strategies of robust filters are different from Bayes’ rule. One can take the *H*_{∞} filter (Francis 1987; Simon 2006), one of the robust filters, as an example. The *H*_{∞} filter (HF) does not require one to exactly know the statistical properties of a system being assimilated. Instead, it accepts the possibility that one may only have incomplete information of the system. Consequently, rather than looking for the best possible estimates based on Bayes’ rule, the optimal *H*_{∞} filter employs a robust strategy, namely, the minimax rule (Burger 1985, chapter 4), to update its background statistics. This robustness may be of interest in practical situations. For example, for data assimilation in earth systems, the system models are often not the exact descriptions of the underlying physical processes, and it is challenging to characterize the properties of the corresponding model errors (Wang and Cai 2008 and the references therein). Given an imperfect model, the estimation error of the HF in general grows with the uncertainties in assimilation at a finitely bounded rate (except for the special case when the HF reduces to the KF itself), while the estimation error of the KF does not possess such a guarantee.

In this work we propose a variant of the HF, called the time-local HF (TLHF), to avoid solving global constraints as in the HF. By analogy to the EnKF, we further introduce the ensemble TLHF (EnTLHF) for data assimilation in high-dimensional systems. We show that the EnTLHF can be constructed based on the EnKF, and thus the computational complexity of the EnTLHF is in general comparable to that of the EnKF. We also show that some specific forms of the EnTLHF have connections with some EnKFs equipped with certain covariance inflation techniques. More generally, we show that an EnKF with a certain covariance inflation technique is in fact an EnTLHF.

The organization of this work is as follows. Section 2 presents the data assimilation problem and discusses its solutions in terms of the KF and the HF, respectively. Section 3 introduces the TLHF as a variant of the HF, and its ensemble form, the EnTLHF. Section 4 discusses some specific forms of the EnTLHF and shows their connections with some of the EnKF methods with covariance inflation. In section 5, we use some numerical examples to show the relative robustness of the TLHF (EnTLHF) in comparison to the corresponding KF (EnKF) method.

## 2. Problem statement

Equations (1a) and (1b) represent the *m _{x}*-dimensional dynamical system and the

*m*-dimensional observation system, respectively, where

_{y}**x**

*denotes the*

_{i}*m*-dimensional state vector,

_{x}**y**

*the corresponding*

_{i}*m*-dimensional observation,

_{y}*i*. For convenience, in this section we assume the systems in Eqs. (1a) and (1b) are linear so that

Equations (1c)–(1e) imply that the *m _{x}*-dimensional dynamical noise

**u**

*and the*

_{i}*m*-dimensional observation noise

_{y}**v**

*are uncorrelated white noise,*

_{i}^{1}with zero mean and covariances

*δ*denotes the Kronecker delta function such that

_{ij}*δ*= 1 if

_{ij}*i*=

*j*,

*δ*= 0 otherwise.

_{ij}In what follows, we discuss two filtering approaches as the solutions of the above state-estimation problem: 1) the KF, which is based on the criterion of minimizing the variance of the estimation error [equivalent to applying Bayes’ rule to update background statistics as shown in Jazwinski (1970, chapter 7)]; and 2) the optimal HF, which is based on the criterion of minimizing the supremum (or maximum) of the ratio of the “energy” of the estimation error to the “energy” of the uncertainties in data assimilation (to be made clear shortly). In what follows we outline the main results of the KF and the HF. For more details, readers will be referred to appropriate references.

### a. Kalman filter as a solution

The KF algorithm involves prediction and filtering steps, the deductions of which can be found in, for example, Simon (2006, chapter 3). When the KF is applied to assimilate the system in Eq. (1), these steps are as follows.

- Prediction step: Propagate the analysis
at the ( *i*− 1)th cycle and the associated analysis error covarianceforward to obtain the background at the next cycle [Eq. (2a)] and the associated background error covariance [Eq. (2b)], respectively:

- Filtering step: With a new incoming observation
**y**, update the background_{i}and the associated error covariance to their analysis counterparts, [Eq. (3a)] and [Eq. (3b)], respectively:

**x**

*, we are interested in estimating some*

_{i}*m*-dimensional linear transform

_{z}**z**

*of*

_{i}**x**

*, in terms ofwith*

_{i}*m*×

_{z}*m*matrix. In Eq. (5), if

_{x}*m*×

_{x}*m*identity matrix, then

_{x}**z**

*reduces to*

_{i}**x**

*itself. But*

_{i}**z**

*is equivalent to the first element of*

_{i}**x**

*.*

_{i}**z**

*, while*

_{i}*i*th cycle with respect to

**z**

*, is the variance of the estimation error*

_{i}*i*th assimilation cycle, with

*L*

_{2}norm. Given an

*m*-dimensional vector

_{z}*direct*sequential minimization of the local cost functions

**z**

*and*

_{i}*i*, respectively, associated with the modified transition operator

**y**

*and*

_{i}**v**

*still correspond to the observation vector and noise, respectively, but the observation operator becomes*

_{i}*indirect*way, but without changing the systems in assimilation. To this end, one specifies a new cost function,with respect to the variables

*indirect*estimation, the estimates

*indirect*estimation, which is clearly different from the KF.

### b. H_{∞} filter as a solution

The HF (Simon 2006, chapter 11 and references therein) aims to provide robust, but not necessarily best, estimates. The main idea is to first recognize that in the Eq. (1) system, there are three possible sources that contribute to the uncertainties in data assimilation, namely, the uncertainties in the initial conditions, the model error, and the observation error. Accordingly, during an assimilation time window [0, *N*], these uncertainty sources are characterized by three uncertainty “energy” terms, defined as **u**_{0} is included in our discussion. Given a symmetric, positive semidefinite matrix *L*_{2} norm so that **z**.

Since the minimum variance criterion in the KF is consistent with Bayes’s rule, it is customary to interpret the matrices **Δ**_{0}, **Δ**_{0}, **Δ**_{0}, *uncertainty weight matrices* (uncertainty matrices for short) with respect to the initial conditions, the model error, and the observation error, respectively, and their inverses, *information matrices*. The uncertainty matrices **Δ**_{0},

*γ*, that is,In Eq. (10), the weight matrix

**x**

_{0}, {

**u**

*}, {*

_{i}**v**

*}, and*

_{i}^{2}Then the inequality [Eq. (10)] is equivalent to

**Δ**

_{0},

**u**

*, and*

_{i}**v**

*are all zero, we have the ideal solution*

_{i}*γ*

^{*}be the value such thatwhere

**x**

_{0}, {

**u**

*}, {*

_{i}**v**

*} (which is a function of*

_{i}*supremum plane*of

**x**

_{0}, {

**u**

*}, {*

_{i}**v**

*}. We say that the optimal HF is achieved if*

_{i}*γ*=

*γ*

^{*}. In this sense, the evaluation of

*γ*

^{*}is an application of the minimax rule, a strategy that aims to provide robust estimates and is different from its Bayesian counterpart (Burger 1985, chapters 4 and 5).

In practice, it is difficult to evaluate the exact value of *γ*^{*}, since, by the definition in Eq. (12), *γ*^{*} depends not only on the initial conditions and the dynamical and observation systems but also on the length *N* of the assimilation time window. A more convenient strategy is to choose a value *γ* satisfying 1/*γ*^{*} < 1/*γ* ≤ +∞, so that it guarantees that there exists a (suboptimal) HF solution *γ* < 1/*γ*^{*}, then there is no guarantee (although it is not impossible) that there exists such an HF solution. For instance, in the extreme event with the variables **x**_{0}, {**u*** _{i}*}, {

**v**

*} located on their*

_{i}*supremum plane*, then by the definition in Eq. (12), there is no HF solution

*γ*the

*performance level*of the HF.

The inequality Eq. (10) can be solved through dynamic constrained optimization, with Eqs. (1a) and (1b) being the constraints at different assimilation cycles. For details, readers are referred to Simon (2006, chapter 11). For convenience of comparison, we also split the algorithm into prediction and filtering steps.

- Prediction step: As in the KF, we also propagate the analysis forward to produce the background at the next cycle:
- Filtering step: With a new observation
**y**_{i}, we update the background to the analysis: - subject to the constraints

Here **Δ**_{i} denotes the uncertainty matrix, analogous to the covariance matrix **z*** _{i}*.

Comparing Eq. (13) with Eq. (2), one can see that the prediction steps of the KF and the HF are the same. Furthermore, the update formula Eq. (14a) of the HF is a linear estimator as in the KF, but in general with a different gain matrix *γ* = 0 in Eq. (14b), then Eqs. (14a), (14b), and (14c) reduce to Eqs. (3a), (4a), and (4b), respectively. In this case, the HF is equivalent to the KF. Therefore, the KF can be considered as a special case of the HF with a performance level *γ* = 0.

**y**

*than the KF does, which may be preferred when the background is not very reliable because of the uncounted sources of uncertainties [similar arguments can also be found in, e.g., Jazwinski (1970) and Van Leeuwen (2009)].*

_{i}^{3}In fact, as will be shown later, this conservativeness exhibits connections to covariance inflation techniques adopted in some EnKF methods.

The presence of the term

A further issue that may be of interest in practice is the choice of the term *γ*, as long as ^{4} In other words, the robustness requirement in general does not yield unique choices of *γ* and *H*_{∞} filter (Simon 2006, chapter 12), where one tries to minimize a least squares cost function as in the KF, while it imposes certain robustness on the estimates in the sense of Eq. (10). Other examples can be found in Anderson’s (2007, 2009) works on adaptive covariance inflation in the EnKF, where the criterion that the pdf of the covariance inflation factor be maximized is used. This point will become clear after we establish the connection between robustness and covariance inflation in subsequent sections.

## 3. Time-local ensemble *H*_{∞} filter

_{∞}

The HF has to satisfy the inequality constraints in Eq. (15), which makes it challenging and inefficient for sequential data assimilation in certain circumstances. To see this, suppose that for *i* = 0, … , *N*, the HF has an admissible solution *γ* satisfying all the inequality constraints in Eq. (15). When extending the time horizon from *N* to *N* + 1, *γ* may not satisfy the constraint at *N* + 1. As a result, one has to choose a smaller value for *γ* and restart the assimilation in the new time window [0, *N* + 1], resulting in a different filter solution for *i* = 0, … , *N*.

Alternatively, one may keep the solution between *i* = 0, … , *N* unchanged. From *N* + 1, one uses a smaller value *γ*′ for estimation as long as it satisfies Eq. (15). Once *γ*′ violates the constraint for a larger *N*, one adopts an even smaller performance level *γ*″ but still keeps the previously obtained estimates, and so on. In what follows, we extend this idea further. We propose a variant of the HF, called the time-local *H*_{∞} filter (TLHF), in which we impose a local cost function and adopt a local performance level *γ _{i}* to solve a local constraint at each assimilation cycle.

### a. Time-local H_{∞} filter for linear systems

We first introduce the TLHF for linear systems. The extension to nonlinear systems, analogous to the EnKF methods, will be presented in the next section.

**x**

*,*

_{i}**u**

*,*

_{i}**v**

*, and*

_{i}*i*th cycle

*i*,

*N*=

*i*= 0.

*γ*is a suitable local performance level, which satisfieswith

_{i}*γ*also has to satisfy a local inequality constraint at time instant

_{i}*i*(to be shown later). Then for all such

*γ*(0 ≤

_{i}*i*≤

*N*), we haveThis shows that the growth rate of the total “energy” of the estimation error is finite unless

*γ*= 0 for some

_{i}*i*. Thus, the corresponding estimates

Equation (21) bears a similar form to Eq. (10), but also exhibits a clear difference. That is, in Eq. (21), the total “energy” of the uncertainties includes the contribution from the uncertainty in specifying the background at each assimilation cycle. In contrast, in Eq. (10) the counterpart term only represents the contribution from the uncertainty in specifying the initial conditions. The extra terms in Eq. (21) provide a possibility to take into account the effect(s) of nonlinearity and/or other mechanisms that contribute to the estimation errors in the background, so that one does not have to significantly change the structure of the HF when extending it from linear systems to nonlinear ones. For example, in the presence of nonlinearity, there may exist extra uncertainties incurred by the chosen data assimilation algorithm itself (called algorithm uncertainty hereafter), including the linearization error when one uses the extended Kalman filter (EKF) to assimilate a nonlinear system, and, more generally, the approximation error when one adopts an approximation scheme in assimilation, such as the sampling error in the EnKF, or the rank deficiency in a reduced rank filter. These potential uncertainties influence the estimations of the system states, but conceptually they might not belong to the uncertainties in specifying the dynamical or observation systems. Instead, one may treat them as the uncertainties in specifying the background, an extension of the uncertainties in specifying the initial conditions. With this treatment, one may apply the TLHF to a nonlinear system in the same way as it is applied to a linear system, while including the uncertainties due to the effect(s) of nonlinearity and/or any other error sources into the category of uncertainty in specifying the background.^{5}

Following the same deductions in Simon (2006, chapter 11) one can derive the steps of the TLHF as follows.

Thus, compared with the HF, the TLHF only replaces the (global) performance level *γ* with the local one *γ _{i}* (

*i*= 1, … ,

*N*), without changing anything else.

### b. Ensemble time-local H_{∞} filter

The ensemble time-local *H*_{∞} filter (EnTLHF) is a straightforward analog to the EnKF methods. Here the principal idea is that, at the prediction step, one uses the background ensemble, which is the propagation of the analysis ensemble from the previous cycle, to estimate the background and the associated uncertainty matrix. Then, one updates the background uncertainty matrix to the analysis one based on an EnKF method, calculates the corresponding gain matrix of the EnTLHF, and computes the analysis mean and the associated uncertainty matrix [cf. Eq. (28) below].

Concretely, let *n*}, be the *n*-member background ensemble at time instant *i*, which is the prediction of the analysis ensemble

*m*-dimensional identity matrix. Thus, in general

_{x}In particular, if one chooses *γ _{i}* = 0 for

*i*= 0, … ,

*N*in Eqs. (26) and (27), then it is clear that

*γ*> 0, the EnTLHF has connections to some EnKF methods with certain covariance inflation techniques, as will be discussed in section 4.

_{i}After obtaining

## 4. Some specific forms and their connections to covariance inflation

Here we show that some specific forms of the EnTLHF exhibit connections to some existing EnKF methods with covariance inflation. We again assume that the observation operator

*γ*and the information matrix

_{i}### a. Case for 0 ≤ c ≤ 1

*c*the performance level coefficient (PLC).

**Γ**

*,*

_{i}*c*).

### b. Case for 0 < c ≤ 1

*c*) = (1 +

*δ*)

^{2}, with

*δ*being the inflation factor. Note that, in the EnKF method, one may first update the background mean to the analysis counterpart and then generate an analysis ensemble, the error covariance of which is (implicitly) inflated. In doing so, covariance inflation does not affect the computation of the Kalman gain at the same cycle. Instead, it affects the Kalman gain in the next cycle, since the background covariance at the next cycle will be inflated by conducting covariance inflation. In contrast, in the EnTLHF, the gain matrix is directly affected by the analysis uncertainty matrix at the same assimilation cycle.

### c. Case

*σ*

_{i}_{,j}of

*σ*

_{i}_{,j}≥

*σ*

_{i}_{,k}≥ 0 for

*j*<

*k*, and

**e**

_{i}_{,j}(

*j*= 1, … ,

*m*). Then, in order to make

_{x}*γ*is less than the inverse of the maximum eigenvalue of

_{i}*γ*=

_{i}*c*/

*σ*

_{i}_{,1}, with 0 ≤

*c*< 1.

*γ*,

_{i}*η*

_{i}_{,j}(

*j*= 1, … ,

*m*), withThus, one has

_{x}Through Eqs. (43) and (44), one can see that the analysis uncertainty matrix *σ _{i}*

_{,j}of

*σ*

_{i}_{,j}> 0, or remain unchanged if

*σ*

_{i}_{,j}= 0.

#### Remarks

*(relative to the*

_{i}*i*th cycle), so that after inflation, the eigenvalues of

*σ*

_{i}_{,j}+ ϵ

*, but with the same associated eigenvectors as in*

_{i}*c*≤ 1. This leads toandTo our knowledge, covariance inflation based on Eq. (46) is not used in the literature, possibly because it is more natural to conduct covariance inflation through the background or analysis ensemble of the system state, rather than through the ensemble of observation. However, Eq. (46) might provide an alternative point of view to explain the underperformance of the stochastic EnKF (Burgers et al. 1998; Evensen 1994; Evensen and van Leeuwen 1996; Houtekamer and Mitchell 1998) in comparison to the deterministic ones (Anderson 2001; Bishop et al. 2001; Whitaker and Hamill 2002) in certain situations (e.g., Whitaker and Hamill 2002). In the stochastic EnKF, one generates an ensemble of surrogate observations based on the observation distribution. Because of the effect of small sample size, the sample covariance of the surrogate observations will underrepresent the original covariance of observation. This is equivalent to letting 1 −

*c*> 1 in Eq. (46) (i.e.,

*c*< 0), which implies the choice of

*γ*< 0 in Eq. (30). As discussed previously, the negativeness of

_{i}*γ*means that, instead of being conservative, the filter designer is confident in the estimation accuracy of the background. Hence when updating the background to the analysis, more weight is allocated to the background, rather than to the observation, which may deteriorate the performance of the filter if there exist more uncertainties in the background than in the observation.

_{i}## 5. Numerical examples

We conduct a series of numerical experiments to assess the relative robustness of the TLHF–EnTLHF in comparison to the corresponding KF/EnKF method without inflation. In all experiments, we estimate the full state vectors so that the transform matrix

### a. Experiments with a linear model

*u*follows the Gaussian distribution with mean zero and variance 1 [denoted by

_{i}*u*~

_{i}*N*(

*u*: 0, 1)];

_{i}*f*(

*x*,

_{i}*k*,

*h*,

*d*) is a discrete boxcar function, starting at the time index

*k*, with a jump height

*h*and a width

*d*, that is,In the experiment we let four jumps occur with

*k*= 200, 400, 600, and 800, respectively, with the objective to verify that the TLHF can perform (almost) equally well for the jumps that occurred at different times. For each jump, we fix

*d*= 20 but let the jump height

*h*be 10 or 30. For illustration, a time series generated by Eq. (47) with

*h*= 10 is shown in Fig. 1.

*υ*~

_{i}*N*(

*υ*: 0, 1). In the experiment, we let the time index

_{i}*i*= 1, 2, … , 1000, and record the observation every time step. To reduce statistical fluctuations in estimation, we conduct the experiment for 100 times, each time with different initial conditions (drawn at random), hence different truths and observations.

*(of a scalar variable) evaluated by the KF at the*

_{i}*i*th cycle, in the TLHF one only needs to replace it by Δ

*= Σ*

_{i}*/(1 −*

_{i}*γ*Σ

_{i}*) = Σ*

_{i}*/(1 −*

_{i}*c*) [with

*γ*=

_{i}*c*/Σ

*,*

_{i}*c*∈ [0, 1) in the scalar system] and change the gain matrix accordingly, while the other steps are the same as those in the KF. Clearly, in the scalar system, the specific form I-ANA in section 4b is equivalent to the specific form I-MTX in section 4c.

*m*-dimensional system, the RMSE

_{x}*e*of an estimate

_{i}*i*is defined asThe average RMSE (RMSE for short)

*i*over 100 simulations is defined as

*i*in the

*j*th simulation. We also define the time mean RMSE

*N*), that is,

*N*= 1000 in this experiment).

Figure 2 plots the RMSE of the KF over the time horizon [1, 1000] in the cases *h* = 10 (top), *h* = 30 (bottom). In both cases, the KF achieves a relatively low RMSE during the period without any abrupt jump. However, when the abrupt jumps occur, the RMSE of the KF rises sharply in response.

Figures 3–5 plot the RMSE differences between the TLHF of I-BG with different PLC values and the KF. Throughout this work, we use the RMSEs of the KF as the baselines, and the RMSE differences are defined as the RMSEs of the TLHF subtracted by the corresponding ones of the KF. In all these figures, the top plots correspond to the case *h* = 10 and the lower ones to the case *h* = 30. At *c* = 0.1 (Fig. 3), when there is no abrupt jump, the RMSEs of the TLHF and the KF are nearly indistinguishable, so that their RMSE differences are almost zero. However, when the abrupt jumps appear, the RMSEs of the TLHF do not rise as sharply as those of the KF, so that their RMSE differences become negative, suggesting that the TLHF has relatively more robust performance than the KF during the abrupt jumps. At *c* = 0.5 (Fig. 4), the RMSE differences during the periods with the abrupt jumps become larger, while those during the periods without any abrupt jump remain close to zero. Further increasing *c* to 0.9 (Fig. 5), the performance of the TLHF becomes remarkably better than the KF during the periods with abrupt jumps. The RMSEs of the TLHF appear insensitive to the presence of the abrupt jumps, which is not the case for the KF. However, there is also a price for the TLHF to achieve this. During the periods without the abrupt jumps, the TLHF performs worse than the KF, so that their RMSE differences are slightly above zero. Moreover, the divergence of the TLHF is spotted for time indices *i* > 870. The occurrence of the divergence is possibly due to the fact that the PLC is too large, so that 1/*γ _{i}* becomes less than the minimum threshold 1/

*γ*

^{*}defined in Eq. (12). As discussed in section 2b, in such situations there is no guarantee to attain a TLHF solution that satisfies the inequality in Eq. (19). Instead, divergence of the filter solution may occur as observed in the experiment.

Figures 6–8 show the RMSE differences between the TLHFs of I-ANA and I-MTX with three different PLC values (equivalent to each other in scalar systems) and the KF. Similar results are observed. At *c* = 0.1 (Fig. 6), the RMSEs of the TLHF and the KF are almost indistinguishable when there is no abrupt jump, so that their RMSE differences are very close to zero. The TLHF again performs better than the KF when the abrupt jumps occur. At a larger PLC value, *c* = 0.4 (Fig. 7), the TLHF performs remarkably better than the KF during the period of abrupt jumps, but at the cost of slightly worse performance than the KF during the period without any abrupt jump. When further increasing *c* to 0.6 (Fig. 8), the performance of the TLHF deteriorates in comparison with the choice *c* = 0.4. More investigations (not reported here) show that a larger value (*c* > 0.6) leads to even worse performance.

To summarize, our experiment results show that, for a relatively small PLC, the KF and the TLHF have close performance. This is expected, since the TLHF with *c* = 0 reduces to the KF as we have noted in section 3. As *c* increases, the TLHF exhibits a better performance than the KF when there are relatively large uncertainties. However, when there only exist relatively small uncertainties in assimilation, a too large *c* (hence too much uncertainty inflation) may also make the TLHF appear overconservative and deteriorate the filter performance (or even diverge). This is because, with relatively small uncertainties, the backgrounds also provide useful information and, thus, should not be underweighted. To mitigate this problem, one possible strategy is to use a relatively small value of *c* to make the TLHF less conservative when there only exist relatively small uncertainties, and a larger one when there exhibit more uncertainties. This is essentially a strategy that conducts adaptive covariance inflation, as has already been investigated in some works (e.g., Anderson 2007, 2009; Hoteit et al. 2002; Hoteit and Pham 2004). From our earlier discussion in section 3b, the adaptive inflation problem can be solved under the framework of the HF with an additional optimality criterion (e.g., minimum variance or maximum likelihood), which will be investigated in the future.

### b. Experiments with a nonlinear model

*F*acts as the external forcing term (Lorenz 1996). For consistency, we define

*x*

_{−1}=

*x*

_{39},

*x*

_{0}=

*x*

_{40}, and

*x*

_{41}=

*x*

_{1}. We suppose that the true value of the parameter

*F*is 8 for

*t*≥ 0, but in assimilation one may choose other values for

*F*, which thus yields a potential parameter mismatch. In our experiments we consider two scenarios, with

*F*= 6 and

*F*= 8, respectively.

**x**

_{i}≡ (

*x*

_{i}_{,1},

*x*

_{i}_{,2}, … ,

*x*

_{i}_{,40}) at time instant

*i*, where

**v**

*follows the Gaussian distribution*

_{i}*N*(

**v**

*:*

_{i}We use the ensemble transform Kalman filter (ETKF) (Bishop et al. 2001) to construct the EnTLHF. The ETKFs with I-BG and I-ANA are constructed by inflating the background ensembles and the analysis ensembles, respectively, in a similar way to that in Anderson and Anderson (1999) and Whitaker and Hamill (2002). To construct the ETKF with I-MTX, one needs to evaluate the analysis covariances, conduct SVDs, and then inflate the associated eigenvalues. In high-dimensional systems, conducting SVDs on the analysis covariances makes the ETKF with I-MTX computationally less efficient than its I-BG and I-ANA counterparts. However, it is possible to implement the I-MTX form in the SEEK filter (Hoteit et al. 2002; Pham et al. 1998) without significant increase of computational cost, since in this case all such SVDs can be conducted on the matrices updated by Eq. (33), the dimension of which is determined by the ensemble size in assimilation.

In our experiments we let the ensemble size *n* = 10 and vary the PLC values. To reduce statistical fluctuations, for each PLC value *c* we repeat the experiments for 20 times, each time with a randomly drawn initial background ensemble (with 10 members). In practice, it is customary to introduce covariance localization to the ETKF in order to improve the filter performance (Hamill et al. 2009; Van Leeuwen 2009). Since in our experiments our objective is to assess the relative robustness of the EnTLHF, we choose not to conduct covariance localization to avoid complicating the analysis of our experiment results. In what follows, we examine the time mean RMSE of the EnTLHF as a function of the PLC value *c*, with *c* ∈ [0, 0.1, 0.2, … , 0.9]. The ETKF is treated as a special case of the EnTLHF with *c* = 0.

Figure 9 plots the time-mean RMSEs of the ETKF with I-BG. The result in the case of *F* = 6 is marked with the dash–dotted line and that in the case of *F* = 8 with the dotted one. When *F* = 6, the time mean RMSE appears to be a monotonically decreasing function with respect to *c*. When *F* = 8, the time mean RMSE tends to decrease until it reaches *c* = 0.8. After that, the time-mean RMSE slowly rises. In both cases, all time mean RMSEs with *c* > 6 are lower than that of the ETKF (*c* = 0).

Similar results of the ETKF with I-ANA are observed in Fig. 10. For both cases, with *F* = 6 and *F* = 8, their time mean RMSEs are monotonically decreasing functions with respect to *c*, and all time mean RMSEs with *c* > 0 are lower than that of the ETKF (*c* = 0).

Figure 11 shows the time mean RMSEs of the ETKF with I-MTX. When *F* = 6, the time mean RMSE decreases monotonically until it reaches *c* = 0.4. After that, the time-mean RMSE rises rapidly. Moreover, if *c* > 0.6, filter divergence is spotted, possibly for the same reason as explained in the previous section. The result of *F* = 8 is similar: the time mean RMSE decreases until *c* = 0.5 and then increases as *c* continues growing. Filter divergence also occurs when *c* > 0.6. Compared to the ETKF (*c* = 0), the time mean RMSEs with *c* > 0 are lower until *c* reaches the turnaround point.

Through the above experiments, we have shown that, with suitable PLC values, the ETKFs of all three specific forms, namely, I-BG, I-ANA, and I-MTX, exhibit relative robustness in comparison with the ETKF without any covariance inflation, which is consistent with the observations in the literature that an EnKF method with suitable covariance inflation may perform better than that without any covariance inflation [see, e.g., Hamill et al. (2009) and Van Leeuwen (2009) and the references therein]. Different inflation schemes may result in different filter performance. For instance, the ETKF with I-ANA appears to have better performance than the other two schemes. The validity of this conclusion may depend on the system in assimilation, though, and may need to be verified case by case.

## 6. Discussion and conclusions

In this work we considered the applications of the KF and the HF to a state-estimation problem. We discussed the similarity and difference between the KF and the HF, and showed that the KF can be considered as a special case of the HF with the performance level equal to zero. For convenience of applying the *H*_{∞} filtering theory to sequential data assimilation, we introduced a variant, called the time-local HF, in which we suggested to solve the constraints in the HF locally (in time). Analogous to the EnKF methods, we proposed the ensemble version of the TLHF, called the ensemble time-local HF (EnTLHF), and showed that the EnTLHF can be constructed based on the EnKF. In addition, we established the connections of some specific forms of the EnTLHF to some EnKF methods equipped with certain covariance inflation techniques.

Compared to existing works on covariance inflation in the EnKF, the *H*_{∞} filtering theory provides a theoretical framework that unifies various inflation techniques in the literature and establishes the connection between covariance inflation and robustness. The *H*_{∞} filtering theory also provides an explicit definition of robustness and the associated mathematical description. Conceptually, this leads to the possibility of recasting the problem of optimal covariance inflation as an optimization problem with multiple objectives, although further investigations will be needed for practical considerations. In addition, since the definition of robustness is filter-independent, the robustness property may be integrated into other types of nonlinear filters, for example, the particle filter or the Gaussian sum filter (Hoteit et al. 2008; Luo et al. 2010; Van Leeuwen 2003), by imposing constraints similar to that in Eq. (19). In our opinion, it might be less obvious to see how the above extensions can be made from the point of view of covariance inflation.

Through numerical experiments, we verified the relative robustness of three specific forms of the TLHF–EnTLHF in comparison with the KF–ETKF without covariance inflation. There are also some issues that have not been fully addressed in this work, for instance, the optimal choice of the performance level coefficient in conducting uncertainty inflation. Further investigations in these aspects will be considered in the future.

We thank two anonymous reviewers for their most constructive suggestions and comments that have significantly improved our work. This publication is based on work supported by funds from the KAUST GCR Academic Excellence Alliance program.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903.Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters.

,*Tellus***59A**, 210–224.Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83.Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts.

,*Mon. Wea. Rev.***127**, 2741–2758.Beezley, J. D., , and J. Mandel, 2007: Morphing ensemble Kalman filters.

,*Tellus***60A**, 131–140.Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129**, 420–436.Burger, J. O., 1985:

*Statistical Decision Theory and Bayesian Analysis*. Springer-Verlag, 624 pp.Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126**, 1719–1724.Cohn, S., , and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics.

,*J. Meteor. Soc. Japan***74**, 63–75.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**(C5), 10 143–10 162.Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation.

,*Ocean Dyn.***53**, 343–367.Evensen, G., , and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Aghulas Current using the ensemble Kalman filter with a quasigeostrophic model.

,*Mon. Wea. Rev.***124**, 85–96.Francis, B. A., 1987:

*A Course in H*. Springer-Verlag, 156 pp._{∞}Control TheoryHamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790.Hamill, T. M., , J. S. Whitaker, , J. L. Anderson, , and C. Snyder, 2009: Comments on “Sigma-point Kalman filter data assimilation methods for strongly nonlinear systems.”

,*J. Atmos. Sci.***66**, 3498–3500.Hoteit, I., , and D. T. Pham, 2004: An adaptively reduced-order extended Kalman filter for data assimilation in the tropical Pacific.

,*J. Mar. Syst.***45**, 173–188.Hoteit, I., , D. T. Pham, , and J. Blum, 2001: A semi-evolutive partially local filter for data assimilation.

,*Mar. Pollut. Bull.***43**, 164–174.Hoteit, I., , D. T. Pham, , and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific.

,*J. Mar. Syst.***36**, 101–127.Hoteit, I., , D. T. Pham, , G. Triantafyllou, , and G. Korres, 2008: A new approximate solution of the optimal nonlinear filter for data assimilation in meteorology and oceanography.

,*Mon. Wea. Rev.***136**, 317–334.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811.Jazwinski, A. H., 1970:

*Stochastic Processes and Filtering Theory*. Academic Press, 400 pp.Kalman, R., 1960: A new approach to linear filtering and prediction problems.

,*Trans. ASME, Ser. D. J. Basic Eng.***82**, 35–45.Lorenz, E. N., 1996: Predictability—A problem partly solved.

*Predictability,*T. Palmer, Ed., ECMWF, 1–18.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414.Luo, X., , and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform.

,*Physica D***238**, 549–562.Luo, X., , I. M. Moroz, , and I. Hoteit, 2010: Scaled unscented transform Gaussian sum filter: Theory and application.

,*Physica D***239**, 684–701.Nerger, L., , L. Hiller, , and J. Schröter, 2005: A comparison of error subspace Kalman filters.

,*Tellus***57A**, 715–735.Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A**, 415–428.Pham, D. T., , J. Verron, , and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for data assimilation in oceanography.

,*J. Mar. Syst.***16**, 323–340.Schlee, F. H., , C. J. Standish, , and N. F. Toda, 1967: Divergence in the Kalman filter.

,*AIAA J.***5**, 1114–1120.Shaked, U., , and N. Berman, 1995:

*H*∞ nonlinear filtering of discrete-time processes.,*IEEE Trans. Sig. Proc.***43**, 2205–2209.Simon, D., 2006:

*Optimal State Estimation: Kalman, H-Infinity, and Nonlinear Approaches*. Wiley-Interscience, 552 pp.Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters.

,*Mon. Wea. Rev.***131**, 1485–1490.Van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications.

,*Mon. Wea. Rev.***131**, 2071–2084.Van Leeuwen, P. J., 2009: Particle filtering in geophysical systems.

,*Mon. Wea. Rev.***137**, 4089–4114.Verlaan, M., , and A. W. Heemink, 1997: Tidal flow forecasting using reduced rank square root filters.

,*Stoch. Hydrol. Hydraul.***11**, 349–368.Wang, D., , and X. Cai, 2008: Robust data assimilation in hydrological modeling—A comparison of Kalman and H-infinity filters.

,*Adv. Water Resour.***31**, 455–472.Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924.Zupanski, M., 2005: Maximum likelihood ensemble filter: Theoretical aspects.

,*Mon. Wea. Rev.***133**, 1710–1726.

^{1}

The deduction will be similar in case that **u*** _{i}* and

**v**

*are correlated colored noise. Readers are referred to, for example, Simon (2006, chapter 7) for the details.*

_{i}^{2}

**x**_{0}, which is assumed to be available, and therefore not influenced by the filter designer.

^{3}

If, in contrast, the observation is very unreliable, then one may choose a negative value for γ such that the background has relatively more weight in the update. In this work we confine ourselves to the scenario *γ* ≥ 0.

^{5}

Like the extended KF, there also exists the extended HF containing more thorough treatment of nonlinearity [see, e.g., Shaked and Berman (1995)], whose implementation, however, involves the derivative(s) of nonlinear functions and more sophisticated inequality constraints.