• Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634642.

  • Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter. Physica D, 230, 99111.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, T., , C. Snyder, , and D. Nychka, 2003: Toward a nonlinear ensemble filter for high-dimensional systems. J. Geophys. Res., 108 (D24), 8775, doi:10.1029/2002JD002900.

    • Search Google Scholar
    • Export Citation
  • Bishop, C. H., , B. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transformation Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436.

    • Search Google Scholar
    • Export Citation
  • Brusdal, K., , J. M. Brankart, , G. Halberstadt, , G. Evensen, , P. Brasseur, , P. J. van Leeuwen, , E. Dombrowsky, , and J. Verron, 2003: A demonstration of ensemble-based assimilation methods with a layered OGCM from the perspective of operational ocean forecasting systems. J. Mar. Syst., 40–41, 253289.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724.

  • Chorin, A., , and X. Tu, 2009: Implicit sampling for particle filters. Proc. Natl. Acad. Sci. USA, 106, 17 24917 254.

  • Evensen, G., 1994: Sequential data assimilation with a non-linear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367.

  • Evensen, G., 2007: Data Assimilation: The Ensemble Kalman Filter. Springer, 279 pp.

  • Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757.

    • Search Google Scholar
    • Export Citation
  • Gordon, N., , D. Salmond, , and A. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc.-F, 140, 107113.

    • Search Google Scholar
    • Export Citation
  • Hammersley, J. M., , and D. C. Handscomb, 1965: Monte Carlo Methods. Methuen & Co., 178 pp.

  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129, 123137.

    • Search Google Scholar
    • Export Citation
  • Künsch, H. R., 2005: Recursive Monte Carlo filters: Algorithms and theoretical analysis. Ann. Stat., 33, 19832021.

  • Lawson, W. G., , and J. A. Hansen, 2004: Implications of stochastic and deterministic filters as ensemble-based data assimilation methods in varying regimes of error growth. Mon. Wea. Rev., 132, 19661981.

    • Search Google Scholar
    • Export Citation
  • Lei, J., , P. Bickel, , and C. Snyder, 2010: Comparison of ensemble Kalman filters under non-Gaussianity. Mon. Wea. Rev., 138, 12931306.

  • Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Reading, Berkshire, United Kingdom, European Centre for Medium-Range Weather Forecasts, 1–18.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 2005: Designing chaotic models. J. Atmos. Sci., 62, 15741587.

  • Mitchell, H. L., , and P. L. Houtekamer, 2009: Ensemble Kalman filter configurations and their performance with the logistic map. Mon. Wea. Rev., 137, 43254343.

    • Search Google Scholar
    • Export Citation
  • Musso, C., , N. Oudjane, , and F. Le Gland, 2001: Improving regularized particle filters. Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon, Eds., Springer-Verlag, 247–271.

    • Search Google Scholar
    • Export Citation
  • Nakano, S., , G. Ueno, , and T. Higuchi, 2007: Merging particle filter for sequential data assimilation. Nonlinear Processes Geophys., 14, 395408.

    • Search Google Scholar
    • Export Citation
  • Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415428.

  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 11941207.

  • Sacher, W., , and P. Bartello, 2008: Sampling errors in ensemble Kalman filtering. Part I: Theory. Mon. Wea. Rev., 136, 30353049.

  • Sakov, P., , and P. R. Oke, 2008a: A deterministic formulation of the ensemble Kalman filter: An alternative to ensemble square root filters. Tellus, 60A, 361371.

    • Search Google Scholar
    • Export Citation
  • Sakov, P., , and P. R. Oke, 2008b: Implications of the form of the ensemble transformation in the ensemble square root filters. Mon. Wea. Rev., 136, 10421053.

    • Search Google Scholar
    • Export Citation
  • Snyder, C., , T. Bengtsson, , P. Bickel, , and J. Anderson, 2008: Obstacles to high-dimensional particle filtering. Mon. Wea. Rev., 136, 46294640.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131, 20712084.

  • van Leeuwen, P. J., 2009: Particle filtering in geophysical systems. Mon. Wea. Rev., 137, 40894114.

  • van Leeuwen, P. J., 2010: Nonlinear data assimilation in geosciences: An extremely efficient particle filter. Quart. J. Roy. Meteor. Soc., 136, 19911999.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924.

  • Xiong, X., , I. M. Navon, , and B. Uzunoglu, 2006: A note on the particle filter with posterior Gaussian resampling. Tellus, 58A, 456460.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Ensemble size against average RMSE for the Lorenz-96 system with step size 0.05 and double-exponential observation noise.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 131 131 19
PDF Downloads 123 123 19

A Moment Matching Ensemble Filter for Nonlinear Non-Gaussian Data Assimilation

View More View Less
  • 1 Department of Statistics, University of California, Berkeley, Berkeley, California
© Get Permissions
Full access

Abstract

The ensemble Kalman filter is now an important component of ensemble forecasting. While using the linear relationship between the observation and state variables makes it applicable for large systems, relying on linearity introduces nonnegligible bias since the true distribution will never be Gaussian. This paper analyzes the bias of the ensemble Kalman filter from a statistical perspective and proposes a debiasing method called the nonlinear ensemble adjustment filter. This new filter transforms the forecast ensemble in a statistically principled manner so that the updated ensemble has the desired mean and variance. It is also easily localizable and, hence, potentially useful for large systems. Its performance is demonstrated and compared with other Kalman filter and particle filter variants through various experiments on the Lorenz-63 and Lorenz-96 systems. The results show that the new filter is stable and accurate for challenging situations such as nonlinear, high-dimensional systems with sparse observations.

Corresponding author address: Jing Lei, Department of Statistics, UC Berkeley, 367 Evans Hall, Berkeley, CA 94720. E-mail: leij09@gmail.com

Abstract

The ensemble Kalman filter is now an important component of ensemble forecasting. While using the linear relationship between the observation and state variables makes it applicable for large systems, relying on linearity introduces nonnegligible bias since the true distribution will never be Gaussian. This paper analyzes the bias of the ensemble Kalman filter from a statistical perspective and proposes a debiasing method called the nonlinear ensemble adjustment filter. This new filter transforms the forecast ensemble in a statistically principled manner so that the updated ensemble has the desired mean and variance. It is also easily localizable and, hence, potentially useful for large systems. Its performance is demonstrated and compared with other Kalman filter and particle filter variants through various experiments on the Lorenz-63 and Lorenz-96 systems. The results show that the new filter is stable and accurate for challenging situations such as nonlinear, high-dimensional systems with sparse observations.

Corresponding author address: Jing Lei, Department of Statistics, UC Berkeley, 367 Evans Hall, Berkeley, CA 94720. E-mail: leij09@gmail.com

1. Introduction

The ensemble Kalman filter (EnKF; Evensen 1994, 2003, 2007) has become a popular tool for data assimilation because of its computational efficiency and flexibility (Houtekamer and Mitchell 1998; Anderson 2001; Whitaker and Hamill 2002; Ott et al. 2004; Evensen 2003). Although many variants have been developed, the EnKF approximates the probability distributions of the forecast state vector, the observation, and the updated state vector by Gaussian distributions. Such a Gaussian approximation allows a linear update that makes the EnKF applicable for many large-scale problems. However, in reality such a Gaussian approximation and linear update will introduce systematic bias because the true distributions can be significantly non-Gaussian and the relationship between the observation and state may be nonlinear.

A filtering approach that is adaptive to nonlinearity and non-Gaussianity is the particle filter (Gordon et al. 1993; van Leeuwen 2003). However, it is known that the ordinary particle filter requires a prohibitively large ensemble to avoid collapsing even when the dimensionality of the state space is moderately high (Snyder et al. 2008). On the other hand, the particle filter suffers from sample impoverishment when the system is deterministic. Both the efficiency problem and sample impoverishment problem can be alleviated by reducing the dimensionality. Therefore, a major challenge for particle filters in data assimilation is localization. Traditional covariance tapering techniques for the EnKF are not applicable for the particle filter, where the forecast ensemble is updated directly from the likelihood function and the covariance matrix is not used. The sliding window localization method used in Brusdal et al. (2003) and Ott et al. (2004) seems feasible but the resampling/reweighting step of the ordinary particle filter breaks the connection between overlapping local windows. For an introduction to particle filters in data assimilation, see van Leeuwen (2009).

In this article we propose a new ensemble filtering method that enables localization by combining the advantages of both the EnKF and the particle filter. The basic idea is to view the EnKF as a linear regression of the state vector on the observation vector (Anderson 2003), where the updated state is obtained by using the true observation as predictor in the fitted linear model. Under nonlinear and non-Gaussian models, such a linear update is biased in both location (the posterior expectation) and shape (the posterior higher-order moments). The bias in location is a direct consequence of using linear regression in a nonlinear model. Our method uses importance sampling to estimate the conditional expectation of the state given the observation and each ensemble member is shifted accordingly. Unlike the particle filter, which tries to estimate the whole posterior distribution, the new method uses importance sampling only to estimate the posterior mean, lending itself more easily to localization because it does not involve any reweighting or resampling steps.

Xiong et al. (2006) and Nakano et al. (2007) also use importance sampling to update the mean and covariance, and transform the forecast ensemble so that the updated ensemble has asymptotically the desired mean and covariance. In particular, Xiong et al. (2006) propose a particle filter with Gaussian resampling (PFGR) using a deterministic transform on the forecast ensemble, which is very similar to the ensemble square root filter. This method depends on the particle filter to estimate the updated mean and covariance, which is hard if the dimensionality of state space is moderately high. On the other hand, Nakano et al. (2007) develop a merging particle filter (MPF) that generates an auxiliary posterior ensemble using importance sampling and resampling, and each updated particle is obtained by a linear combination of a group of these auxiliary particles. This method gives good simulation results, but the importance sampling and resampling steps collapse frequently when the state space is high dimensional and the system is deterministic. We compare the performance of our method with both PFGR and MPF in the Lorenz-63 system. In our simulations with the Lorenz-96 system, both the PFGR and MPF degenerate. Note that in Nakano et al. (2007), the MPF does not degenerate when a stochastic Lorenz-96 system is considered.

The rest of this article is organized as follows. In section 2 we review the EnKF and the particle filter. In section 3 we examine the sources of bias in the EnKF and introduce the nonlinear ensemble adjustment filter (NLEAF). In section 4 the NLEAF algorithm is tested and compared with other methods in both the Lorenz-63 and Lorenz-96 systems. Some final remarks are given in section 5.

2. Ensemble filters

Filtering algorithms for data assimilation usually work sequentially. There are two major steps in each recursion. In the forecasting step, a forecast (prior) ensemble is obtained by applying the forecast model to each update (posterior) ensemble member produced at the previous time. In the update step, the forecast ensemble is modified to incorporate the information provided by the new observation. We focus on the update step, assuming that the forecast can be done in a standard way.

Formally, suppose the uncertainty of the forecast ensemble can be represented by a random variable x with probability density function pf(·), where the subindex “f” stands for “forecast.” Assume that the observation y is given by
eq1
where the observation mechanism h(·) may be nonlinear and ϵ is the observation noise independent of x, with probability density function g(·), which could be non-Gaussian. Then the likelihood function of x given y is g[yh(x)]. The optimal way to update the state distribution is Bayes's rule, in which the updated probability density function of the state variable is
eq2
where the subindex “a” stands for “analysis.”

However, a closed form solution is available only for a few special cases such as when h is linear and pf(·), g(·) are Gaussian (the Kalman filter), or when x is discrete (hidden Markov models). In ensemble filtering, pf and pa are approximated by a discrete set (ensemble) of sample points (particles), and the ensemble is propagated by the forecast model and updated according to the observation at each time. We recall two typical ensemble filtering methods, the particle filter and the EnKF.

a. The particle filter

The particle filter (Gordon et al. 1993) uses importance sampling followed by a resampling step. Given the forecast ensemble as a random sample from pf, and the observation yo, a simple particle filter algorithm works as follows:

  1. Evaluate the likelihood for each forecast ensemble member: for j = 1, … , n.
  2. For j = 1, … , n, sample the updated ensemble member independently from with weights proportional to (w1, … , wn).
Reweighting the particles according to their likelihoods is called importance sampling, which dates back at least to Hammersley and Handscomb (1965). The particle filter is statistically consistent in the sense that when the ensemble size goes to infinity, the updated ensemble will be exactly a random sample from pa (Künsch 2005). However, in high-dimensional situations, the ordinary particle filter requires a very large ensemble to search the whole state space and the ensemble tends to collapse in a few steps. Snyder et al. (2008) give a quantitative estimate of the rate of collapse in a simple Gaussian model.

Another potential problem of the particle filter is the issue of degeneracy or sample impoverishment. When resampling is used, there will inevitably be a loss of diversity of distinct particles during the update. If the system is deterministic, the updated ensemble will soon have few distinct particles. A remedy is to perturb the updated particles with small random noises, which is known as the regularized particle filter (Musso et al. 2001). Such a perturbation introduces another source of noise, which may impact performance negatively even in low-dimensional problems.

However, because of its natural advantage in dealing with nonlinear non-Gaussian problems, the particle filter is still a promising tool for data assimilation. Many resampling methods have been proposed to improve the efficiency of particle filters. Examples include, but are not limited to, Xiong et al. (2006) and Nakano et al. (2007). Of course, the major challenge is finding a good proposal density in the resampling step. The extent to which this is possible in very high-dimensional geophysical systems remains to be analyzed but see Chorin and Tu (2009) and van Leeuwen (2010) for examples in this direction.

b. The ensemble Kalman filter

A widely used approach in data assimilation is the EnKF (Evensen 1994; Burgers et al. 1998). Unlike the particle filter, which tries to capture the whole posterior distribution, the EnKF adjusts the posterior mean and covariance to agree with the ones obtained by a Kalman filter update. Assume the observation is linear in x and ϵ:
eq3
with Gaussian noise . Given the forecast ensemble and true observation yo, the stochastic EnKF works as follows:
  1. Compute the forecast sample covariance matrix: , where is the forecast ensemble mean, and the superscript T means matrix transpose.
  2. Generate perturbed observations: , with .
  3. Estimate the linear regression coefficient of x on y: . This is usually done by multiplying the sample covariance between the forecast ensemble and the perturbed observations and the inverse of the sample variance of the perturbed observations.
  4. Update ensemble: .
It is known that using perturbed observations will introduce some sampling bias especially for small ensembles. Some systematic sampling schemes are often used to make sure that the perturbations have zero mean, covariance , and zero correlation with the forecast ensemble (Pham 2001). Another technique known as covariance inflation is often used to mitigate the forecast ensemble bias carried over from the previous time. There is vast literature on different variants and applications of the EnKF. See Evensen (2003) for a nice review and Mitchell and Houtekamer (2009) for more recent developments.

Some ensemble filters update the first two moments of the forecast ensemble in a deterministic manner, under the name of Kalman square root filter (Anderson 2001; Bishop et al. 2001; Whitaker and Hamill 2002). Although deterministic filters have no sampling error introduced by perturbing the observations, the bias caused by the nonlinearity and non-Gaussianity still remains even when the ensemble is large. There have been many studies comparing the stochastic and deterministic filters. It is noted under some simple models, for example in Lawson and Hansen (2004) and Lei et al. (2010), that the stochastic filter is more robust against nonlinearity and non-Gaussianity.

A key feature of the EnKF is that it updates each particle directly, without reweighting or resampling, which makes it easily localizable. In the next section we will introduce another filter with the same property but having reduced bias in a nonlinear, non-Gaussian context.

3. A moment-matching nonlinear non-Gaussian ensemble filter

a. Why does the ensemble Kalman filter work?

An explanation of the stochastic EnKF is based on the following simple fact: if x and y are jointly Gaussian, then the posterior distribution of x given y depends on y only through the mean. That is, for any y, let μa(y) be the posterior mean of x given y, then the posterior distribution of x given y is , where does not depend on y. By the construction of the perturbed observation yj, is jointly Gaussian. Therefore, recalling that μf is the forecast ensemble mean, is a random draw from and is a random draw from , the true posterior distribution. On the other hand, joint Gaussianity also implies that μa(y) is a linear function of y with coefficient as given in step 2 in section 2b. Thus the update can be written as . Note that by definition we have . As a result, the update formula can be written as , which is exactly the same as the “observation perturbation” used in Eq. (13) of Burgers et al. (1998). In this paper we use the perturbed observation yj to present a unified argument for both Gaussian linear and non-Gaussian nonlinear models.

In a non-Gaussian nonlinear model, the EnKF will be biased for two reasons. First, since μa(y), the conditional expectation of x given y, is no longer linear in y. Second, the recentered conditional random variables will not have the correct variance and higher moments. The first source of bias will result in a larger distance between the true state and the updated ensemble mean, whereas the second source of bias will affect the shape of the updated ensemble. Such a shape bias might lead to further location bias when propagated by the dynamics.

b. The NLEAF algorithm with first-order correction

We now introduce the NLEAF as a debiasing alternative to the EnKF. It requires no parametric assumptions on the prior distribution of x, observation function h(·), and observation noise distribution g(·). The basic idea is to estimate μa(y) using importance sampling, instead of linear fitting as in the EnKF. Let pf(·) and pa(·) be the forecast (prior) and updated (posterior given y) densities, respectively. Then we have, for any y,
e1
The importance sampling estimator of the conditional expectation is given by
e2
Given the forecast ensemble and observation yo, the NLEAF update works as follows:
  1. Generate perturbed observations , with ϵj independently sampled from probability density function g(·).
  2. Estimate the conditional expectation using (2) for y equal to the true observation yo and all yj, j = 1, … , n.
  3. Update ensemble:

This algorithm improves the EnKF by using importance sampling to estimate the conditional expectations. It is known that under mild conditions the importance sampling estimator converges to μa(y) as the ensemble size tends to infinity (Künsch 2005). Therefore is centered approximately at μa(yo), since is centered approximately at μa(yj) conditioning on yj. As a result, the first source of bias in the EnKF is reduced. On the other hand, it also keeps the simplicity of the EnKF, avoiding the reweighting and resampling steps used by the particle filter. This is particularly suitable for sliding-window localization techniques (Brusdal et al. 2003; Ott et al. 2004), because when applied to localized state vectors, the absence of resampling and reweighting maintains the spatial smoothness across neighboring local state vectors. As a result, one would expect the NLEAF algorithm to be applicable for many high-dimensional problems with a better accuracy than the EnKF. Before demonstrating its performance in section 4, we describe two important extensions of the NLEAF.

c. Higher-order corrections

The NLEAF update described above only adjusts the posterior mean. In fact the same idea can be applied for higher-order corrections. For example, an NLEAF with second-order correction uses importance sampling to estimate the posterior covariance of x given y, denoted as . Theoretically we have
eq4
where pa(x) is defined as in (1). Then the importance sampling estimate for is
e3
where is obtained using (2). The update with second-order correction is
eq5
To understand this algorithm, note that when the ensemble size is large, and for all y, then has mean zero and covariance . Therefore the updated ensemble member has approximately mean μa(yo) and covariance . As shown in our numerical experiments, this higher-order correction does improve the accuracy. It even outperforms the particle filter in some settings with deterministic dynamics. As a trade-off, it requires a larger ensemble size and is computationally more intensive than the first-order NLEAF because of the covariance matrix estimation and inversion. We also note this is actually a stochastic version of the Gaussian resampling particle filter proposed by Xiong et al. (2006).

d. NLEAF with unknown likelihood function

Another useful extension of the NLEAF is to use nonlinear regression methods to estimate μa(y). When y is generated by a black-box function, the relationship between the observation y given the state vector x might be complicated so that the density function of y given x is not available in analytic form. One practical example is satellite radiance. Satellite radiance is related to temperature, humidity, and other trace variables depending on the wavelength. But no simple analytic relationship is known between the state and the observation. In this case, all we have is the forecast ensemble paired with the perturbed observations: . One may use regression methods to estimate the conditional moments of x given y. For example, the EnKF uses a linear regression to estimate μa(y). One can also use more general methods, such as polynomial regression, to handle nonlinearity. More specifically, one may estimate μa(y) as a function of y by minimizing over all quadratic functions m(·) under the square loss . This could be a promising method for nonlinear data assimilation problems where we lack knowledge of the observation-generating mechanism.

4. Numerical experiments

a. The Lorenz-63 system

The Lorenz-63 system is a three-dimensional model determined by an ordinary differential equation system:
e4
e5
e6
where zτ is the three-dimensional state vector describing a simplified flow of heated fluid with τ being the continuous time index. The parameters are set as , ρ = 28, and σ = 10.

In our simulation the system is discretized using the fourth-order Runge–Kutta method. Let xt = zΔt, for all integers t, with Δ being the step size. A larger value of Δ indicates a more nonlinear relationship between xt and xt+1. For a given starting state , a hidden true orbit and observation are generated using independent observation noise sequence ϵt with mean 0 and covariance . Starting from an initial ensemble , filtering methods are applied recursively at each time t ≥ 1. The updated ensemble average is used as the single point estimate of . The main evaluation criterion used is the root-mean-squared error (RMSE): , where ‖·‖2 is the Euclidean norm, d = 3 is the dimensionality of the state space, and is the updated ensemble mean at time t. For a measurement of the ensemble spread (sharpness) we look at in comparison with the RMSE, where is the sample covariance matrix of the updated ensemble at time t. In addition, we use the percentage of being covered by the range between 0.025 and 0.975 quantiles (the sample 95% confidence interval) of the updated ensemble as a measurement of how well the updated ensemble covers the true state.

1) Simulation setup

We look at two different observation error distributions. First we consider Gaussian observation noise, which has been used widely in previous studies. In this case, we let the coordinates of ϵt be independent Gaussian with mean 0 and variance θ2. On the other hand, in order to study non-Gaussian observation noises we let ϵt have independent coordinates with a common double-exponential distribution (also known as the Laplace distribution) with mean 0 and variance 2θ2. The density of such a double-exponential distribution is p(z) = (2θ)−1 exp(−|z|/θ). The EnKF variants use the noise variance 2θ2 in the update and pretend that the noise is still Gaussian. The NLEAF update uses the true double-exponential density as the likelihood.

The initial ensemble is generated by applying the standard EnKF with 400 ensemble members and standard Gaussian observation noise after 10 000 assimilation cycles. At each following time step, for each filtering method in comparison, a forecast ensemble of size n = 400 is obtained by applying the fourth-order Runge–Kutta method on the corresponding previous update ensemble. A noisy observation of the true state is also generated. Then each ensemble filtering method is applied individually to the corresponding forecast ensemble and the observation, obtaining the new update ensemble. The number of assimilation cycles is chosen to be 2000 throughout our presentation to make the results quickly reproducible. For both the Lorenz-63 system and the Lorenz-96 system (see section 4b), the same results are obtained in simulations with 50 000 assimilation cycles. We consider two different step sizes (denoted Δ), 0.02 and 0.05, to represent different levels of nonlinearity. Moreover, three values of the noise scale parameter θ are considered: 0.5, 1, and 2.

The methods in comparison are: the stochastic EnKF (EnKF), NLEAF with first-order correction (NLEAF1), and NLEAF with second-order correction (NLEAF2), particle filter (PF), PFGR (Xiong et al. 2006), and MPF (Nakano et al. 2007). PFGR and MPF use inflation
e7
with δ = 0.01 to avoid degeneracy. The PF uses a slightly different inflation method that replaces the jth updated ensemble member by , where is the posterior covariance matrix and ξj is an independent Gaussian noise with unit variance.

Remark 1: The EnKF used in all of the simulations is the standard single ensemble EnKF, which typically needs inflation in nonlinear, high-dimensional applications. Given the large ensemble size used here, no covariance inflation is used with the EnKF in the Lorenz-63 experiments.

2) Results

The simulation results are summarized in Table 1 and Table 2. The first number in each cell is the average RMSE over 2000 steps. The number in the parentheses is the average ensemble spread measured by the square root of trace of the ensemble covariance matrix, which is scaled by to match the RMSE. The number below the RMSE and ensemble spread is the percentage of being covered by the 95% confidence interval.

Table 1.

Average RMSE over 2000 time steps for the Lorenz-63 system with Gaussian N(0, θ2) observation errors and step size Δ. Also presented are the average ensemble spread (in parentheses) and the percentage of (3) being ranked between 2.5% and 97.5% in the updated ensemble (below the average RMSE and standard deviation).

Table 1.
Table 2.

Average RMSE over 2000 time steps for the Lorenz-63 system with step size Δ. The observation errors are sampled independently from a double-exponential distribution with mean zero and scale parameter θ. Also presented is the average ensemble spread (in parentheses) and the percentage of (3) being ranked between 2.5% and 97.5% in the updated ensemble (below the average RMSE and standard deviation).

Table 2.

A good filtering method should produce 1) a small average RMSE and 2) an ensemble spread that well represents the ensemble mean error (Sacher and Bartello 2008). If the spread is too small, the updated ensemble is not likely to cover the true state; and if the spread is too large, the updated ensemble would be too noninformative. In other words, the true state should look like a random sample from the updated ensemble. Therefore, the ensemble confidence interval coverage of the true state would be a good indicator of the filter performance. Based on our experiment [see also Figs. 1–6 in Sacher and Bartello (2008) and discussions in Mitchell and Houtekamer (2009)], it is usually good to have the updated ensemble spread close to or slightly larger than the RMSE.

As we see from the tables, the NLEAF gives unparalleled performance except for the double-exponential noise with Δ = 0.05 and θ = 2, which is the hardest case with a significantly nonlinear system and large observation noise. Even in this case, it produces similar average RMSE as the particle filter, with a tighter ensemble spread and a reasonable confidence interval coverage.

The EnKF gives large RMSEs because of its bias under nonlinearity and non-Gaussianity. As for the particle filter, in section 2a we point out that it has the issue of sample impoverishment in deterministic systems. In our implementation, small random perturbations are added to the updated ensemble members to overcome this difficulty, increasing both the uncertainty of the update ensemble and the confidence interval coverage. It is interesting to compare the performance of the PF to NLEAF1. When Δ = 0.05, NLEAF1 gives slightly larger RMSE. Conversely, NLEAF1 performs better than the PF when Δ = 0.02. This is because when the step size is small, the system is more linear and ignoring higher-order moments does not lose much information.

We observe that for double-exponential observations, NLEAF2 tends to produce a tight ensemble, which might fail to cover the true state. In practice one can mitigate this issue by inflating the updated ensemble covariance as in (7) with some small positive constant δ.

b. The Lorenz-96 system

The Lorenz-96 system (Lorenz 1996) is another common test bed for data assimilation algorithms (Bengtsson et al. 2003; Ott et al. 2004; Anderson 2007). The state vector has 40 coordinates evolving according to the following ordinary differential equation system:
e8
where zτ(0) = zτ(40), zτ(−1) = zτ(39) and zτ(41) = zτ(1). This system mimics the evolution of some meteorological quantity at 40 equally spaced grid points along a latitude circle. The attractor has 13 positive local Lyapunov vectors. Similarly, the system is discretized with step size Δ: xt = zΔt, for all integers t.

In such a high-dimensional problem, the particle filter often collapses in a single step with all weight concentrating on one particle. The EnKF also has difficulties when the sample size n is as small as a few tens if no dimension-reduction techniques are used. Most dimension-reduction techniques for the Lorenz-96 system are based on the assumption that two coordinates of xt have little dependence if they are far away in physical space. For example, the correlation between x(1) and x(2) might be significant, but x(1) and x(20) would be nearly independent. It should be noted that in the Lorenz-96 model described in (8), the correlation between x(j) and x(j + 1) is statistically detectable but rather week [see Lorenz (2005), which also introduces models with stronger spatial correlation]. Based on such a spatial correlation assumption, the covariance tapering idea (Gaspari and Cohn 1999; Houtekamer and Mitchell 2001) is applicable. Houtekamer and Mitchell (1998) used a localization scheme in which only data points within a cutoff horizontal distance will be used for the update of each horizontal point. The NLEAF implementation uses the sliding-window localization (Brusdal et al. 2003; Ott et al. 2004), where local observations are used to update local state vectors, and the whole state vector is reconstructed by aggregating the local updates. Specifically, the state vector x = [x(1), … , x(40)] is broken down to overlapping local vectors x(Nj), j = 1, … , 40, with Nj = (jl, … , j, … , j + l), for some positive integer l and all numbers being mod 40. Then each local vector x(Nj) is updated by the ensemble filter using local observations y(Nj) (assuming y = x + ϵ). Therefore each coordinate x(j) is updated simultaneously in 2l + 1 local vectors including x(Njl), x(Njl+1), … , x(Nj+l). The final update for x(j) is then obtained by averaging its updates in local vectors x(Njk), x(Njk+1), … , x(Nj+k) for some positive integer kl.

Our study of the Lorenz-96 system consists of two different settings corresponding to two different levels of difficulty.

1) The hard case

(i) Simulation setup

This setting has been considered by Bengtsson et al. (2003) as an early effort toward a high-dimensional nonlinear non-Gaussian filter. The setup is similar to previous simulations in the Lorenz-63 system. The initial ensemble is generated by applying the EnKF of size 400 after 2000 steps using a complete observation model with standard Gaussian noise. In the next 2000 assimilation cycles, different filtering methods are applied with an ensemble of size 400. The time step between two assimilation cycles is Δ = 0.4, which indicates a highly nonlinear non-Gaussian forecast ensemble. To make the problem more challenging, it is assumed that not all coordinates of the state variable are observed. In particular, a linear but incomplete observation mechanism is used: , where with ei = (0, … , 1, … , 0)T being the unit vector with all zero entries except the ith position, and . That is, only state coordinates with odd indices are observed with independent Gaussian noise of variance 0.5. Such a combination of incomplete observation, high nonlinearity, and high dimensionality poses great challenge to data assimilation algorithms.

Here we compare four methods: 1) NLEAF1 (NLEAF with first-order correction); 2) NLEAF1q (NLEAF using quadratic regression); 3) EnKF (the stochastic EnKF); and 4) XEnsF, a nonlinear filter using Gaussian mixture approximations to the prior and posterior distributions (Bengtsson et al. 2003). Both NLEAF1 and NLEAF1q uses localization and averaging parameters (l, k) = (3, 1). The inflation follows (7) with δ = 0.045 for NLEAF1 and NLEAF1q, and δ = 0.005 for the EnKF.

(ii) Results

The results are summarized in Table 3. We only look at the mean, median, and standard deviation of the 2000 RMSEs since they are the only reported performance metrics for the XEnsF in Bengtsson et al. (2003). Both the NLEAF1 and NLEAF1q give small and stable RMSEs on average. Remember that in this setting the observation is still linear with Gaussian noise, the NLEAF algorithms outperform other methods because of their ability of handling nonlinearity. One might note that the difference in the average RMSE seems not too large compared to the standard deviation. The reason is that the 2000 RMSEs have strong temporal correlation and their distributions are heavy tailed.

Table 3.

Average RMSE over 2000 time steps for the Lorenz-96 system with incomplete observations and Gaussian N(0, 0.5) observation errors. The step size Δ = 0.4.

Table 3.

It should be noted that the result for XEnsF is quoted directly from Bengtsson et al. (2003), thus their results depend on different simulation data than ours and are perhaps less comparable.

2) The easy case

(i) Simulation setup

The easy case uses observation interval Δ = 0.05, which makes the system much more linear and can be thought nominally equivalent to 6 h in real-world time (Lorenz 1996). This setting allows a more comprehensive comparison with existing methods since it has been studied extensively in the literature (e.g., see Whitaker and Hamill 2002; Ott et al. 2004; Sakov and Oke 2008a,b). In particular, we can study the effect of the ensemble size n on filter performance, which is an important practical concern.

The initial ensemble is generated using the same method as in the hard case. Then the system runs to generate 2000 assimilation cycles with a complete observation yt = xt + ϵt. Since we are interested in non-Gaussian situations, the observation error ϵt has independent double-exponential coordinates with scale parameter θ = 1.

We compare the NLEAF with first-order correction (NLEAF1), the EnKF, and the local ensemble transform Kalman filter (LETKF; Ott et al. 2004). The LETKF is among the best-performing ensemble filters in this setting with Gaussian observation noise. It uses the sliding-window localization and performs well especially when the ensemble size is very small. However, it is also reported that its performance can hardly be improved by a larger ensemble size. Other particle filter–based methods such as the PFGR and MPF either have difficulty in localization or diverge with limited ensemble size for deterministic systems.

The LETKF uses localization parameter k = l = 3 for n = 10, 20, k = l = 4 for n = 50, and k = l = 10 for n = 100, 400. It uses inflation with δ = 0.02 for n = 10, 20, 50 and δ = 0.01 for n = 100, 400. The EnKF uses Cohn–Gaspari covariance tapering with c = 5, 10 for n = 10, 20, and c = 20 for larger ensemble sizes. The inflation rate δ = 0.04, 0.04, 0.02, 0.02, 0.01 for n = 10, 20, 50, 100, 400, respectively. For NLEAF1, the localization parameter is k = l = 3 for n = 10, 20, and k = l = 4 for n = 50, 100, 400, respectively; the inflation rate δ = 0.06, 0.04, 0.02, 0.01, 0.01 for n = 10, 20, 50, 100, 400, respectively. NLEAF1q only works for n = 400 with k = l = 4 and inflation rate 0.09. All methods use inflation as described in (7).

(ii) Results

Results on the average RMSE are summarized in Fig. 1 for a quick comparison. Table 4 summarizes some further comparisons on ensemble spread and confidence interval coverage. The ensemble spread is measured by the square root of trace of the ensemble covariance matrix, scaled by to match the RMSE. The number below is the percentage of being covered by the 95% confidence interval. A not applicable (N/A) entry indicates that the filter diverges (the whole ensemble is far away from the true orbit).

Fig. 1.
Fig. 1.

Ensemble size against average RMSE for the Lorenz-96 system with step size 0.05 and double-exponential observation noise.

Citation: Monthly Weather Review 139, 12; 10.1175/2011MWR3553.1

Table 4.

Average RMSE over 2000 time steps for the Lorenz-96 system with complete observation and step size Δ = 0.05. The observation errors are sampled from a double-exponential distribution with zero mean and scale parameter θ = 1. Also presented is the average ensemble spread (in parentheses) and the percentage of (1) being ranked between 2.5% and 97.5% in the updated ensemble (below the average RMSE and standard deviation).

Table 4.

From Fig. 1 and Table 4 we see that the NLEAF1 performs better than the LETKF as long as the ensemble size exceeds 20, providing both accurate point estimates and confidence intervals. In this setting the system is pretty linear and NLEAF1 outperforms the LETKF mainly because of the non-Gaussian observation noise. One might notice that the LETKF performs slightly worse for n = 400 than n = 100. This could be a result of its vulnerability to the presence of outliers (Lawson and Hansen 2004; Lei et al. 2010). That is, increasing the sample size also increases the chance of having outliers, which has a significant impact on the performance of deterministic filters.

5. Further discussion

This article demonstrates how simple statistical ideas can help design better filtering algorithms. The NLEAF algorithm inherits some key features of the EnKF by keeping track of each particle, which makes it suitable for high-dimensional problems with spatial structures such as the Lorenz-96 system. It also reflects the bias-variance trade-off principle: the EnKF is computationally stable but biased for nonlinear non-Gaussian problems, whereas the particle filter is consistent but unstable for moderate-scale problems. The NLEAF is somewhere in the middle: it uses the particle filter only up to the first and second moments, while avoiding reweighting/resampling to maintain stability.

Although the NLEAF works nicely for the Lorenz models, it does not completely overcome the collapse problem of the particle filter. Currently the implementation of NLEAF depends on the sliding-window localization. One might expect difficulties if the state vector is only sparsely observed. In our experiments the largest local window is of size 9, which is still a small number for real-world systems such as a global climate model. For higher dimensionality, importance sampling is likely to collapse even for the first moment. One potential solution to this problem could be to assimilate each observation coordinate sequentially, with the impact on each state coordinate adjusted according to its distance to the observation (Anderson 2003, 2007).

The NLEAF algorithm originates from a regression perspective of the EnKF. Such a perspective can be used in other ways to design new filtering methods. In some high-dimensional problems, the sparsity structure (or the support) of the state covariance matrix may not depend only on the physical distance. As a result, the sliding-window localization method described in section 4b is no longer valid. Then one may be able to employ sparse regression methods to identify the set of “effective” neighborhood for each state coordinate.

Another direction of future work is the development of a deterministic NLEAF. The NLEAF update introduced here uses perturbed observations that are subject to sampling errors. It is possible to reduce such sampling errors in a deterministic NLEAF update where the posterior moments are estimated by importance sampling or nonlinear regression methods, but the ensemble is updated deterministically to have the corresponding moments.

Acknowledgments

The authors thank Jeff Anderson, Thomas Bengtsson, Doug Nychka, and Chris Snyder for helpful comments. Lei and Bickel are supported by Grants NSF DMS-0605236 and NSF DMS-0906808.

REFERENCES

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634642.

  • Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter. Physica D, 230, 99111.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, T., , C. Snyder, , and D. Nychka, 2003: Toward a nonlinear ensemble filter for high-dimensional systems. J. Geophys. Res., 108 (D24), 8775, doi:10.1029/2002JD002900.

    • Search Google Scholar
    • Export Citation
  • Bishop, C. H., , B. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transformation Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436.

    • Search Google Scholar
    • Export Citation
  • Brusdal, K., , J. M. Brankart, , G. Halberstadt, , G. Evensen, , P. Brasseur, , P. J. van Leeuwen, , E. Dombrowsky, , and J. Verron, 2003: A demonstration of ensemble-based assimilation methods with a layered OGCM from the perspective of operational ocean forecasting systems. J. Mar. Syst., 40–41, 253289.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724.

  • Chorin, A., , and X. Tu, 2009: Implicit sampling for particle filters. Proc. Natl. Acad. Sci. USA, 106, 17 24917 254.

  • Evensen, G., 1994: Sequential data assimilation with a non-linear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367.

  • Evensen, G., 2007: Data Assimilation: The Ensemble Kalman Filter. Springer, 279 pp.

  • Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757.

    • Search Google Scholar
    • Export Citation
  • Gordon, N., , D. Salmond, , and A. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc.-F, 140, 107113.

    • Search Google Scholar
    • Export Citation
  • Hammersley, J. M., , and D. C. Handscomb, 1965: Monte Carlo Methods. Methuen & Co., 178 pp.

  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129, 123137.

    • Search Google Scholar
    • Export Citation
  • Künsch, H. R., 2005: Recursive Monte Carlo filters: Algorithms and theoretical analysis. Ann. Stat., 33, 19832021.

  • Lawson, W. G., , and J. A. Hansen, 2004: Implications of stochastic and deterministic filters as ensemble-based data assimilation methods in varying regimes of error growth. Mon. Wea. Rev., 132, 19661981.

    • Search Google Scholar
    • Export Citation
  • Lei, J., , P. Bickel, , and C. Snyder, 2010: Comparison of ensemble Kalman filters under non-Gaussianity. Mon. Wea. Rev., 138, 12931306.

  • Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Reading, Berkshire, United Kingdom, European Centre for Medium-Range Weather Forecasts, 1–18.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 2005: Designing chaotic models. J. Atmos. Sci., 62, 15741587.

  • Mitchell, H. L., , and P. L. Houtekamer, 2009: Ensemble Kalman filter configurations and their performance with the logistic map. Mon. Wea. Rev., 137, 43254343.

    • Search Google Scholar
    • Export Citation
  • Musso, C., , N. Oudjane, , and F. Le Gland, 2001: Improving regularized particle filters. Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon, Eds., Springer-Verlag, 247–271.

    • Search Google Scholar
    • Export Citation
  • Nakano, S., , G. Ueno, , and T. Higuchi, 2007: Merging particle filter for sequential data assimilation. Nonlinear Processes Geophys., 14, 395408.

    • Search Google Scholar
    • Export Citation
  • Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415428.

  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 11941207.

  • Sacher, W., , and P. Bartello, 2008: Sampling errors in ensemble Kalman filtering. Part I: Theory. Mon. Wea. Rev., 136, 30353049.

  • Sakov, P., , and P. R. Oke, 2008a: A deterministic formulation of the ensemble Kalman filter: An alternative to ensemble square root filters. Tellus, 60A, 361371.

    • Search Google Scholar
    • Export Citation
  • Sakov, P., , and P. R. Oke, 2008b: Implications of the form of the ensemble transformation in the ensemble square root filters. Mon. Wea. Rev., 136, 10421053.

    • Search Google Scholar
    • Export Citation
  • Snyder, C., , T. Bengtsson, , P. Bickel, , and J. Anderson, 2008: Obstacles to high-dimensional particle filtering. Mon. Wea. Rev., 136, 46294640.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131, 20712084.

  • van Leeuwen, P. J., 2009: Particle filtering in geophysical systems. Mon. Wea. Rev., 137, 40894114.

  • van Leeuwen, P. J., 2010: Nonlinear data assimilation in geosciences: An extremely efficient particle filter. Quart. J. Roy. Meteor. Soc., 136, 19911999.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924.

  • Xiong, X., , I. M. Navon, , and B. Uzunoglu, 2006: A note on the particle filter with posterior Gaussian resampling. Tellus, 58A, 456460.

    • Search Google Scholar
    • Export Citation
Save