• Anderson, J., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J., 2007a: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224.

  • Anderson, J., 2007b: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter. Physica D, 230, 99111.

    • Search Google Scholar
    • Export Citation
  • Anderson, J., and S. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758.

    • Search Google Scholar
    • Export Citation
  • Bocquet, M., C. Pires, and L. Wu, 2010: Beyond Gaussian statistical modeling in geophysical data assimilation. Mon. Wea. Rev., 138, 29993023.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., P. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724.

  • Dee, D., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation. Mon. Wea. Rev., 123, 11281145.

  • Doucet, A., N. de Freitas, K. Murphy, and S. Russell, 2000: Rao-Blackwellised particle filtering for dynamic Bayesian networks. Proc. 16th Conf. on Uncertainty in Artificial Intelligence, Stanford, CA, Association for Uncertainty in Artificial Intelligence, 176–183.

    • Search Google Scholar
    • Export Citation
  • Doucet, A., N. de Freitas, and N. Gordon, Eds., 2001: Sequential Monte Carlo Methods in Practice. Springer, 581 pp.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757.

    • Search Google Scholar
    • Export Citation
  • Gordon, N., D. Salmond, and A. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc., 140F, 107113.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P., and H. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

  • Ide, K., P. Courtier, M. Ghil, and A. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181189.

    • Search Google Scholar
    • Export Citation
  • Li, H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter. Quart. J. Roy. Meteor. Soc., 135, 523533.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E., and K. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414.

    • Search Google Scholar
    • Export Citation
  • Mitchell, H., and P. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128, 416433.

  • Mustière, F., M. Bolić, and M. Bouchard, 2006: A modified Rao-Blackwellised particle filter. Proc. IEEE Conf. on Acoustics, Speech, and Signal Processing, Toulouse, France, IEEE, doi:10.1109/ICASSP.2006.1660580.

    • Search Google Scholar
    • Export Citation
  • Nakano, S., G. Ueno, and T. Higuchi, 2007: Merging particle filter for sequential data assimilation. Nonlinear Processes Geophys., 14, 395408.

    • Search Google Scholar
    • Export Citation
  • Peterka, V., 1981: Bayesian approach to system identification. Trends and Progress in System Identification, P. Eykhoff, Ed., Pergamon Press, 239–304.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 11941207.

  • Schön, T., F. Gustafsson, and P.-J. Nordlund, 2005: Marginalized particle filters for mixed linear/nonlinear state-space models. IEEE Trans. Signal Process., 53, 22792289.

    • Search Google Scholar
    • Export Citation
  • Šmídl, V., and A. Quinn, 2008: Variational Bayesian filtering. IEEE Trans. Signal Process., 56, 50205030.

  • Snyder, C., T. Bengtsson, P. Bickel, and J. Anderson, 2008: Obstacles to high-dimensional particle filtering. Mon. Wea. Rev., 136, 46294640.

    • Search Google Scholar
    • Export Citation
  • Stroud, J., and T. Bengtsson, 2007: Sequential state and variance estimation within the ensemble Kalman filter. Mon. Wea. Rev., 135, 31943208.

    • Search Google Scholar
    • Export Citation
  • Tippett, M., J. Anderson, C. Bishop, T. Hamill, and J. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490.

  • van Leeuwen, P., 2009: Particle filtering in geophysical systems. Mon. Wea. Rev., 137, 40894114.

  • Vossepoel, F., and P. van Leeuwen, 2007: Parameter estimation using a particle method: Inferring mixing coefficients from sea level observations. Mon. Wea. Rev., 135, 10061020.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J., and T. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924.

  • View in gallery

    Estimation of stationary parameters on rectangular grid. (top) Marginal log-likelihood (33) of the tuned parameters Δ, l. Labels of the contour lines denote the difference from the maximum, which is marked by a circle. (bottom) Time average of RMSE (34).

  • View in gallery

    Illustration of the posterior densities of the parameters via posterior mean (black line) and range between minimum and maximum value (gray area).

  • View in gallery

    Position of the particles denoted by a plus sign on the background of stationary contours of marginal log-likelihood.

  • View in gallery

    Instability in parameter estimation via the CI-MPF algorithm.

  • View in gallery

    Estimation of system (37) with time-varying αt of triangular profile displayed at the bottom. Posterior densities of the parameters are displayed via their mean value (black line) and region between minimum and maximum value of the particles (gray area).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 73 68 3
PDF Downloads 30 24 0

Marginalized Particle Filtering Framework for Tuning of Ensemble Filters

View More View Less
  • 1 Institute of Information Theory and Automation, Prague, Czech Republic
© Get Permissions
Full access

Abstract

Marginalized particle filtering (MPF), also known as Rao-Blackwellized particle filtering, has been recently developed as a hybrid method combining analytical filters with particle filters. This paper investigates the prospects of this approach in environmental modeling where the key concerns are nonlinearity, high-dimensionality, and computational cost. In the formulation herein, exact marginalization in the MPF is replaced by approximate marginalization, yielding a framework for creation of new hybrid filters. In particular, the authors propose to use the MPF framework for online tuning of nuisance parameters of ensemble filters. Conditional independence–based simplification of the MPF algorithm is proposed for computational reasons and its close relation to previously published methods is discussed. The strength of the framework is demonstrated on the joint estimation of the inflation factor, the measurement error variance, and the length scale parameter of covariance localization. It is shown that accurate estimation can be achieved with a moderate number of particles. Moreover, this result was achieved with naively chosen proposal densities, leaving space for further improvements.

Corresponding author address: Václav Šmídl, Institute of Information Theory and Automation, Pod Vodárenskou věží 4, Praha 8, 196 00 Czech Republic. E-mail: smidl@utia.cas.cz

Abstract

Marginalized particle filtering (MPF), also known as Rao-Blackwellized particle filtering, has been recently developed as a hybrid method combining analytical filters with particle filters. This paper investigates the prospects of this approach in environmental modeling where the key concerns are nonlinearity, high-dimensionality, and computational cost. In the formulation herein, exact marginalization in the MPF is replaced by approximate marginalization, yielding a framework for creation of new hybrid filters. In particular, the authors propose to use the MPF framework for online tuning of nuisance parameters of ensemble filters. Conditional independence–based simplification of the MPF algorithm is proposed for computational reasons and its close relation to previously published methods is discussed. The strength of the framework is demonstrated on the joint estimation of the inflation factor, the measurement error variance, and the length scale parameter of covariance localization. It is shown that accurate estimation can be achieved with a moderate number of particles. Moreover, this result was achieved with naively chosen proposal densities, leaving space for further improvements.

Corresponding author address: Václav Šmídl, Institute of Information Theory and Automation, Pod Vodárenskou věží 4, Praha 8, 196 00 Czech Republic. E-mail: smidl@utia.cas.cz

1. Introduction

We are concerned with the task of data assimilation using a Bayesian approach. Bayesian interpretation of data assimilation is well established in the community (e.g., by Anderson and Anderson 1999). However, application of standard Bayesian methods in this domain is not straightforward because of its specific properties, most notably high-dimensionality of the state variable. Successful application of Bayesian methods in this domain requires their adaptation. This was the case of two prominent Bayesian methods: the Kalman filter (KF) and the particle filter (PF). Practical constraints of the domain motivated the approximation of the formally correct Kalman filter into the form of ensemble Kalman filter (EnKF) (Evensen 1994), which has evolved beyond the Kalman filter since then. Direct application of the original particle filter in the considered domain is problematic (Snyder et al. 2008); however, the ideas of particle filtering are incorporated in many new algorithms. Each of these filters has features attractive for the domain as well as limitations. Specifically, the KF (and EnKF) is limited by approximation of the posterior by a Gaussian density, which is required for computational tractability in high dimensions. On the other hand, the particle filter is capable of approximating an arbitrary density via empirical density at the price of significantly higher computational cost, which is prohibitive in the considered dimensions. A combination of these approaches is an obvious goal. Indeed, various hybrid filters have been proposed to overcome the above mentioned difficulties (e.g., Pham 2001; Nakano et al. 2007; van Leeuwen 2009; Bocquet et al. 2010, etc.). A common assumption of these methods is that the state variable is treated as a homogeneous variable (i.e., that all entries in the state vector are treated identically).

In this paper we investigate the applicability of the marginalized particle filter (MPF) (Schön et al. 2005)—also known as Rao-Blackwellized PF (Doucet et al. 2000)—in the field of environmental modeling. This filter arises as a combination of the conditional Kalman filter with the particle filter. The key difference from the abovementioned hybrid filters is that it partitions the state variable into two parts and approximates each part by a different type of distribution. Partitioning of the state space has been discussed by van Leeuwen (2009) in the context of hierarchical Bayesian models, where both partitions were estimated by the particle filter. The partitioning may seem to be arbitrary step in the considered domain; however, we will show that this approach may be attractive for addressing some specific scenarios such as online tuning of already existing filters. Particle filtering ideas have already been used for estimation of parameters by Vossepoel and van Leeuwen (2007); however, the resulting method was based on weighting of the ensemble members. The MPF approach is based on running several ensemble filters in parallel and it is thus closer to the hierarchical ensemble filter (Anderson 2007b).

The paper is organized as follows. In section 2, we review the basic algorithms of Bayesian filtering. In section 3, we apply the MPF approach to tuning of ensemble filters. Performance of the proposed methods is studied in section 4 via simulation of the Lorenz-96 model.

2. Bayesian filtering and its approximation

The task of data assimilation can be interpreted as a problem of inference of a discrete-time stochastic process:
e1
Here, xt is a vector known as the state variable, yt are the observations, and p(·|·) denotes the conditional probability density of the variable. By Bayesian filtering we mean the recursive evaluation of the filtering distribution, p(xt|y1:t), using Bayes’ rule (Peterka 1981):
e2
e3
where p(x1|y0) is the prior distribution, and y1:t = [y1, … ,yt] denotes the set of all observations. The integration in (3), and elsewhere in this paper, is over the whole support of the involved probability density functions.

Equations (2) and (3) are analytically tractable only for a limited set of models. The most notable example of an analytically tractable model is linear Gaussian (1), for which (2) and (3) are equivalent to the Kalman filter. For other models, (2) and (3) need to be evaluated approximately. Selected approximations are now briefly described.

a. Ensemble Kalman filtering

The ensemble Kalman filter arises as an approximation of Bayesian filtering for a Gaussian stochastic system:
e4
where is a Gaussian distribution with mean value μ and covariance matrix Σ. Here is a linear observation operator that transforms vectors from the space of model into the space of observations; generally the nonlinear operator represents dynamics of the system. Matrices are error covariance matrices of observations and model, respectively. Optimal inference of model (4) for linear operator yields the Kalman filter, which implies potentially high computational cost for high-dimensional state vector. The ensemble Kalman filter is an approximation of the Bayesian filtering for model (4) with predictive and posterior densities (2) and (3) projected to Gaussian densities:
eq1
Here, vectors and are known as the background (predictive) model estimate and analysis (posterior) estimate, in standard notation (Ide et al. 1997).
The predictive and posterior statistics are evaluated via a set of state vectors known as the ensemble members as follows:
e5
e6
Here, are deviations of the ensemble members from the mean . The standard EnKF filter (Burgers et al. 1998) requires us to draw random samples to perturb the observations to compute . This need for random sampling is eliminated by deterministic variants such as the ensemble adjustment Kalman filter (Anderson 2001) and the ensemble square root filter (EnSRF) (Whitaker and Hamill 2002; Tippett et al. 2003).
In the EnSRF, the posterior mean and the deviations are updated separately, as follows:
e7
e8
where is an observation operator (which can be potentially nonlinear), is the Kalman gain matrix for updating of the ensemble mean (7),
e9
and is the modified Kalman gain for updating departures from the mean (8):
e10
For a diagonal covariance matrix of observations , (10) may be processed one element of yt at a time without any significant increase of computational effort.
In all the equations above, the terms with are replaced by empirical covariance in the sense of (6). The same replacement is used for predictive density of the observations,
e11
e12
which corresponds to the standard predictive density of the Kalman filter (Peterka 1981). This quantity is often called marginal likelihood (marginalization is with respect to xt).

Small sample issues

Because of approximation (6), the maximum rank of matrix is M. The full rank covariance matrix can thus be obtained with the number of ensemble members equal to or greater than the dimensionality of the state variable. However, the number of ensemble members is typically much smaller, yielding inaccurate approximation of the true background error covariance matrix. Basic techniques for improvement are covariance inflation and distance-dependent covariance localization.

Ensemble inflation is a method that artificially increases the background error variance (Anderson and Anderson 1999). The forecast ensemble is replaced by :
e13
where Δ is the inflation factor slightly greater than 1. Equation (13) implies that the mean value of the ensemble remains the same but its variance is increased.

Covariance localization is a method suppressing spurious covariances representing false relations between distant states (Houtekamer and Mitchell 1998). The method artificially suppresses covariances between distant entries of the state vector and thus improves the conditioning of matrices and . Localization of the forecast error covariance matrix is performed as a Schur product (element-by-element product) of a localization matrix with the model forecast error covariance matrix (6). Localization matrices are constructed by using various correlation functions that are equal to 1 at the observation location and gradually decrease to zero at a finite distance from the observation. This distance is parameterized by the so-called length scale parameter l (Gaspari and Cohn 1999).

b. Particle filtering

Particle filtering (Gordon et al. 1993; Doucet et al. 2001) refers to a range of techniques for generating an empirical approximation of p(x1:t|y1:t), where x1:t = [x1, … , xt] is the state trajectory:
e14
Here, are independent, identically distributed (i.i.d.) samples from the posterior, and δ(·) denotes the Dirac δ function. Therefore, this approach is feasible only if we can sample from the exact posterior p(x1:t|y1:t). If this is not the case, we can draw samples from a chosen proposal distribution (importance function), q(x1:t|y1:t), as follows:
e15
Approximation (15) can be written in the form of a weighted empirical distribution, as follows:
e16
e17
Under this importance sampling procedure, the true posterior distribution needs to be evaluated pointwise only, since (16) can be normalized trivially via a constant . Here ∝ is used to denote equality up to the normalization constant.
The challenge for online algorithms is to achieve recursive generation of the samples and evaluation of the importance weights. Using (1) and standard Bayesian calculus, (17) may be written in the following recursive form:
e18
Now, are drawn from the denominator of (18), which can be chosen as p(xt|xt−1). A successful application of the particle filter requires us to resample the particles to preserve informative weights . While resampling can be done in each step, it is computationally advantageous to perform it only when the effective number of particles falls below a chosen threshold . See Doucet et al. (2001) for more details and proofs of convergence of the filter to the true posterior density.

c. Marginalized particle filtering

The main advantage of importance sampling is its generality. However, it may be computationally prohibitive to draw samples from the possibly high-dimensional state space of xt (Snyder et al. 2008). These problems can be overcome in cases where the structure of the model (1) allows analytical marginalization over a subset xc,t of the full state vector (Doucet et al. 2001; Schön et al. 2005). Therefore, we consider the factorization
e19
where p(xc,1:t|xp,1:t, y1:t) is analytically tractable, while p(xp,1:t|y1:t) is not. We replace the latter by a weighted empirical distribution, in analogy to (15), yielding
e20
e21
Note that now we only have to sample from the space of xp,t. Recursive evaluation is achieved by application of the Bayes’ rule
e22
and substitution of (20) in place of p(x1:t|y1:t) and p(x1:t−1|y1:t−1). Comparing elements in the summations on both sides of (22), we obtain
e23
e24
Analytical tractability of integrations in (24) allows us to use the Rao-Blackwell theorem to show that the variance of weights (23) is lower than the variance of weights (18) for the same problem (Doucet et al. 2000).

d. Marginalized particle filtering framework

The requirement of analytical tractability of integrations in (24) is always fulfilled when (1) contains a linear Gaussian part (Schön et al. 2005) giving rise to the MPF with the Kalman filter. For the considered high-dimensional application, we propose to relax the requirement of exact marginalization in (24) and replace it by an approximation. This is explained below.

Algorithm 1: General MPF Framework

  1. Generate initial particles and set initial statistics of all associated conditional filters.

  2. For each new data record yt, do the following:

    1. Sample new values of particles , and update statistics of all associated conditional filters via (25).

    2. Compute weights (23) of all particles and their associated conditional filters.

    3. If the number of effective particles neff is lower than the chosen threshold, resample the particles.

We note that, given numerical values of and , (24) is equivalent to the normalizing constant of a Bayesian filter (2). Hence, any Bayesian filter that is capable of evaluating its normalizing constant can be used to approximate (24). What results is an algorithm equivalent to the marginalized particle filtering where the analytical Kalman filters are replaced by approximate conditional filters. Specifically, the following filters interact via algorithm 1:
  • Conditional filter: on variable xc,t, treating xp,t as an observation, that is,
    e25
  • Particle filter: on variable xp, that handles sampling from the proposal function q(xp,t|xp,t−1, y1:t) and resampling. Each particle is attached to one conditional filter.

In this general form, the algorithm is rather unspecific. This is due to the fact that arbitrary conditional filters can be combined with arbitrary particle filtering approaches. Therefore, we consider algorithm 1 to represent a framework for designing specific filtering variants. The word “framework” is used to distinguish this approach from the analytical MPF.

3. MPF framework for tuning of ensemble filters

The key property of the MPF is partitioning of the state variable into two parts. In the original exact formulation, the choice of partitioning is fully determined by tractability of the Bayes’ rule (2). Finding a partitioning in the context of environmental modeling where the state variables typically obey the same equations is harder. However, the relaxed formalization of section 2d allows us to interpret xp,t not as a partition of the full state but rather as an augmentation of the original state (now denoted by xc,t) by nuisance parameters. What results is a framework for online tuning of existing filters.

The general algorithm of tuning is described in algorithm 1. Specific variants arise for the following choices:

  • Choice 1: Choose a preferred variant of the conditional filter (e.g., a variant of the ensemble filter) estimating xc,t.

  • Choice 2: Choose tuning parameters of interest xp,t and use them to augment the original state xc,t via a chosen model of their evolution p(xp,t|·).

  • Choice 3: Choose a proposal density q(xp,t|·), such as the evolution model q(xp,t|·) ≡ p(xp,t|·).

Different choices in each of the points above will lead to different properties of the resulting filter. The number of possible combinations of these choices is enormous, and finding guidelines for the best option in a given application context is a task for further research. In some applications, a physically motivated evolution model of xp,t may be found, while heuristic or expert-chosen models may be more appropriate in others.

a. Estimation of inflation factor, observation error variance, and length scale parameter

The techniques for compensating small ensemble issues (see section 2a) have tuning parameters that typically need to be set up experimentally. Substantial effort has been put into online estimation of the inflation factor alone (Anderson 2007a), in tandem with the observation error (Li et al. 2009), or scaling of the observation and model errors (Stroud and Bengtsson 2007). In this section, we approach the same problem using the MPF framework with the following specific choices.

  • Choice 1: We have chosen the EnSRF with multiplicative inflation (13) as our conditional filter.

  • Choice 2: The unknown tuning parameters are (i) the time-variant inflation factor Δt, (ii) the time-variant variance of the observation error rt for all observations (i.e., , where is the identity matrix), and (iii) the time-varying length scale parameter lt of the covariance localization function. Evolution of these parameters is modeled by truncated Gaussian random walks:
    e26
    where scalar parameters σΔ, σr, and σl denote the spreads of the random walks, respectively. Nonnegativity of all considered parameters motivates truncation of support of the random walks. The augmentation of the state vector is then xp,t = [Δt, rt, lt]T.
  • Choice 3: The proposal density is chosen as p(xp,t|xp,t−1) = ptt−1)p(rt|rt−1)p(lt|lt−1), under which (23) reduces to
    e27
    where is defined by (11) now explicitly conditioned on the unknown parameters
    e28
Under these choices, algorithm 1 defines an adaptation scheme related to other approaches used in the literature. Specifically, (28) is the same equation that was used for maximum likelihood estimation of covariance parameters (Dee 1995). Maximization of this function is achieved, for example, via simplex methods (Mitchell and Houtekamer 2000). In our approach, (28) serves as a likelihood function for Bayesian estimation of the tuning parameters xp,t. The variance of the random walk then models our belief in the time variability of the tuning parameters. In the special case of stationary parameters, (i.e., σΔ = σr = σl = 0), algorithm 1 is reduced to a parallel run of n ensemble filters, each of which is accumulating the product of (28) in each step. After several hundreds of steps, the majority of the weights will converge to zeros and one of them will converge to one. Such behavior is known as sample impoverishment in the particle filtering literature. The convergence of probability mass to a single point may be useful for finding the best-tuned values in offline phase. However, this degeneracy is undesirable for online application, and nonzero variances of random walks (26) have to be used.

For nonstationary parameters, each of the n filters follows a random walk of the tuning parameters. The resampling operation removes filters that diverged into unlikely regions and replaces them by copies of the filters with parameters that are more likely. The area of higher likelihood is then explored by more filters in detail. This of course requires us to run n ensemble filters in parallel, which is computationally expensive. However, the key advantage of this approach is that it is able to optimize nonconvex and multimodal likelihood functions.

b. Computational simplifications within the MPF framework

Parallel evaluation of n ensemble filters is computationally inefficient especially when the tuning parameters are close to each other, for example in cases when the posterior is very sharp. In that case, we may represent the posterior by a point estimate ,
e29
and explore the space of (28) only locally. This is a common approach in adaptive ensemble filtering and we will discuss its implications from the MPF point of view.
Adaptive evaluation of a single ensemble filter corresponds to approximation of the joint posterior (19) by a conditionally independent posterior
e30
The problem is then split in an interaction of a single ensemble filter and a filter on parameters, as used, for example, by Anderson (2007a). The key difficulty of this approach is that the normalizing constant of the ensemble filter (28) is informative only about the point estimate and not about its posterior density. Optimization of (28) around this value at time t is independent of the value at time t − 1. Restoration of temporal correlation between values of the parameters is achieved, for example, by the means of temporal smoothing (Li et al. 2009).

Algorithm 2: Computational simplification of the MPF framework (CI-MPF)

  1. Generate initial particles and set the initial statistics of the ensemble filter.

  2. For each new data record yt, do the following:

    1. Compute weights (27), using (28) for each .

    2. Evaluate expected value (29) and use it to inflate and localize the background covariance .

    3. Perform the analysis step of the ensemble filter via (7)(10).

    4. If the number of effective particles neff is lower than the chosen threshold, resample the particles.

To resemble the same approach in the MPF framework, we impose additional approximation
e31
in (20) and (24). What results is a combination of a single ensemble filter with a particle filter, algorithm 2. Note that the MPF framework (algorithm 1) requires n background updates and n analysis steps of the ensemble filter, while the simplified algorithm CI-MPF requires one background update, n + 1 evaluations of (28), and one analysis step.

Approximation (31) was first presented by Mustière et al. (2006). We will use the label conditionally independent MPF (CI-MPF) in this paper. Approximation of the full posterior on xp,t by the expected value may seem as a coarse approximation since it loses all information about variance and higher-order moments. A way to use higher-order moments within this approach has been presented in Šmídl and Quinn (2008); however, it implies changes within the conditional filter (i.e., the ensemble filter in this case). An alternative is to use heuristic modifications that improve performance in simulations. In this paper, we have used two heuristic modifications: (i) temporal evolution of the particles is not performed between resampling steps, and (ii) the new particles are sampled from rather than .

4. Simulation studies

a. Lorenz-96 model

To demonstrate versatility of the method, we test the MPF approach for assimilation of the Lorenz-96 model by Lorenz and Emanuel (1998), which has been widely used in simulation studies. The model is given by
e32
where F is the model forcing and xi are variables forming a cyclic chain. We define x−1 = xJ−1, x0 = xJ, and xJ+1 = x1 to make (32) meaningful for all values of j = 1 … J. We use 40 variables and F = 8 for the strength of forcing. The model (32) can be integrated forward with the fourth-order Runge–Kutta scheme. The system is computationally stable for step of 0.05 nondimensional units, which is also the step of the analysis. All the experiments are performed as twin experiments.

b. Stationary parameters

To create a baseline for comparison of adaptive tuning strategies, we performed parallel run of EnSRFs for fixed values of Δ, l selected on a rectangular grid, as in Whitaker and Hamill (2002). From the Bayesian point of view, this setup corresponds to estimation of stationary parameters:
e33
where p(Δ, l) is a prior probability density on discrete values of Δ, l at the grid points, which is uniform, and p(yτ|y1:τ−1, Δ, l) is given by (28). For numerical stability, (33) is often computed in logarithmic scale where the product is replaced by the sum of marginal log-likelihoods.
The observation data are generated from the perfect model scenario where the “true” state was generated by integrating the Lorenz-96 model (32) for 100 000 steps and the observations are generated from the true state by addition of realizations of zero-mean Gaussian noise with variance r = 1. The analysis was performed by the EnSRF with covariance localization constructed using a compactly supported fifth-order piecewise rational function given by (4.10) in Gaspari and Cohn (1999) with length scale parameter l. The results of a simulation experiment with 132 EnSRFs with 15 ensemble members, r = 1, Δ = [1.0, 1.1, … , 1.10], and l = [0, 1, … , 11] are displayed in Fig. 1 in two modalities. First, the traditional RMSE is computed for each couple of parameters,
e34
where
e35
for the MPF algorithm. Second, the sum of marginal log-likelihoods (33) within the same time intervals is displayed for illustration.
Fig. 1.
Fig. 1.

Estimation of stationary parameters on rectangular grid. (top) Marginal log-likelihood (33) of the tuned parameters Δ, l. Labels of the contour lines denote the difference from the maximum, which is marked by a circle. (bottom) Time average of RMSE (34).

Citation: Monthly Weather Review 139, 11; 10.1175/2011MWR3586.1

Note that the contours of the marginal log-likelihood (33) correspond closely to the contours of the RMSE. This suggests that the marginal likelihood (28) is a good measure to optimize for the best RMSE in cases where the true state values are not known. The three best choices within each modality are given in Table 1. Note that two choices—Δ = 1.05, l = 8 and Δ = 1.04, l = 6—are in the top three for both criteria. The relative differences are rather small, however; in terms of normalized posterior probability (33) the best parameters in Table 1 are e82 times more likely than the second best.

Table 1.

Best stationary choices of nuisance parameters for EnSRF according to two criteria.

Table 1.

c. Adaptive estimation in perfect model scenario

The same observation data used for estimation of stationary parameters were used to estimate the time-varying parameters in two different scenarios: (i) fixed rt = 1.0, estimated Δt, lt, and (ii) estimated rt, lt, Δt. The particle filter is using multinomial resampling (Gordon et al. 1993) and . The variances of random walks (26) for the MPF algorithm were chosen as
e36
The CI-MPF algorithm was run with 10 times greater variances of the random walk than (36). The prior density of the tuning parameters is chosen as uniform on support p(r0) = U(0.1, 4), p0) = U(1.0, 1.10), p(l0) = U(0.11, 11.11).

Results of simulations for different number of particles, both algorithms (MPF and CI-MPF), and both scenarios are displayed in Tables 2 and 3, respectively. In accordance with Whitaker and Hamill (2002) and Li et al. (2009), we ignore the first 1000 steps and report the results only for the subsequent steps. Illustration of the behavior of the MPF algorithm in the first 100 steps of scenario (ii) is displayed in Fig. 2 via the mean value of the posterior density and the region between minimum and maximum value of the particles. Spatial distribution of the particles for Δt, lt in scenario (i) at time steps t = 1, 50 000, and 100 000 is displayed in Fig. 3 on the background of contours for the stationary marginal log-likelihood from Fig. 1. We note that alignment of the particles in the middle of the stationary contour at t = 50 000 is a coincidence; in the majority of all other time steps the cloud is a bit off the stationary optimum.

Table 2.

Adaptive tuning of xp,t = [Δt, lt]T and the resulting analysis RMSE, averaged over assimilation steps between t = 1000 and t = 100 000 (std denotes standard deviation of the estimates from the mean over time).

Table 2.
Table 3.

Adaptive tuning of xp,t = [Δt, rt, lt]T and the resulting analysis RMSE, averaged over assimilation steps between t = 1000 and t = 100 000 (std denotes standard deviation of the estimates from the mean over time).

Table 3.
Fig. 2.
Fig. 2.

Illustration of the posterior densities of the parameters via posterior mean (black line) and range between minimum and maximum value (gray area).

Citation: Monthly Weather Review 139, 11; 10.1175/2011MWR3586.1

Fig. 3.
Fig. 3.

Position of the particles denoted by a plus sign on the background of stationary contours of marginal log-likelihood.

Citation: Monthly Weather Review 139, 11; 10.1175/2011MWR3586.1

As expected, the RMSE of both tuning algorithms is steadily decreasing with increasing number of particles for all considered scenarios. The exact MPF algorithm clearly outperforms the conditionally independent approximation CI-MPF. Note that for n = 10 and higher, the MPF filter achieves better performance than the best-tuned filter. On the other hand, the CI-MPF is coming close to the best-tuned values but never better.

In the more complex scenario of tuning all three parameters, we can observe much higher gap between the performance of MPF and CI-MPF (Table 3). The MPF algorithm achieves only negligible increase of the RMSE over the scenario with known r, while CI-MPF is significantly worse. Especially with n = 5, the overall performance of the CI-MPF algorithm is severely degraded. This is caused by an instability in estimation that occurred around time t = 90 000 (see Fig. 4). Much smaller anomalies of a similar kind occur for all tested numbers of particles in the CI-MPF algorithm, while no such anomaly occurred in any run of the MPF algorithm. These anomalies also do not occur in any run of the CI-MPF algorithm of the simple scenario with known r. Therefore, we conjecture that with increasing uncertainty in xp,t the enforced locality of approximation (30) is quickly losing accuracy.

Fig. 4.
Fig. 4.

Instability in parameter estimation via the CI-MPF algorithm.

Citation: Monthly Weather Review 139, 11; 10.1175/2011MWR3586.1

We note that good performance of the adaptive tuning was achieved for as low as 10 particles. This result is especially promising since it suggests that even more challenging assimilation scenarios can be handled at comparable computational complexity. This is particularly true for the exact MPF algorithm since addition of one extra tuning parameter had negligible impact on the performance.

d. Model with random perturbations

For comparison with Li et al. (2009), we tested the MPF algorithms on data simulated with model (32) with additive random perturbations:
e37
where et is Gaussian distributed noise with zero-mean and unit variance. The observed data were generated using model (37) with α = 4 for 100 000 steps. The same setup of the EnSRF as in the previous experiments was used, including the same initial conditions. Results of estimation for the 100 000 steps are displayed in Table 4. Since parameter α is stationary, the time evolution of the parameter estimates using MPF is reaching the stationary values in Table 4 after the initial convergence period. However, convergence of the CI-MPF filter is much slower, causing a large standard deviation in Table 4.
Table 4.

Adaptive tuning of xp,t = [Δt, lt, rt]T for system with random model errors (37) with variance α = 4. The resulting analysis RMSE is averaged over assimilation steps between t = 1000 and t = 100 000; time averages of parameter estimates are displayed in tandem with standard deviation of the estimates from the mean over time. Both algorithms were run with n = 10 and .

Table 4.

Note that the additive noise was compensated by higher values of and lower values of than that of the perfect model (Table 3). This is in agreement with findings of Li et al. (2009) and also expected because α is increasing the background covariance (reflected by higher inflation) and decreasing correlation between elements of the state vector (reflected by lower length scale).

For testing the tracking properties of the MPF algorithms, we have designed a scenario with time-varying αt according to a triangular profile. Posterior densities of the parameters obtained using the MPF algorithm are displayed in Fig. 5. For this experiment, we increased the variances of random walks (26) to
eq2
In this case, the CI-MPF algorithm was unable to follow the change in the parameters, confirming the observation of its slower convergence from the experiment with stationary α. This experiment also confirms the trend of increasing Δt and decreasing lt with increasing αt. Note that when α returns to the stationary values, so do the estimates of the tuning parameters.
Fig. 5.
Fig. 5.

Estimation of system (37) with time-varying αt of triangular profile displayed at the bottom. Posterior densities of the parameters are displayed via their mean value (black line) and region between minimum and maximum value of the particles (gray area).

Citation: Monthly Weather Review 139, 11; 10.1175/2011MWR3586.1

5. Discussion and conclusions

The purpose of this paper is to present marginalized particle filtering (also known as Rao-Blackwellized filtering) as an attractive tool for research of data assimilation methods in environmental modeling and especially for tuning of ensemble filters. The method is based on partitioning of the state (or unknown parameters) into two parts: (i) unknowns estimated by a conditional filter and (ii) unknowns estimated by a particle filter. The original MPF assumes that the conditional filter is analytically tractable. In this paper, we propose to replace the analytical filters by the ensemble filters. The resulting algorithm allows us to address the problem of tuning of the ensemble filters. We have shown that the number of particles needed to achieve acceptable performance is rather low; for example, 10 particles are sufficient to achieve online tuning of the inflation factor and the length scale parameter in the EnSRF for the 40-dimensional Lorenz-96 model. Furthermore, we have shown that the approach easily extends to estimation of unknown variance of the observation error and potentially any other tuning parameters. Once again, 10 particles were sufficient to achieve performance comparable to that of the best-tuned filter.

Computational cost of the MPF framework is high since it requires running n EnSRFs in parallel. An attempt to reduce the cost was presented using the enforcement of conditional independence, resulting in an algorithm that adapts a point estimate of parameters for a single EnSRF. Similarities of this CI-MPF algorithm to previously published methods were discussed. The CI-MPF algorithm performed well only on simple scenarios with tuning of a low number of stationary parameters, where it was significantly faster than the exact MPF algorithm. However, in more demanding scenarios its performance significantly dropped. We expect that advantages of parallel evaluation of ensemble filters over adaptation of a single ensemble filter will become apparent in even more demanding scenarios. Computational complexity may prevent its operational use; however, it may be an important tool for gaining insight into the ensemble filters in the same spirit as in Anderson (2007b).

Acknowledgments

Václav Šmídl’s research was supported by Grant GACR 102/08/P250. Radek Hofman’s research was supported by Grant VG20102013018 and MŠMT 1M0572.

REFERENCES

  • Anderson, J., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J., 2007a: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224.

  • Anderson, J., 2007b: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter. Physica D, 230, 99111.

    • Search Google Scholar
    • Export Citation
  • Anderson, J., and S. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758.

    • Search Google Scholar
    • Export Citation
  • Bocquet, M., C. Pires, and L. Wu, 2010: Beyond Gaussian statistical modeling in geophysical data assimilation. Mon. Wea. Rev., 138, 29993023.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., P. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724.

  • Dee, D., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation. Mon. Wea. Rev., 123, 11281145.

  • Doucet, A., N. de Freitas, K. Murphy, and S. Russell, 2000: Rao-Blackwellised particle filtering for dynamic Bayesian networks. Proc. 16th Conf. on Uncertainty in Artificial Intelligence, Stanford, CA, Association for Uncertainty in Artificial Intelligence, 176–183.

    • Search Google Scholar
    • Export Citation
  • Doucet, A., N. de Freitas, and N. Gordon, Eds., 2001: Sequential Monte Carlo Methods in Practice. Springer, 581 pp.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757.

    • Search Google Scholar
    • Export Citation
  • Gordon, N., D. Salmond, and A. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc., 140F, 107113.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P., and H. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

  • Ide, K., P. Courtier, M. Ghil, and A. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181189.

    • Search Google Scholar
    • Export Citation
  • Li, H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter. Quart. J. Roy. Meteor. Soc., 135, 523533.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E., and K. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414.

    • Search Google Scholar
    • Export Citation
  • Mitchell, H., and P. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128, 416433.

  • Mustière, F., M. Bolić, and M. Bouchard, 2006: A modified Rao-Blackwellised particle filter. Proc. IEEE Conf. on Acoustics, Speech, and Signal Processing, Toulouse, France, IEEE, doi:10.1109/ICASSP.2006.1660580.

    • Search Google Scholar
    • Export Citation
  • Nakano, S., G. Ueno, and T. Higuchi, 2007: Merging particle filter for sequential data assimilation. Nonlinear Processes Geophys., 14, 395408.

    • Search Google Scholar
    • Export Citation
  • Peterka, V., 1981: Bayesian approach to system identification. Trends and Progress in System Identification, P. Eykhoff, Ed., Pergamon Press, 239–304.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 11941207.

  • Schön, T., F. Gustafsson, and P.-J. Nordlund, 2005: Marginalized particle filters for mixed linear/nonlinear state-space models. IEEE Trans. Signal Process., 53, 22792289.

    • Search Google Scholar
    • Export Citation
  • Šmídl, V., and A. Quinn, 2008: Variational Bayesian filtering. IEEE Trans. Signal Process., 56, 50205030.

  • Snyder, C., T. Bengtsson, P. Bickel, and J. Anderson, 2008: Obstacles to high-dimensional particle filtering. Mon. Wea. Rev., 136, 46294640.

    • Search Google Scholar
    • Export Citation
  • Stroud, J., and T. Bengtsson, 2007: Sequential state and variance estimation within the ensemble Kalman filter. Mon. Wea. Rev., 135, 31943208.

    • Search Google Scholar
    • Export Citation
  • Tippett, M., J. Anderson, C. Bishop, T. Hamill, and J. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490.

  • van Leeuwen, P., 2009: Particle filtering in geophysical systems. Mon. Wea. Rev., 137, 40894114.

  • Vossepoel, F., and P. van Leeuwen, 2007: Parameter estimation using a particle method: Inferring mixing coefficients from sea level observations. Mon. Wea. Rev., 135, 10061020.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J., and T. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924.

Save