• Anderson, J. L., 2001: An ensemble adjustment filter for data assimilation. Mon. Wea. Rev., 129 , 28842903.

  • Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 27412758.

    • Search Google Scholar
    • Export Citation
  • Bender, M., , and S. Orszag, 1978: Advanced Mathematical Methods for Scientists and Engineers. McGraw-Hill, 593 pp.

  • Bengtsson, T., , C. Snyder, , and D. Nychka, 2003: Toward a nonlinear ensemble filter for high-dimensional systems. J. Geophys. Res., 108 , D24. 87758785.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, T., , P. Bickel, , and B. Li, 2008: Curse of dimensionality revisited: Collapse of the particle filter in very large scale systems. Probability and Statistics: Essays in Honor of David A. Freedman, D. Nolan and T. Speed, Eds., Vol. 2, Institute of Mathematical Statistics, 316–334, doi:10.1214/193940307000000518. [Available online at http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.imsc/1207580091.].

  • Bickel, P., , and E. Levina, 2008: Regularized estimation of large covariance matrices. Ann. Stat., 36 , 199227.

  • Bickel, P., , B. Li, , and T. Bengtsson, 2008: Sharp failure rates for the bootstrap particle filter in high dimensions. Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, Vol. 3, B. Clarke and S. Ghosal Eds., Institute of Mathematical Statistics, 318–329, doi:10.1214/074921708000000228.

  • Chin, T. M., , M. J. Turmon, , J. B. Jewell, , and M. Ghil, 2007: An ensemble-based smoother with retrospectively updated weights for highly nonlinear systems. Mon. Wea. Rev., 135 , 186202.

    • Search Google Scholar
    • Export Citation
  • David, H. A., , and H. N. Nagaraja, 2003: Order Statistics. 3rd ed. John Wiley and Sons, 458 pp.

  • Doucet, A., , N. de Freitas, , and N. Gordon, 2001: An introduction to sequential Monte Carlo methods. Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon, Eds., Springer-Verlag, 2–14.

    • Search Google Scholar
    • Export Citation
  • Durret, R., 2005: Probability: Theory and Examples. 3rd ed. Duxbury Press, 512 pp.

  • Furrer, R., , and T. Bengtsson, 2007: Estimation of high-dimensional prior and posteriori covariance matrices in Kalman filter variants. J. Multivar. Anal., 98 , 2. 227255.

    • Search Google Scholar
    • Export Citation
  • Gordon, N. J., , D. J. Salmond, , and A. F. M. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proc., 140 , 107113.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 27762790.

    • Search Google Scholar
    • Export Citation
  • Harlim, J., , and B. R. Hunt, 2007: A non-Gaussian ensemble filter for assimilating infrequent noisy observations. Tellus, 59A , 225237.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Ide, K., , P. Courtier, , M. Ghil, , and A. C. Lorenc, 1997: Unified notation for data assimilation: operational, sequential, and variational. J. Meteor. Soc. Japan, 75 , (Special Issue). 181189.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., , M. M. Rienecker, , N. P. Kurkowski, , and D. A. Adamec, 2005: Ensemble Kalman filter assimilation of temperature and altimeter data with bias correction and application to seasonal prediction. Nonlinear Processes Geophys., 12 , 491503.

    • Search Google Scholar
    • Export Citation
  • Kim, S., , G. L. Eyink, , J. M. Restrepo, , F. J. Alexander, , and G. Johnson, 2003: Ensemble filtering for nonlinear dynamics. Mon. Wea. Rev., 131 , 25862594.

    • Search Google Scholar
    • Export Citation
  • Liu, J. S., 2001: Monte Carlo Strategies in Scientific Computing. Springer-Verlag, 364 pp.

  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20 , 130148.

  • Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Vol. 1, Reading, Berkshire, United Kingdom, ECMWF, 1–18.

  • Moradkhani, H., , K-L. Hsu, , H. Gupta, , and S. Sorooshian, 2005: Uncertainty assessment of hydrologic model states and parameters: Sequential data assimilation using the particle filter. Water Resour. Res., 41 .W05012, doi:10.1029/2004WR003604.

    • Search Google Scholar
    • Export Citation
  • Nakano, S., , G. Ueno, , and T. Higuchi, 2007: Merging particle filter for sequential data assimilation. Nonlinear Processes Geophys., 14 , 395408.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129 , 11941207.

  • Pitt, M. K., , and N. Shephard, 1999: Filtering via simulation: Auxilliary particle filters. J. Amer. Stat. Assoc., 94 , 590599.

  • Reichle, R. H., , D. B. McLaughlin, , and D. Entekhabi, 2002: Hydrologic data assimilation with the ensemble Kalman filter. Mon. Wea. Rev., 130 , 103114.

    • Search Google Scholar
    • Export Citation
  • Silverman, B. W., 1986: Density Estimation for Statistics and Data Analysis. Chapman and Hall, 175 pp.

  • Smith, K. W., 2007: Cluster ensemble Kalman filter. Tellus, 59A , 749757.

  • Snyder, C., , and F. Zhang, 2003: Assimilation of simulated Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 131 , 16631677.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131 , 20712084.

  • Whitaker, J. S., , G. P. Compo, , X. Wei, , and T. M. Hamill, 2004: Reanalysis without radiosondes using ensemble data assimilation. Mon. Wea. Rev., 132 , 11901200.

    • Search Google Scholar
    • Export Citation
  • Xiong, X., , I. M. Navon, , and B. Uzunoglu, 2006: A note on the particle filter with posterior Gaussian resampling. Tellus, 58A , 456460.

    • Search Google Scholar
    • Export Citation
  • Zhou, Y., , D. McLaughlin, , and D. Entekhabi, 2006: Assessing the performance of the ensemble Kalman filter for land surface data assimilation. Mon. Wea. Rev., 134 , 21282142.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Histograms of maxwi for Nx = 10, 30, and 100 and Ne = 103 from the particle-filter simulations described in text: Ne = 103, xN(0, 𝗜), Ny = Nx, 𝗛 = 𝗜, and ϵN(0, 𝗜).

  • View in gallery

    The ensemble size Ne as a function of Nx (or Ny) required if the posterior mean estimated by the particle filter is to have average squared error less than the prior or observations, in the simple example considered in the text. Asterisks show the simulation results, averaged over 400 realizations. The best-fit line is given by log10Ne = 0.05Nx + 0.78.

  • View in gallery

    The ensemble size of log10Ne as a function of Nx (or Ny) such that maxwi averaged over 400 realizations is less than 0.6 (plus signs), 0.7 (circles), and 0.8 (asterisks) in the simple example considered in the text.

  • View in gallery

    Evaluation of (19) against simulations in the case λ2j = 1, j = 1, . . . , Ny. For each of 60 (Ny, Ne) pairs as detailed in the text, E[1/w(Ne)] was estimated from an average of 1000 realizations of the particle-filter update. The best-fit line to the data, given by E[1/w(Ne)] − 1 = −0.006 + 0.964 log(Ne)/Ny, is indicated by the solid line, while the prediction in (24) is shown by a dashed line.

  • View in gallery

    Evaluation of (19) against simulations in the case λ2j = cjθ, j = 1, . . . , Ny. The parameters θ and c are varied as described in the text, while Ny = 4 × 103 and Ne = 105 are fixed. The expectation E[1/w(Ne)] was estimated by averaging over 400 realizations of the particle-filter update. The best-fit line to the data, given by E[1/w(Ne)] − 1 = 0.006 + 1.0082 logNe/τ is indicated by the solid line, while the prediction in (19) is shown by a dashed line.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 536 536 50
PDF Downloads 429 429 46

Obstacles to High-Dimensional Particle Filtering

View More View Less
  • 1 National Center for Atmospheric Research,* Boulder, Colorado
  • | 2 Bell Laboratories, Murray Hill, New Jersey
  • | 3 Department of Statistics, University of California, Berkeley, Berkeley, California
  • | 4 National Center for Atmospheric Research,* Boulder, Colorado
© Get Permissions
Full access

Abstract

Particle filters are ensemble-based assimilation schemes that, unlike the ensemble Kalman filter, employ a fully nonlinear and non-Gaussian analysis step to compute the probability distribution function (pdf) of a system’s state conditioned on a set of observations. Evidence is provided that the ensemble size required for a successful particle filter scales exponentially with the problem size. For the simple example in which each component of the state vector is independent, Gaussian, and of unit variance and the observations are of each state component separately with independent, Gaussian errors, simulations indicate that the required ensemble size scales exponentially with the state dimension. In this example, the particle filter requires at least 1011 members when applied to a 200-dimensional state. Asymptotic results, following the work of Bengtsson, Bickel, and collaborators, are provided for two cases: one in which each prior state component is independent and identically distributed, and one in which both the prior pdf and the observation errors are Gaussian. The asymptotic theory reveals that, in both cases, the required ensemble size scales exponentially with the variance of the observation log likelihood rather than with the state dimension per se.

* The National Center for Atmospheric Research is sponsored by the National Science Foundation.

Corresponding author address: C. Snyder, NCAR, P.O. Box 3000, Boulder, CO 80307-3000. Email: chriss@ucar.edu

This article included in the Mathematical Advances in Data Assimilation (MADA) special collection.

Abstract

Particle filters are ensemble-based assimilation schemes that, unlike the ensemble Kalman filter, employ a fully nonlinear and non-Gaussian analysis step to compute the probability distribution function (pdf) of a system’s state conditioned on a set of observations. Evidence is provided that the ensemble size required for a successful particle filter scales exponentially with the problem size. For the simple example in which each component of the state vector is independent, Gaussian, and of unit variance and the observations are of each state component separately with independent, Gaussian errors, simulations indicate that the required ensemble size scales exponentially with the state dimension. In this example, the particle filter requires at least 1011 members when applied to a 200-dimensional state. Asymptotic results, following the work of Bengtsson, Bickel, and collaborators, are provided for two cases: one in which each prior state component is independent and identically distributed, and one in which both the prior pdf and the observation errors are Gaussian. The asymptotic theory reveals that, in both cases, the required ensemble size scales exponentially with the variance of the observation log likelihood rather than with the state dimension per se.

* The National Center for Atmospheric Research is sponsored by the National Science Foundation.

Corresponding author address: C. Snyder, NCAR, P.O. Box 3000, Boulder, CO 80307-3000. Email: chriss@ucar.edu

This article included in the Mathematical Advances in Data Assimilation (MADA) special collection.

1. Introduction

Ensemble methods for data assimilation are presently undergoing rapid development. The ensemble Kalman filter (EnKF), in various forms, has been successfully applied to a wide range of geophysical systems including atmospheric flows from global to convective scales (Whitaker et al. 2004; Snyder and Zhang 2003), oceanography from global to basin scales (Keppenne et al. 2005), and the land surface (Reichle et al. 2002). Particle filters are another class of ensemble-based assimilation methods of interest in geophysical applications. [See Gordon et al. (1993) or Doucet et al. (2001) for an introduction.]

In their simplest form, particle filters calculate posterior weights for each ensemble member based on the likelihood of the observations given that member. Like the EnKF, particle filters are simple to implement and largely independent of the forecast model, but they have the added attraction that they are, in principle, fully general implementations of Bayes’s rule and applicable to highly non-Gaussian probability distributions. Unlike the EnKF, however, particle filters have so far mostly been applied to low-dimensional systems. This paper examines obstacles to applying particle filters in high-dimensional systems.

Both particle filters and the EnKF are Monte Carlo techniques—they work with samples (i.e., ensembles) rather than directly with the underlying probability density function (pdf). Naively, one would expect such techniques to require ensemble sizes that are large relative to the dimension of the state vector. Experience has shown, however, that this requirement does not hold for the EnKF if localization of the sample covariance matrix is employed (Houtekamer and Mitchell 1998, 2001; Hamill et al. 2001). The feasibility of the EnKF with ensemble sizes much smaller than the state dimension also has theoretical justification. Furrer and Bengtsson (2007) and Bickel and Levina (2008) examine the sample covariance structure for reasonably natural classes of covariance matrices and demonstrate the effectiveness of localizing the sample covariance matrix.

There is much less experience with particle filters in high dimensions. Several studies have presented results from particle filters and smoothers for very low dimensional systems, including that of Lorenz (1963) and the double-well potential (Pham 2001; Kim et al. 2003; Moradkhani et al. 2005; Xiong et al. 2006; Chin et al. 2007). Both van Leeuwen (2003) and Zhou et al. (2006), however, apply the particle filter to higher-dimensional systems. Van Leeuwen (2003) considers a model for the Agulhas Current with dimension of roughly 2 × 105, and Zhou et al. (2006) use a land surface model of dimension 684. We will return to the relation of our results to their studies in the concluding section.

One might expect that particle filters, which in essence attempt to approximate the full pdf of the state, will be substantially more difficult to apply in high dimensions than the EnKF, which only involves approximation of the mean and covariance. The estimation of continuous pdfs is known to suffer from the “curse of dimensionality,” requiring computations that increase exponentially with dimension (Silverman 1986).

We argue here that high-dimensional particle filters face fundamental difficulties. Specifically, we explore the result from Bengtsson et al. (2008) and Bickel et al. (2008) that, unless the ensemble size is exponentially large in a quantity τ2, the particle-filter update suffers from a “collapse” in which with high probability a single member is assigned a posterior weight close to one while all other members have vanishingly small weights. The quantity τ2 is the variance of the observation log likelihood, which depends not only on the state dimension but also on the prior distribution and the number and character of observations. As will be discussed later, τ2 may be considered an effective dimension as it is proportional to the dimension of the state vector in some simple examples.

The tendency for collapse of weights has been remarked on previously in the geophyscial literature (Anderson and Anderson 1999; Bengtsson et al. 2003; van Leeuwen 2003) and is also well known in the particle-filtering literature, where it is often referred to as “degeneracy,” “impoverishment,” or “sample attrition.” Unlike previous studies, however, we emphasize the collapse of weights as a fundamental obstacle to particle filtering in high-dimensional systems, in that very large ensembles are required to avoid collapse even for system dimensions of a few tens or hundreds.1

Because of the tendency for collapse, particle filters invariably employ some form of resampling or selection step after the updated weights are calculated (e.g., Liu 2001), in order to remove members with very small weights and replenish the ensemble. We do not analyze resampling algorithms in this paper but rather contend that, whatever their efficacy for systems of small dimension and reasonably large ensemble sizes, they are unlikely to overcome the need for exponentially large ensembles as τ2 grows. Resampling proceeds from the approximate posterior distribution computed by the particle filter; it does not improve the quality of that approximate posterior.

The particle filter can also be cast in the framework of importance sampling (see Doucet et al. 2001 for an introduction), which allows one to choose the proposal distribution from which the particles are drawn. All the analysis in this paper assumes that the proposal is the prior distribution, a simple and widely used approach. Although the possibility is yet to be demonstrated, clever choices of the proposal distribution may be able to overcome the need for exponentially large ensemble sizes in high-dimensional systems.

The outline of the paper is as follows. In section 2, we review the basics of particle filters. Section 3 illustrates the difficulty of particle filtering when τ2 is not small through simulations for the simplest possible example: a Gaussian prior and observations of each component of the state with Gaussian errors, both of which have identity covariance. In section 4, we derive (following Bengtsson et al. 2008) an asymptotic condition on the ensemble sizes that yield collapse when both the prior and observation errors are independent and identically distributed in each component of the state vector. Section 5 extends those results to the more general case of Gaussian priors and Gaussian observation errors. Section 6 briefly discusses the effect of a specific heavy-tailed distribution for the observation error.

2. Background on particle filters

Our notation will generally follow that of Ide et al. (1997) except for the dimensions of the state and observation vectors and our use of subscripts to indicate ensemble members.

Let x of dimension Nx be the state of the system represented in some discrete basis, such as the values of all prognostic variables on a regular grid. Since it cannot be determined exactly given imperfect observations, we consider x to be a random variable with pdf p(x).

The subsequent discussion will focus on the update of p(x) given new observations at some time t = t0. That is, suppose that we have both a prediction p[x(t0)] and a vector of observations y that depends on x(t0) and has dimension Ny. {To be more precise, p[x(t0)] is conditioned on all observations prior to t = t0. Since all pdfs here pertain to t = t0 and will be conditioned on all previous observations, in what follows we suppress explicit reference to t0 and the previous observations.} We wish to estimate p(x|y), the pdf of x given the observations y, which we will term the posterior pdf.

For simplicity, let the observations have a linear relation to the state and be subject to additive random errors ϵ:
i1520-0493-136-12-4629-e1
More general observation models are of course possible but (1) suffices for all the points we wish to make in this paper.
The particle filter begins with an ensemble of states {xfi, i = 1, . . . , Ne} that is assumed to be drawn from p(x), where the superscript f (for “forecast”) indicates a prior quantity. The ensemble members are also known as particles. The update step makes the approximation of replacing the prior density p(x) by a sum of delta functions, N−1eΣNei=1δ(xxfi). Applying Bayes’s rule yields
i1520-0493-136-12-4629-e2
where the posterior weights are given by
i1520-0493-136-12-4629-e3
In the posterior, each member xfi is weighted according to how likely the observations would be if xfi were the true state.
If one of the likelihoods p(y|xfi) is much larger than the rest, maxwi will be close to 1 and the particle filter approximates the posterior pdf as a single point mass. The particle-filter estimates of posterior expectations, such as the posterior mean
i1520-0493-136-12-4629-e4
may then be poor approximations. We will loosely term this situation, in which a single member is given almost all the posterior weight, as a collapse of the particle filter. The goal of our study is to describe the situations in which collapse occurs, both through the rigorous asymptotic results of Bengtsson et al. (2008) for large Ny and Ne and through simulations informed by the asymptotics.

3. Failure of the particle filter in a simple example

We next consider a simple example, in which the prior distribution p(x) is Gaussian with each component of x independent and of unit variance and the observations y are of each component of x individually with independent Gaussian errors of unit variance. More concisely, consider Ny = Nx, 𝗛 = 𝗜, xN(0, 𝗜), and ϵN(0, 𝗜), where the symbol ∼ means “is distributed as” and N(μ, 𝗣) is the Gaussian distribution with mean μ and covariance matrix 𝗣.

Figure 1 shows histograms for maxwi from simulations of the particle-filter update using Nx = 10, 30, and 100, and Ne = 103. In the simulations, x, ϵ, and an ensemble {xfi, i = 1, . . . , Ne} are drawn from N(0, 𝗜). Weights wi are then computed from (3). The histograms are based on 103 realizations for each value of Nx.

The maximum wi is increasingly likely to be close to 1 as Nx and Ny increase. Large weights appear occasionally in the case Nx = 10, for which maxwi > 0.5 in just over 6% of the simulations. Once Nx = 100, the average value of maxwi over the 103 simulations is greater than 0.8 and maxwi > 0.5 with probability 0.9. Collapse of the weights occurs frequently for Nx = 100 despite the ensemble size Ne = 103.

Two comparisons illustrate the detrimental effects of collapse. The correct posterior mean in this Gaussian example is given by xa = (xf + y)/2, where the superscript a (for “analysis”) indicates a posterior quantity and the prior mean xf = 0 in this example. The expected squared error of xa is E(|xax|2) = [E(|xfx|2 + E(|yx|2)]/4 = Nx/2, while that of the observations [E(|yx|2)] is equal to Nx. The posterior mean estimated by the particle filter,
i1520-0493-136-12-4629-eq1
has squared error of 5.5, 25, and 127 for Nx = 10, 30, and 100, respectively, when averaged over the simulations. Thus, a has error close to that of xa only for Nx = 10. For Nx = 100, collapse of the weights is pronounced and a is a very poor estimator of the posterior mean—it has larger errors than either the prior or the observations.

As might be expected, the effects of collapse are also apparent in the particle-filter estimate of posterior variance, which is given by Σwi|xfia|2. The correct posterior variance is given by E(|xxa|2) = Nx/2, yet the particle-filter estimates (again averaged over 103 simulations) are 4.7, 10.5, and 19.5 for Nx = 10, 30, and 100, respectively. Except for Nx = 10, the particle-filter update significantly underestimates the posterior variance, especially when compared with the squared error of a.

The natural question is how large the ensemble must be in order to avoid the complete failure of the update. This example is tractable enough that the answer may be found by direct simulation: for various Nx, we simulate with Ne = 10 × 2k and increase k until the average squared error of a is less than that of the prior or the observations. We emphasize that this merely requires that the particle-filter estimate of the state is no worse than simply relying on the observations or the prior alone (i.e., that the particle filter “does no harm”). The Ne required to reach this minimal threshold is shown as a function of Nx (or Ny) in Fig. 2.

The required Ne appears to increase exponentially in Nx. The limitations this increase places on implementations of the particle filter are profound. For Nx = Ny = 90, somewhat more than 3 × 105 ensemble members are needed. Ensemble sizes for larger systems can be estimated from the best-fit line shown in Fig. 2. Increasing Nx and Ny to 100 increases the necessary ensemble size to just under 106, while Nx = Ny = 200 would require 1011 members.

The exponential dependence on Ne is also apparent in other aspects of the problem. Figure 3 shows the minimum Ne such that maximum wi (averaged over 400 realizations) is less than a specified value. For each of the values 0.6, 0.7, and 0.8, the required Ne increases approximately exponentially with Nx.

4. Behavior of weights for large Ny

The previous example highlights potential difficulties with the particle-filter update but does not permit more general conclusions. Results of Bengtsson et al. (2008), outlined in this section and the next, provide further guidance on the behavior of the particle-filter weights. Our discussion will be largely heuristic; we refer the reader to Bengtsson et al. for more rigorous and detailed proofs.

a. Approximation of the observation likelihood

Suppose that each component ϵj of ϵ is independent and identically distributed (i.i.d.) with density f (). Then for each member xfi, the observation likelihood can be written as
i1520-0493-136-12-4629-e5
where yj and (𝗛xfi)j are the jth components of y and 𝗛xfi, respectively. An elementary consequence of (5) is that, given y, the likelihood depends only on Ny, f () and the prior as reflected in the observed variables 𝗛x. There is no direct dependence on the state dimension Nx.
Defining ψ() = log f (),
i1520-0493-136-12-4629-e6
where Vij = −ψ[yj − (𝗛xfi)j], the negative log likelihood of the jth component of the observation vector given the ith ensemble member. It is convenient to center and scale the argument of the exponent in (6) by defining
i1520-0493-136-12-4629-e7a
where
i1520-0493-136-12-4629-e7b
Then (6) becomes
i1520-0493-136-12-4629-e8
where Si has zero mean and unit variance. The simplest situation (as in the example of section 3) is when the random variables Vij, j = 1, . . . , Ny, are independent given y, so that
i1520-0493-136-12-4629-eq2
Because Si is a sum of Ny random variables, its distribution will often be close to Gaussian if Ny is large. When Vij, j = 1, . . . , Ny, are independent given y, the distribution of Si on any fixed, finite interval approaches the standard Gaussian distribution for large Ny if the Lindeberg condition holds with probability tending to 1 (see Durret 2005, section 2.4a). More generally, the approximate normality of Si holds for any observation error density f () such that ∫f1−ϵ(t) dt is finite for some ϵ > 0 and when the Vij are not i.i.d. but have sufficiently similar distributions and are not too dependent (see Bengtsson et al. 2008). We note in passing that the requirement that the Vij be not too dependent as Ny increases means that Nx must become large as well and also that the components of the state vector are not strongly dependent. The pdf of the xfi must also have a moment-generating function, but is otherwise unconstrained. We will return to the role of Nx in the collapse later.

Equation (8) together with the approximation SiN(0, 1) is the basis for the asymptotic conditions for collapse derived in section 4b. They allow statements about the asymptotic behavior of likelihood, and thus of the wi, for large sample sizes Ne and large numbers of observations Ny, using asymptotic results for large samples from the standard normal distribution.

Showing that the approximation SiN(0, 1) is adequate for our purposes is nontrivial, since the behavior in the tails of the distribution is crucial to the derivations but convergence to a Gaussian is also weakest there. The interested reader will find details and proofs in Bengtsson et al. (2008). In fact, the approximation is adequate when the Sis are distributed as noncentral χ2 variables with Ny degrees of freedom, which is exactly the case when the observations themselves are Gaussian. As the study of Bengtsson et al. (2008) shows, the adequacy of the Gaussian approximation for the Si holds if ψ = logf has a moment generating function, for instance if f is Cauchy. In what follows, however, we will assume that SiN(0, 1) holds in a fashion which makes succeeding manipulations valid.

b. Heuristic derivation of conditions for collapse

Using (8), the maximum weight w(Ne) can be expressed as
i1520-0493-136-12-4629-e9
where S(i) is the ith-order statistic of the sample {Si, i = 1, . . . , Ne}.2 Defining
i1520-0493-136-12-4629-e10
we then have
i1520-0493-136-12-4629-e11
Collapse of the particle-filter weights occurs when T approaches zero.
To obtain asymptotic conditions for collapse, we next derive an expression for E(T) for large Ne and Ny by approximating E[T|S(1)] and then taking an expectation over the distribution of S(1). For an expectation conditioned on S(1), the sum in (10) may be replaced by a sum over an unordered ensemble with the condition Si > S(1). In that case the expectation of each term in the sum will be identical and
i1520-0493-136-12-4629-e12
where is drawn from the same distribution as the Si but with values restricted to be greater than S(1).
We now proceed with the calculation under the assumption that SiN(0, 1). Then has the density
i1520-0493-136-12-4629-eq3
where φ() is the density for the standard normal distribution and (x) = ∫xφ(z) dz. Writing the expectation explicitly with the density of yields
i1520-0493-136-12-4629-e13
Next, we replace φ(z) by (2π)−1/2 exp(−z2/2) in the integrand in (13), complete the square in the exponent, and use the definition of (x) to obtain the following:
i1520-0493-136-12-4629-e14
The behavior of Gaussian order statistics, such as the minimum of a sample, are well known (David and Nagaraja 2003). An important result is that,3 as Ne → ∞,
i1520-0493-136-12-4629-e15
Thus, since S(1) is becoming large and negative, [S(1)] approaches 1 and may be ignored in (14) when calculating the asymptotic behavior of E(T|S(1)).
Now suppose that τ/log Ne → ∞ as Ne → ∞. In this limit, τ + S(1)τ(1 − 2 log Ne/τ) → ∞ and so, by the standard approximation to the behavior of for large positive values of its argument:
i1520-0493-136-12-4629-e16
which may be easily derived with integration by parts (e.g., see section 6.3 of Bender and Orszag 1978).
Substituting (16) in (14), we conclude after some algebra4 that
i1520-0493-136-12-4629-e17
But, reversing the reasoning that led to (16) gives φ[S(1)]/|S(1)| ≈ Φ[S(1)], where Φ(x) = 1 − (x) is the cumulative distribution function (cdf) for the standard Gaussian. Thus,
i1520-0493-136-12-4629-e18
as Ne → ∞.
Taking the expectation of (18) over S(1) then gives
i1520-0493-136-12-4629-e19
To see this, recall that evaluating the cdf of a random variable at the value of the random variable, as in Φ(Si), yields a random variable with a uniform distribution on [0, 1]. This property underlies the use of rank histograms as diagnostics of ensemble forecasts (Hamill 2001 and references therein) and is known in statistics as the “probability integral transform.” Thus, Φ[S(1)] is distributed as the minimum of a sample of size Ne from a uniform distribution and E{Φ[S(1)]} ≈ 1/Ne. In the next section, we will confirm (19) with direct simulations.

Equation (19) implies that the particle filter will suffer collapse asymptotically if Ne ≪ exp(τ2/2). More generally, Ne must increase exponentially with τ2 in order to keep E[1/w(Ne)] fixed as τ increases. This exponential dependence of Ne on τ2 is consistent with the simulation results of section 3, where τ2Ny.

In contrast to the most obvious intuition, the asymptotic behavior of w(Ne) given in (19) does not depend directly on the state dimension Nx. Instead, the situation is more subtle: τ2, a measure of the variability of the observation priors, controls the maximum weight. The dimensionality of the state enters only implicitly, via the approximation that Si is asymptotically Gaussian, which requires that Nx be asymptotically large. One can then think of τ2 as an equivalent state dimension, in the sense that τ2 is the dimension of the identity-prior, identity-observation example (in section 3) that would have the same collapse properties.

5. The Gaussian–Gaussian case

The analysis in the previous section focused on situations in which the log likelihoods for the observations (considered as random functions of the prior) were mutually independent and identically distributed. In general, however, the observation likelihoods need not be i.i.d., since the state variables are correlated in the prior distribution and observations may depend on multiple state variables. In this section, we consider the case of a Gaussian prior, Gaussian observation errors, and linear 𝗛, where analytic progress is possible even for general prior covariances and general 𝗛.

Let the prior xN(0, 𝗣) and the observation error ϵN(0, 𝗥). We may assume that both x and ϵ have mean zero since, if the observations depend linearly on the state, E(y) = 𝗛E(x) and p(y|x) is unchanged if y is replaced by yE(y) and x by xE(x).

For Gaussian observation errors ϵ, the transformation y′ = 𝗥−1/2y also leaves p(y|x) unchanged but results in cov(ϵ′) = cov(𝗥−1/2ϵ) = 𝗜. Further simplification comes from diagonalizing cov(𝗥−1/2𝗛x) via an additional orthogonal transformation in the observation space. Let y″ = 𝗤Ty′, where 𝗤 is the matrix of eigenvectors of cov(𝗥−1/2𝗛x) with corresponding eigenvalues λ2j, j = 1, . . . , Ny; then cov(𝗤T𝗥−1/2𝗛x) = diag (λ21, . . . , λ2Ny), while ϵ″ = 𝗤Tϵ′ still has identity covariance and p(y|x) is again unchanged because 𝗤 is orthogonal. [Anderson (2001) presents a similar transformation that diagonalizes the problem in terms of the state variables, rather than the observation variables.] We therefore assume, without loss of generality, that
i1520-0493-136-12-4629-e20
and drop primes in the sequel.

a. Analysis of the observation likelihood

With the assumptions in (20), the observation errors are independent, so p(y|xfi) can be written in terms of a sum over the log likelihoods Vij as in (6). In addition, the pdf for each component of the observations is Gaussian with unit variance and, given xfi, mean Hxfi. Thus,
i1520-0493-136-12-4629-eq4
The additive constant c results from the normalization of the Gaussian density and may be omitted without loss of generality, since it cancels in the calculation of the weights wi.
We wish to approximate the observation likelihood as in (8). This requires ΣNyj=1Vij to be approximately Gaussian with mean μ and variance τ2. Leaving aside for the moment the conditions under which the sum is approximately Gaussian, the mean and variance given y of ΣNyj=1Vij can be calculated directly using (20) together with the properties of the standard normal distribution and the fact that the Vij are independent as j varies [as in (7b)]. This yields
i1520-0493-136-12-4629-e21a
and
i1520-0493-136-12-4629-e21b
Equations (21) still depend on the specific realization y of the observations. Proceeding rigorously would require taking the expectation of (19) over y. Here, we simply assume that expectation may be approximated by replacing τ in (19) by its expectation over y. Using the fact that E(y2j) = λ2j + 1, we have
i1520-0493-136-12-4629-e22a
and
i1520-0493-136-12-4629-e22b
As discussed in section 3 of Bickel et al. (2008), if λ1λ2 ≥ . . . , the distribution of Si = (ΣVijμ)/τ converges to a standard Gaussian as Ny → ∞ if and only if
i1520-0493-136-12-4629-e23
That is, Si converges to a Gaussian when no single eigenvalue or set of eigenvalues dominate the sum of squares: (23) implies that maxj(λ2j)/Σλ2j → 0 as Ny → ∞. The condition (23) also means that τ2 → ∞, which in turn leads to collapse if log Ne/τ2 → 0.

On the other hand, in the case that (23) is not satisfied, the unscaled log likelihood converges to a quantity that does not have a Gaussian distribution. Collapse does not occur since the updated ensemble empirical distribution converges to the true posterior as Ne → ∞, whatever Ny may be.

b. Simulations

First, we check the asymptotic expression for E[1/w(Ne)] − 1 as a function of Ne and Ny, given in (19), for the Gaussian–Gaussian case. For simplicity, let λj = 1, j = 1, . . . , Ny (as in the example of section 2). Then (22b) implies that E(τ2) = 5Ny/2 and (19) becomes
i1520-0493-136-12-4629-e24

This approximation is valid when Ne is large enough that the sample minimum follows (15) and Ny is large enough that log(Ne)/Ny is small. To capture the appropriate asymptotic regime, we have performed simulations with Ne = Nαy, α = 0.75, 0.875, 1.0, 1.25, Ny varying over a dozen values between 600 and 3000, and E(1/w(Ne)) approximated by averaging over 1000 realizations of the experiment. As can be seen from Fig. 4, 1 − E[1/w(Ne)] has an approximately linear relation to, log(Ne)/Ny, though considerable scatter is present. The best-fit line to the simulation results has a slope of 0.96 with a 95% confidence interval of ±0.087, which captures the predicted slope of 4/5 ≈ 0.89.

Equation (19) also implies that asymptotic collapse of the particle filter depends only on τ rather than the specific sequence {λj, j = 1, . . . , Ny}. To illustrate that τ does control collapse, we consider various λ sequences by setting λ2j = cjθ. In this case, the simulations fix Ny = 4 × 103 and Ne = 105 while θ takes the values 0.3, 0.5, and 0.7 and c is varied such that substituting (22b) in (19) gives 0.01 < E[1/w(Ne)] − 1 < 0.075. These values are again chosen to capture the appropriate asymptotic regime where the normalized log likelihood Si is approximately Gaussian. The expectation E[1/w(Ne)] is approximated by averaging over 400 realizations of the experiment.

Figure 5 shows results as a function of 2 logNe/τ. As predicted by (19), E[1/w(Ne)] depends mainly on τ rather than on the specific λ sequence. The simulations thus confirm the validity of (19) and, in particular, the control of the maximum weight by τ. Nevertheless, some limited scatter around the theoretical prediction remains, which arises from weak dependence of E[1/w(Ne)] on the λ sequence for finite τ. We defer to a subsequent study a more detailed examination of the behavior of the maximum weight for finite τ and Ne and the limits of validity of (19).

6. Multivariate Cauchy observation-error distribution

Van Leeuwen (2003) proposes the use of a multivariate Cauchy distribution for the observation error to avoid collapse and gives some numerical results supporting his claim. In Bengtsson et al. (2008), analytical arguments as well as simulations indicate that collapse still occurs but more slowly with such an observation-error distribution. Specifically, they show that, in the limit log(Ne)/Ny → 0, E(T) approaches zero at a rate given by log(Ne)/Ny log|log(Ne)/Ny|. This condition emerges from the analysis of the log likelihood of the mulivariate Cauchy in the same way as (24) emerges from analysis of the Gaussian–Gaussian case. The condition for E(T) → 0 and collapse is then identical to that in the Gaussian–Gaussian case, namely, log(Ne)/Ny → 0, but the rate is distinctly slower than those implied by (19) or (24).

Intuitively, what happens is that if ϵ has a multivariate Cauchy distribution, then ϵ can be written as
i1520-0493-136-12-4629-eq5
where z1, . . . , zNy are i.i.d. N(0, 1). For given zNy+1 close to 0, the errors have very long Gaussian tails. This makes collapse harder because the true posterior resembles the prior, implying that the observations have relatively little information.

7. Conclusions

Particle filters have a well-known tendency for the particle weights to collapse, with one member receiving a posterior weight close to unity. We have illustrated this tendency through simulations of the particle-filter update for the simplest example, in which the priors for each of Nx state variables are i.i.d. and Gaussian, and the observations are of each state variable with independent, Gaussian errors. In this case, avoiding collapse and its detrimental effects can require very large ensemble sizes even for moderate Nx. The simulations indicate that the ensemble size Ne must increase exponentially with Nx in order for the posterior mean from the particle filter to have an expected error smaller than either the prior or the observations. For Nx = 100, the posterior mean will typically be worse than either the prior or the observations unless Ne > 106.

Asymptotic analysis, following Bengtsson et al. (2008) and Bickel et al. (2008), provides precise conditions for collapse either in the case of i.i.d. observation likelihoods or when both the prior and the observation errors are Gaussian (but with general covariances) and the observation operator is linear. The asymptotic result holds when Ne is large and τ2, the variance of the observation log likelihood defined in (7b), becomes large and has an approximately Gaussian distribution. Then, in the limit that τ−1Ne → 0, the maximum weight w(Ne) satisfies E[1/w(Ne)] ≈ 1 + τ−12 logNe. The maximum weight therefore approaches 1 (and collapse occurs) as τ increases unless the ensemble size Ne grows exponentially with τ.

In the case that both the prior and observation errors are Gaussian, τ2 can be written as a sum over the eigenvalues of the observation-space-prior covariance matrix. The theory then predicts that collapse does not depend on the eigenstructure of the prior covariances, except as that influences τ. Simulations in section 5 confirm this result.

It is thus not the state dimension per se that matters for collapse, but rather τ, which depends on both the variability of the prior and the characteristics of the observations. Still, one may think of τ2 as an effective dimension, as it gives the dimension of the identity-prior, identity-observation Gaussian system (as in section 3) that would have the same collapse properties. This analogy is only useful, however, when the normalized observation log likelihood Si defined in (7a) has an approximately Gaussian distribution, which requires that Nx not be too small.

Our results point to a fundamental obstacle to the application of particle filters in high-dimensional systems. The standard particle filter, which uses the prior as a proposal distribution together with some form of resampling, will clearly require exponentially increasing ensemble sizes as the state dimension increases and thus will be impractical for many geophysical applications. Nevertheless, some limitations of this study will need to be addressed before the potential of particle filtering in high dimensions is completely clear.

First, the simulations and asymptotic theory presented here have not dealt with the most general situation, namely, when the prior and observations are non-Gaussian and have nontrivial dependencies among their components. There is no obvious reason to expect that the general case should have less stringent requirements on Ne and we speculate that the Gaussian–Gaussian results of section 5 will still be informative even for non-Gaussian systems. Some support for this claim comes from the results of Nakano et al. (2007), who apply the particle filter with a variety of ensemble sizes to the fully nonlinear, 40-variable model of Lorenz (1996). Consistent with Fig. 2, they find that an ensemble size between 500 and 1000 is necessary for the posterior from the particle filter to have smaller rms errors than the observations themselves.

Second, the asymptotic theory pertains to the behavior of the maximum weight, but says nothing about how the tendency for collapse might degrade the quality of the particle-filter update. Indeed, the update may be poor long before the maximum weight approaches unity, as illustrated by Figs. 2 and 3. What is needed is practical guidance on ensemble size for a given problem with finite Nx, Ny, and τ. Though rigorous asymptotic analysis will be difficult, we anticipate that simulations may provide useful empirical rules to guide the choice of ensemble size.

Third, we have not addressed the possible effects of sequentially cycling the particle filter given observations at multiple instants in time. Overall, cycling must increase the tendency for collapse of the particle filter. The quantitative effect, however, will depend on the resampling strategy, which again makes analytic progress unlikely.

Fourth, we have not considered proposal distributions other than the prior nor have we considered resampling algorithms, which are frequently employed to counteract the particle filter’s tendency for collapse of the ensemble. We emphasize that resampling strategies that do not alter the update step are unlikely to overcome the need for very large Ne, since they do not improve the estimate of the posterior distribution, but merely avoid carrying members with very small weights further in the algorithm. It is conceivable that the required Ne might be reduced by splitting a large set of observations valid at a single time into several batches, and then assimilating the batches serially with resampling after each update step. Alternatively, one might identify states in the past that will evolve under the system dynamics to become consistent with present observations, thereby reducing the need for large ensembles of present states when updating given present observations. Gordon et al. (1993) term this process “editing,” and a similar idea is employed by Pitt and Shephard (1999). Such a scheme, however, would likely demand very large ensembles of past states.

As noted in the introduction, both van Leeuwen (2003) and Zhou et al. (2006) have applied particle filters to systems of dimension significantly larger than 100. In Zhou et al., however, each update is based on only a single observation (and only 28 observations total are assimilated); assuming that the prior uncertainty is comparable to the observation variance, τ2 < 28 in their case and their ensemble sizes of O(1000) would be adequate based on Fig. 3. Based on the characteristics of the sea surface height observations assimilated by van Leeuwen, we estimate that the particle-filter update uses O(100) observations at each (daily) analysis. Allowing for the possibility that nearby observations are significantly correlated owing to the relatively large scales emphasized by sea surface height, then van Leeuwen’s use of 500–1000 ensemble members would seem to be at the edge of where our results would indicate collapse to occur. Consistent with this, van Leeuwen notes a strong tendency for collapse.

Fundamentally, the particle filter suffers collapse in high-dimensional problems because the prior and posterior distributions are nearly mutually singular, so that any sample from the prior distribution has exceptionally small probability under the posterior distribution. For example, in the Gaussian i.i.d. case, the prior and posterior distributions have almost all their mass confined to the neighborhood of hyperspheres with different radii and different centers. The mutual singularity of different pdfs becomes generic in high dimensions and is one manifestation of the curse of dimensionality.

Another way of looking at the cause of collapse is that the weights of different members for any chosen state variable are influenced by all observations, even if those observations are nearly independent of the particular state variable. The particle filter thus inherently overestimates the information available in the observations and underestimates the uncertainty of the posterior distribution. Similar problems occur for the EnKF and, for spatially distributed systems with finite correlation lengths (e.g., most geophysical systems), can be reduced by explicitly restricting any observation’s influence to some spatially local neighborhood. This motivates the development of nonlinear, non-Gaussian ensemble assimilation schemes that perform spatially local updates, as in Bengtsson et al. (2003) or Harlim and Hunt (2007).

Acknowledgments

It was T. Hamill who first introduced the lead author to the potential problems with the particle-filter update in high dimensions. This work was supported in part by NSF Grant 0205655.

REFERENCES

  • Anderson, J. L., 2001: An ensemble adjustment filter for data assimilation. Mon. Wea. Rev., 129 , 28842903.

  • Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 27412758.

    • Search Google Scholar
    • Export Citation
  • Bender, M., , and S. Orszag, 1978: Advanced Mathematical Methods for Scientists and Engineers. McGraw-Hill, 593 pp.

  • Bengtsson, T., , C. Snyder, , and D. Nychka, 2003: Toward a nonlinear ensemble filter for high-dimensional systems. J. Geophys. Res., 108 , D24. 87758785.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, T., , P. Bickel, , and B. Li, 2008: Curse of dimensionality revisited: Collapse of the particle filter in very large scale systems. Probability and Statistics: Essays in Honor of David A. Freedman, D. Nolan and T. Speed, Eds., Vol. 2, Institute of Mathematical Statistics, 316–334, doi:10.1214/193940307000000518. [Available online at http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.imsc/1207580091.].

  • Bickel, P., , and E. Levina, 2008: Regularized estimation of large covariance matrices. Ann. Stat., 36 , 199227.

  • Bickel, P., , B. Li, , and T. Bengtsson, 2008: Sharp failure rates for the bootstrap particle filter in high dimensions. Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, Vol. 3, B. Clarke and S. Ghosal Eds., Institute of Mathematical Statistics, 318–329, doi:10.1214/074921708000000228.

  • Chin, T. M., , M. J. Turmon, , J. B. Jewell, , and M. Ghil, 2007: An ensemble-based smoother with retrospectively updated weights for highly nonlinear systems. Mon. Wea. Rev., 135 , 186202.

    • Search Google Scholar
    • Export Citation
  • David, H. A., , and H. N. Nagaraja, 2003: Order Statistics. 3rd ed. John Wiley and Sons, 458 pp.

  • Doucet, A., , N. de Freitas, , and N. Gordon, 2001: An introduction to sequential Monte Carlo methods. Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon, Eds., Springer-Verlag, 2–14.

    • Search Google Scholar
    • Export Citation
  • Durret, R., 2005: Probability: Theory and Examples. 3rd ed. Duxbury Press, 512 pp.

  • Furrer, R., , and T. Bengtsson, 2007: Estimation of high-dimensional prior and posteriori covariance matrices in Kalman filter variants. J. Multivar. Anal., 98 , 2. 227255.

    • Search Google Scholar
    • Export Citation
  • Gordon, N. J., , D. J. Salmond, , and A. F. M. Smith, 1993: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proc., 140 , 107113.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 27762790.

    • Search Google Scholar
    • Export Citation
  • Harlim, J., , and B. R. Hunt, 2007: A non-Gaussian ensemble filter for assimilating infrequent noisy observations. Tellus, 59A , 225237.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Ide, K., , P. Courtier, , M. Ghil, , and A. C. Lorenc, 1997: Unified notation for data assimilation: operational, sequential, and variational. J. Meteor. Soc. Japan, 75 , (Special Issue). 181189.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., , M. M. Rienecker, , N. P. Kurkowski, , and D. A. Adamec, 2005: Ensemble Kalman filter assimilation of temperature and altimeter data with bias correction and application to seasonal prediction. Nonlinear Processes Geophys., 12 , 491503.

    • Search Google Scholar
    • Export Citation
  • Kim, S., , G. L. Eyink, , J. M. Restrepo, , F. J. Alexander, , and G. Johnson, 2003: Ensemble filtering for nonlinear dynamics. Mon. Wea. Rev., 131 , 25862594.

    • Search Google Scholar
    • Export Citation
  • Liu, J. S., 2001: Monte Carlo Strategies in Scientific Computing. Springer-Verlag, 364 pp.

  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20 , 130148.

  • Lorenz, E. N., 1996: Predictability: A problem partly solved. Proc. Seminar on Predictability, Vol. 1, Reading, Berkshire, United Kingdom, ECMWF, 1–18.

  • Moradkhani, H., , K-L. Hsu, , H. Gupta, , and S. Sorooshian, 2005: Uncertainty assessment of hydrologic model states and parameters: Sequential data assimilation using the particle filter. Water Resour. Res., 41 .W05012, doi:10.1029/2004WR003604.

    • Search Google Scholar
    • Export Citation
  • Nakano, S., , G. Ueno, , and T. Higuchi, 2007: Merging particle filter for sequential data assimilation. Nonlinear Processes Geophys., 14 , 395408.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129 , 11941207.

  • Pitt, M. K., , and N. Shephard, 1999: Filtering via simulation: Auxilliary particle filters. J. Amer. Stat. Assoc., 94 , 590599.

  • Reichle, R. H., , D. B. McLaughlin, , and D. Entekhabi, 2002: Hydrologic data assimilation with the ensemble Kalman filter. Mon. Wea. Rev., 130 , 103114.

    • Search Google Scholar
    • Export Citation
  • Silverman, B. W., 1986: Density Estimation for Statistics and Data Analysis. Chapman and Hall, 175 pp.

  • Smith, K. W., 2007: Cluster ensemble Kalman filter. Tellus, 59A , 749757.

  • Snyder, C., , and F. Zhang, 2003: Assimilation of simulated Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 131 , 16631677.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131 , 20712084.

  • Whitaker, J. S., , G. P. Compo, , X. Wei, , and T. M. Hamill, 2004: Reanalysis without radiosondes using ensemble data assimilation. Mon. Wea. Rev., 132 , 11901200.

    • Search Google Scholar
    • Export Citation
  • Xiong, X., , I. M. Navon, , and B. Uzunoglu, 2006: A note on the particle filter with posterior Gaussian resampling. Tellus, 58A , 456460.

    • Search Google Scholar
    • Export Citation
  • Zhou, Y., , D. McLaughlin, , and D. Entekhabi, 2006: Assessing the performance of the ensemble Kalman filter for land surface data assimilation. Mon. Wea. Rev., 134 , 21282142.

    • Search Google Scholar
    • Export Citation

Fig. 1.
Fig. 1.

Histograms of maxwi for Nx = 10, 30, and 100 and Ne = 103 from the particle-filter simulations described in text: Ne = 103, xN(0, 𝗜), Ny = Nx, 𝗛 = 𝗜, and ϵN(0, 𝗜).

Citation: Monthly Weather Review 136, 12; 10.1175/2008MWR2529.1

Fig. 2.
Fig. 2.

The ensemble size Ne as a function of Nx (or Ny) required if the posterior mean estimated by the particle filter is to have average squared error less than the prior or observations, in the simple example considered in the text. Asterisks show the simulation results, averaged over 400 realizations. The best-fit line is given by log10Ne = 0.05Nx + 0.78.

Citation: Monthly Weather Review 136, 12; 10.1175/2008MWR2529.1

Fig. 3.
Fig. 3.

The ensemble size of log10Ne as a function of Nx (or Ny) such that maxwi averaged over 400 realizations is less than 0.6 (plus signs), 0.7 (circles), and 0.8 (asterisks) in the simple example considered in the text.

Citation: Monthly Weather Review 136, 12; 10.1175/2008MWR2529.1

Fig. 4.
Fig. 4.

Evaluation of (19) against simulations in the case λ2j = 1, j = 1, . . . , Ny. For each of 60 (Ny, Ne) pairs as detailed in the text, E[1/w(Ne)] was estimated from an average of 1000 realizations of the particle-filter update. The best-fit line to the data, given by E[1/w(Ne)] − 1 = −0.006 + 0.964 log(Ne)/Ny, is indicated by the solid line, while the prediction in (24) is shown by a dashed line.

Citation: Monthly Weather Review 136, 12; 10.1175/2008MWR2529.1

Fig. 5.
Fig. 5.

Evaluation of (19) against simulations in the case λ2j = cjθ, j = 1, . . . , Ny. The parameters θ and c are varied as described in the text, while Ny = 4 × 103 and Ne = 105 are fixed. The expectation E[1/w(Ne)] was estimated by averaging over 400 realizations of the particle-filter update. The best-fit line to the data, given by E[1/w(Ne)] − 1 = 0.006 + 1.0082 logNe/τ is indicated by the solid line, while the prediction in (19) is shown by a dashed line.

Citation: Monthly Weather Review 136, 12; 10.1175/2008MWR2529.1

1

This obstacle is equally relevant to a related class of “mixture” filters in which the prior ensemble serves as the centers for a kernel density estimate of the prior (Anderson and Anderson 1999; Bengtsson et al. 2003; Smith 2007). These filters also involve the calculation of the weight of each center given observations, and thus are subject to similar difficulties.

2

In other words, S(1) is the minimum of the sample, S(2) is the next smallest element, and so on until the maximum, S(Ne).

3

If a random variable X depends on a parameter a, we write X(a) = op(l) as (say) a → ∞ if Pr(|X| ≥ δ) → 0 for all δ ≥ 0.

4

To derive (17), we have included a factor of |S(1)|−12 log Ne = 1 + op(1/log Ne) on the rhs in order to simplify the manipulations that follow.

Save