• Anderson, J., , and S. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 27412758.

    • Search Google Scholar
    • Export Citation
  • Blumberg, A. F., , and G. L. Mellor, 1987: A description of a three-dimensional coastal ocean circulation model. Three-Dimensional Coastal Ocean Circulation Models, N. S. Heaps, Ed., Coastal Estuarine Science Series, Vol. 4, Amer. Geophys. Union, 1–16.

    • Search Google Scholar
    • Export Citation
  • Cane, M. A., , A. Kaplan, , R. N. Miller, , B. Tang, , E. C. Hackert, , and A. J. Busalacchi, 1996: Mapping tropical Pacific sea level: Data assimilation via a reduced state Kalman filter. J. Geophys. Res., 101 , 2259922617.

    • Search Google Scholar
    • Export Citation
  • Chen, R., , and J. Liu, 2000: Mixture Kalman filters. J. Roy. Stat. Soc. A, 62 , 493508.

  • Cohn, S. E., , and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc. Japan, 74 , 6375.

    • Search Google Scholar
    • Export Citation
  • Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 471 pp.

  • Doucet, A., , N. de Freitas, , and N. Gordon, 2001: Sequential Monte Carlo Methods in Practice. Springer-Verlag, 581 pp.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 , 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1992: Using the extended Kalman filter with a multilayer quasi-geostrophic ocean model. J. Geophys. Res., 97 , 1790517924.

    • Search Google Scholar
    • Export Citation
  • Farrell, B. F., , and P. J. Ioannou, 2001a: State estimation using a reduced-order Kalman filter. J. Atmos. Sci., 58 , 36663680.

  • Farrell, B. F., , and P. J. Ioannou, 2001b: Accurate low-dimensional approximation of the linear dynamics of fluid flow. J. Atmos. Sci., 58 , 27712789.

    • Search Google Scholar
    • Export Citation
  • Flather, R. A., 1976: A tidal model of the northwest European continental shelf. Mem. Soc. Roy. Sci. Liege, 10 , 141164.

  • Fukumori, I., , and P. Malanotte-Rizzoli, 1995: An approximate Kalman filter for ocean data assimilation: An example with an idealized Gulf Stream model. J. Geophys. Res., 100 , 67776794.

    • Search Google Scholar
    • Export Citation
  • Gauthier, P., , P. Courtier, , and P. Moll, 1993: Assimilation of simulated wind Lidar data with a Kalman filter. Mon. Wea. Rev., 121 , 18031820.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., , C. Snyder, , and R. E. Morss, 2002: Analysis-error statistics of a quasigeostrophic model using three-dimensional variational assimilation. Mon. Wea. Rev., 130 , 27772790.

    • Search Google Scholar
    • Export Citation
  • Heemink, A. W., , M. Verlaan, , and A. J. Segers, 2001: Variance reduced ensemble Kalman filtering. Mon. Wea. Rev., 129 , 17181728.

  • Hoteit, I., , D-T. Pham, , and J. Blum, 2002: A simplified reduced-order kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36 , 101127.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D-T. Pham, , and J. Blum, 2003: A semi-evolutive filter with partially local correction basis for data assimilation in oceanography. Oceanol. Acta, 26 , 511524.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , G. Korres, , and G. Triantafyllou, 2005: Comparison of extended and ensemble based Kalman filters with low and high resolution primitive equation ocean models. Nonlinear Processes Geophys., 12 , 755765.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , G. Triantafyllou, , and G. Korres, 2007: Using low-rank ensemble Kalman filters for data assimilation with high dimensional imperfect models. J. Numer. Anal. Ind. Appl. Math., 2 , 6778.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

  • Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376 pp.

  • Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. Trans. ASME, J. Basic Eng., 82D , 3545.

  • Kaplan, J. L., , and J. A. Yorke, 1979: Preturbulence: A regime observed in a fluid flow model of Lorenz. Commun. Math. Phys., 67 , 93108.

    • Search Google Scholar
    • Export Citation
  • Kivman, G., 2003: Sequential parameter estimation for stochastic systems. Nonlinear Processes Geophys., 10 , 253259.

  • Korres, G., , and A. Lascaratos, 2003: An eddy resolving model Aegean and Levantine basins for the Mediterranean System Pilot Project (MFSPP): Implementation and climatological runs. Ann. Geophys., 21 , 205220.

    • Search Google Scholar
    • Export Citation
  • Lermusiaux, P. F. J., , and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Part I: Theory and schemes. Mon. Wea. Rev., 127 , 13851407.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20 , 130141.

  • Osborne, A. R., , and A. Pastorello, 1993: Simultaneous occurrence of low-dimensional chaos and colored random noise in nonlinear physical systems. Phys. Lett. A, 181 , 159171.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129 , 11941207.

  • Pham, D. T., , J. Verron, , and M. C. Roubaud, 1997: Singular evolutive Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16 , 323340.

    • Search Google Scholar
    • Export Citation
  • Preisendorfer, R. W., 1988: Principal Component Analysis in Meteorology and Oceanography. Developments in Atmospheric Science Series, Vol. 17, Elsevier, 425 pp.

    • Search Google Scholar
    • Export Citation
  • Silverman, B. W., 1986: Density Estimation for Statistics and Data Analysis. Chapman and Hall, 175 pp.

  • Smagorinsky, J., 1963: General circulation experiments with the primitive equations. I: The basic experiment. Mon. Wea. Rev., 91 , 99164.

    • Search Google Scholar
    • Export Citation
  • Todling, R., 1999: Estimation theory and foundations of atmospheric data assimilation. Data Assimilation Office, Goddard Space Flight Center, DAO Office Note 1999–01, 187 pp.

  • Van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131 , 20712084.

  • West, B. J., , and H. J. Mackey, 1991: Geophysical attractors may be only colored noise. J. Appl. Phys., 69 , 67476749.

  • View in gallery

    (top) Rms error of the LRKPK filter with the Lorenz model. (bottom) Reference (black curve) and filter (plus signs) solutions for the z component (note that this variable was not assimilated).

  • View in gallery

    Topography of the ¼° Mediterranean model.

  • View in gallery

    Time evolution of the basin average kinetic energy (10−3 m2 s−2) from 1979 to 1987.

  • View in gallery

    Time evolution of the RRMS as it results from the LRKPK filter using 50 particles with resampling every 5 filtering cycles and a bandwidth parameters h = 0.4.

  • View in gallery

    Spatial distribution of the SSH rms misfit as it results from (a) the free run and (b) the LRKPK filter’s run using 50 particles with resampling every 5 filtering cycles and a bandwidth parameters h = 0.4.

  • View in gallery

    Spatial distribution of the rms misfit for a zonal temperature transect along 33.75°N as it results from (a) the free run and (b) the LRKPK filter’s run using 50 particles with resampling every 5 filtering cycles and a bandwidth parameters h = 0.4.

  • View in gallery

    Time evolution of the RRMS as it results from the LRKPK filter using 30, 50, and 100 particles with resampling every 5 filtering cycles and a bandwidth parameter h = 0.4.

  • View in gallery

    Time evolution of the RRMS as it results from the LRKPK filter using 30 particles with resampling every 5 filtering cycles and three different bandwidth parameters: 0.2, 0.4, and 0.8.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 60 60 19
PDF Downloads 47 47 17

A New Approximate Solution of the Optimal Nonlinear Filter for Data Assimilation in Meteorology and Oceanography

View More View Less
  • 1 Scripps Institution of Oceanography, University of California at San Diego, La Jolla, California
  • 2 Laboratoire de Modélisation et Calcul, Centre National de la Recherche Scientifique, Grenoble, France
  • 3 Hellenic Center for Marine Research, Anavissos, Greece
© Get Permissions
Full access

Abstract

This paper introduces a new approximate solution of the optimal nonlinear filter suitable for nonlinear oceanic and atmospheric data assimilation problems. The method is based on a local linearization in a low-rank kernel representation of the state’s probability density function. In the resulting low-rank kernel particle Kalman (LRKPK) filter, the standard (weight type) particle filter correction is complemented by a Kalman-type correction for each particle using the covariance matrix of the kernel mixture. The LRKPK filter’s solution is then obtained as the weighted average of several low-rank square root Kalman filters operating in parallel. The Kalman-type correction reduces the risk of ensemble degeneracy, which enables the filter to efficiently operate with fewer particles than the particle filter. Combined with the low-rank approximation, it allows the implementation of the LRKPK filter with high-dimensional oceanic and atmospheric systems. The new filter is described and its relevance demonstrated through applications with the simple Lorenz model and a realistic configuration of the Princeton Ocean Model (POM) in the Mediterranean Sea.

Corresponding author address: I. Hoteit, Scripps Institution of Oceanography, 9500 Gilman Dr., Mail Code 0230, La Jolla, CA 92093. Email: ihoteit@ucsd.edu

Abstract

This paper introduces a new approximate solution of the optimal nonlinear filter suitable for nonlinear oceanic and atmospheric data assimilation problems. The method is based on a local linearization in a low-rank kernel representation of the state’s probability density function. In the resulting low-rank kernel particle Kalman (LRKPK) filter, the standard (weight type) particle filter correction is complemented by a Kalman-type correction for each particle using the covariance matrix of the kernel mixture. The LRKPK filter’s solution is then obtained as the weighted average of several low-rank square root Kalman filters operating in parallel. The Kalman-type correction reduces the risk of ensemble degeneracy, which enables the filter to efficiently operate with fewer particles than the particle filter. Combined with the low-rank approximation, it allows the implementation of the LRKPK filter with high-dimensional oceanic and atmospheric systems. The new filter is described and its relevance demonstrated through applications with the simple Lorenz model and a realistic configuration of the Princeton Ocean Model (POM) in the Mediterranean Sea.

Corresponding author address: I. Hoteit, Scripps Institution of Oceanography, 9500 Gilman Dr., Mail Code 0230, La Jolla, CA 92093. Email: ihoteit@ucsd.edu

1. Introduction

The Kalman filter provides the optimal (minimum variance) solution of the linear-Gaussian sequential data assimilation problem (Kalman 1960). Since most dynamical and/or observational systems encountered in practice are nonlinear, the system equations are often linearized about the most recent estimate, leading to the popular, but no longer optimal, extended Kalman (EK) filter. Several studies have demonstrated, however, that the linearization of the system may produce instabilities, even divergence, when applied to strongly nonlinear systems (Gauthier et al. 1993; Evensen 1992). For the latter case, an optimal solution can be obtained from the optimal nonlinear filter, which involves the estimation of the conditional probability density function (PDF), not necessarily Gaussian, of the system state given all available measurements up to the estimation time (Doucet et al. 2001). Knowledge of the state’s PDF allows the determination of estimates of the system state, such as the minimum-variance estimate or the maximum a posteriori estimate, following the Bayesian estimation theory (Todling 1999). Similar to the Kalman filter, the nonlinear filter operates as a succession of a correction (or analysis) step at measurement times to correct the predictive density using Bayes’ rule, and a prediction step to propagate the analysis density to the time of the next available observation.

The particle filter is a discrete approximation of the optimal nonlinear filter and is based on point mass representation (mixture of Dirac distributions), called particles, of the state’s PDF (Doucet et al. 2001). In this filter, the particles evolve in time with the numerical model and their assigned weights are updated each time new measurements are available. The filter solution is then the weighted average of the particle ensemble. In practice, this filter suffers from a major problem known as the degeneracy phenomenon; after only a few iterations, weights become concentrated on very few particles and hence only a tiny fraction of the ensemble contributes to the average, very often causing the divergence of the filter. The use of more particles alleviates this problem over short time periods only, so the most efficient way to get around it is resampling (Doucet et al. 2001). This technique consists of drawing new particles according to the distribution of the ensemble and then reassigning them the same weights. However, resampling often introduces Monte Carlo fluctuations, which degrade the filter’s performance. Additionally, even with resampling, a large number of particles are required for an effective behavior of the filter. This makes brute-force implementation of the particle filter problematic with computationally expensive atmospheric and oceanic models. Interesting discussions on the use of the optimal nonlinear filter for high-dimensional oceanic and atmospheric data assimilation problems can be found in Anderson and Anderson (1999), Kivman (2003), and Van Leeuwen (2003).

The popular ensemble Kalman (EnK) filter, which has been introduced by Evensen (1994), is also a particle-based filtering technique. However, while it has the same prediction step as the particle filter, it does not have the same correction step. The EnK filter retains the “linearity aspect” of the Kalman filter in the analysis step, in that it applies the Kalman correction using a forecast error covariance computed as the sample covariance of the particle ensemble. Its correction step therefore uses only the first two moments of the particle ensemble, and is thus suboptimal for non-Gaussian systems. In practical situations however, the EnK filter was found to be more robust than the particle filter when small-size ensembles were considered because the Kalman update of its particles significantly reduces the risk of ensemble degeneracy by pulling the particles toward the true state of the system (Kivman 2003; Van Leeuwen 2003).

In this paper, we introduce a new approximate solution of the optimal nonlinear filter suitable for applications in oceanography and meteorology. The filter makes use of a mixture of Gaussian distributions in a kernel representation to approximate the state PDFs in the optimal nonlinear filter. A Gaussian mixture has been already used by Anderson and Anderson (1999) and Chen and Liu (2000) in the context of the nonlinear filter. It is expected to provide more reliable representation of the state PDFs than a simple mixture of Dirac functions used in the particle filter. Here, we further assume that the covariance matrix of the Gaussian mixtures has low rank to avoid manipulating huge matrices associated with the large dimension of the oceanic and atmospheric systems. This is a very common approach in the atmospheric and oceanic Kalman filtering community (e.g., Fukumori and Malanotte-Rizzoli 1995; Cane et al. 1996; Cohn and Todling 1996; Pham et al. 1997; Heemink et al. 2001; Lermusiaux and Robinson 1999; Pham 2001; Farrell and Ioannou 2001a; Hoteit et al. 2002, 2003, 2005), and implicitly assumes that state estimation errors can be accurately modeled in a severely reduced dimensional subspace (Lermusiaux and Robinson 1999; Hamill et al. 2002). It enforces a smooth analysis since the filter correction is only applied using the leading modes of the analysis covariance matrix (Cane et al. 1996; Lermusiaux and Robinson 1999). Smoothness in the analysis is produced by assuming relatively large spatial scales for the uncertainty in the starting conditions for each forecast step, which leads to a concentration of variance in only a few modes of the covariance matrix (Cane et al. 1996; Cohn and Todling 1996; Lermusiaux and Robinson 1999). In addition, the dissipative and driven nature of the geophysical fluid systems also concentrates the energy at large scales, meaning a red spectrum of variability (Daley 1991), or, for others, suggests the existence of a low-dimensional attractor (Pham et al. 1997; Lermusiaux and Robinson 1999). In practice, a red spectrum is often indistinguishable from a low-dimensional attractor, as both can be efficiently described by a limited number of functions or modes (West and Mackey 1991; Osborne and Pastorello 1993). For simplicity, we refer to this as a system with a limited number of “effective” degrees of freedom (EDOF). Additionally, as the covariance matrixes of the Gaussian mixtures are kept small during the filter operations, a local linearization about the center of the mixture components (the particles) is applied. This leads to a Kalman-type correction for each particle complementing the usual particle-type correction. The resulting filter, called low-rank kernel particle Kalman (LRKPK) filter, basically runs an ensemble of low-rank Kalman filters and then provides the optimal (minimum variance) analysis state as the weighted mean of all the subfilters’ analyses. As in the EnK filter, the Kalman-type correction attenuates the degeneracy problem, which allows the filter to efficiently operate with relatively small-size ensembles (roughly of the same order as the EnK filter). A similar approach was implemented by Houtekamer and Mitchell (1998) who used a pair of ensemble Kalman filters to deal with the problem of inbreeding, generally associated with the use of small-size ensembles. The LRKPK filter is first tested with the simple but chaotic and highly nonlinear Lorenz model (Lorenz 1963). This model is a simplified form of the complicated system describing the dynamics of fluid motion and the heat flow in term of three ordinary partial equations. Assimilation results from a realistic application with a general circulation Princeton Ocean Model (POM) of the Mediterranean Sea are then reported and discussed.

The paper is organized as follows. The characteristics of the optimal nonlinear filter are briefly recalled in section 2. The LRKPK filter is then introduced in section 3 and its algorithm is summarized in section 4. The design and preliminary assimilation results of numerical experiments are then presented in sections 5 and 6. A general discussion concludes the paper in section 7.

2. The optimal nonlinear filter

Consider the nonlinear stochastic discrete-time dynamical system:
i1520-0493-136-1-317-e1
i1520-0493-136-1-317-e2
where xk is the state vector (to be estimated) of dimension n; yk is the observation vector of dimension p; 𝗠k and 𝗛k are two continuously differentiable maps from ℝn to ℝn and from ℝn to ℝp, respectively representing the transition and the observational operators; and ηk and εk denote the dynamical and the observational noise. We assume that the random vectors ηk and εk are independent and independent of xk−1, and are Gaussian with mean zero and covariance matrices 𝗤k and 𝗥k, respectively. Such system is somewhat less general than the one considered in particle filtering but is still quite general and covers most practical situations.

Starting from a random initial condition x0 with a known probability density function, the optimal nonlinear filter provides the conditional density function of the system state xk at each time tk given all available measurements up to tk. To simplify the notation, we shall write y1:k as a shorthand for y1, . . . , yk. Let pk|k−1(·|y1:k−1) be the conditional (predictive) density function of xk given y1:k−1, and let pk(·|y1:k) be the conditional (analysis) density function of xk given y1:k. The nonlinear filtering algorithm consists of two steps, which we summarize below. The reader is referred to Doucet et al. (2001) for an extensive description of the filter.

  • The prediction step. Suppose that the required analysis density pk−1(·|y1:k−1) at time tk−1 is available. The prediction step involves using the model Eq. (1) to obtain the predictive density pk|k−1(·|y1:k−1) at the time of the next available observation tk via the Chapman–Kolmogorov equation:
    i1520-0493-136-1-317-e3
    where p(x|xk−1 = u) is the conditional density of the state vector xk to be at x at time tk given that it was at u at time tk−1. Under the assumptions made on the model noise, p(xk|xk−1 = u) = ϕ[x − 𝗠k(u); 𝗤k], where
    i1520-0493-136-1-317-e4
    denotes the Gaussian density of zero mean and covariance matrix Σ. Thus,
    i1520-0493-136-1-317-e5
    The correction step. After a new observation yk has been made, we recover the analysis density pk(·|y1:k) at time tk using the Bayes rule,
    i1520-0493-136-1-317-e6
    The analysis density is therefore obtained by multiplying the prior predictive density by the observation likelihood and normalizing by bk = ∫n pk|k−1(u| y1:k−1)ϕ[yk − 𝗛k(u); 𝗥k]du to ensure a probability density.

While the expressions of the state PDFs can be easily obtained, determining the value of the predictive density at each point in state space is practically impossible for large dimensional systems, as in meteorology and oceanography. This actually requires the evaluation of 𝗠k(x) for a large set of values of x, given that one single evaluation can be quite costly in realistic applications. The particle filters approximate the state PDFs by convex mixtures of Dirac functions. In the next section, we will resort to the kernel method to approximate the state PDFs by mixtures of Gaussian distributions.

3. The low-rank kernel particle Kalman filter

Given N independent samples (observations) x1, . . . , xN from a (multivariate) density p, an estimator of p can be obtained by the kernel method as a mixture of N Gaussian densities (Silverman 1986),
i1520-0493-136-1-317-e7
where 𝗣 is a positive definite matrix. In practice, 𝗣 is very often taken to be h2 times the sample covariance matrix of the observations, and h is a bandwidth parameter to be chosen. Inspired from this estimator, we propose to approximate the state PDFs in the optimal nonlinear filter by mixtures of N Gaussian densities of the form
i1520-0493-136-1-317-e8
where the subscript t replaces k at the analysis time and k + 1|k at the prediction time, xit are vectors in ℝn called particles, wit are probabilities (also called weights), and 𝗣t is a positive definite matrix. Similar estimators have been already considered by Anderson and Anderson (1999) and Chen and Liu (2000). Here, we further assume that the matrices 𝗣t are small, in some sense, and of low rank N − 1 (the rank of a sample covariance matrix of an ensemble of N members). The first condition will be used to locally linearize the model around the particles ensemble. The second condition is needed for applications with high-dimensional systems, as in atmospheric and oceanic data assimilation problems where n is of the order of 108, because the manipulation of the associated full size covariance matrices is not possible in practice. This actually allows the decomposition
i1520-0493-136-1-317-e9
where 𝗟t and 𝗨t are n × (N − 1) and (N − 1) × (N − 1) matrices, respectively, which avoids the manipulation of the huge matrices 𝗣t by only including the matrices 𝗟t and 𝗨t in the filter’s algorithm. As discussed by Pham (2001), the approximated rank, number of particles plus one, should be larger than the number of EDOF to expect an acceptable filter behavior. A good behavior is achieved when the filter brings the estimation error to an acceptable level of observational and representational error (Cane et al. 1996; Lermusiaux and Robinson 1999). An estimate of the EDOF can be made from the number of positive Lyapounov exponents (Kaplan and Yorke 1979), or can be bounded above by the empirical orthogonal function (EOF) spectrum, which is the number of EOFs needed to account for most of the system variance (Farrell and Ioannou 2001b). For simplicity and convenience, we only compute the latter in our numerical applications, which is sufficient for setting an appropriate rank for the mixtures covariance matrices.
Starting from an initial density of the form, Eq. (8) and Eq. (9), we show below that the predictive and the analysis densities of the optimal nonlinear filter can be always approximated by mixtures of Gaussian densities of the same form. Moreover, 𝗟t will be determined as function of the xit following
i1520-0493-136-1-317-e10
where 𝗫t = [x1t · · · xNt], and 𝗧 is a N × (N − 1) full rank matrix with zero columns sum. A convenient choice of such matrix is
i1520-0493-136-1-317-e11
Postmultiplication with 𝗧 implicitly subtracts the {x1t · · · xNt}-ensemble mean. An efficient algorithm to recursively update the parameters, wit, xit, and 𝗨t, of both mixtures is also provided. The algorithm consists of a correction step and a prediction step and is complemented by a resampling step, whenever needed.

a. The initialization step

We initialize the filter with a predictive density of the form Eq. (8),
i1520-0493-136-1-317-e12
Based on the kernel density estimation method, we may simply sample the xi1|0 from the unconditional distribution of the state vector x1 at this time, and take wi1|0 = 1/N, and 𝗣1|0 = h2cov(xi1|0; wi1|0), where h is a small tuning parameter and cov(xi; wi) denotes the sample covariance matrix of the particles xi associated with the weights wi, namely
i1520-0493-136-1-317-e13
The above initialization would result in a predictive density p1|0 centered about the average of the xi1|0 with a covariance matrix (1 + h2)cov(xi1|0; wi1|0), which is larger than the theoretical covariance matrix of x1 by about a factor (1 + h2)(N − 1)/N. This can be beneficial as it means that we err on the safe side by overestimating the initial error covariance matrix (Anderson and Anderson 1999).
In applications to realistic atmospheric and oceanic data assimilation problems, the dimension n of the state vector is much larger than the number of particles N that can be used. The matrix cov(xi; wi) will therefore be singular of rank ≤ N − 1. By letting X = [x1 · · · xN] and 𝗪 the diagonal matrix with diagonal elements w1, . . . , wN, it is possible to verify that cov(xi; wi) can be decomposed as1
i1520-0493-136-1-317-e14
The matrix 𝗣1|0 can therefore be factorized as in Eqs. (9)(10) into
i1520-0493-136-1-317-e15
with
i1520-0493-136-1-317-e16

In practice, very little information is available on the distribution of the initial state vector px1. However, this is often not a serious problem since several studies have found that px1 does not need to be set with high accuracy because it does not have a significant impact on the long-term behavior of the filter (Doucet et al. 2001). In atmospheric and oceanic applications, it is important that the initial estimate of px1 take into account the main physical quantities that govern the evolution of the state of these systems. Omitting such quantities may badly affect the filter’s behavior, as they tend to persist over time. We therefore estimate the statistics of px1 from a sample of model outputs. In the absence of prior information, we follow the common practice and assume px1 to be Gaussian (Doucet et al. 2001). Under the assumption of low-rank covariance matrix, a second-order exact drawing can be performed as described by Pham (2001) to sample the xi1|10 such that their mean and sample covariance matrix exactly matches the mean and 1/(1 + h2) times the covariance matrix of x1|0. The covariance matrix of the resulting initial density estimate p1|0 then matches that of px1.

b. The correction step

Starting from a predictive density of the form Eq. (8), the analysis density in Eq. (6) is also a mixture of densities, giving up to a normalizing factor as
i1520-0493-136-1-317-e17
But since 𝗣k|k−1 is small, ϕ[xxik|k−1; 𝗣k|k−1] becomes negligible as soon as x is not sufficiently close to xik|k−1. This allows the linearization of the observation operator 𝗛k around xik|k−1, and therefore to approximate the ith component of Eq. (17) by
i1520-0493-136-1-317-e18
where yik|k−1 = 𝗛k(xik|k−1) and Hik denotes the gradient of 𝗛k at the point xik|k−1. With the significant computational burden in mind, we further assume that 𝗛k is nearly linear so that its gradient depends weakly on the point where it is evaluated. This allows us to take HikHk (the gradient of 𝗛k at the predicted state ΣNi=1wik|k−1xik|k−1, e.g.). Following similar calculations to the one used in the derivation of the standard Kalman filter (e.g., Todling 1999), expression Eq. (18) can be rewritten as
i1520-0493-136-1-317-e19
where
i1520-0493-136-1-317-e20
i1520-0493-136-1-317-e21
i1520-0493-136-1-317-e22
i1520-0493-136-1-317-e23
The analysis density pk(x|y1:k) can be, therefore, approximated by the Gaussian mixture
i1520-0493-136-1-317-e24
where the new wik weights are updated with
i1520-0493-136-1-317-e25
This shows pk(·|y1:k) is also a mixture of Gaussian densities, as stated before. Moreover, the covariance matrix 𝗣k of the components of this mixture, being bounded above by the assumed small 𝗣k|k−1 in Eq. (23), remains small.

The filter’s correction step can be interpreted as composed of two types of corrections: a Kalman-type correction defined by Eqs. (20)(23) and a particle-type correction defined by Eqs. (24)(25). The Kalman-type correction reduces the risk of degeneracy by pulling the particles toward the true state of the system Eq. (20). This can also be seen from Eq. (25), which has the same form as the standard particles’ reweight equation, but uses the covariance matrix of the predictive measure Σk as the “observation covariance” matrix rather than 𝗥k, which is used in the standard particle filter. Since Σk is always greater than 𝗥k, the particles close to the observation will somehow get less weight than in the standard particle filter, while those far from the observation will receive more weight. This means that the support of the local predictive density and likelihood will be more coherent than in the particle filter. Resampling will therefore be needed less often, thus reducing Monte Carlo fluctuations.

The covariance matrix of the predictive mixture 𝗣k|k−1 was decomposed into 𝗟k|k−1𝗨k|k−1𝗟Tk|k−1 as in Eqs. (9)(10). Using Eq. (23), it can be seen that 𝗣k can be also factorized into 𝗟k|k−1𝗨k𝗟Tk|k−1, where
i1520-0493-136-1-317-e26
Letting 𝗟k = [x1k · · · xNk]𝗧 = 𝗫k𝗧 and using Eqs. (10) and (23), we have 𝗟k = 𝗟k|k−1𝗕k, where
i1520-0493-136-1-317-e27
This shows that the covariance matrix of the analysis mixture 𝗣k remains of the form 𝗟k𝗨k𝗟Tk as in Eqs. (9)(10) with
i1520-0493-136-1-317-e28
Note that when the observational operator 𝗛k is linear,
i1520-0493-136-1-317-eq1
This formula can still be used when 𝗛k is nonlinear to avoid linearization, as proposed by Pham (2001).

c. The prediction step

The correction step provided an approximation of the state analysis density pk(·|y1:k) in the form of a mixture of Gaussian densities Eq. (24), with the mixture covariance matrix 𝗣k being small and factorized as 𝗟k𝗨k𝗟Tk. Using Eq. (5), the predictive density at the next step is
i1520-0493-136-1-317-e29
Again, since ϕ[uxik; 𝗣k] becomes negligible as soon as u is not sufficiently close to xik, the model can be linearized around the xik to obtain the approximation
i1520-0493-136-1-317-eq2
where Mik+1 denotes the gradient of 𝗠k+1 at xik. As before, we will assume that 𝗠k+1 is nearly linear so that its gradient depends weakly on the point where it is evaluated, and therefore take Mik+1Mk+1. With these approximations, the integral in the above equation appears as the density of the sum of a Gaussian random vector of mean 𝗠k+1(xik) and covariance matrix 𝗤k+1, and an independent Gaussian vector of mean zero and covariance matrix Mk+1𝗣kMTk+1. Therefore,
i1520-0493-136-1-317-e30
This shows that the predictive density remains a mixture of Gaussian densities of mean xik+1|k = 𝗠k+1(xik), covariance matrix
i1520-0493-136-1-317-e31
and weight wik+1|k = wik. The covariance matrix of the analysis density 𝗣k was decomposed into 𝗟k𝗨k𝗟Tk with 𝗟k = 𝗫k𝗧, the above equation becomes
i1520-0493-136-1-317-e32
To avoid the linearization of the model, we follow Pham (2001) and we compute 𝗟k+1|k from
i1520-0493-136-1-317-e33
In the case of a perfect model, that is, with no dynamical noise (𝗤k = 0), 𝗣k+1/k is again of the form Eqs. (9)(10). The correction step can therefore be repeated as before. When the model is imperfect, the rank of the mixture covariance matrix 𝗣k+1|k will continuously increase without any limit even if 𝗤k+1 was of low rank. However, under the assumption of 𝗤k+1 of low rank, several techniques can be used to avoid this problem; for example, projecting the dynamical noise on the subspace spanned by 𝗟k+1|k, or reapproximating the matrix 𝗣k+1|k by a (N − 1)-rank matrix, or using ensemble representation of the model error, so that 𝗣k+1|k remains of the form 𝗟k+1|k𝗨k+1|k𝗟Tk+1|k, as suggested by Hoteit et al. (2007).

d. The resampling step

Resampling is crucial in the particle filter to avoid the degeneracy of the particles. The same problem might also occur in the LRKPK filter, albeit to a lesser degree since the matrix Σk in Eq. (22) is greater than 𝗥k, which means that the weights are distributed more uniformly than in the case of the particle filter. Another problem may occur in our filter is that the matrix 𝗣k+1|k in Eq. (31) is generally greater than 𝗣k; either because of the presence of dynamical noise and/or the amplification effect of the multiplication by Mk+1, hence the mixture covariance matrices might become too large to justify the approximations needed to derive Eq. (18) and Eq. (30). We rely on resampling to reduce the size of 𝗣k+1|k. Note that a standard “full” resampling would require particles selection, which eliminates particles with low weights and duplicate particles with high weights. To avoid Monte Carlo fluctuations associated with such selection, a “partial” resampling is used here when the weights are already sufficiently uniform. Partial resampling is less destructive (of information) than full resampling. It amounts to adding some noise to the particles, while the weights remain unchanged. This is similar to the resampling procedure of the standard particle filter, which adds noise to the particles to mimic the dynamical noise. In any case, since resampling inevitably entails some loss of accuracy, it should be used only when necessary.

Resampling is therefore used in the LRKPK filter to obtain an approximation of the predictive density
i1520-0493-136-1-317-e34
by a mixture of Gaussian density
i1520-0493-136-1-317-e35
under the requirement that the new mixture components have a small covariance matrix k+1|k [always decomposed as in Eqs. (9)(10)]. The weights ik+1|k are set to 1/N if full resampling is performed and remain unchanged (=wik) in the case of partial resampling.
We first describe the full resampling procedure. A simple way to determine k+1|k(x|y1:k) is to follow the theory of the kernel density estimator (Silverman 1986); draw the ik+1|k according to the density Eq. (34); and then take k+1|k = h2Πk+1|k, where Πk+1|k is the covariance matrix of the predictive density,
i1520-0493-136-1-317-e36
and h is a small bandwidth parameter, but can be chosen to some extent. The kernel density estimator is, however, biased: if the ik+1|k are sampled according to Eq. (34), then Eq. (35) has the expectation
i1520-0493-136-1-317-e37
The above calculations show that the bias can be completely eliminated if the ik+1|k is drawn from a density of the same form as Eq. (34), but replacing 𝗣k+1|k with 𝗣k+1|kk+1|k. This obviously requires the matrix 𝗣k+1|kk+1|k to be nonnegative. Moreover, bias is not the only criterion, and the variance needs to be considered as well (a small bias might be allowed if it improves the variance and achieves less estimation error). We therefore propose to draw the ik+1|k from the density
i1520-0493-136-1-317-e38
where ∈ (0, h) is another parameter. The idea is then to choose the h and that minimize the mean integrated squares error (MISE) of the estimator Eq. (35). It turned out that for a given , the MISE is (asymptotically) minimized with respect to h at a unique point hopt(), and further the attained minimum decreases as increases. This means that should be chosen as large as possible. But is constrained by the condition 𝗣k+1|k2Πk+1|k > 0 and the corresponding k+1|k is small. Given Eqs. (14) and (32), the first constraint is equivalent to 𝗨k2𝗩k > 0, where2
i1520-0493-136-1-317-e39
This is true if 𝗖−1k𝗨k(𝗖Tk)−1 > 2𝗜d (𝗜d being the identity matrix), with 𝗖k𝗖Tk is the Cholesky decomposition of 𝗩k. In general, this constraint always entails the second one, so the latter may be ignored. We therefore choose as the square root of the smallest eigenvalue * of the matrix 𝗖−1k𝗨k(𝗖Tk)−1. Then the corresponding hopt() can be computed.

In practice, we found that * can be quite small. As a result, the Kalman-type correction had little effect, and the filter behaved more like a particle filter, which is undesirable. As we care more about the behavior of the filter than the accuracy of the density estimator, it is of interest to focus on the reduction of Monte Carlo fluctuations at the expense of bias. Such bias does not cause great harm; it actually means that the predictive density pk+1|k, and therefore the particles ensemble, is more spread out than the true one and this helps reduce the risk of degeneracy. Intuitively, as pk+1|k is made more diffusive, the filter relies more on recent observations than the model and past observations. This has a similar effect to the widely used forgetting (or inflation) factor in Kalman filtering (Jazwinski 1970), and helps attenuate the propagation of the different sources of errors in the filter (such as Monte Carlo fluctuations, low-rank and model error approximations, and system linearization in our case). It is therefore quite reasonable to sacrifice some filter’s performance (on average) to reduce the risk of divergence. This means that it could be more beneficial to use a value of h larger than the “optimal” one. In the absence of a precise rule, we will consider h as a tunable parameter and try to select it empirically, by trial and error.

For partial resampling, we draw the new particles ik+1|k according to the Gaussian density of mean ik+1|k and covariance matrix 𝗣k+1|k2Πk+1|k. The above calculations concerning the bias of the density estimator k+1|k(·|y1:k) remain valid, but not those concerning the variance since the ik+1|k are not drawn from the same distribution anymore. We therefore take = *, and for the same reasons as above, we choose h a priori as a tuning parameter.

Note that after a full or a partial resampling, 𝗣k+1|k is reset to k+1|kh2Πk+1|k. The matrix 𝗨k+1|k needs to be replaced by k+1|k so that k+1|k is again factorized into k+1|kk+1|kTk+1|k, where k+1|k is function of the new particles, that is, k+1|k = k+1|k𝗧.

Finally, to test if resampling is needed, we should check if the weights wik+1|k are not uniformly distributed (full resampling), or if the matrix 𝗣k+1|k is not small (partial resampling). A simple test for the first condition is to consider the entropy of the weights −ΣNi=1wik+1|k logwik+1|k, which reaches its maximum log N when the distribution of the weights wik+1|k is uniform. Thus, we can consider applying a full resampling if the quantity
i1520-0493-136-1-317-e40
exceeds some threshold. Concerning the second condition, the largest eigenvalues of the matrix 𝗖−1k𝗨k(𝗖Tk)−1 should be monitored. In practice, the filter’s covariance matrices 𝗣k and 𝗣k+1|k do not change much after a prediction or a correction step. After a resampling step, these matrices can be expected to remain small for at least several filtering cycles. This is of course a problem dependent but is often the case in practice, as the sampling interval is generally small (so that the system does not change much in this interval) and the information provided by an observation is not so significant to produce an important correction. To avoid the computation of the eigenvalues of 𝗖−1k𝗨k(𝗖Tk)−1, a simple rule is to wait, say m filtering cycles, before resampling again. Resampling is then partial if the entropy test reveals that the distribution of the weights is still not too far from being uniform.

4. Summary of the LRKPK filter algorithm

The filter’s algorithm is summarized below. Of course, the initialization step is only applied once. The forecast and correction steps must be repeated and resampling can be skipped or done partially. After every prediction or correction step, the minimum-variance estimate of the system state is obtained as the weighted mean of the predicted or the analyzed particles.

  1. Initialization. Starting from an initial prediction state of mean m0 and covariance matrix 𝗣0, draw an ensemble x11|0, . . . , xN1|0 according to the Gaussian distribution ϕ[m0; (1/1 + h2)𝗣0]. Then set wi1| 0 = 1/N for all i and take 𝗣1|0 = 𝗟1|0𝗨1|0𝗟T1|0, where 𝗟1|0 = 𝗫1|0𝗧 and 𝗨1|0 = h2(𝗧T𝗪−11|0𝗧)−1 = (h2/N)(𝗧T𝗧)−1.
  2. Correction step. First compute yik|k−1 = 𝗛k(xik|k−1) for each i and determine Σk from Eq. (22). Then apply a
    • Kalman-type correction: Compute 𝗚k from Eq. (21) using Eq. (26), and use it to correct the forecast particles xik|k−1 with the new observation yk via Eq. (20) to obtain the analysis particles xik. Then take 𝗟k = 𝗫k𝗧 and update 𝗨k by Eq. (28) so that 𝗣k = 𝗟k𝗨k𝗟Tk.
    • Particle-type correction: Update the particles weights with Eq. (25).
  3. Prediction step. Integrate each particle xik with the model to the time of the next available observation to determine the xik+1|k and keep the weights unchanged wik+1|k = wik. Then take 𝗟k+1|k = 𝗫k+1|k𝗧 and 𝗨k+1|k = 𝗨k.
  4. Resampling step. Compute the matrix Vk from Eq. (39) and its Cholesky decomposition 𝗖k𝗖Tk, and set * as the square root of the smallest eigenvalue of 𝗖−1k𝗨k(𝗖−1k)T. Then, for each i, draw a random Gaussian vector ξik+1 of mean zero and covariance matrix 𝗨k*2𝗩k. If 𝗘k+1, as computed from Eq. (40), is smaller than a threshold η, perform a
    • Partial resampling: Simply add the 𝗟k+1| kξik+1 to the xik+1| k to obtain the new particles, ik+1|k, that is,
      i1520-0493-136-1-317-e41
      and set ik+1|k = wik+1|k = wik.
    Else, if 𝗘k+1 > η, perform a
    • Full resampling: Select N particles among the xik+1| k according to their weights wik+1| k, then add 𝗟k+1| kξik+1 to the ith selection to obtain the new particle ik+1|k, that is,
      i1520-0493-136-1-317-e42
      where 𝗦k+1 is the selection matrix with exactly one nonzero element equal to one in each column, and set ik+1|k = 1/N.

Finally, after partial or full resampling, determine the new k+1|k from the ik+1|k and set
i1520-0493-136-1-317-e43
where 𝗗k+1 = [k+1 + (ξ1k+1 · · · ξNk+1)𝗧]−1, and k+1 = 𝗜d if partial resampling was performed, or k+1 = (𝗧𝗧T)−1𝗧T𝗦k+1𝗧 if a full resampling was performed.

5. Application to the Lorenz model

The LRKPK filter was first implemented with the Lorenz model (Lorenz 1963) to assess the behavior of the filter with a simple but strongly nonlinear and chaotic model. The model is described by the ordinary system of differential equations in three dimensions
i1520-0493-136-1-317-eq3
with the commonly used parameters s = 10, r = 28, and b = 8/3, to obtain a chaotic solution. The design of the experiment is identical to that of Pham (2001), and therefore the results of the LRKPK filter can be evaluated with those reported by Pham (2001) for several ensemble/particle-based filters. The Lorenz equations are first integrated between t = 0 and t = 25 with the standard Runge–Kutta scheme using a step size of 0.005 and starting from the initial state (x0, y0, z0) = (−0.587276, −0.563678, 16.8708). This is the reference run. A set of 400 states was formed by retaining the reference solution at intervals of 0.05 starting from t = 5 to avoid the transitory phase. Observations of the variable x only are simulated at intervals of 0.05 by adding a random normal noise with mean zero and variance 2 to the reference solution. The state’s initial PDF is assumed to be Gaussian with mean x1|0 and covariance matrix 𝗣1|0. Here, x1|0 and 𝗣1|0 were set as the mean and sample covariance matrix of the 400 retained reference states. A rank-2 approximation of 𝗣1|0 was then computed by applying an EOF analysis on the reference states as described by Pham et al. (1997). This analysis provides the best low-rank matrix approximation for a sample covariance matrix (Preisendorfer 1988). The rank-2 approximation is in line with the effective dimension of the Lorenz model (2.06). This means that the filter will run with only 3 particles. The initial 3 particles xi1|0 were randomly sampled using the second-order exact drawing scheme (Pham 2001). The performance of the filter is measured by the root-mean-square (rms) error (1/3 times the norm of the difference) between the filter analysis and the reference state.

Figure 1 plots the rms error of the LRKPK filter in the top panel and the corresponding solutions for the z component (not assimilated) in the lower panel. The resampling step was applied every two filtering cycles and the resampling threshold and the parameter h were set to be 0.25 and 0.9, respectively. It can be seen that the LRKPK filter does a very good job at tracking the reference solution while capturing all the model phase transitions. The rms error is also rather consistent over the entire assimilation window. It is very comparable to those obtained by Pham (2001), with the EnK filter using larger number of particles. The filter was also able to provide reliable estimates for the z component of the model, showing an efficient propagation of information to a nonobserved variable. We must mention that the performance of the filter was quite sensitive to the choice of the tuning parameters of the resampling step. A bad choice of these parameters might result in much worse performance than those reported here. Likewise better performance could be achieved with different choices.

The results of this section suggest that the LRKPK filter exploits the limited number of EDOF of the Lorenz system to efficiently operate with a small number of particles. The next section will test the performance of the LRKPK filter with a realistic ocean general circulation model (OGCM).

6. Application to an OGCM

We present preliminary results from a first application of the LRKPK filter for assimilation of synthetic sea surface height (SSH) data into an OGCM of the Mediterranean Sea following a twin experiments approach.

a. The ocean model

We use POM, which is a primitive-equations finite-difference model formulated under the hydrostatic and Boussinesq approximations. The model solves the 3D Navier–Stokes equations on an Arakawa-C grid with a numerical scheme that conserves mass and energy. The spatial differencing schemes are central and explicit in the horizontal and central and implicit in the vertical. Time stepping is achieved using a leapfrog scheme associated with an Asselin filter. The numerical computation is split into an external barotropic mode with a short time step [dictated by the Courant–Friedrichs–Lewy (CFL) condition] solving for the time evolution of the free surface elevation and the depth averaged velocities, and an internal baroclinic mode that solves for the vertical velocity shear. Horizontal mixing in the model is parameterized according to Smagorinsky (1963) while vertical mixing is calculated using the Mellor and Yamada 2.5 turbulence closure scheme. The model state vector is composed of all prognostic variables of the model at every sea grid point. The reader is referred to Blumberg and Mellor (1987) for a detailed description of POM.

The model domain covers the entire Mediterranean basin extending from 7°W to 36°E and 30° to 46°N and has one open boundary located at 7°W. Open boundary conditions were set as follows:

  • Zero gradient condition for the free surface elevation.
  • Flather’s (1976) boundary conditions for the normal barotropic velocity normal.
  • Sommerfeld radiation for the internal baroclinic velocities.
  • Temperature and salinity are advected upstream. When there is inflow through the boundary, these fields are prescribed from the Mediterranean Ocean Database (MODB) MODB-MED4 seasonal climatology.

The horizontal resolution is ¼° × ¼° with 25 sigma levels in the vertical logarithmically distributed near the surface and the bottom. The number of grid points was therefore 175 × 65 × 25. The model bathymetry was obtained from the U.S. Navy Digital Bathymetric Data Bases (DBDB) DBDB5 and DBDB1 and is shown in Fig. 2. The surface forcing (monthly climatological wind stress, upward heat flux, net shortwave radiation and evaporation rate) were derived from the 1979–1993 ECMWF global 1° × 1° 6-h reanalysis, except for the precipitation fields, which were derived from the Jaeger monthly climatology. Bulk formulas were used to compute the surface momentum, heat, and freshwater fluxes at each time step of model integration taking into account the SST predicted by the model itself. The model dynamics were first adjusted to achieve a perpetually repeated seasonal cycle before applying the interannual atmospheric forcing by integrating the model climatologically for 20 yr. This climatological run was initialized with the MODB-MED4 spring temperature and salinity profiles and the initial velocities were set to zero.

b. Experimental setup

1) Filter initialization

The filter is initialized by a Gaussian PDF of mean x1|0 and covariance matrix 𝗣1|0 respectively taken as the mean and sample covariance matrix of a large historical set 𝗛S of state vectors simulated from a long model run. A low-rank approximation 1|0 = 𝗟1|0𝗨1|0𝗟T1|0 is determined by applying an EOF analysis on 𝗛S. The initial particles xi1|10 were then randomly sampled using the second-order exact drawing scheme (Pham 2001).

The historical set 𝗛S was constructed as follows. The model was first integrated for a 2-yr period (1980–1981) starting from the end of the 20-yr spinup run to achieve a quasi adjustment of the model climatological dynamics to the ECMWF interannual forcing. Next, another integration of 4 yr (1982–1985) was carried out to generate 𝗛S by retaining one model output (state vector) every two days. Since the state vector is composed of variables of different nature, a multivariate EOF analysis was applied on the sampled set of 730 state vectors. In this analysis, the state variables were normalized by the standard deviation of each state variable spatially averaged over all sea grid points. About 50 EOFs were needed to account for 90% of the system variance. Given that the individual variance explained by the remaining EOFs was insignificant, this number, 50, provides an estimate of the upper bound for the EDOF of the system (Farrell and Ioannou 2001b). This suggests that a covariance matrix of rank 50, or even less, would likely provide a sufficiently accurate approximation of the covariance matrix of the Gaussian mixture used to approximate the PDF of the system state.

2) Twin-experiments design

A reference model run was first carried out over a 1-yr period starting from 1 January 1986. A sequence of 73 reference states was formed by retaining one model output every 5 days. These states, considered as the “true states,” are used to extract the pseudoobservations, and to evaluate the filter’s behavior by comparing them with state vectors as estimated by the filter. This allows the assessment of the filter’s performance with nonobserved variables. The assimilation experiments were then carried out over the same period, using pseudoobservations of SSH extracted from the reference states every 4 grid points. Independent Gaussian errors of zero mean and 3-cm standard deviation are added to the observations. All experiments were performed in a perfect-model context (𝗤k = 0) and the observational error covariance matrix 𝗥k is diagonal with (3 cm)2 diagonal elements. Another model run initialized from the filter’s initial state estimate x1|0 and integrated over 1986 without any assimilation was also performed to assess the relevance of the assimilation.

Figure 3 plots the time evolution of the basin average kinetic energy during the experiments (initialization/EOF-analysis/twin-experiment) periods. The kinetic energy exhibits a strong seasonal cycle to which interannual anomalies are superimposed; the most important being that of year 1981, forced by the corresponding wind stress anomalies.

The performance of the filters was evaluated by comparing the filter’s analysis errors for each model state variable, relative to the free-run estimation errors
i1520-0493-136-1-317-e44

c. Assimilation results

In this section, the performance of the LRKPK filter is examined in a series of data assimilation experiments. We first present the overall behavior of the filter and then discuss the effect of varying the filter’s parameters, as the rank of the mixture covariance matrices and the bandwidth parameter, on the filter’s performance (basically to determine the setup for the main experiment).

1) Main experiment

The LRKPK filter was implemented using 50 particles with resampling every 5 filtering cycles and a bandwidth parameter h = 0.4. The evolution of the RRMS for this run as function of time is plotted in Fig. 4. The temporal development of the RRMS is characterized by a large reduction of the estimation error with respect to the free run at the first analysis step. Subsequent analyzes are less significant and the filter is able to stabilize the state estimation error at about 70% less than the free run. The assimilation is also shown to significantly improve the estimation of all model state variables with respect to the model free run all over the assimilation window, and, as can be expected, the best assimilation results were achieved for the observed variable SSH.

To evaluate the filter’s performance in capturing the variability of the model, Fig. 5 plots the spatial distribution of the SSH rms estimation error for (a) the free run and (b) the filter’s run with respect to the reference run. Centers of large errors (6 cm) are located in the central Balearic basin and the Tyrrhenian Sea within the western Mediterranean and the central Ionian within the eastern part of the basin. Most of these errors are related to the variability of the Atlantic waters (AW) current flowing along the north African coasts within the Western Mediterranean and the Atlantic-Ionian (AI) stream, which is the continuation of this current within the Ionian basin. The filter greatly improves the solution for the SSH with respect to the free run. In particular, the general variability as well as the mean position of the AW current and the AI stream were efficiently captured by the assimilation system. This resulted in a drastic reduction of the SSH misfits below 2 cm all over the Mediterranean basin. It is also important to examine the filter’s behavior in the intermediate and deep ocean layers to make sure that the surface observations were properly assimilated by the model and to assess the efficiency of the filter in propagating this information to the deep layers. Figure 6 shows the spatial distribution of the mean rms estimation error for (a) the free run, and (b) the filter’s run, for a zonal temperature transect along 33.75°N. Temperature misfits for the free run are concentrated between the surface and 300 m, which is approximately the depth of the Levantine waters produced annually within the eastern Mediterranean. In the particular section shown here, most of the error (reaching 1.2°C) is located within the central Ionian basin and is related to the SSH misfits shown before (Korres and Lascaratos 2003). At all depths, the filter significantly improves the estimation of the temperature with respect to the model free run, particularly in the eastern basin where large filter/data misfits were completely removed. This suggests high capabilities for efficiently propagating only surface altimetric information to nonobserved variables.

Overall, the filter was able to significantly improve the estimation of all model state variables with respect to the model free run, while efficiently propagating surface altimetric information to the deep ocean.

2) Sensitivity with respect to the rank of the mixture covariance matrices

Sensitivity assimilation experiments were carried out to examine the effect of varying the rank r of the Gaussian mixture covariance matrices on the filter’s behavior. The rank r determines the number of particles N (=r + 1) to be used in the filter, and hence sets the computational cost of the assimilation system.

Figure 7 shows the time evolution of the RRMS for the model state variables as they result from the LRKPK filter using mixture covariance matrices with three different numbers of particles; 30, 50, and 100. In all these experiments, the bandwidth parameter was set to h = 0.4 and resampling was performed every 5 filtering steps. These plots suggest that an ensemble with a limited number of 30 particles provides reliable estimates for the assimilated variable (SSH) at reasonable computational cost. For the same variable, the RRMS obtained using 50 and 100 particles are not significantly different. Concerning the estimation of nonobserved variables, the filter shows some weaknesses when small-size ensembles are used, although the overall performance is still reasonable. This suggests that the multivariate propagation of assimilated information requires well-resolved covariance matrices between SSH and other variables, which does not seem feasible with only 30 particles. The use of more particles significantly attenuates this problem and stabilizes the filter’s behavior for all model state variables and all over the assimilation window. It further allows for more degrees of freedom (larger covariance matrices ranks), which enables better fit to the data. In this particular system, benefits from doubling the number of particles from 50 to 100 were not significant. Considering the consequent computational burden associated with the increase in the number of particles, the LRKPK filter can be implemented with 50 particles. This is a small ensemble for a nonlinear filter, even for this model configuration, which has a rather limited number of EDOF. This is consistent with the results of section 5, suggesting that the LRKPK filter exploits the limited number of EDOF of the system to efficiently operate with a small number of particles.

3) Sensitivity with respect to resampling parameters

Following the discussion in section 3d, the frequency m (in terms of filtering cycles) of how often resampling needs to be performed and the value of the bandwidth parameter h, which adjusts the “size” of the mixture covariance matrices after resampling are chosen empirically in the LRKPK filter. It is therefore necessary to conduct assimilation experiments to evaluate the sensitivity of the filter’s performance with respect to those parameters.

Assimilation results (not shown here) from three sensitivity experiments with different resampling frequencies; m = 3, 5, and 10, all with h = 0.4 and 30 particles to reduce computational burden, mainly revealed very little differences in the filter’s performances whether resampling was applied every 3 or 5 filtering cycles (slightly better performances were obtained with m = 5), and both generally provided better results than the run with resampling every 10 filtering cycles. This suggests that resampling is generally useful to limit the overdispersion of the particles, but can be sometimes omitted because of the Kalman-type correction that reduces the risk of degeneracy.

We then present the results of three filter runs with different values of the bandwidth parameter; h = 0.2, 0.4, and 0.8, respectively. Again, to save computing time, the filter was implemented with 30 particles in all these runs. The time evolution of the RRMS as they result from the three assimilation runs is shown in Fig. 8. These plots basically suggest that the best performances were obtained using h = 0.4. They support our discussion in section 3d about the appropriate choice for h: not too large to justify the approximations in Eqs. (18) and (30), and not too small to enhance the benefit of the Kalman-type correction. This “best” value of h is, however, only marginal as the appropriate choice depends on the different properties of the assimilation system (even on the size of the ensemble). It may therefore be different if another system or setup was considered.

7. Summary and discussion

Most analysis schemes of current sequential data assimilation techniques are based on Gaussian distributions of the model state. However, for nonlinear models, the distribution of the model state is not Gaussian even when the system statistics are Gaussian. This means that the above assimilation schemes are only suboptimal (more precisely, they are only optimal among linear analysis estimators). The solution of the nonlinear data assimilation problem is well known and is provided by the optimal nonlinear filter, which theoretically offers a simple method to estimate the PDF of the system state. Several attempts to develop a discrete algorithm for an efficient implementation of this filter were presented, mainly based on a point-mass representation of the state PDFs (particle filter). Besides being computationally prohibitive for large dimensional systems, these filters greatly suffer from the degeneracy of their particles, which very often causes the divergence of the filter.

A new approximate solution of the optimal nonlinear filter suitable for applications in meteorology and oceanography has been presented. A pilot implementation in the simple Lorenz model and in a test case assimilating pseudoaltimetric data into a realistic ocean general circulation model is shown, which preliminary proves its feasibility. The new filter, called the low-rank kernel particle Kalman (LRKPK) filter, is based on a Gaussian mixture representation of the state PDFs complemented by a local linearization of the system around the mean of each Gaussian component. With application to high-dimensional oceanic and atmospheric systems in mind, the covariance matrix of the Gaussian mixture was further assumed of low rank and the local linearization was replaced by a “one vector” linearization. This resulted in a new filter in which the standard particle-type correction of the particles weights is complemented by a Kalman-type correction, similar to the popular ensemble Kalman filter but using the covariance matrix of the mixture instead of the sample covariance matrix of the ensemble. As in the ensemble Kalman filter, the Kalman correction attenuates the degeneracy of the particles by pulling them toward the true state of the system, which enables the filter to efficiently operate with reasonable size ensembles. Combined with the low-rank approximation that allows avoiding the manipulation of full size covariance matrices, it enables the use of the filter for oceanic and atmospheric data assimilation problems.

The chosen test situation was that of a Princeton Ocean Model (POM) of the Mediterranean Sea within which several mesoscale eddies are interacting. In such testing conditions, the LRKPK filter was found to be fairly effective in monitoring the flow state and evolution disposing of surface-only pseudoaltimetric data. Further work will consider more complex situations, both from the model point of view; different model setups with stronger nonlinear configurations, and the assimilation of real data from the Ocean Topography Experiment (TOPEX)/Poseidon satellite, for example. A close theoretical and practical comparison between the LRKPK and the popular ensemble Kalman filter is also of interest to assess the relevance of the nonlinear analysis step and will be pursued in the near future. This preliminary application was a necessary step before realistic applications were undertaken and it provided us with encouraging results regarding that purpose.

Acknowledgments

We thank Dr. Bruce Cornuelle for valuable comments and discussions.

REFERENCES

  • Anderson, J., , and S. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 27412758.

    • Search Google Scholar
    • Export Citation
  • Blumberg, A. F., , and G. L. Mellor, 1987: A description of a three-dimensional coastal ocean circulation model. Three-Dimensional Coastal Ocean Circulation Models, N. S. Heaps, Ed., Coastal Estuarine Science Series, Vol. 4, Amer. Geophys. Union, 1–16.

    • Search Google Scholar
    • Export Citation
  • Cane, M. A., , A. Kaplan, , R. N. Miller, , B. Tang, , E. C. Hackert, , and A. J. Busalacchi, 1996: Mapping tropical Pacific sea level: Data assimilation via a reduced state Kalman filter. J. Geophys. Res., 101 , 2259922617.

    • Search Google Scholar
    • Export Citation
  • Chen, R., , and J. Liu, 2000: Mixture Kalman filters. J. Roy. Stat. Soc. A, 62 , 493508.

  • Cohn, S. E., , and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc. Japan, 74 , 6375.

    • Search Google Scholar
    • Export Citation
  • Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 471 pp.

  • Doucet, A., , N. de Freitas, , and N. Gordon, 2001: Sequential Monte Carlo Methods in Practice. Springer-Verlag, 581 pp.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 , 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1992: Using the extended Kalman filter with a multilayer quasi-geostrophic ocean model. J. Geophys. Res., 97 , 1790517924.

    • Search Google Scholar
    • Export Citation
  • Farrell, B. F., , and P. J. Ioannou, 2001a: State estimation using a reduced-order Kalman filter. J. Atmos. Sci., 58 , 36663680.

  • Farrell, B. F., , and P. J. Ioannou, 2001b: Accurate low-dimensional approximation of the linear dynamics of fluid flow. J. Atmos. Sci., 58 , 27712789.

    • Search Google Scholar
    • Export Citation
  • Flather, R. A., 1976: A tidal model of the northwest European continental shelf. Mem. Soc. Roy. Sci. Liege, 10 , 141164.

  • Fukumori, I., , and P. Malanotte-Rizzoli, 1995: An approximate Kalman filter for ocean data assimilation: An example with an idealized Gulf Stream model. J. Geophys. Res., 100 , 67776794.

    • Search Google Scholar
    • Export Citation
  • Gauthier, P., , P. Courtier, , and P. Moll, 1993: Assimilation of simulated wind Lidar data with a Kalman filter. Mon. Wea. Rev., 121 , 18031820.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., , C. Snyder, , and R. E. Morss, 2002: Analysis-error statistics of a quasigeostrophic model using three-dimensional variational assimilation. Mon. Wea. Rev., 130 , 27772790.

    • Search Google Scholar
    • Export Citation
  • Heemink, A. W., , M. Verlaan, , and A. J. Segers, 2001: Variance reduced ensemble Kalman filtering. Mon. Wea. Rev., 129 , 17181728.

  • Hoteit, I., , D-T. Pham, , and J. Blum, 2002: A simplified reduced-order kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36 , 101127.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D-T. Pham, , and J. Blum, 2003: A semi-evolutive filter with partially local correction basis for data assimilation in oceanography. Oceanol. Acta, 26 , 511524.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , G. Korres, , and G. Triantafyllou, 2005: Comparison of extended and ensemble based Kalman filters with low and high resolution primitive equation ocean models. Nonlinear Processes Geophys., 12 , 755765.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , G. Triantafyllou, , and G. Korres, 2007: Using low-rank ensemble Kalman filters for data assimilation with high dimensional imperfect models. J. Numer. Anal. Ind. Appl. Math., 2 , 6778.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., , and L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

  • Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376 pp.

  • Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. Trans. ASME, J. Basic Eng., 82D , 3545.

  • Kaplan, J. L., , and J. A. Yorke, 1979: Preturbulence: A regime observed in a fluid flow model of Lorenz. Commun. Math. Phys., 67 , 93108.

    • Search Google Scholar
    • Export Citation
  • Kivman, G., 2003: Sequential parameter estimation for stochastic systems. Nonlinear Processes Geophys., 10 , 253259.

  • Korres, G., , and A. Lascaratos, 2003: An eddy resolving model Aegean and Levantine basins for the Mediterranean System Pilot Project (MFSPP): Implementation and climatological runs. Ann. Geophys., 21 , 205220.

    • Search Google Scholar
    • Export Citation
  • Lermusiaux, P. F. J., , and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Part I: Theory and schemes. Mon. Wea. Rev., 127 , 13851407.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20 , 130141.

  • Osborne, A. R., , and A. Pastorello, 1993: Simultaneous occurrence of low-dimensional chaos and colored random noise in nonlinear physical systems. Phys. Lett. A, 181 , 159171.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129 , 11941207.

  • Pham, D. T., , J. Verron, , and M. C. Roubaud, 1997: Singular evolutive Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16 , 323340.

    • Search Google Scholar
    • Export Citation
  • Preisendorfer, R. W., 1988: Principal Component Analysis in Meteorology and Oceanography. Developments in Atmospheric Science Series, Vol. 17, Elsevier, 425 pp.

    • Search Google Scholar
    • Export Citation
  • Silverman, B. W., 1986: Density Estimation for Statistics and Data Analysis. Chapman and Hall, 175 pp.

  • Smagorinsky, J., 1963: General circulation experiments with the primitive equations. I: The basic experiment. Mon. Wea. Rev., 91 , 99164.

    • Search Google Scholar
    • Export Citation
  • Todling, R., 1999: Estimation theory and foundations of atmospheric data assimilation. Data Assimilation Office, Goddard Space Flight Center, DAO Office Note 1999–01, 187 pp.

  • Van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131 , 20712084.

  • West, B. J., , and H. J. Mackey, 1991: Geophysical attractors may be only colored noise. J. Appl. Phys., 69 , 67476749.

Fig. 1.
Fig. 1.

(top) Rms error of the LRKPK filter with the Lorenz model. (bottom) Reference (black curve) and filter (plus signs) solutions for the z component (note that this variable was not assimilated).

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 2.
Fig. 2.

Topography of the ¼° Mediterranean model.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 3.
Fig. 3.

Time evolution of the basin average kinetic energy (10−3 m2 s−2) from 1979 to 1987.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 4.
Fig. 4.

Time evolution of the RRMS as it results from the LRKPK filter using 50 particles with resampling every 5 filtering cycles and a bandwidth parameters h = 0.4.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 5.
Fig. 5.

Spatial distribution of the SSH rms misfit as it results from (a) the free run and (b) the LRKPK filter’s run using 50 particles with resampling every 5 filtering cycles and a bandwidth parameters h = 0.4.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 6.
Fig. 6.

Spatial distribution of the rms misfit for a zonal temperature transect along 33.75°N as it results from (a) the free run and (b) the LRKPK filter’s run using 50 particles with resampling every 5 filtering cycles and a bandwidth parameters h = 0.4.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 7.
Fig. 7.

Time evolution of the RRMS as it results from the LRKPK filter using 30, 50, and 100 particles with resampling every 5 filtering cycles and a bandwidth parameter h = 0.4.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

Fig. 8.
Fig. 8.

Time evolution of the RRMS as it results from the LRKPK filter using 30 particles with resampling every 5 filtering cycles and three different bandwidth parameters: 0.2, 0.4, and 0.8.

Citation: Monthly Weather Review 136, 1; 10.1175/2007MWR1927.1

1

This can be shown by noticing that the matrix 𝗧(𝗧T𝗪−1𝗧)−1𝗧T𝗪−1 is the orthogonal projection operator, with respect to the metric 𝗪1, to the linear subspace spanned by vectors with components sum equal to 0.

2

It is important to note that the filter error covariance matrix is not 𝗣k but cov(xik; wik) + 𝗣k. The first term represents the dispersion of the particles and the second the covariance matrix associated with the particles. When 𝗣k is approximated by a (N − 1)-low-rank matrix, both cov(xik; wik) and 𝗣k are factorized as 𝗟k(𝗧T𝗪−1k𝗧)−1𝗟Tk and 𝗟k𝗨k𝗟Tk, and therefore the filter error covariance matrix is factorized as 𝗟k𝗩k𝗟Tk.

Save