1. Introduction
The Kalman filter provides the optimal (minimum variance) solution of the linear-Gaussian sequential data assimilation problem (Kalman 1960). Since most dynamical and/or observational systems encountered in practice are nonlinear, the system equations are often linearized about the most recent estimate, leading to the popular, but no longer optimal, extended Kalman (EK) filter. Several studies have demonstrated, however, that the linearization of the system may produce instabilities, even divergence, when applied to strongly nonlinear systems (Gauthier et al. 1993; Evensen 1992). For the latter case, an optimal solution can be obtained from the optimal nonlinear filter, which involves the estimation of the conditional probability density function (PDF), not necessarily Gaussian, of the system state given all available measurements up to the estimation time (Doucet et al. 2001). Knowledge of the state’s PDF allows the determination of estimates of the system state, such as the minimum-variance estimate or the maximum a posteriori estimate, following the Bayesian estimation theory (Todling 1999). Similar to the Kalman filter, the nonlinear filter operates as a succession of a correction (or analysis) step at measurement times to correct the predictive density using Bayes’ rule, and a prediction step to propagate the analysis density to the time of the next available observation.
The particle filter is a discrete approximation of the optimal nonlinear filter and is based on point mass representation (mixture of Dirac distributions), called particles, of the state’s PDF (Doucet et al. 2001). In this filter, the particles evolve in time with the numerical model and their assigned weights are updated each time new measurements are available. The filter solution is then the weighted average of the particle ensemble. In practice, this filter suffers from a major problem known as the degeneracy phenomenon; after only a few iterations, weights become concentrated on very few particles and hence only a tiny fraction of the ensemble contributes to the average, very often causing the divergence of the filter. The use of more particles alleviates this problem over short time periods only, so the most efficient way to get around it is resampling (Doucet et al. 2001). This technique consists of drawing new particles according to the distribution of the ensemble and then reassigning them the same weights. However, resampling often introduces Monte Carlo fluctuations, which degrade the filter’s performance. Additionally, even with resampling, a large number of particles are required for an effective behavior of the filter. This makes brute-force implementation of the particle filter problematic with computationally expensive atmospheric and oceanic models. Interesting discussions on the use of the optimal nonlinear filter for high-dimensional oceanic and atmospheric data assimilation problems can be found in Anderson and Anderson (1999), Kivman (2003), and Van Leeuwen (2003).
The popular ensemble Kalman (EnK) filter, which has been introduced by Evensen (1994), is also a particle-based filtering technique. However, while it has the same prediction step as the particle filter, it does not have the same correction step. The EnK filter retains the “linearity aspect” of the Kalman filter in the analysis step, in that it applies the Kalman correction using a forecast error covariance computed as the sample covariance of the particle ensemble. Its correction step therefore uses only the first two moments of the particle ensemble, and is thus suboptimal for non-Gaussian systems. In practical situations however, the EnK filter was found to be more robust than the particle filter when small-size ensembles were considered because the Kalman update of its particles significantly reduces the risk of ensemble degeneracy by pulling the particles toward the true state of the system (Kivman 2003; Van Leeuwen 2003).
In this paper, we introduce a new approximate solution of the optimal nonlinear filter suitable for applications in oceanography and meteorology. The filter makes use of a mixture of Gaussian distributions in a kernel representation to approximate the state PDFs in the optimal nonlinear filter. A Gaussian mixture has been already used by Anderson and Anderson (1999) and Chen and Liu (2000) in the context of the nonlinear filter. It is expected to provide more reliable representation of the state PDFs than a simple mixture of Dirac functions used in the particle filter. Here, we further assume that the covariance matrix of the Gaussian mixtures has low rank to avoid manipulating huge matrices associated with the large dimension of the oceanic and atmospheric systems. This is a very common approach in the atmospheric and oceanic Kalman filtering community (e.g., Fukumori and Malanotte-Rizzoli 1995; Cane et al. 1996; Cohn and Todling 1996; Pham et al. 1997; Heemink et al. 2001; Lermusiaux and Robinson 1999; Pham 2001; Farrell and Ioannou 2001a; Hoteit et al. 2002, 2003, 2005), and implicitly assumes that state estimation errors can be accurately modeled in a severely reduced dimensional subspace (Lermusiaux and Robinson 1999; Hamill et al. 2002). It enforces a smooth analysis since the filter correction is only applied using the leading modes of the analysis covariance matrix (Cane et al. 1996; Lermusiaux and Robinson 1999). Smoothness in the analysis is produced by assuming relatively large spatial scales for the uncertainty in the starting conditions for each forecast step, which leads to a concentration of variance in only a few modes of the covariance matrix (Cane et al. 1996; Cohn and Todling 1996; Lermusiaux and Robinson 1999). In addition, the dissipative and driven nature of the geophysical fluid systems also concentrates the energy at large scales, meaning a red spectrum of variability (Daley 1991), or, for others, suggests the existence of a low-dimensional attractor (Pham et al. 1997; Lermusiaux and Robinson 1999). In practice, a red spectrum is often indistinguishable from a low-dimensional attractor, as both can be efficiently described by a limited number of functions or modes (West and Mackey 1991; Osborne and Pastorello 1993). For simplicity, we refer to this as a system with a limited number of “effective” degrees of freedom (EDOF). Additionally, as the covariance matrixes of the Gaussian mixtures are kept small during the filter operations, a local linearization about the center of the mixture components (the particles) is applied. This leads to a Kalman-type correction for each particle complementing the usual particle-type correction. The resulting filter, called low-rank kernel particle Kalman (LRKPK) filter, basically runs an ensemble of low-rank Kalman filters and then provides the optimal (minimum variance) analysis state as the weighted mean of all the subfilters’ analyses. As in the EnK filter, the Kalman-type correction attenuates the degeneracy problem, which allows the filter to efficiently operate with relatively small-size ensembles (roughly of the same order as the EnK filter). A similar approach was implemented by Houtekamer and Mitchell (1998) who used a pair of ensemble Kalman filters to deal with the problem of inbreeding, generally associated with the use of small-size ensembles. The LRKPK filter is first tested with the simple but chaotic and highly nonlinear Lorenz model (Lorenz 1963). This model is a simplified form of the complicated system describing the dynamics of fluid motion and the heat flow in term of three ordinary partial equations. Assimilation results from a realistic application with a general circulation Princeton Ocean Model (POM) of the Mediterranean Sea are then reported and discussed.
The paper is organized as follows. The characteristics of the optimal nonlinear filter are briefly recalled in section 2. The LRKPK filter is then introduced in section 3 and its algorithm is summarized in section 4. The design and preliminary assimilation results of numerical experiments are then presented in sections 5 and 6. A general discussion concludes the paper in section 7.
2. The optimal nonlinear filter
Starting from a random initial condition x0 with a known probability density function, the optimal nonlinear filter provides the conditional density function of the system state xk at each time tk given all available measurements up to tk. To simplify the notation, we shall write y1:k as a shorthand for y1, . . . , yk. Let pk|k−1(·|y1:k−1) be the conditional (predictive) density function of xk given y1:k−1, and let pk(·|y1:k) be the conditional (analysis) density function of xk given y1:k. The nonlinear filtering algorithm consists of two steps, which we summarize below. The reader is referred to Doucet et al. (2001) for an extensive description of the filter.
- The prediction step. Suppose that the required analysis density pk−1(·|y1:k−1) at time tk−1 is available. The prediction step involves using the model Eq. (1) to obtain the predictive density pk|k−1(·|y1:k−1) at the time of the next available observation tk via the Chapman–Kolmogorov equation:where p(x|xk−1 = u) is the conditional density of the state vector xk to be at x at time tk given that it was at u at time tk−1. Under the assumptions made on the model noise, p(xk|xk−1 = u) = ϕ[x − 𝗠k(u); 𝗤k], wheredenotes the Gaussian density of zero mean and covariance matrix Σ. Thus,The correction step. After a new observation yk has been made, we recover the analysis density pk(·|y1:k) at time tk using the Bayes rule,The analysis density is therefore obtained by multiplying the prior predictive density by the observation likelihood and normalizing by bk = ∫ℝn pk|k−1(u| y1:k−1)ϕ[yk − 𝗛k(u); 𝗥k]du to ensure a probability density.
While the expressions of the state PDFs can be easily obtained, determining the value of the predictive density at each point in state space is practically impossible for large dimensional systems, as in meteorology and oceanography. This actually requires the evaluation of 𝗠k(x) for a large set of values of x, given that one single evaluation can be quite costly in realistic applications. The particle filters approximate the state PDFs by convex mixtures of Dirac functions. In the next section, we will resort to the kernel method to approximate the state PDFs by mixtures of Gaussian distributions.
3. The low-rank kernel particle Kalman filter
a. The initialization step
In practice, very little information is available on the distribution of the initial state vector px1. However, this is often not a serious problem since several studies have found that px1 does not need to be set with high accuracy because it does not have a significant impact on the long-term behavior of the filter (Doucet et al. 2001). In atmospheric and oceanic applications, it is important that the initial estimate of px1 take into account the main physical quantities that govern the evolution of the state of these systems. Omitting such quantities may badly affect the filter’s behavior, as they tend to persist over time. We therefore estimate the statistics of px1 from a sample of model outputs. In the absence of prior information, we follow the common practice and assume px1 to be Gaussian (Doucet et al. 2001). Under the assumption of low-rank covariance matrix, a second-order exact drawing can be performed as described by Pham (2001) to sample the xi1|10 such that their mean and sample covariance matrix exactly matches the mean and 1/(1 + h2) times the covariance matrix of x1|0. The covariance matrix of the resulting initial density estimate p1|0 then matches that of px1.
b. The correction step
The filter’s correction step can be interpreted as composed of two types of corrections: a Kalman-type correction defined by Eqs. (20)–(23) and a particle-type correction defined by Eqs. (24)–(25). The Kalman-type correction reduces the risk of degeneracy by pulling the particles toward the true state of the system Eq. (20). This can also be seen from Eq. (25), which has the same form as the standard particles’ reweight equation, but uses the covariance matrix of the predictive measure Σk as the “observation covariance” matrix rather than 𝗥k, which is used in the standard particle filter. Since Σk is always greater than 𝗥k, the particles close to the observation will somehow get less weight than in the standard particle filter, while those far from the observation will receive more weight. This means that the support of the local predictive density and likelihood will be more coherent than in the particle filter. Resampling will therefore be needed less often, thus reducing Monte Carlo fluctuations.
c. The prediction step
d. The resampling step
Resampling is crucial in the particle filter to avoid the degeneracy of the particles. The same problem might also occur in the LRKPK filter, albeit to a lesser degree since the matrix Σk in Eq. (22) is greater than 𝗥k, which means that the weights are distributed more uniformly than in the case of the particle filter. Another problem may occur in our filter is that the matrix 𝗣k+1|k in Eq. (31) is generally greater than 𝗣k; either because of the presence of dynamical noise and/or the amplification effect of the multiplication by Mk+1, hence the mixture covariance matrices might become too large to justify the approximations needed to derive Eq. (18) and Eq. (30). We rely on resampling to reduce the size of 𝗣k+1|k. Note that a standard “full” resampling would require particles selection, which eliminates particles with low weights and duplicate particles with high weights. To avoid Monte Carlo fluctuations associated with such selection, a “partial” resampling is used here when the weights are already sufficiently uniform. Partial resampling is less destructive (of information) than full resampling. It amounts to adding some noise to the particles, while the weights remain unchanged. This is similar to the resampling procedure of the standard particle filter, which adds noise to the particles to mimic the dynamical noise. In any case, since resampling inevitably entails some loss of accuracy, it should be used only when necessary.
In practice, we found that h̃* can be quite small. As a result, the Kalman-type correction had little effect, and the filter behaved more like a particle filter, which is undesirable. As we care more about the behavior of the filter than the accuracy of the density estimator, it is of interest to focus on the reduction of Monte Carlo fluctuations at the expense of bias. Such bias does not cause great harm; it actually means that the predictive density pk+1|k, and therefore the particles ensemble, is more spread out than the true one and this helps reduce the risk of degeneracy. Intuitively, as pk+1|k is made more diffusive, the filter relies more on recent observations than the model and past observations. This has a similar effect to the widely used forgetting (or inflation) factor in Kalman filtering (Jazwinski 1970), and helps attenuate the propagation of the different sources of errors in the filter (such as Monte Carlo fluctuations, low-rank and model error approximations, and system linearization in our case). It is therefore quite reasonable to sacrifice some filter’s performance (on average) to reduce the risk of divergence. This means that it could be more beneficial to use a value of h larger than the “optimal” one. In the absence of a precise rule, we will consider h as a tunable parameter and try to select it empirically, by trial and error.
For partial resampling, we draw the new particles x̃ik+1|k according to the Gaussian density of mean x̃ik+1|k and covariance matrix 𝗣k+1|k − h̃2Πk+1|k. The above calculations concerning the bias of the density estimator p̃k+1|k(·|y1:k) remain valid, but not those concerning the variance since the x̃ik+1|k are not drawn from the same distribution anymore. We therefore take h̃ = h̃*, and for the same reasons as above, we choose h a priori as a tuning parameter.
Note that after a full or a partial resampling, 𝗣k+1|k is reset to
4. Summary of the LRKPK filter algorithm
The filter’s algorithm is summarized below. Of course, the initialization step is only applied once. The forecast and correction steps must be repeated and resampling can be skipped or done partially. After every prediction or correction step, the minimum-variance estimate of the system state is obtained as the weighted mean of the predicted or the analyzed particles.
Initialization. Starting from an initial prediction state of mean m0 and covariance matrix 𝗣0, draw an ensemble x11|0, . . . , xN1|0 according to the Gaussian distribution ϕ[m0; (1/1 + h2)𝗣0]. Then set wi1| 0 = 1/N for all i and take 𝗣1|0 = 𝗟1|0𝗨1|0𝗟T1|0, where 𝗟1|0 = 𝗫1|0𝗧 and 𝗨1|0 = h2(𝗧T𝗪−11|0𝗧)−1 = (h2/N)(𝗧T𝗧)−1.
Correction step. First compute yik|k−1 = 𝗛k(xik|k−1) for each i and determine Σk from Eq. (22). Then apply a
Kalman-type correction: Compute 𝗚k from Eq. (21) using Eq. (26), and use it to correct the forecast particles xik|k−1 with the new observation yk via Eq. (20) to obtain the analysis particles xik. Then take 𝗟k = 𝗫k𝗧 and update 𝗨k by Eq. (28) so that 𝗣k = 𝗟k𝗨k𝗟Tk.
Particle-type correction: Update the particles weights with Eq. (25).
Prediction step. Integrate each particle xik with the model to the time of the next available observation to determine the xik+1|k and keep the weights unchanged wik+1|k = wik. Then take 𝗟k+1|k = 𝗫k+1|k𝗧 and 𝗨k+1|k = 𝗨k.
Resampling step. Compute the matrix Vk from Eq. (39) and its Cholesky decomposition 𝗖k𝗖Tk, and set h̃* as the square root of the smallest eigenvalue of 𝗖−1k𝗨k(𝗖−1k)T. Then, for each i, draw a random Gaussian vector ξik+1 of mean zero and covariance matrix 𝗨k − h̃*2𝗩k. If 𝗘k+1, as computed from Eq. (40), is smaller than a threshold η, perform a
- Partial resampling: Simply add the 𝗟k+1| kξik+1 to the xik+1| k to obtain the new particles, x̃ik+1|k, that is,and set w̃ik+1|k = wik+1|k = wik.
Else, if 𝗘k+1 > η, perform a
- Full resampling: Select N particles among the xik+1| k according to their weights wik+1| k, then add 𝗟k+1| kξik+1 to the ith selection to obtain the new particle x̃ik+1|k, that is,where 𝗦k+1 is the selection matrix with exactly one nonzero element equal to one in each column, and set w̃ik+1|k = 1/N.
5. Application to the Lorenz model
Figure 1 plots the rms error of the LRKPK filter in the top panel and the corresponding solutions for the z component (not assimilated) in the lower panel. The resampling step was applied every two filtering cycles and the resampling threshold and the parameter h were set to be 0.25 and 0.9, respectively. It can be seen that the LRKPK filter does a very good job at tracking the reference solution while capturing all the model phase transitions. The rms error is also rather consistent over the entire assimilation window. It is very comparable to those obtained by Pham (2001), with the EnK filter using larger number of particles. The filter was also able to provide reliable estimates for the z component of the model, showing an efficient propagation of information to a nonobserved variable. We must mention that the performance of the filter was quite sensitive to the choice of the tuning parameters of the resampling step. A bad choice of these parameters might result in much worse performance than those reported here. Likewise better performance could be achieved with different choices.
The results of this section suggest that the LRKPK filter exploits the limited number of EDOF of the Lorenz system to efficiently operate with a small number of particles. The next section will test the performance of the LRKPK filter with a realistic ocean general circulation model (OGCM).
6. Application to an OGCM
We present preliminary results from a first application of the LRKPK filter for assimilation of synthetic sea surface height (SSH) data into an OGCM of the Mediterranean Sea following a twin experiments approach.
a. The ocean model
We use POM, which is a primitive-equations finite-difference model formulated under the hydrostatic and Boussinesq approximations. The model solves the 3D Navier–Stokes equations on an Arakawa-C grid with a numerical scheme that conserves mass and energy. The spatial differencing schemes are central and explicit in the horizontal and central and implicit in the vertical. Time stepping is achieved using a leapfrog scheme associated with an Asselin filter. The numerical computation is split into an external barotropic mode with a short time step [dictated by the Courant–Friedrichs–Lewy (CFL) condition] solving for the time evolution of the free surface elevation and the depth averaged velocities, and an internal baroclinic mode that solves for the vertical velocity shear. Horizontal mixing in the model is parameterized according to Smagorinsky (1963) while vertical mixing is calculated using the Mellor and Yamada 2.5 turbulence closure scheme. The model state vector is composed of all prognostic variables of the model at every sea grid point. The reader is referred to Blumberg and Mellor (1987) for a detailed description of POM.
The model domain covers the entire Mediterranean basin extending from 7°W to 36°E and 30° to 46°N and has one open boundary located at 7°W. Open boundary conditions were set as follows:
Zero gradient condition for the free surface elevation.
Flather’s (1976) boundary conditions for the normal barotropic velocity normal.
Sommerfeld radiation for the internal baroclinic velocities.
Temperature and salinity are advected upstream. When there is inflow through the boundary, these fields are prescribed from the Mediterranean Ocean Database (MODB) MODB-MED4 seasonal climatology.
The horizontal resolution is ¼° × ¼° with 25 sigma levels in the vertical logarithmically distributed near the surface and the bottom. The number of grid points was therefore 175 × 65 × 25. The model bathymetry was obtained from the U.S. Navy Digital Bathymetric Data Bases (DBDB) DBDB5 and DBDB1 and is shown in Fig. 2. The surface forcing (monthly climatological wind stress, upward heat flux, net shortwave radiation and evaporation rate) were derived from the 1979–1993 ECMWF global 1° × 1° 6-h reanalysis, except for the precipitation fields, which were derived from the Jaeger monthly climatology. Bulk formulas were used to compute the surface momentum, heat, and freshwater fluxes at each time step of model integration taking into account the SST predicted by the model itself. The model dynamics were first adjusted to achieve a perpetually repeated seasonal cycle before applying the interannual atmospheric forcing by integrating the model climatologically for 20 yr. This climatological run was initialized with the MODB-MED4 spring temperature and salinity profiles and the initial velocities were set to zero.
b. Experimental setup
1) Filter initialization
The filter is initialized by a Gaussian PDF of mean x1|0 and covariance matrix 𝗣1|0 respectively taken as the mean and sample covariance matrix of a large historical set 𝗛S of state vectors simulated from a long model run. A low-rank approximation
The historical set 𝗛S was constructed as follows. The model was first integrated for a 2-yr period (1980–1981) starting from the end of the 20-yr spinup run to achieve a quasi adjustment of the model climatological dynamics to the ECMWF interannual forcing. Next, another integration of 4 yr (1982–1985) was carried out to generate 𝗛S by retaining one model output (state vector) every two days. Since the state vector is composed of variables of different nature, a multivariate EOF analysis was applied on the sampled set of 730 state vectors. In this analysis, the state variables were normalized by the standard deviation of each state variable spatially averaged over all sea grid points. About 50 EOFs were needed to account for 90% of the system variance. Given that the individual variance explained by the remaining EOFs was insignificant, this number, 50, provides an estimate of the upper bound for the EDOF of the system (Farrell and Ioannou 2001b). This suggests that a covariance matrix of rank 50, or even less, would likely provide a sufficiently accurate approximation of the covariance matrix of the Gaussian mixture used to approximate the PDF of the system state.
2) Twin-experiments design
A reference model run was first carried out over a 1-yr period starting from 1 January 1986. A sequence of 73 reference states was formed by retaining one model output every 5 days. These states, considered as the “true states,” are used to extract the pseudoobservations, and to evaluate the filter’s behavior by comparing them with state vectors as estimated by the filter. This allows the assessment of the filter’s performance with nonobserved variables. The assimilation experiments were then carried out over the same period, using pseudoobservations of SSH extracted from the reference states every 4 grid points. Independent Gaussian errors of zero mean and 3-cm standard deviation are added to the observations. All experiments were performed in a perfect-model context (𝗤k = 0) and the observational error covariance matrix 𝗥k is diagonal with (3 cm)2 diagonal elements. Another model run initialized from the filter’s initial state estimate x1|0 and integrated over 1986 without any assimilation was also performed to assess the relevance of the assimilation.
Figure 3 plots the time evolution of the basin average kinetic energy during the experiments (initialization/EOF-analysis/twin-experiment) periods. The kinetic energy exhibits a strong seasonal cycle to which interannual anomalies are superimposed; the most important being that of year 1981, forced by the corresponding wind stress anomalies.
c. Assimilation results
In this section, the performance of the LRKPK filter is examined in a series of data assimilation experiments. We first present the overall behavior of the filter and then discuss the effect of varying the filter’s parameters, as the rank of the mixture covariance matrices and the bandwidth parameter, on the filter’s performance (basically to determine the setup for the main experiment).
1) Main experiment
The LRKPK filter was implemented using 50 particles with resampling every 5 filtering cycles and a bandwidth parameter h = 0.4. The evolution of the RRMS for this run as function of time is plotted in Fig. 4. The temporal development of the RRMS is characterized by a large reduction of the estimation error with respect to the free run at the first analysis step. Subsequent analyzes are less significant and the filter is able to stabilize the state estimation error at about 70% less than the free run. The assimilation is also shown to significantly improve the estimation of all model state variables with respect to the model free run all over the assimilation window, and, as can be expected, the best assimilation results were achieved for the observed variable SSH.
To evaluate the filter’s performance in capturing the variability of the model, Fig. 5 plots the spatial distribution of the SSH rms estimation error for (a) the free run and (b) the filter’s run with respect to the reference run. Centers of large errors (6 cm) are located in the central Balearic basin and the Tyrrhenian Sea within the western Mediterranean and the central Ionian within the eastern part of the basin. Most of these errors are related to the variability of the Atlantic waters (AW) current flowing along the north African coasts within the Western Mediterranean and the Atlantic-Ionian (AI) stream, which is the continuation of this current within the Ionian basin. The filter greatly improves the solution for the SSH with respect to the free run. In particular, the general variability as well as the mean position of the AW current and the AI stream were efficiently captured by the assimilation system. This resulted in a drastic reduction of the SSH misfits below 2 cm all over the Mediterranean basin. It is also important to examine the filter’s behavior in the intermediate and deep ocean layers to make sure that the surface observations were properly assimilated by the model and to assess the efficiency of the filter in propagating this information to the deep layers. Figure 6 shows the spatial distribution of the mean rms estimation error for (a) the free run, and (b) the filter’s run, for a zonal temperature transect along 33.75°N. Temperature misfits for the free run are concentrated between the surface and 300 m, which is approximately the depth of the Levantine waters produced annually within the eastern Mediterranean. In the particular section shown here, most of the error (reaching 1.2°C) is located within the central Ionian basin and is related to the SSH misfits shown before (Korres and Lascaratos 2003). At all depths, the filter significantly improves the estimation of the temperature with respect to the model free run, particularly in the eastern basin where large filter/data misfits were completely removed. This suggests high capabilities for efficiently propagating only surface altimetric information to nonobserved variables.
Overall, the filter was able to significantly improve the estimation of all model state variables with respect to the model free run, while efficiently propagating surface altimetric information to the deep ocean.
2) Sensitivity with respect to the rank of the mixture covariance matrices
Sensitivity assimilation experiments were carried out to examine the effect of varying the rank r of the Gaussian mixture covariance matrices on the filter’s behavior. The rank r determines the number of particles N (=r + 1) to be used in the filter, and hence sets the computational cost of the assimilation system.
Figure 7 shows the time evolution of the RRMS for the model state variables as they result from the LRKPK filter using mixture covariance matrices with three different numbers of particles; 30, 50, and 100. In all these experiments, the bandwidth parameter was set to h = 0.4 and resampling was performed every 5 filtering steps. These plots suggest that an ensemble with a limited number of 30 particles provides reliable estimates for the assimilated variable (SSH) at reasonable computational cost. For the same variable, the RRMS obtained using 50 and 100 particles are not significantly different. Concerning the estimation of nonobserved variables, the filter shows some weaknesses when small-size ensembles are used, although the overall performance is still reasonable. This suggests that the multivariate propagation of assimilated information requires well-resolved covariance matrices between SSH and other variables, which does not seem feasible with only 30 particles. The use of more particles significantly attenuates this problem and stabilizes the filter’s behavior for all model state variables and all over the assimilation window. It further allows for more degrees of freedom (larger covariance matrices ranks), which enables better fit to the data. In this particular system, benefits from doubling the number of particles from 50 to 100 were not significant. Considering the consequent computational burden associated with the increase in the number of particles, the LRKPK filter can be implemented with 50 particles. This is a small ensemble for a nonlinear filter, even for this model configuration, which has a rather limited number of EDOF. This is consistent with the results of section 5, suggesting that the LRKPK filter exploits the limited number of EDOF of the system to efficiently operate with a small number of particles.
3) Sensitivity with respect to resampling parameters
Following the discussion in section 3d, the frequency m (in terms of filtering cycles) of how often resampling needs to be performed and the value of the bandwidth parameter h, which adjusts the “size” of the mixture covariance matrices after resampling are chosen empirically in the LRKPK filter. It is therefore necessary to conduct assimilation experiments to evaluate the sensitivity of the filter’s performance with respect to those parameters.
Assimilation results (not shown here) from three sensitivity experiments with different resampling frequencies; m = 3, 5, and 10, all with h = 0.4 and 30 particles to reduce computational burden, mainly revealed very little differences in the filter’s performances whether resampling was applied every 3 or 5 filtering cycles (slightly better performances were obtained with m = 5), and both generally provided better results than the run with resampling every 10 filtering cycles. This suggests that resampling is generally useful to limit the overdispersion of the particles, but can be sometimes omitted because of the Kalman-type correction that reduces the risk of degeneracy.
We then present the results of three filter runs with different values of the bandwidth parameter; h = 0.2, 0.4, and 0.8, respectively. Again, to save computing time, the filter was implemented with 30 particles in all these runs. The time evolution of the RRMS as they result from the three assimilation runs is shown in Fig. 8. These plots basically suggest that the best performances were obtained using h = 0.4. They support our discussion in section 3d about the appropriate choice for h: not too large to justify the approximations in Eqs. (18) and (30), and not too small to enhance the benefit of the Kalman-type correction. This “best” value of h is, however, only marginal as the appropriate choice depends on the different properties of the assimilation system (even on the size of the ensemble). It may therefore be different if another system or setup was considered.
7. Summary and discussion
Most analysis schemes of current sequential data assimilation techniques are based on Gaussian distributions of the model state. However, for nonlinear models, the distribution of the model state is not Gaussian even when the system statistics are Gaussian. This means that the above assimilation schemes are only suboptimal (more precisely, they are only optimal among linear analysis estimators). The solution of the nonlinear data assimilation problem is well known and is provided by the optimal nonlinear filter, which theoretically offers a simple method to estimate the PDF of the system state. Several attempts to develop a discrete algorithm for an efficient implementation of this filter were presented, mainly based on a point-mass representation of the state PDFs (particle filter). Besides being computationally prohibitive for large dimensional systems, these filters greatly suffer from the degeneracy of their particles, which very often causes the divergence of the filter.
A new approximate solution of the optimal nonlinear filter suitable for applications in meteorology and oceanography has been presented. A pilot implementation in the simple Lorenz model and in a test case assimilating pseudoaltimetric data into a realistic ocean general circulation model is shown, which preliminary proves its feasibility. The new filter, called the low-rank kernel particle Kalman (LRKPK) filter, is based on a Gaussian mixture representation of the state PDFs complemented by a local linearization of the system around the mean of each Gaussian component. With application to high-dimensional oceanic and atmospheric systems in mind, the covariance matrix of the Gaussian mixture was further assumed of low rank and the local linearization was replaced by a “one vector” linearization. This resulted in a new filter in which the standard particle-type correction of the particles weights is complemented by a Kalman-type correction, similar to the popular ensemble Kalman filter but using the covariance matrix of the mixture instead of the sample covariance matrix of the ensemble. As in the ensemble Kalman filter, the Kalman correction attenuates the degeneracy of the particles by pulling them toward the true state of the system, which enables the filter to efficiently operate with reasonable size ensembles. Combined with the low-rank approximation that allows avoiding the manipulation of full size covariance matrices, it enables the use of the filter for oceanic and atmospheric data assimilation problems.
The chosen test situation was that of a Princeton Ocean Model (POM) of the Mediterranean Sea within which several mesoscale eddies are interacting. In such testing conditions, the LRKPK filter was found to be fairly effective in monitoring the flow state and evolution disposing of surface-only pseudoaltimetric data. Further work will consider more complex situations, both from the model point of view; different model setups with stronger nonlinear configurations, and the assimilation of real data from the Ocean Topography Experiment (TOPEX)/Poseidon satellite, for example. A close theoretical and practical comparison between the LRKPK and the popular ensemble Kalman filter is also of interest to assess the relevance of the nonlinear analysis step and will be pursued in the near future. This preliminary application was a necessary step before realistic applications were undertaken and it provided us with encouraging results regarding that purpose.
Acknowledgments
We thank Dr. Bruce Cornuelle for valuable comments and discussions.
REFERENCES
Anderson, J., and S. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 2741–2758.
Blumberg, A. F., and G. L. Mellor, 1987: A description of a three-dimensional coastal ocean circulation model. Three-Dimensional Coastal Ocean Circulation Models, N. S. Heaps, Ed., Coastal Estuarine Science Series, Vol. 4, Amer. Geophys. Union, 1–16.
Cane, M. A., A. Kaplan, R. N. Miller, B. Tang, E. C. Hackert, and A. J. Busalacchi, 1996: Mapping tropical Pacific sea level: Data assimilation via a reduced state Kalman filter. J. Geophys. Res., 101 , 22599–22617.
Chen, R., and J. Liu, 2000: Mixture Kalman filters. J. Roy. Stat. Soc. A, 62 , 493–508.
Cohn, S. E., and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc. Japan, 74 , 63–75.
Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 471 pp.
Doucet, A., N. de Freitas, and N. Gordon, 2001: Sequential Monte Carlo Methods in Practice. Springer-Verlag, 581 pp.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 , 10143–10162.
Evensen, G., 1992: Using the extended Kalman filter with a multilayer quasi-geostrophic ocean model. J. Geophys. Res., 97 , 17905–17924.
Farrell, B. F., and P. J. Ioannou, 2001a: State estimation using a reduced-order Kalman filter. J. Atmos. Sci., 58 , 3666–3680.
Farrell, B. F., and P. J. Ioannou, 2001b: Accurate low-dimensional approximation of the linear dynamics of fluid flow. J. Atmos. Sci., 58 , 2771–2789.
Flather, R. A., 1976: A tidal model of the northwest European continental shelf. Mem. Soc. Roy. Sci. Liege, 10 , 141–164.
Fukumori, I., and P. Malanotte-Rizzoli, 1995: An approximate Kalman filter for ocean data assimilation: An example with an idealized Gulf Stream model. J. Geophys. Res., 100 , 6777–6794.
Gauthier, P., P. Courtier, and P. Moll, 1993: Assimilation of simulated wind Lidar data with a Kalman filter. Mon. Wea. Rev., 121 , 1803–1820.
Hamill, T. M., C. Snyder, and R. E. Morss, 2002: Analysis-error statistics of a quasigeostrophic model using three-dimensional variational assimilation. Mon. Wea. Rev., 130 , 2777–2790.
Heemink, A. W., M. Verlaan, and A. J. Segers, 2001: Variance reduced ensemble Kalman filtering. Mon. Wea. Rev., 129 , 1718–1728.
Hoteit, I., D-T. Pham, and J. Blum, 2002: A simplified reduced-order kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36 , 101–127.
Hoteit, I., D-T. Pham, and J. Blum, 2003: A semi-evolutive filter with partially local correction basis for data assimilation in oceanography. Oceanol. Acta, 26 , 511–524.
Hoteit, I., G. Korres, and G. Triantafyllou, 2005: Comparison of extended and ensemble based Kalman filters with low and high resolution primitive equation ocean models. Nonlinear Processes Geophys., 12 , 755–765.
Hoteit, I., G. Triantafyllou, and G. Korres, 2007: Using low-rank ensemble Kalman filters for data assimilation with high dimensional imperfect models. J. Numer. Anal. Ind. Appl. Math., 2 , 67–78.
Houtekamer, P. L., and L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796–811.
Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376 pp.
Kalman, R. E., 1960: A new approach to linear filtering and prediction problems. Trans. ASME, J. Basic Eng., 82D , 35–45.
Kaplan, J. L., and J. A. Yorke, 1979: Preturbulence: A regime observed in a fluid flow model of Lorenz. Commun. Math. Phys., 67 , 93–108.
Kivman, G., 2003: Sequential parameter estimation for stochastic systems. Nonlinear Processes Geophys., 10 , 253–259.
Korres, G., and A. Lascaratos, 2003: An eddy resolving model Aegean and Levantine basins for the Mediterranean System Pilot Project (MFSPP): Implementation and climatological runs. Ann. Geophys., 21 , 205–220.
Lermusiaux, P. F. J., and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Part I: Theory and schemes. Mon. Wea. Rev., 127 , 1385–1407.
Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20 , 130–141.
Osborne, A. R., and A. Pastorello, 1993: Simultaneous occurrence of low-dimensional chaos and colored random noise in nonlinear physical systems. Phys. Lett. A, 181 , 159–171.
Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129 , 1194–1207.
Pham, D. T., J. Verron, and M. C. Roubaud, 1997: Singular evolutive Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16 , 323–340.
Preisendorfer, R. W., 1988: Principal Component Analysis in Meteorology and Oceanography. Developments in Atmospheric Science Series, Vol. 17, Elsevier, 425 pp.
Silverman, B. W., 1986: Density Estimation for Statistics and Data Analysis. Chapman and Hall, 175 pp.
Smagorinsky, J., 1963: General circulation experiments with the primitive equations. I: The basic experiment. Mon. Wea. Rev., 91 , 99–164.
Todling, R., 1999: Estimation theory and foundations of atmospheric data assimilation. Data Assimilation Office, Goddard Space Flight Center, DAO Office Note 1999–01, 187 pp.
Van Leeuwen, P. J., 2003: A variance-minimizing filter for large-scale applications. Mon. Wea. Rev., 131 , 2071–2084.
West, B. J., and H. J. Mackey, 1991: Geophysical attractors may be only colored noise. J. Appl. Phys., 69 , 6747–6749.
This can be shown by noticing that the matrix 𝗧(𝗧T𝗪−1𝗧)−1𝗧T𝗪−1 is the orthogonal projection operator, with respect to the metric 𝗪−1, to the linear subspace spanned by vectors with components sum equal to 0.
It is important to note that the filter error covariance matrix is not 𝗣k but cov(xik; wik) + 𝗣k. The first term represents the dispersion of the particles and the second the covariance matrix associated with the particles. When 𝗣k is approximated by a (N − 1)-low-rank matrix, both cov(xik; wik) and 𝗣k are factorized as 𝗟k(𝗧T𝗪−1k𝗧)−1𝗟Tk and 𝗟k𝗨k𝗟Tk, and therefore the filter error covariance matrix is factorized as 𝗟k𝗩k𝗟Tk.