• Altaf, U., , T. Butler, , T. Mayo, , C. Dawson, , A. Heemink, , and I. Hoteit, 2014: A comparison of ensemble Kalman filters for storm surge assimilation. Mon. Wea. Rev.,142, 2899–2914, doi:10.1175/MWR-D-13-00266.1.

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634642, doi:10.1175/1520-0493(2003)131<0634:ALLSFF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2010: A non-Gaussian ensemble filter update for data assimilation. Mon. Wea. Rev., 138, 41864198, doi:10.1175/2010MWR3253.1.

    • Search Google Scholar
    • Export Citation
  • Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436, doi:10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bowler, N. E., , J. Flowerdew, , and S. R. Pring, 2013: Tests of different flavors of EnKF on a simple model. Quart. J. Roy. Meteor. Soc., 139, 15051519, doi:10.1002/qj.2055.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 14310 162, doi:10.1029/94JC00572.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367, doi:10.1007/s10236-003-0036-9.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2009: The ensemble Kalman filter for combined state and parameter estimation. IEEE Control Syst., 29, 83104, doi:10.1109/MCS.2009.932223.

    • Search Google Scholar
    • Export Citation
  • Frei, M., , and H. R. Kunsch, 2013: Mixture ensemble Kalman filters. Comput. Stat. Data Anal., 58, 127138, doi:10.1016/j.csda.2011.04.013.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757, doi:10.1002/qj.49712555417.

    • Search Google Scholar
    • Export Citation
  • Golub, G., , and C. F. Van Loan, Eds., 1996: Matrix Computation. 3rd ed. Johns Hopkins University Press, 728 pp.

  • Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hestenes, M. R., , and W. Karush, 1951: A method of gradients for the calculation of the characteristic roots and vectors of a real symmetric matrix. J. Res. Natl. Bur. Stand., 47, 4561.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36, 101127, doi:10.1016/S0924-7963(02)00129-X.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , G. Korres, , and G. Triantafyllou, 2005: Comparison of extended and ensemble based Kalman filters with low and high-resolution primitive equations ocean models. Nonlinear Processes Geophys., 12, 755765, doi:10.5194/npg-12-755-2005.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , X. Luo, , and D. T. Pham, 2012: Particle Kalman filtering: A nonlinear Bayesian framework for ensemble Kalman filters. Mon. Wea. Rev., 140, 528542, doi:10.1175/2011MWR3640.1.

    • Search Google Scholar
    • Export Citation
  • Kalman, R., 1960: A new approach to linear filtering and prediction problems. J. Fluids Eng., 82, 3545, doi:10.1115/1.3662552.

  • Karush, W., 1951: An iterative method for finding characteristic vectors of a symmetric matrix. Pac. J. Math., 1, 233248, doi:10.2140/pjm.1951.1.233.

    • Search Google Scholar
    • Export Citation
  • Korres, G., , I. Hoteit, , and G. Triantafyllou, 2007: Data assimilation into a Princeton Ocean Model of the Mediterranean Sea using advanced Kalman filters. J. Mar. Syst., 65, 84104, doi:10.1016/j.jmarsys.2006.09.005.

    • Search Google Scholar
    • Export Citation
  • Lawson, W., , and J. Hansen, 2004: Implications of stochastic and deterministic filters as ensemble-based data assimilation methods in varying regimes of error growth. Mon. Wea. Rev., 132, 19661981, doi:10.1175/1520-0493(2004)132<1966:IOSADF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Lei, J., , P. Bickel, , and C. Snyder, 2010: Comparison of ensemble Kalman filters under non-Gaussianity. Mon. Wea. Rev., 138, 12931306, doi:10.1175/2009MWR3133.1.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414, doi:10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Luo, X., , and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform. Physica D, 238, 549562, doi:10.1016/j.physd.2008.12.003.

    • Search Google Scholar
    • Export Citation
  • Maybeck, P., 1979: Square root filtering. Stochastic Models, Estimation, and Control, Vol. 1, Academic Press, 368–410.

  • Nerger, L., , W. Hiller, , and J. Schroter, 2005: A comparison of error subspace Kalman filters. Tellus, 57A, 715735, doi:10.1111/j.1600-0870.2005.00141.x.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 11941207, doi:10.1175/1520-0493(2001)129<1194:SMFSDA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Sakov, P., , and P. Oke, 2008: A deterministic formulation of the ensemble Kalman filter: An alternative to ensemble square root filters. Tellus, 60A, 361371, doi:10.1111/j.1600-0870.2007.00299.x.

    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490, doi:10.1175/1520-0493(2003)131<1485:ESRF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Wan, E., , and R. van der Merwe, 2000: The unscented Kalman filter for nonlinear estimation. Proc. IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symp., Lake Louise, Alberta, Canada, IEEE, 153–158, doi:10.1109/ASSPCC.2000.882463.

  • Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Two examples of observations perturbations distributions [fitted by a Kernel density (KDE) estimator] as produced by the proposed filter (EnKFesops) for all observed variables at a given analysis step in data assimilation experiments with the Lorenz-96 model and compared with the ideal observation errors distribution. (left) Filter run with no inflation, no localization, 30 members, and assimilation of all model variables, and (right) filter run with inflation (), localization (length scale ), 30 members, and assimilation of every second model variable.

  • View in gallery

    Time-averaged RMSE as a function of the localization length scale (x axis) and inflation factor (y axis). The (left) EnKFreg, (middle) EnKFser, and (right) EnKFesops are implemented with 10 members and assimilation of observations from (top) all model variables and (bottom) half of the variables at every model time step (or 6 h in real time). A logarithmic color scale is used to emphasize the low RMSE values. The minimum-averaged RMSEs are indicated by asterisks, and their associated values are given in the maps. White boxes indicate divergence of the filter.

  • View in gallery

    As in Fig. 2, but for 30 ensemble members.

  • View in gallery

    As in Fig. 2, but for 30 ensemble members and assimilation every four model time steps.

  • View in gallery

    As in Fig. 2, but for EnSRF, DEnKF, and EnKFesops.

  • View in gallery

    Minimum average RMSE for all tested filters (EnKFreg, EnKFser, EnKFesops, EnSRF, and DEnKF) as a function of the ensemble size. (left) All variables and (right) every other variable are observed at every model time step (or 6 h in real time).

  • View in gallery

    Time evolution of the average ensemble spread as it results from EnKFreg, EnKFser, EnKFesops, EnSRF, and DEnKF implemented with 30 ensemble members as function of (from bottom to top) localization length scales (lc) and inflation factor (ic) and (left to right) spatial and temporal observation network densities. Full time and sparse time correspond to the assimilation of observations at every model time step and every four model time steps, respectively.

  • View in gallery

    Time evolution of the average ensemble spread over the first-year assimilation period as it results from EnKFreg, EnKFser, EnKFesops, EnSRF, and DEnKF implemented with 30 ensemble members for their best performances in term of RMSE with respect to the choices of inflation and localization. Full time and sparse time correspond to assimilation of observations every model time step and every four model time steps, respectively.

  • View in gallery

    Time evolution of the average ensemble spread (dark, thick colors) and the corresponding RMSEs (light, thin colors) over the first-year assimilation period as it results from the same single run for the best performance (with respect to the choice of inflation and localization) of each filter implemented with 30 ensemble members with the different observations scenarios. Full time and sparse time correspond to assimilation of observations every model time step and every four model time steps, respectively.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 459 459 118
PDF Downloads 57 57 14

Mitigating Observation Perturbation Sampling Errors in the Stochastic EnKF

View More View Less
  • 1 King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
  • | 2 Centre National de la Recherche Scientifique, Grenoble, France
  • | 3 King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, and Nansen Environmental and Remote Sensing Center, Bergen, Norway
  • | 4 International Research Institute of Stavanger, Bergen, Norway
© Get Permissions
Full access

Abstract

The stochastic ensemble Kalman filter (EnKF) updates its ensemble members with observations perturbed with noise sampled from the distribution of the observational errors. This was shown to introduce noise into the system and may become pronounced when the ensemble size is smaller than the rank of the observational error covariance, which is often the case in real oceanic and atmospheric data assimilation applications. This work introduces an efficient serial scheme to mitigate the impact of observations’ perturbations sampling in the analysis step of the EnKF, which should provide more accurate ensemble estimates of the analysis error covariance matrices. The new scheme is simple to implement within the serial EnKF algorithm, requiring only the approximation of the EnKF sample forecast error covariance matrix by a matrix with one rank less. The new EnKF scheme is implemented and tested with the Lorenz-96 model. Results from numerical experiments are conducted to compare its performance with the EnKF and two standard deterministic EnKFs. This study shows that the new scheme enhances the behavior of the EnKF and may lead to better performance than the deterministic EnKFs even when implemented with relatively small ensembles.

Corresponding author address: I. Hoteit, 4700 King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. E-mail: ibrahim.hoteit@kasut.edu.sa

Abstract

The stochastic ensemble Kalman filter (EnKF) updates its ensemble members with observations perturbed with noise sampled from the distribution of the observational errors. This was shown to introduce noise into the system and may become pronounced when the ensemble size is smaller than the rank of the observational error covariance, which is often the case in real oceanic and atmospheric data assimilation applications. This work introduces an efficient serial scheme to mitigate the impact of observations’ perturbations sampling in the analysis step of the EnKF, which should provide more accurate ensemble estimates of the analysis error covariance matrices. The new scheme is simple to implement within the serial EnKF algorithm, requiring only the approximation of the EnKF sample forecast error covariance matrix by a matrix with one rank less. The new EnKF scheme is implemented and tested with the Lorenz-96 model. Results from numerical experiments are conducted to compare its performance with the EnKF and two standard deterministic EnKFs. This study shows that the new scheme enhances the behavior of the EnKF and may lead to better performance than the deterministic EnKFs even when implemented with relatively small ensembles.

Corresponding author address: I. Hoteit, 4700 King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. E-mail: ibrahim.hoteit@kasut.edu.sa

1. Introduction

The ensemble Kalman filter (EnKF) (Evensen 2003) is widely used for data assimilation into geophysical models. The filter was introduced by Evensen (1994) as a variant of the Kalman filter (KF) (Kalman 1960), designed for large-scale nonlinear models. As any variant of the KF, it operates sequentially in two steps: a forecast step and an analysis step. In the forecast step, the model is integrated forward in time starting from the current estimate of the system state and the associated error covariance matrix to compute the forecast state and its error covariance. In the analysis step, incoming observations are used to update the forecast in order to obtain a more accurate estimate of the system state. This new (analysis) state vector and its error covariance are then used for the next forecast step. When the model is linear, the forecast step is straightforward and requires only standard matrix operations. In the nonlinear case, however, this step becomes more challenging, requiring multiple forward model integrations over the whole state space with respect to the analysis error distribution. The extended Kalman filter (EKF) avoids this difficulty by linearizing the model around the state estimate and then applying the KF on the resulting linearized system. In contrast, the EnKF uses a set of vectors in the state space (or ensemble members in the terminology of the EnKF) to represent the state estimate error distribution. This same ensemble framework is also used by other ensemble-based Kalman filters, including the singular evolutive interpolated Kalman (SEIK) filter (Pham 2001), the ensemble transform Kalman filter (ETKF) (Bishop et al. 2001), the ensemble adjustment Kalman filter (EAKF) (Anderson 2001), the ensemble square root filter (EnSRF) (Whitaker and Hamill 2002), the deterministic EnKF (DEnKF) (Sakov and Oke 2008), and the unscented Kalman filter (UKF) (Wan and van der Merwe 2000; Luo and Moroz 2009). This framework is expected to be more accurate than simple linearization, and we will not discuss it further as it was extensively investigated in several previous studies (e.g., Evensen 1994; Hoteit et al. 2002, 2005; Korres et al. 2007). Here, we mainly focus on the EnKF analysis step as presented by Burgers et al. (1998).

In the common case of linear observations, the analysis step is optimally, in the sense of minimizing the analysis error variance, implemented by the KF.1 The ensemble-based filters follow the KF to update the ensemble mean and covariance and then update deterministically (or resample in the case of SEIK) a new ensemble with a sample mean and covariance exactly matching those of the KF based on the ensemble-estimated background covariance. The EnKF, on the other hand, tries to match the same Kalman analysis mean and covariance by individually updating the ensemble members with stochastically perturbed observations using the KF gain matrix (still based on the ensemble-estimated background covariance). This, however, requires the observations to be perturbed by random noise that has exactly the same first two statistical moments as those of the observational error distribution (Burgers et al. 1998). Consequently, the EnKF was then referred to as the stochastic EnKF (Tippett et al. 2003). An important advantage of the EnKF analysis step is that it readily provides an analysis ensemble for forecasting randomly sampled from the assumingly Gaussian analysis distribution, avoiding the deterministic updating step that may distort the features of the forecast ensemble distribution as in the other ensemble-based KFs without perturbations (Lei et al. 2010).

The first two moments of the EnKF analysis may only asymptotically match those of the KF (Evensen 2003). In this sense, the EnKF analysis step always introduces noise during the filter’s analysis. This may become pronounced in the typical oceanic and atmospheric data assimilation applications where the rank of the observational error covariance matrix is much larger than the size of the ensemble (or the number of ensemble members), since in this case will be strongly undersampled (Nerger et al. 2005; Altaf et al. 2014). Spurious correlations between the observation perturbations and the forecast perturbations could also lead to errors in the EnKF ensemble estimates of the analysis error covariance matrices (Pham 2001; Bowler et al. 2013). For these cases, it might be more appropriate to use an ensemble filter that does not require perturbing the observations. In the opposite case, that is, when the ensemble size is larger than the number of observations, the EnKF was shown to perform better than other ensemble KFs without perturbations in many situations (Anderson 2010; Hoteit et al. 2012). Lawson and Hansen (2004) argued that the stochastic EnKF has the advantage that the observation perturbations tend to “re-Gaussianize” the ensemble distribution to explain the improved stability compared to ensemble filters without perturbations that slavishly follow the shape of the background distribution. Lei et al. (2010) also demonstrated that the EnKF can be relatively more stable in certain circumstances, especially in the presence of wild outliers in the data.

In this work, we introduce a new EnKF algorithm to mitigate the sampling error of the observational error distribution in the stochastic EnKF. Whitaker and Hamill (2002) and Sakov and Oke (2008), respectively, proposed two deterministic ensemble KFs, the EnSRF and DEnKF, which exactly yielded the first two moments of the KF analysis (based on the ensemble-estimated background covariance), while separately updating the ensemble perturbations using a modified Kalman gain. The DEnKF approximates the Kalman gain, and this may affect the distribution of the new sampled members. The EnSRF modification of the Kalman gain is just a scaling; it should not be particularly harmful to the distribution of the new sampled members, except insofar as it inherits the KF’s assumption that the distributions are Gaussian, so that they can be simply scaled rather than calculating the full Bayesian product (Lawson and Hansen 2004). Two methods were also previously proposed for mitigating the effects of perturbed observations. Anderson (2003) suggested pairing the analysis ensemble members with the ensemble forecast members in the observation space. By doing so, the same set of analysis states can be achieved while minimizing the analysis increments for each ensemble member (Bowler et al. 2013). Pham (2001) and Bowler et al. (2013) proposed schemes for removing spurious correlations between the observation and forecast perturbations. The modified (stochastic) EnKFs improved performance but did not provide better results than those obtained with some ensemble KFs without perturbations (Bowler et al. 2013).

This paper is organized as follows: Section 2 describes the EnKF analysis step and discusses in detail the observational error sampling noise in the EnKF. The new method to mitigate this noise is introduced in section 3, and the resulting algorithm is summarized in section 3c. Results of numerical experiments with the Lorenz-96 model are presented in section 4. A summary of the main results followed by a general discussion concludes the paper in section 5.

2. The analysis step in the EnKF

a. Analysis step and approximations

The analysis step of the KF starts from a forecast of the state vector and the associated error covariance matrix . The forecast is then updated with a vector of incoming observations:
e1
where is the true state vector, and is the observational error. This yields the analysis state vector :
e2
where is the observation matrix (i.e., are the forecasted observations), and is a matrix called the gain. We consider here only the case where the observations are linear functions of the state’s variables. The case of nonlinear observations can be handled by augmenting the state vector with (Anderson 2003). The new state vector becomes , and the observation is again linear composed of the last components of .
The analysis error can be decomposed as
e3
Thus, assuming is independent of the forecast error , the analysis error has a covariance matrix
e4
where denotes the covariance matrix of the observation error . This formula is valid for any gain matrix. Hereafter, we only consider the Kalman gain matrix
e5
which minimizes the trace of among all gain matrices. For this gain, there are alternative better known formulas to compute :
e6
e7
In the EnKF, and are estimated as the sample mean and covariance matrix of ensemble members,2 :
e8
The analysis ensemble members are then obtained by applying the KF analysis Eq. (2) to each ensemble member , with noise added to the observation , that is,
e9
where is the Kalman gain in Eq. (5) based on , and is a sample from the Gaussian distribution of mean zero and covariance matrix . The analysis state and its error covariance matrix are then taken as the sample mean and covariance of these ensemble members. Explicitly,
e10
where is the average of . Subtracting the first part of Eq. (10) from Eq. (9), one obtains
e11
and using Eq. (8) to recognize in the first term of the expansion of the second part of Eq. (10),
e12
To match and with those that would result from the KF via Eqs. (2) and (4) based on and , the term should be exactly sampled from a second-order draw; that is,
e13
under the constraint that there is no cross correlation between the observation and the forecast ensemble, that is,
e14
This is possible only when the rank of the forecast perturbations matrix
e15
plus the rank of do not exceed r (Pham 2001). One way to see this is by noticing that the constraint Eq. (14) means that the vectors must satisfy independent constraints, with being the rank of . Together with the constraint , there can be at most independent vectors in the set , and hence the matrix can be at most of rank .
The matrix is generally of rank r, so the above requirement is impossible to satisfy. Of course, when the rank of the matrix does not exceed r, one can at least force to only satisfy the constraints Eq. (13) [without Eq. (14)] using a second-order exact drawing (Pham 2001). In the more general case, one may also use a low-rank observational error covariance in the calculation of the Kalman gain to reduce the observational sampling errors (Evensen 2009). However, even in this case, the remaining sampling “error” from neglecting the cross-correlation terms Eq. (14) in the calculation of the EnKF analysis covariance,
e16
is generally not globally small as it may be composed of a large number of small elements that may add up. The impact of ignoring this error has also been investigated in some detail by Whitaker and Hamill (2002). It was reported that this may reduce the filter accuracy while increasing the probability of underestimating the analysis error covariance.

b. Discussion

The Kalman gain matrix depends on the exact knowledge of the forecast error covariance matrix . Errors in the computation of this matrix will certainly degrade the filter performance. Moreover such errors are likely to propagate to subsequent steps of the filter. In an EnKF, errors in may come from (i) model nonlinearity and the use of a limited number of ensemble members; (ii) errors in the sample covariance matrix of the ensemble members at the previous step; and the fact that (iii) even if the analysis ensemble at the previous step is the “correct” sample covariance matrix, errors can still arise when this ensemble is not representative of the true (non-Gaussian) distribution of the analysis error.

The nonlinearity is the root cause of the problem; if the model is nonlinear, the true forecast covariance can be exactly computed only when we use an infinite number of ensemble members. In practice, this number is generally quite limited relative to the dimension of the state space, depending on the available computational resources. However, when the model is linear, then it suffices that the sample covariance of the analysis ensemble at the previous step matches the true analysis covariance matrix. This is the reason behind the use of second-order exact sampling schemes in some ensemble-based KFs to draw the new analysis ensemble members with the mean and covariance matrix matching those computed by Eqs. (2)(7). As long as the model is weakly nonlinear, exact knowledge of the distribution of the ensemble members is not very important if we are concerned only with second-order statistics. However, when the model is not weakly nonlinear, the distribution of the ensemble members might affect the sample mean and covariance of the ensemble members at the next forecast step. In this case, it is preferable to have this ensemble as representative as possible, even crudely, of the true distribution of the analysis error. A deterministic resampling step might destroy any possible non-Gaussian feature of this distribution, such as skewness, long tail, bimodality, and so on. By contrast, the EnKF may better retain some of these features as it directly samples its analysis ensemble from the ensemble-based estimated posterior distribution (Frei and Kunsch 2013). Below we introduce a new (serial) EnKF scheme that retains the assimilation of “perturbed observations” for each member of the ensemble but in such a way as to make the associated sampling errors in the calculation of in Eq. (12) as small as possible.

3. An EnKF with exact second-order observation perturbations sampling

a. Analysis

Suppose that the rank of the matrix , defined in Eq. (15), is instead of r. Then, following the above discussion in section 2a, in the case where is scalar, it is possible to draw the observation perturbations such that the EnKF analysis vector and its sample error covariance matrix are exactly the same as those that would have been computed with the KF using Eqs. (2) and (4). More explicitly, since the matrix has columns and is of rank , its kernel3 contains two independent vectors, with one of them already known to be . Thus, there exists a vector of unit norm and orthogonal to , such that . Simply choosing the perturbations in Eq. (9) to be equal to , where is the component of , would thus satisfy the requirements in Eqs. (13) and (14). Hence, with this choice of perturbations, and coincide with those of the KF. Clearly, the choice would also satisfy the constraints. We therefore take , with s being a random plus or minus sign.

The interesting point is that after the analysis step of one scalar observation, the new ensemble members are also such that the analysis perturbations matrix
eq2
is still of rank . Indeed, by Eq. (11),
eq3
Not only will we show that this matrix has rank (at most), but also we can iteratively compute a new vector of unit norm, orthogonal to , that belongs to its kernel. Let us first try . Since and ,
e17
Let us next try :
eq4
Therefore, multiplying the equality Eq. (17) by and subtracting the result from the last equality, we find that the vector belongs to the kernel of . Moreover, this vector is already orthogonal to and has a norm . Therefore, the normalized vector
e18
is the desired vector.

In the general case where is not scalar, we may assimilate the observations serially, that is, one at a time. It is known that serial assimilation is equivalent to “batch” assimilation,4 provided that the observation errors are uncorrelated (Maybeck 1979). If this is not the case, one can always apply a linear transformation to the data, , such that the components of are uncorrelated. As shown above, after assimilating a scalar observation, the matrix is still of rank , and moreover a vector of unit norm, belonging to its kernel and orthogonal to , is available. Therefore, one may continue and assimilate a new scalar observation to compute a new analysis vector and its covariance matrix, which again match those of the KF. This point is important since the equivalence between serial assimilation and batch assimilation is based on the fact that the analysis vector and its covariance matrix satisfy Eqs. (2)(7). In the EnKF, this equivalence holds only approximately because the computed analysis error covariance matrix does not match that of the KF.

The observations’ perturbations in the proposed scheme cannot be Gaussian because of the involved constraints. Two examples of observations’ perturbation distributions as produced by the proposed scheme in the data assimilation experiments with the Lorenz-96 model presented in section 4a are plotted in Fig. 1 for all observed variables. These show reasonable deviations from the ideal (prescribed) Gaussian distribution of the observations’ errors. There are further good reasons to believe that it may be more important, from an ensemble Kalman filter performance point of view, to comply with the undersampling and independency constraints of the observations’ perturbations rather than Gaussianity. One can indeed argue that the Gaussianity of the observational error is not crucial for an EnKF. Without it, the Kalman filter update remains optimal, among linear filters (Gauss–Markov theorem). Moreover, EnKF calculations only involve the first- and second-order statistics and not the whole distribution. Our goal here is to make the ensemble members more representative of the forecast and analysis error distributions. In the EnKF, perturbations are introduced to “inflate” its analysis covariance matrix to the correct covariance. The effect of these on the ensemble distribution should be generally small, but this would introduce Monte Carlo sampling errors in the mean and covariance statistics. By contrast, other ensemble-based filters update the ensemble members (or redraw them in the SEIK) to match the Kalman analysis and covariance, which may distort the distribution of the forecast ensemble members. Our scheme follows the EnKF approach by retaining the original ensemble members’ distribution and introducing the perturbations to inflate its analysis covariance. The difference is that we constrain the perturbations so that this matrix matches the correct covariance, as in the other ensemble-based Kalman filters.

Fig. 1.
Fig. 1.

Two examples of observations perturbations distributions [fitted by a Kernel density (KDE) estimator] as produced by the proposed filter (EnKFesops) for all observed variables at a given analysis step in data assimilation experiments with the Lorenz-96 model and compared with the ideal observation errors distribution. (left) Filter run with no inflation, no localization, 30 members, and assimilation of all model variables, and (right) filter run with inflation (), localization (length scale ), 30 members, and assimilation of every second model variable.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

Note that the scheme described above is meant only for the case where the ensemble size N minus 1 () does not exceed the dimension of the state vector. This is always the case of large-scale data assimilation applications in meteorology and oceanography but may not be in some engineering applications. However, as we have discussed in section 2a, it is always possible to draw the observations’ perturbations second-order exactly and satisfy the “no correlation constraint” in Eq. (14) if is at least equal to the rank of the forecast perturbations matrix defined in Eq. (15), plus the rank of . Here, is scalar, and the rank of is at most the dimension of the state vector, so the above condition is automatically fulfilled. The drawing of the observations’ perturbations in this case can be based on an algorithm described in Pham (2001).

b. Initialization and forecast

We initialize the filter from an analysis step. If an initial forecast ensemble is readily available, one may directly start by removing one rank from the ensemble following the singular value decomposition (SVD) procedure described below and then go to the analysis step. One may also generate the initial forecast ensemble based on some prior knowledge (without using the available observation at the initialization time). Appendix A describes a procedure to draw vectors according to a (supposedly) Gaussian prior distribution of mean and covariance matrix , under the constraints that their sample mean and covariance matrix coincide exactly with and . In this case, the initial should be of rank , and hence the matrix defined in Eq. (15) also has the same rank.

The “first” forecast ensemble members satisfy the rank requirement of the first analysis step. The resulting analysis ensemble members also satisfy the same rank requirement, that is, has rank of . However, when are integrated forward with the model to compute the next forecast ensemble members (in the same way as in the EnKF), this rank property may be lost. An exception of course is when the model is linear, since implies , with denoting the linear dynamical operator (a matrix in this case). Therefore, if the nonlinearity is not too strong, one may expect that the matrix is still nearly of rank . Thus, one can well approximate to have a rank via a small adjustment.

For a matrix of rank r, the best rank approximation is obtained by first computing its SVD:
eq5
where are the nonzero singular values ranked in decreasing order , and and are the associated normalized left and right singular vectors. The best rank approximation of is then obtained by dropping the last term in the above sum, that is, by subtracting the (rank 1) matrix from . In the SVD, the vectors and are orthonormal and are respectively the eigenvectors of and associated with the nonzero eigenvalues .
To simplify the notations, we drop the subscript r from , , and and write and for the singular vectors associated with the smallest nonzero singular value λ. Since the matrix usually has a very large number of rows, it is more efficient to compute and λ via the eigendecomposition of the matrix and then deduce , given . Therefore, the best rank approximation of is obtained as . Replacing by this matrix corresponds to the following adjustment of the forecast ensemble members:
eq6
where denotes the ith component of and the sign means “is substituted by.” Vector , being an eigenvector of , is orthogonal to the eigenvector associated with the zero eigenvalue of this matrix. This means , and hence the above adjustment does not change the sample mean of the forecast ensemble members. Moreover, since , after applying the above adjustment, one has
eq7
Therefore, is precisely the vector needed to proceed with the next analysis step, as discussed in section 3a.

c. Algorithm

This section presents in detail the implementation of the proposed algorithm, which we refer to as the EnKF with exact second-order observation perturbations sampling (EnKFesops). The algorithm is basically very similar to that of the serial EnKF, with just two additional operations: (i) an SVD to the forecast perturbations matrix after every forecast step to reduce its rank by one and compute the first and (ii) a recursive update of after each serial assimilation of a single observation in the analysis step as in Eq. (18). We suppose that is diagonal with diagonal elements and denote by the jth row of and by the jth component of , for . The algorithm of the EnKFesops can be summarized in the following steps:

  • Initialization: One may initialize the filter from a readily available forecast ensemble or generate an ensemble from a Gaussian prior given a mean and a rank covariance matrix as described in appendix A.
  • Analysis (serial) step: Once an observation becomes available, set and for , .
    • For (loop over each observation):
      eq8
      End the loop on j.
    The are independent, random plus or minus signs. The update formula for above comes from Eq. (18).
  • Forecast step: The terms are integrated forward with the model to compute the forecast ensemble members in the same way as in the EnKF. The forecast state . The are then rank adjusted by
    eq9
    where is defined as in Eq. (15) and as described in section 3b, is the eigenvector of associated with the smallest nonzero eigenvalue, and is its ith component.
    • In the unlikely situation where has rank or less, would have at least two orthogonal eigenvectors associated with the zero eigenvalue; one of them is the vector and would be the other. Since , there is, in fact, no need for rank adjustment of , but one still needs for the next analysis step.

The implementation of the EnKFesops is therefore straightforward and can be very easily included in any readily available serial EnKF code. In term of complexity, the EnKFesops algorithm has almost the same computational cost as that of the serial EnKF, with only the additional cost of the two operations for iteratively updating the vector and SVDing to remove one rank from , both negligible compared to the cost of integrating a large-scale dynamical (atmospheric or ocean) model. In appendix B we also propose and discuss the implementation of an efficient algorithm to directly compute the smallest eigenvector of , further reducing the cost of this operation.

4. Numerical experiments

a. Experimental setting

We use the Lorenz-96 (L96) model (Lorenz and Emanuel 1998) to test and evaluate the behavior of the proposed EnKFesops scheme. The L96 model mimics the time evolution of an atmospheric quantity and is governed by the following highly nonlinear set of differential equations:
e19
where the nonlinear quadratic terms simulate advection, and the linear term represents dissipation. The model was implemented here in its most common form. We therefore considered 40 variables, the forcing term , and periodic boundary conditions; that is, we defined , , and . For , disturbances propagate from low to high indices (west to east), and the model is chaotic (Lorenz and Emanuel 1998). The model was numerically integrated using the Runge–Kutta fourth-order scheme with a constant time step of (which corresponds to 6 h in real-word time). A “truth” model run was first performed to obtain a set of reference states. The filters’ performances were then evaluated by how well they recover the reference states using a “perfect” forecast model with perturbed initial conditions and assimilating a set of (perturbed) observations that was extracted from the reference states.

To generate the filters’ initial (forecast) ensemble, the model was first integrated forward without assimilation for several years in real-world time. The starting forecast ensemble members were sampled by adding independent Gaussian random perturbations with unit variance to each variable of the mean state of this long model run. With this choice, the initial ensemble members are off the model attractor and have no clear relationship between spread and error. The spinup period was, however, long enough to remove any detrimental impact. All filters were implemented with covariance inflation and covariance localization as described by Whitaker and Hamill (2002). Localization was implemented using the fifth-order function of Gaspari and Cohn (1999), which behaves like a Gaussian function but reaches zero at finite radius. For a given length scale, the correlation between two grid points becomes zero when the distance between two points is greater than twice the length scale (Hamill et al. 2001).

The experimental setup presented below is very similar to that of Whitaker and Hamill (2002), but we ran experiments assimilating the observations at every time step and, less frequently, at every fourth time step, which is equivalent to 1 day in real time, to mimic more common and challenging situations. Two different observational strategies were also considered: observations sampled from all model variables and every other model variable, using constant sampling intervals. Observations were perturbed with independent random noise of mean zero and unit variance. Accordingly, the observational error covariance was set to the identity matrix.

Experiments were performed over a period of 5 yr (or 7300 model steps), excluding an early spinup period of about 20 days. The results were averaged over 10 runs to reduce statistical fluctuations. Several longer (more than model steps) assimilation runs were also performed to test the long time behavior of the proposed EnKFesops. The resulting root-mean-square errors (RMSEs) from these long runs were very close (within ) to those obtained with the 5-yr runs. We therefore report here only the results from 5-yr assimilation runs to save computational time. All the systems have been tested on the same random number sequences to narrow the confidence interval on the difference between them.

The RMSE between the reference states and the filters’ analyses averaged over the simulation period is used to evaluate the filters’ performance. Given a set of n-dimensional state vectors , with being the maximum time index, the time-averaged RMSE is defined as
e20
where is the analysis at time k. For a given inflation factor and covariance localization length scale, each filter run was repeated times, each with randomly drawn initial ensemble and observational errors, and the average RMSEs over these runs were reported as
e21
To further assess the behavior of the filters, we also monitored the time evolution of the average ensemble spread (AES) of each filter, which we computed at every filtering step as
e22
where is the ensemble variance of .

b. Experiments results

In this section, we present and analyze assimilation results obtained using the proposed EnKFesops algorithm with the Lorenz-96 model. We also compare the performance of the EnKFesops with the EnKF when implemented with serial assimilation of the data EnKFser and with the regular (batch) updating scheme EnKFreg, EnSRF, and DEnKF. Sensitivity experiments were also conducted using different ensemble sizes and different spatial and temporal observation frequencies to study the behavior of the filters under different experimental settings. To visualize the results, we use two-dimensional plots of the RMSE as a function of the inflation factor, varying between 1 and 1.2, and localization length scale, varying between 1 (strong localization) and 40 (weak localization).

In the first experiment, we implement the EnKFreg, EnKFser, and EnKFesops with 10 members, and we assimilate observations at every model step (i.e., every 6 h). As shown in Fig. 2, when all 40 variables are observed, both serial and regular EnKFs exhibit similar performances, in term of , with slight advantage to the EnKFreg. Both generally benefit from localization and inflation, with the best results obtained with inflation values ranging between 1.05 and 1.15 and localization length scale between 10 and 25. Both filters achieve a minimum average RMSE of 0.27. The proposed EnKFesops also benefits from inflation and localization but seems to be more sensitive to the choice of their parameters, requiring in particular more inflation than the standard and serial EnKFs. It, however, somehow achieves the lowest average RMSE, equal to 0.26, but this is not conclusive and statistically not very significant.

Fig. 2.
Fig. 2.

Time-averaged RMSE as a function of the localization length scale (x axis) and inflation factor (y axis). The (left) EnKFreg, (middle) EnKFser, and (right) EnKFesops are implemented with 10 members and assimilation of observations from (top) all model variables and (bottom) half of the variables at every model time step (or 6 h in real time). A logarithmic color scale is used to emphasize the low RMSE values. The minimum-averaged RMSEs are indicated by asterisks, and their associated values are given in the maps. White boxes indicate divergence of the filter.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

When every second variable is assimilated (for a total of 20 variables), the performance of the filters slightly degrades and divergence eventually occurs (indicated by the white boxes in the figure), mainly with very small (and large) localization length scales and large inflation factors (Fig. 2). Overall the results are consistent with those resulting from the assimilation of all model variables. The EnKFesops is more sensitive to the choice of inflation and localization but again achieves the minimum average RMSE of 0.42, compared to 0.46 and 0.44 for the EnKFreg and EnKFser, respectively. Using only 10 ensemble members affects the robustness of the EnKF algorithm, especially when less data are assimilated. Moreover, removing one rank from its sample covariance matrix means that the EnKF will be practically operating with nine members. This may counterbalance any improvement from the new proposed algorithm EnKFesops when the number of assimilated observations is not much larger than the ensemble size, as is the case here, especially when only half of the model variables were assimilated. This behavior of the EnKFesops is further analyzed below.

Based on the same setup but using 30 members instead of only 10 members, the performance of the EnKFesops clearly outperforms the EnKFreg and EnKFser as shown in Fig. 3. The minimum average RMSEs achieved by the EnKFesops is respectively 0.18 for the experiment assimilating all model variables, compared to 0.2 for both EnKFreg and EnKFser, and 0.29 for the experiment assimilating half of the model variables, compared to 0.33 for both EnKFreg and EnKFser. The EnKFesops requires less inflation and localization to achieve these minima. The EnKFesops further exhibits more robust behavior with respect to the choice of localization and inflation, generally achieving better performances than the EnKFreg and EnKFser with weaker inflation and localization.

Fig. 3.
Fig. 3.

As in Fig. 2, but for 30 ensemble members.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

The clearly better behavior of the EnKFesops with 30 members compared with 10 members can be explained by the rank-minus-1 approximation of the sample covariance of the forecast ensemble that is required after every forecast step of the EnKFesops (as discussed in section 3b). The rank-minus-1 approximation should indeed have much less impact on a matrix of rank 29, which is the rank of the sample covariance of an ensemble of 30 members than a matrix of rank 9, which is the covariance matrix rank of an ensemble of 10 members. This was verified by monitoring the ratio between the smallest singular value and the sum of all the singular values for both EnKFreg and EnKFser forecast ensemble covariance matrices, which provide a measure of the significance of the SVD approximation of this matrix by a rank-minus-1 matrix. This ratio is almost the same for the EnKFreg and EnKFser but about one order of magnitude less in the runs with 30 ensemble members compared to the runs with 10 members.

To evaluate the performance of the filters in a more challenging setting, we decreased the temporal availability of the observations to 1 day, instead of 6 h, and continued using 30 ensemble members (Fig. 4). This setup eventually makes the system more nonlinear and thus less accurate estimates are expected. On top of providing the lowest RMSEs, the EnKFesops is again shown to be more robust to less inflation and localization as can be seen from the two-dimensional (2D) RMSE plots compared to the regular and serial EnKFs, which experience divergence more often, particularly when implemented with large localization length scales. In terms of accuracy of the state estimates, when all 40 variables are assimilated, the average improvements (over all assimilation runs with different localization length scales and inflation factors) of the EnKFesops over the EnKFreg and EnKFser are about 26% and 46%, respectively. This reduces to 12% and 23%, respectively, when only half of the model variables are assimilated.

Fig. 4.
Fig. 4.

As in Fig. 2, but for 30 ensemble members and assimilation every four model time steps.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

We further compared the performance of the proposed EnKFesops against the EnSRF and DEnKF using 30 ensemble members with assimilation of all model variables every 6 h. The resulting average RMSE values using the EnSRF and the DEnKF are generally smaller than those obtained with the EnKFreg and EnKFser, but they are larger than those obtained with the EnKFesops. The proposed EnKFesops algorithm, as shown in Fig. 5, performs slightly better than the EnSRF and the DEnKF for all tested settings of inflation and localization respectively leading to about 4% and 8% improvement on average and in terms of RMSE, when all variables are assimilated. When only half of the variables are assimilated, the EnKFesops and the EnSRF/DEnKF performances become more comparable, but again the EnKFesops produces the lowest RMSEs (about 2% and 3% improvements on average compared to the EnSRF and DEnKF, respectively). The EnKFesops is further found to be more robust with respect to the choice of inflation and localization, especially compared to the DEnKF.

Fig. 5.
Fig. 5.

As in Fig. 2, but for EnSRF, DEnKF, and EnKFesops.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

As a final assessment of the filters’ performances in terms of RMSE, we show in Fig. 6 bar plots of the minimum RMSE resulting from the separately optimized configuration of each filter, as a function of the ensemble size , and for both full and half observation scenarios. With assimilation of either all or half of the model variables, the EnKFesops provides the lowest RMSE when the ensemble sizes are larger than 20. For smaller ensembles (i.e., ), the EnSRF and the DEnKF outperform the proposed scheme especially when only half of the variables are assimilated, in line with the above discussion on the rank-minus-1 approximation of the forecast ensemble covariance in the EnKFesops. Similar to the EnKFesops, the performances of the regular and serial EnKFs improve with increasing ensemble size but at a slower pace.

Fig. 6.
Fig. 6.

Minimum average RMSE for all tested filters (EnKFreg, EnKFser, EnKFesops, EnSRF, and DEnKF) as a function of the ensemble size. (left) All variables and (right) every other variable are observed at every model time step (or 6 h in real time).

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

We finally compared the evolution of the ensembles’ spreads of all filters in time when implemented with 30 members and assimilation of all/half observations at every model/four model time steps. For that, we arbitrarily selected three pairs of localization length scales and inflation values and plotted the spread of the corresponding analysis ensembles (from one single assimilation run) in Fig. 7. For the 12 displayed (and actually all tested) cases, the EnKFesops spread is larger than the spreads of the regular and serial EnKFs, close but slightly smaller than that of the EnSRF, and always smaller than that of the DEnKF. The random sampling of the ensemble members in the analysis step of the regular EnKF estimates a small analysis error variance, as compared to the other ensemble filters, and the new proposed scheme brings it closer to that of the EnSRF. The DEnKF spread is the largest for all tested cases, and the EnKFser spread is generally lower than that of the regular EnKF.

Fig. 7.
Fig. 7.

Time evolution of the average ensemble spread as it results from EnKFreg, EnKFser, EnKFesops, EnSRF, and DEnKF implemented with 30 ensemble members as function of (from bottom to top) localization length scales (lc) and inflation factor (ic) and (left to right) spatial and temporal observation network densities. Full time and sparse time correspond to the assimilation of observations at every model time step and every four model time steps, respectively.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

We further analyzed the time evolution of the spread of the different filters in this same setup but for their best achieved performances in terms of RMSE with respect to the choices of inflation and localization. The results are shown in Fig. 8 for the first year only, a shorter but “typical” time range, for better clarity of the plots. The filters seem to achieve their best performances with more or less the same spread, except for the DEnKF. The differences become more pronounced with fewer assimilated observations, and, in general, the EnKFesops produces the lowest spread, which is consistent with its lowest RMSE. To analyze the filters’ spreads in relation to their corresponding RMSEs, we plotted in Fig. 9 the time evolution of the spreads and RMSEs (as they result from the same single run) for the best performance of each filter with the different observations’ scenarios. The spreads and RMSEs are well comparable for all filters, except for the DEnKF spread that seems to overestimate the corresponding RMSE.

Fig. 8.
Fig. 8.

Time evolution of the average ensemble spread over the first-year assimilation period as it results from EnKFreg, EnKFser, EnKFesops, EnSRF, and DEnKF implemented with 30 ensemble members for their best performances in term of RMSE with respect to the choices of inflation and localization. Full time and sparse time correspond to assimilation of observations every model time step and every four model time steps, respectively.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

Fig. 9.
Fig. 9.

Time evolution of the average ensemble spread (dark, thick colors) and the corresponding RMSEs (light, thin colors) over the first-year assimilation period as it results from the same single run for the best performance (with respect to the choice of inflation and localization) of each filter implemented with 30 ensemble members with the different observations scenarios. Full time and sparse time correspond to assimilation of observations every model time step and every four model time steps, respectively.

Citation: Monthly Weather Review 143, 7; 10.1175/MWR-D-14-00088.1

We have also compared the ensemble rank histograms of the filters, but the results were not very conclusive in the sense that the rank histograms resulting from the EnKFesops were not significantly different from those of the standard EnKF. We therefore decided not to show them here for brevity.

5. Summary and discussion

The stochastic ensemble Kalman filter (EnKF) applies the Kalman filter (KF) analysis step to update each ensemble member with a different perturbed observation, directly generating a random analysis ensemble from the presumed analysis Gaussian distribution. This is different from other ensemble-based KFs, which deterministically update the analysis ensemble by matching the first two moments of the KF analysis. It has been argued that the random sampling of the ensemble allows the EnKF to better preserve the features of the forecast ensemble distribution. The EnKF analysis, however, only asymptotically matches the first two moments of the KF. In realistic applications, computational requirements often strongly limit the size of the ensemble. In this case, the observational error covariance will be undersampled, especially when the size of the observation vector is (much) larger than the ensemble size, which is a common situation in atmospheric and oceanic applications. This introduces errors to the filter estimates (and their covariance matrices), which may degrade the filter performance.

In this paper, we introduced a new serial EnKF algorithm in which we retained the idea of separately updating each ensemble member with perturbed observations, but we did this in such a way as to exactly reproduce the first two moments of the KF equations, given the ensemble-based forecast error covariance. For an ensemble size , we found that the first two moments of the EnKF analysis ensemble can exactly match those of the KF only when the rank of the ensemble sample covariance matrix (generally r) plus the rank of the observational error covariance does not exceed r. If the data were to be assimilated serially (one single datum at a time, meaning ), one may satisfy this condition by removing one rank from the ensemble. This is exactly the idea behind our new algorithm. We remove the rank that contributes the least to the variance of the ensemble using a singular value decomposition (SVD), which provides the best approximation of the ensemble by a rank matrix and helps to avoid observational sampling errors in the EnKF analysis step. If we are interested in implementing the filter with an ensemble size N and are concerned about the impact of reducing its rank by one, we could always implement the filter with members. We also show that after the assimilation of a single observation, the ensemble remains of rank , and we provide an efficient iterative algorithm to compute the perturbed observations, such that we can immediately proceed with the assimilation of the next observation. In the resulting new EnKF algorithm, the data are assimilated serially and only minimal modifications and additional computations to the standard serial EnKF algorithm are required.

We investigated the behavior of the new EnKF algorithm with the strongly nonlinear Lorenz-96 model. We also evaluated its performance against those of the standard EnKF (when implemented in batch or serial mode), the ensemble square root filter (EnSRF), and the deterministic EnKF (DEnKF). These last two filters were introduced as ensemble Kalman filters that do not require perturbing the observations before assimilation. Our numerical results suggest that the new scheme improves the behavior of the EnKF and significantly enhances its robustness, alleviating its dependency on the choices of localization and inflation. We also found that with large enough ensembles (i.e., ensembles that are not much affected by the rank-minus-1 approximation of the forecast ensemble covariance matrix), the new EnKF algorithm is at least as good as the EnSRF and DEnKF, and in most cases it provided the best performance in our particular, but very common, experimental setting.

Our results suggest that the EnKF may deserve further reevaluation in large-scale atmospheric and oceanic data assimilation problems. This will be one of the topics of our future work. We will also explore the generalization and impact of the proposed scheme in the context of ensemble Kalman smoothing.

Acknowledgments

We thank three anonymous reviewers for their valuable comments and suggestions. The comments of one of the reviewers on the distinction between the different ensemble Kalman filters were very useful in revising our manuscript, which we gratefully acknowledge. Research reported in this publication was supported by the King Abdullah University of Science and Technology (KAUST).

APPENDIX A

Filter Initialization from a Gaussian Prior

We initialize the algorithm from an analysis step and therefore need an ensemble of size for which the covariance matrix is of rank . To generate the initial ensemble members from a given prior (Gaussian) distribution of mean and covariance matrix of rank , under the constraints that their sample mean and covariance matrix coincide with and , one may proceed as follows: Factorize , where is an column full-rank matrix, and draw an random matrix with orthonormal columns orthogonal to . This matrix is thus “uniform” in the sense that post multiplying it with any orthogonal matrix does not change the distribution. Then take
ea1
with denoting the ith row of of length . The drawing of can be efficiently done using Householder matrices as described by Pham (2001) and Hoteit et al. (2002); is a vector of unit norm orthogonal to all columns of and to and hence can be directly used to generate the observations perturbations as described in section 3a.

APPENDIX B

Computation of the Smallest Eigenvalue of and the Associated Eigenvector

An eigendecomposition of can be wasteful, especially when r is large, as one computes all the eigenvalues and eigenvectors but only needs the smallest one. There exists efficient iterative algorithms that enable us to only compute the smallest nonzero eigenvalue of and the associated eigenvector , such as the one we propose below. For that, one will need an initial vector , which can be taken as the vector from the previous analysis step. With this choice one does not need to run this algorithm for more than a few iterations (and hence it can be very fast) for the following two reasons:

  1. The vector in the previous analysis step satisfies . After integrating the forward with the model to compute the , this linear relation is lost, but if the model is not too strongly nonlinear, once may still expect . This means that is already close to the desired eigenvector of associated with the smallest nonzero eigenvalue, as this vector is the one that minimizes among all vectors of the unit norm and orthogonal to [the eigenvector of zero eigenvalue of ].
  2. The matrix is replaced by to reduce its rank to . This can be achieved by using instead of any vector of the unit norm and orthogonal to . The vector and the adjusted would then satisfy all the requirements for the next analysis step. The only difference is that . Of course, we want to make this adjustment as small as possible, but one might be satisfied with an adjustment that is slightly larger. Hence, we may stop iterating the algorithm before it really converges but has almost converged.

a. Algorithm for the computation of the eigenvector of the smallest eigenvalue

Let be a real symmetric matrix. The goal is to compute the eigenvector associated with its minimum eigenvalue. A simple and straightforward way is to compute all eigenvectors and keep the one of interest, if computational burden is not a concern. There exist in the literature efficient algorithms for finding the eigenvectors of a symmetric, real matrix. In large-scale atmospheric and oceanic applications, computing might be computationally demanding since the number of rows of is equal to the dimension n of the system (typically of the order of or ), requiring flops (floating point operations), where is the number of columns (or the size of the ensemble), which can reach 100 or more. For this type of problem, Lanczos-type algorithms, which do not require us to compute explicitly but only its product with a vector, are attractive, especially in our case since we only need the eigenvector associated with the smallest eigenvalue.

The Lanczos algorithm (Golub and Van Loan 1996) generates a sequence of orthonormal vectors (starting with from a given vector ) such that the matrix with general element is tridiagonal. The smallest eigenvalue of can be approximated by that of , and the corresponding (approximate) eigenvector can be computed from the eigenvector of and . If k equals the size of , then the approximation is exact, but quite often, this approximation is already very good for much smaller k. Each Lanczos vector essentially requires a multiplication of a vector by . In our case, does not need to be computed. The multiplication is performed by two successive multiplications by and , thus only requiring flops. Therefore, the cost of computing k Lanczos vectors is about flops, which is less than that of computing as soon as ; the cost of other required operations being negligible.

The Lanczos method might, however, suffer from numerical instabilities. As k increases, the rounding error accumulates and may result in the loss of orthogonality of the Lanczos vectors, which in turn causes a loss of accuracy. To avoid this problem, we can apply the s-step Lanczos procedure of Karush (1951), which consists of stopping the Lanczos algorithm after s steps, computing the approximate eigenvector, and then starting the algorithm again from this vector and so on until convergence. Hereafter, we focus on the simplest case () of this procedure, for which we have worked out in detail the implementation, which is described below. The derived algorithm ends up being equivalent to the one proposed in Hestenes and Karush (1951), but their derivations and implementations differ.

The eigenvector associated with the smallest eigenvalue is also the minimizer of the criterion , and the procedure of Hestenes and Karush (1951) is based on the steepest descent with an optimal step to minimize this criterion. The gradient of this criterion at a point of the unit norm is . Thus, to minimize it, the steepest descent with an optimal step consists of updating as
eq10
where c is determined such that is minimum. Since the last ratio is unchanged when is multiplied by a scale factor, this is the same as taking as the minimizer of over in the linear space spanned by and . Since such a minimizer is defined only up to a scale factor, we should normalize it to obtain the unit norm.
Thus, the problem reduces to finding the minimum of
eq11
over all real numbers x and for a given vector . It is of interest to consider the vector with , as this vector is orthogonal to . By defining , the problem reduces to the search for the minimum of
eq12
Taking and noticing that , the derivative of the right-hand side with respect to ξ can be written as
eq13
The numerator of the last right-hand side has two roots of opposite sign and is positive for . Hence, it is negative between the roots. This means that the criterion admits a maximum at the negative root and a minimum at the positive root. The latter is given by
eq14
For numerical accuracy, we use the first equality when and the second equality otherwise. The idea is to avoid the subtraction of two quantities of the same sign and about the same magnitude when is small.
Thus, from a current vector , the algorithm computes a new vector . To compute , we write in the form
eq15
since . The vector inside the parentheses on the right-hand side of the above equation is, by construction, orthogonal to both and . Furthermore, the terms outside these parentheses can be written as . Since , . Hence, the last expression reduces to . Finally,
eq16
Therefore,
eq17

b. Algorithm

The algorithm can be summarized as follows.

  1. Initialization: Start from a vector , then compute , .
  2. Iterations: If is small enough, stop. Otherwise, compute , and
    eq18
    according to whether or not is positive. Then, make the substitution
    eq19
    eq20
    eq21
    before proceeding with a new iteration.

The stopping criterion is actually the tangent of the angle between the vectors and . When this angle is small, this means that is nearly parallel to . That is, it is nearly an eigenvector.

By construction, the criterion is decreased after every iteration of the algorithm, and hence it converges to some limit, say λ. Hestenes and Karush (1951) have shown that if the initial vector is expanded in terms of the eigenvectors of as , where , , and are certain (not necessary all) normalized eigenvectors of ranked in increasing order of eigenvalues, then λ is the eigenvalue of and the sequence converges to . In our case, since the initial vector is orthogonal to , the eigenvector of zero eigenvalue of , its expansion in terms of the eigenvectors of will not contain this vector. In general, this expansion would contain the eigenvector of smallest nonzero eigenvalue of the above matrix, and hence the algorithm converges to this vector. The case where this expansion does not contain such a vector is very unlikely, since should be small (see the discussion in section 3c).

REFERENCES

  • Altaf, U., , T. Butler, , T. Mayo, , C. Dawson, , A. Heemink, , and I. Hoteit, 2014: A comparison of ensemble Kalman filters for storm surge assimilation. Mon. Wea. Rev.,142, 2899–2914, doi:10.1175/MWR-D-13-00266.1.

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634642, doi:10.1175/1520-0493(2003)131<0634:ALLSFF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Anderson, J. L., 2010: A non-Gaussian ensemble filter update for data assimilation. Mon. Wea. Rev., 138, 41864198, doi:10.1175/2010MWR3253.1.

    • Search Google Scholar
    • Export Citation
  • Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420436, doi:10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bowler, N. E., , J. Flowerdew, , and S. R. Pring, 2013: Tests of different flavors of EnKF on a simple model. Quart. J. Roy. Meteor. Soc., 139, 15051519, doi:10.1002/qj.2055.

    • Search Google Scholar
    • Export Citation
  • Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 17191724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 14310 162, doi:10.1029/94JC00572.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53, 343367, doi:10.1007/s10236-003-0036-9.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2009: The ensemble Kalman filter for combined state and parameter estimation. IEEE Control Syst., 29, 83104, doi:10.1109/MCS.2009.932223.

    • Search Google Scholar
    • Export Citation
  • Frei, M., , and H. R. Kunsch, 2013: Mixture ensemble Kalman filters. Comput. Stat. Data Anal., 58, 127138, doi:10.1016/j.csda.2011.04.013.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757, doi:10.1002/qj.49712555417.

    • Search Google Scholar
    • Export Citation
  • Golub, G., , and C. F. Van Loan, Eds., 1996: Matrix Computation. 3rd ed. Johns Hopkins University Press, 728 pp.

  • Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hestenes, M. R., , and W. Karush, 1951: A method of gradients for the calculation of the characteristic roots and vectors of a real symmetric matrix. J. Res. Natl. Bur. Stand., 47, 4561.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , D. T. Pham, , and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36, 101127, doi:10.1016/S0924-7963(02)00129-X.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , G. Korres, , and G. Triantafyllou, 2005: Comparison of extended and ensemble based Kalman filters with low and high-resolution primitive equations ocean models. Nonlinear Processes Geophys., 12, 755765, doi:10.5194/npg-12-755-2005.

    • Search Google Scholar
    • Export Citation
  • Hoteit, I., , X. Luo, , and D. T. Pham, 2012: Particle Kalman filtering: A nonlinear Bayesian framework for ensemble Kalman filters. Mon. Wea. Rev., 140, 528542, doi:10.1175/2011MWR3640.1.

    • Search Google Scholar
    • Export Citation
  • Kalman, R., 1960: A new approach to linear filtering and prediction problems. J. Fluids Eng., 82, 3545, doi:10.1115/1.3662552.

  • Karush, W., 1951: An iterative method for finding characteristic vectors of a symmetric matrix. Pac. J. Math., 1, 233248, doi:10.2140/pjm.1951.1.233.

    • Search Google Scholar
    • Export Citation
  • Korres, G., , I. Hoteit, , and G. Triantafyllou, 2007: Data assimilation into a Princeton Ocean Model of the Mediterranean Sea using advanced Kalman filters. J. Mar. Syst., 65, 84104, doi:10.1016/j.jmarsys.2006.09.005.

    • Search Google Scholar
    • Export Citation
  • Lawson, W., , and J. Hansen, 2004: Implications of stochastic and deterministic filters as ensemble-based data assimilation methods in varying regimes of error growth. Mon. Wea. Rev., 132, 19661981, doi:10.1175/1520-0493(2004)132<1966:IOSADF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Lei, J., , P. Bickel, , and C. Snyder, 2010: Comparison of ensemble Kalman filters under non-Gaussianity. Mon. Wea. Rev., 138, 12931306, doi:10.1175/2009MWR3133.1.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414, doi:10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Luo, X., , and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform. Physica D, 238, 549562, doi:10.1016/j.physd.2008.12.003.

    • Search Google Scholar
    • Export Citation
  • Maybeck, P., 1979: Square root filtering. Stochastic Models, Estimation, and Control, Vol. 1, Academic Press, 368–410.

  • Nerger, L., , W. Hiller, , and J. Schroter, 2005: A comparison of error subspace Kalman filters. Tellus, 57A, 715735, doi:10.1111/j.1600-0870.2005.00141.x.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 11941207, doi:10.1175/1520-0493(2001)129<1194:SMFSDA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Sakov, P., , and P. Oke, 2008: A deterministic formulation of the ensemble Kalman filter: An alternative to ensemble square root filters. Tellus, 60A, 361371, doi:10.1111/j.1600-0870.2007.00299.x.

    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490, doi:10.1175/1520-0493(2003)131<1485:ESRF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Wan, E., , and R. van der Merwe, 2000: The unscented Kalman filter for nonlinear estimation. Proc. IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symp., Lake Louise, Alberta, Canada, IEEE, 153–158, doi:10.1109/ASSPCC.2000.882463.

  • Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 19131924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
1

In fact it is only optimal in the Gaussian case or if we restrict the optimality to linear updates.

2

The equation is the maximum rank of a sample covariance matrix of an ensemble of size N.

3

The kernel of a matrix is the linear subspace of all vectors , such that .

4

The use of localization would remove this equivalence.

Save