• Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224.

  • Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 7283.

  • Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, L., and Coauthors, 2007: The need for a dynamical climate reanalysis. Bull. Amer. Meteor. Soc., 88, 495501.

  • Bergemann, K., G. Gottwald, and S. Reich, 2009: Ensemble propagation and continuous matrix factorization algorithms. Quart. J. Roy. Meteor. Soc., 135, 15601572.

    • Search Google Scholar
    • Export Citation
  • Bocquet, M., C. A. Pires, and L. Wu, 2010: Beyond Gaussian statistical modeling in geophysical data assimilation. Mon. Wea. Rev., 138, 29973023.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., M. Miller, and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 125, 28872908.

    • Search Google Scholar
    • Export Citation
  • Charron, M., G. Pellerin, L. Spacek, P. L. Houtekamer, N. Gagnon, H. L. Mitchell, and L. Michelin, 2010: Toward random sampling of model error in the Canadian ensemble prediction system. Mon. Wea. Rev., 138, 18771901.

    • Search Google Scholar
    • Export Citation
  • Compo, G. P., and Coauthors, 2011: The twentieth century reanalysis project. Quart. J. Roy. Meteor. Soc., 137, 128, doi:10.1002/qj.776.

    • Search Google Scholar
    • Export Citation
  • Durran, D. R., 1999: Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. Springer, 482 pp.

  • Eckermann, S. D., and Coauthors, 2009: High-altitude data assimilation system experiments for the northern summer mesosphere season of 2007. J. Atmos. Sol.-Terr. Phys., 71, 531551.

    • Search Google Scholar
    • Export Citation
  • Ehrendorfer, M., 2007: A review of issues in ensemble-based Kalman filtering. Meteor. Z., 16, 795818.

  • Evensen, G., 2006: Data Assimilation: The Ensemble Kalman Filter. Springer, 280 pp.

  • Fisher, M., M. Leutbecher, and G. A. Kelly, 2005: On the equivalence between Kalman smoothing and weak-constraint four-dimensional variational data assimilation. Quart. J. Roy. Meteor. Soc., 131, 32353246.

    • Search Google Scholar
    • Export Citation
  • Gardiner, C. W., 2004: Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences. 3rd ed. Springer, 415 pp.

  • Golub, G. H., and C. F. Van Loan, 1996: Matrix Computations. 3rd ed. The Johns Hopkins University Press, 728 pp.

  • Gottwald, G. A., and I. Melbourne, 2005: Testing for chaos in deterministic systems with noise. Physica D, 212, 100110.

  • Hamill, T. M., and J. S. Whitaker, 2005: Accounting for the error due to unresolved scales in ensemble data assimilation: A comparison of different approaches. Mon. Wea. Rev., 133, 31323147.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790.

    • Search Google Scholar
    • Export Citation
  • Harlim, J., and A. J. Majda, 2010: Catastrophic filter divergence in filtering nonlinear dissipative systems. Comm. Math. Sci., 8, 2743.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129, 123136.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2005: Ensemble Kalman filtering. Quart. J. Roy. Meteor. Soc., 131, 32693289.

  • Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133, 604620.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., H. L. Mitchell, and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter. Mon. Wea. Rev., 137, 21262143.

    • Search Google Scholar
    • Export Citation
  • Ide, K., P. Courtier, M. Ghil, and A. C. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181189.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., 2002: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, 364 pp.

  • Kepert, J. D., 2004: On ensemble representation of the observation-error covariances in the ensemble Kalman filter. Ocean Dyn., 54, 561569.

    • Search Google Scholar
    • Export Citation
  • Kepert, J. D., 2009: Covariance localisation and balance in an Ensemble Kalman Filter. Quart. J. Roy. Meteor. Soc., 135, 11571176.

  • Leimkuhler, B., and S. Reich, 2005: Simulating Hamiltonian Dynamics. Cambridge University Press, 379 pp.

  • Li, H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter. Quart. J. Roy. Meteor. Soc., 135, 523533.

    • Search Google Scholar
    • Export Citation
  • Liu, J., E. J. Fertig, H. Li, E. Kalnay, B. R. Hunt, E. J. Kostelich, I. Szunyogh, and R. Todling, 2008: Comparison between local ensemble transform Kalman filter and PSAS in the NASA finite volume GCM—Perfect model experiments. Nonlinear Processes Geophys., 15, 645659.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4DVAR. Quart. J. Roy. Meteor. Soc., 129, 31833203.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1996: Predictability—A problem partly solved. Predictability, T. Palmer, Ed., European Centre for Medium-Range Weather Forecasts, 1–18.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414.

    • Search Google Scholar
    • Export Citation
  • Mitchell, H. L., and P. L. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128, 416433.

  • Neef, L., S. M. Polavarapu, and T. G. Shepherd, 2006: Four-dimensional data assimilation and balanced dynamics. J. Atmos. Sci., 63, 18401850.

    • Search Google Scholar
    • Export Citation
  • Orrell, D., and L. Smith, 2003: Visualising bifurcations in high dimensional systems: The spectral bifurcation diagram. Int. J. Bifurcation Chaos, 13, 30153028.

    • Search Google Scholar
    • Export Citation
  • Ott, E., B. Hunt, I. Szunyogh, A. Zimin, E. Kostelich, M. Corrazza, E. Kalnay, and J. Yorke, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415428.

    • Search Google Scholar
    • Export Citation
  • Pires, C. A., O. Talagrand, and M. Bocquet, 2010: Diagnosis and impacts of non-Gaussianity of innovations in data assimilation. Physica D, 239, 17011717.

    • Search Google Scholar
    • Export Citation
  • Polavarapu, S., T. G. Shepherd, Y. Rochon, and S. Ren, 2005: Some challenges of middle atmosphere data assimilation. Quart. J. Roy. Meteor. Soc., 131, 35133527.

    • Search Google Scholar
    • Export Citation
  • Sankey, D., S. Ren, S. Polavarapu, Y. Rochon, Y. Nezlin, and S. Beagley, 2007: Impact of data assimilation filtering methods on the mesosphere. J. Geophy. Res., 112, D24104, doi:10.1029/2007JD008885.

    • Search Google Scholar
    • Export Citation
  • Sasaki, Y., 1970: Some basic formalisms on numerical variational analysis. Mon. Wea. Rev., 98, 875883.

  • Shutts, G. J., 2005: A stochastic kinetic energy backscatter algorithm for use in ensemble prediction systems. Quart. J. Roy. Meteor. Soc., 131, 30793102.

    • Search Google Scholar
    • Export Citation
  • Simon, D. J., 2006: Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. John Wiley & Sons, Inc., 552 pp.

  • Szunyogh, I., E. Kostelich, G. Gyarmati, D. J. Patil, B. Hunt, E. Kalnay, E. Ott, and J. Yorke, 2005: Assessing a local ensemble Kalman filter: Perfect model experiments with the National Centers for Environmental Prediction global model. Tellus, 57A, 528545.

    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490.

    • Search Google Scholar
    • Export Citation
  • Wang, X., C. H. Bishop, and S. J. Julier, 2004: Which is better, an ensemble of positive–negative pairs or a centered spherical simplex ensemble? Mon. Wea. Rev., 132, 15901605.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., G. P. Compo, X. Wei, and T. M. Hamill, 2004: Reanalysis without radiosondes using ensemble data assimilation. Mon. Wea. Rev., 132, 11901200.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., G. P. Compo, and J. N. Thépaut, 2009: A comparison of variational and ensemble-based data assimilation systems for reanalysis of sparse observations. Mon. Wea. Rev., 137, 19911999.

    • Search Google Scholar
    • Export Citation
  • Wolfram Research, Inc., 2008: Mathematica Version 7.0.Wolfram Research, Inc., Champaign, IL. [Available online at http://www.wolfram.com/mathematica/.]

    • Search Google Scholar
    • Export Citation
  • Zupanski, D., 1997: A general weak constraint applicable to operational 4DVar data assimilation systems. Mon. Wea. Rev., 125, 22742292.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Proportion of incidences when the variance constraint is switched on and is positive definite as a function of the observation interval Δtobs for the stochastic linear toy model in (12). We used γx = 1, γy = 1, σx = 1, σy = 1, and λ = 0.2. We used k = 20 ensemble members, 100 realizations and , and no inflation with δ = 1. The analytically calculated critical observation interval according to (16) is Δtobs = 10−2.

  • View in gallery

    Average maximal singular value of as a function of ensemble size k for the stochastic linear toy model in (12) using standard ETKF without inflation, with (dashed curve) and (solid curve). Parameters are σx = σy = γx = γy = 1, λ = 0.2, Δtobs = 1, for which the climatic variance is . We used 50 realizations for the averaging.

  • View in gallery

    Dependency of the skill improvement of VLKF over ETKF on the damping coefficient γy of the pseudo-observable. We show a comparison of direct numerical simulations (open circles) with analytical results using (21) (continuous curve) and the approximation of large ensemble size in (23) (dashed curve). Parameters are γx = 1, λ = 2, σx = σy = 1, and Robs = 0.25. We used an ensemble size of k = 20 and averaged over 1000 realizations. (a) No inflation with δ = 1. (b) Inflation with δ = 1.022.

  • View in gallery

    Sample ETKF analysis (continuous gray line) for the (top) unobserved z1 and (bottom) observed z5 component. The dashed line is the truth and the crosses are observations. Parameters used were Nobs = 5, Δtobs = 0.15 (18 h), and .

  • View in gallery

    Sample VLKF analysis (continuous gray line) for the (top) unobserved z1 and (bottom) observed z5 component. The dashed line is the truth and the crosses are observations. Parameters are as in Fig. 4.

  • View in gallery

    Average maximal singular value of as a function of ensemble size k for the Lorenz-96 model in (24), using standard ETKF without inflation and all other parameters are as in Fig. 4. We used 150 realizations for the averaging.

  • View in gallery

    Proportional skill improvement of VLKF over ETKF as a function of the observation interval Δtobs for different values of Nobs, with observational noise . A total of 500 simulations were used to perform the ensemble average in the RMS errors using (25) for ETKF and VLKF. Δtobs is measured in hours.

  • View in gallery

    Proportional skill improvement of VLKF over ETKF as a function of the observation interval Δtobs for different values of Nobs. The RMS error is calculated using (a) only the observed variables or (b) only the pseudo-observables. Δtobs is measured in hours. Parameters are as in Fig. 7.

  • View in gallery

    RMS error of VLKF (solid lines) and ETKF (dashed lines) for , where is calculated using (a) only the observed variables or (b) only the pseudo-observables. Δtobs is measured in hours. Parameters are as in Fig. 7.

  • View in gallery

    RMS error for each variable zi as a function of the lattice site i. Only one observable was used at i = 21. Time between observations is Δtobs = 10 h and observational noise with covariance was used. The results are averaged over 100 different realizations.

  • View in gallery

    RMS error for VLKF (solid lines) and ETKF (dashed lines), as a function of the observational noise, measured here by η defined via . The dashed–dotted line indicates the RMS error if only observations were used. Results for several observation intervals: (a) Δtobs = 1 h, (b) Δtobs = 2 h, and (c) Δtobs = 5 h; Nobs = 4 was used and 1000 simulations were carried out to perform the ensemble averages in the RMS errors using (25) for ETKF and VLKF.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 245 245 31
PDF Downloads 176 176 18

Controlling Overestimation of Error Covariance in Ensemble Kalman Filters with Sparse Observations: A Variance-Limiting Kalman Filter

View More View Less
  • 1 School of Mathematics and Statistics, University of Sydney, Sydney, Australia
  • | 2 Universität Potsdam, Institut für Mathematik, Potsdam, Germany
© Get Permissions
Full access

Abstract

The problem of an ensemble Kalman filter when only partial observations are available is considered. In particular, the situation is investigated where the observational space consists of variables that are directly observable with known observational error, and of variables of which only their climatic variance and mean are given. To limit the variance of the latter poorly resolved variables a variance-limiting Kalman filter (VLKF) is derived in a variational setting. The VLKF for a simple linear toy model is analyzed and its range of optimal performance is determined. The VLKF is explored in an ensemble transform setting for the Lorenz-96 system, and it is shown that incorporating the information of the variance of some unobservable variables can improve the skill and also increase the stability of the data assimilation procedure.

Corresponding author address: Georg A. Gottwald, School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia. E-mail: georg.gottwald@sydney.edu.au

Abstract

The problem of an ensemble Kalman filter when only partial observations are available is considered. In particular, the situation is investigated where the observational space consists of variables that are directly observable with known observational error, and of variables of which only their climatic variance and mean are given. To limit the variance of the latter poorly resolved variables a variance-limiting Kalman filter (VLKF) is derived in a variational setting. The VLKF for a simple linear toy model is analyzed and its range of optimal performance is determined. The VLKF is explored in an ensemble transform setting for the Lorenz-96 system, and it is shown that incorporating the information of the variance of some unobservable variables can improve the skill and also increase the stability of the data assimilation procedure.

Corresponding author address: Georg A. Gottwald, School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia. E-mail: georg.gottwald@sydney.edu.au

1. Introduction

In data assimilation one seeks to find the best estimation of the state of a dynamical system given a forecast model with a possible model error and noisy observations at discrete observation intervals (Kalnay 2002). This process is complicated on the one hand by the often chaotic nature of the underlying nonlinear dynamics leading to an increase of the variance of the forecast, and on the other hand by the fact that one often has only partial information of the observables. In this paper we address the latter issue. We consider situations whereby noisy observations are available for some variables but not for other unresolved variables. However, for the latter we assume that some prior knowledge about their statistical climatic behavior such as their variance and their mean is available.

A particularly attractive framework for data assimilation are ensemble Kalman filters (e.g., see Evensen 2006 ). These straightforwardly implemented filters distinguish themselves from other Kalman filters in that the spatially and temporally varying background error covariance is estimated from an ensemble of nonlinear forecasts. Despite the ease of implementation and the flow-dependent estimation of the error covariance, ensemble Kalman filters are subject to several errors and specific difficulties [see Ehrendorfer (2007) for a recent review]. Besides the problems of estimating model error, which is inherent to all filters, and inconsistencies between the filter assumptions and reality such as non-Gaussianity which render all Kalman filters suboptimal, ensemble-based Kalman filters have the specific problem of sampling errors due to an insufficient size of the ensemble. These errors usually underestimate the error covariances, which may ultimately lead to filter divergence when the filter trusts its own forecast and ignores the information given by the observations.

Several techniques have been developed to counteract the associated small spread of the ensemble. To deal with errors in ensemble filters due to sampling errors we mention two of the main algorithms: covariance inflation and localization. To avoid filter divergence due to an underestimation of error covariances the concept of covariance inflation was introduced whereby the prior forecast error covariance is increased by an inflation factor (Anderson and Anderson 1999). This is usually done in a global fashion and involves careful and expensive tuning of the inflation factor; however, recently methods have been devised to adaptively estimate the inflation factor from the innovation statistics (Anderson 2007, 2009; Li et al. 2009). Too small ensemble sizes also lead to spurious correlations associated with remote observations. To address this issue, the concept of localization has been introduced (Houtekamer and Mitchell 1998, 2001; Hamill et al. 2001; Ott et al. 2004; Szunyogh et al. 2005) whereby only spatially close observations are used for the innovations.

To take into account the uncertainty in the model representation we mention here isotropic model error parameterization (Mitchell and Houtekamer 2000; Houtekamer et al. 2005), stochastic parameterizations (Buizza et al. 1999), and kinetic energy backscatter (Shutts 2005). A recent comparison between those methods is given in Houtekamer et al. (2009), Charron et al. (2010), and Hamill and Whitaker (2005). The problem of non-Gaussianity is for example discussed in Pires et al. (2010) and Bocquet et al. (2010).

Whereas the underestimation of error covariances has received much attention, relatively little is done for a possible overestimation of error covariances. Overestimation of covariance is a finite-ensemble size effect that typically occurs in sparse observation networks (e.g., see Liu et al. 2008; Whitaker et al. 2009). Uncontrolled growth of error covariances, which is not tempered by available observations, may progressively spoil the overall analysis. This effect is even exacerbated when inflation is used; in regions where no observations influence the analysis, inflation can lead to unrealistically large ensemble variances progressively degrading the overall analysis (e.g., see Whitaker et al. 2004). This is particularly problematic when inappropriate uniform inflation is used. Moreover, it is well known that covariance localization can be a significant source of imblance in the analyzed fields (e.g., see Houtekamer and Mitchell 2005; Kepert 2009; Houtekamer et al. 2009). Localization artificially generates unwanted gravity wave activity, which in poorly resolved spatial regions may lead to an unrealistic overestimation of error covariances. Being able to control this should help filter performances considerably.

When assimilating current weather data in numerical schemes for the troposphere, the main problem is underestimation of error covariances rather than overestimation. This is due to the availability of radiosonde data, which assures wide observational coverage. However, in the preradiosonde era there were severe data voids, particularly in the Southern Hemisphere and in vertical resolution since most observations were done on the surface level in the Northern Hemisphere. There is an increased interest in so-called climate reanalysis (e.g., see Bengtsson et al. 2007; Whitaker et al. 2004), which has the challenge to deal with large unobserved regions. Historical atmospheric observations are reanalyzed by a fixed forecast scheme to provide a global homogeneous dataset covering troposphere and stratosphere for very long periods. A remarkable effort is the international Twentieth Century Reanalysis Project (20CR; Compo et al. 2011), which produced a global estimate of the atmosphere for the entire twentieth century (1871 to the present) using only synoptic surface pressure reports and monthly sea surface temperature and sea ice distributions. Such a dataset could help to analyze climate variations in the twentieth century or the multidecadal variations in the behavior of the El Niño–Southern Oscillation. An obstacle for reanalysis is the overestimation of error covariances if one chooses to employ ensemble filters (Whitaker et al. 2004) where multiplicative covariance inflation is employed.

Overestimation of error covariances also occurs in modern numerical weather forecast schemes for which the upper lid of the vertical domain is constantly pushed toward higher and higher levels to incorporate the mesosphere, with the aim to better resolve processes in the polar stratosphere (e.g., see Polavarapu et al. 2005; Sankey et al. 2007; Eckermann et al. 2009). The energy spectrum in the mesosphere is, contrary to the troposphere, dominated by gravity waves. The high variability associated with these waves causes very large error covariances in the mesosphere which can be 2 orders of magnitude larger than at lower levels (Polavarapu et al. 2005), rendering the filter very sensitive to small uncertainties in the forecast covariances. Being able to control the variances of mesospheric gravity waves is therefore a big challenge.

The question we address in this work is how can the statistical information available for some data, which are otherwise not observable, be effectively incorporated in data assimilation to control the potentially high error covariances associated with the data void. We will develop a framework to modify the familiar Kalman filter (e.g., see Evensen 2006; Simon 2006) for partial observations with only limited information on the mean and variance, with the effect that the error covariance of the unresolved variables cannot exceed their climatic variance and their mean is controlled by driving it toward the climatological value.

The paper is organized as follows. In section 2 we will introduce the dynamical setting and briefly describe the ensemble transform Kalman filter (ETKF), a special form of an ensemble square root filter. In section 3 we will derive the variance-limiting Kalman filter (VLKF) in a variational setting. In section 4 we illustrate the VLKF with a simple linear toy model for which the filter can be analyzed analytically. We will extract the parameter regimes where we expect VLKF to yield optimal performance. In section 5 we apply the VLKF to the 40-dimensional Lorenz-96 system (Lorenz 1996) and present numerical results illustrating the advantage of such a variance-limiting filter. We conclude the paper with a discussion in section 6.

2. Setting

Assume an N-dimensional1 dynamical system whose dynamics is given by
e1
with the state variable . We assume that the state space is decomposable according to z = (x, y) with and and n + m = N. Here x shall denote those variables for which direct observations are available, and y shall denote those variables for which only some integrated or statistical information is available. We will coin the former observables and the latter pseudo-observables. We do not incorporate model error here and assume that (1) describes the truth. We apply the notation of Ide et al. (1997) unless stated explicitly otherwise.
Let us introduce an observation operator , which maps from the whole space into observation space spanned by the designated variables x. We assume that observations of the designated variables x are given at equally spaced discrete observation times ti with the observation interval Δtobs. Since it is assumed that there is no model error, the observations at discrete times ti = iΔtobs are given by
eq1
with independent and identically distributed observational Gaussian noise . The observational noise is assumed to be independent of the system state, and to have zero mean and constant covariance .

We further introduce an operator , which maps from the whole space into the space of the pseudo-observables spanned by y. We assume that the pseudo-observables have variance and constant mean . This is the only information available for the pseudo-observables, and may be estimated, for example, from climatic measurements. The error covariance of those pseudo-observations is denoted by .

The model forecast state zf at each observation interval is obtained by integrating the state variable with the full nonlinear dynamics in (1) for the time interval Δtobs. The background (or forecast) involves an error with covariance .

Data assimilation aims to find the best estimation of the current state given the forecast zf with variance and observations yo of the designated variables with error covariance . Pseudo-observations can be included following the standard Bayesian approach once their mean aclim and error covariance are known. However, the error covariance of a pseudo-observation is in general not equal to . In section 3, we will show how to derive the error covariance in order to ensure that the forecast does not exceed the prescribed variance . We do so in the framework of Kalman filters and shall now briefly summarize the basic ideas to construct such a filter for the case of an ensemble square root filter (Tippett et al. 2003), that is, the ensemble transform filter (Wang et al. 2004).

Ensemble Kalman filter

In an ensemble Kalman filter (EnKF; Evensen 2006) an ensemble with k members zk
eq2
is propagated by the full nonlinear dynamics (1), which is written as
e2
The ensemble is split into its mean:
eq3
where , and its ensemble deviation matrix
eq4
with the constant projection matrix:
eq5
The ensemble deviation matrix can be used to approximate the ensemble forecast covariance matrix via
eq6
Given the forecast ensemble and the associated forecast error covariance matrix (or the prior) , the actual Kalman analysis (Kalnay 2002; Evensen 2006; Simon 2006) updates a forecast into a so-called analysis (or the posterior). Variables at times t = tiϵ are evaluated before taking the observations (and/or pseudo observations) into account in the analysis step, and variables at times t = ti + ϵ are evaluated after the analysis step when the observations (and/or pseudo observations) have been taken into account. In the first step of the analysis the forecast mean,
eq7
is updated to the analysis mean:
e3
where the Kalman gain matrices are defined as
e4
The analysis covariance is given by the addition rule for variances, typical in linear Kalman filtering (Kalnay 2002):
e5
To calculate an ensemble , which is consistent with the error covariance after the observation , and that therefore needs to satisfy
eq8
we use the method of ensemble square root filters (Simon 2006). In particular we use the method proposed in (Tippett et al. 2003; Wang et al. 2004), the so-called ETKF, which seeks a transformation such that
e6
Alternatively one could have chosen the ensemble adjustment filter (Anderson 2001) in which the ensemble deviation matrix is premultiplied with an appropriately determined matrix . However, since we are mainly interested in the case kN we shall use the ETKF. Note that the matrix is not uniquely determined for k < N. The transformation matrix can be obtained either by using continuous Kalman filters (Bergemann et al. 2009) or directly (Wang et al. 2004) by
eq9
Here is the singular value decomposition of
eq10
The matrix is obtained by erasing the last zero column from , and is the upper-left (k − 1) × (k − 1) block of the diagonal matrix . The deletion of the 0 eigenvalue and the associated columns in assure that and therefore that the analysis mean is given by . Note that is symmetric and , which assures that implying that the mean is preserved under the transformation. This is not necessarily true for general ensemble transform methods of the form (6).

A new forecast is then obtained by propagating with the full nonlinear dynamics in (2) to the next time of observation. The numerical results presented later in sections 4 and 5 are obtained with this method.

In the next section we will determine how the error covariance used in the Kalman filter is linked to the variance of the pseudovariables.

3. Derivation of the variance-limiting Kalman filter

One may naively believe that the error covariance of the pseudo-observable is determined by the target variance of the pseudo-observables simply by setting . In the following we will see that this is not true, and that the expression for , which ensures that the variance of the pseudo-observables in the analysis is limited from above by involves all error covariances.

We formulate the Kalman filter as a minimization problem of a cost function (e.g., Kalnay 2002). The cost function for one analysis step as described in section 2 with a given background zf and associated error covariance is typically written as
e7
where z is the state variable at one observation time ti = iΔtobs. Note that the part involving the pseudo-observables corresponds to the notion of weak constraints in variational data assimilation (Sasaki 1970; Zupanski 1997; Neef et al. 2006).
The analysis step of the data assimilation procedure consists of finding the critical point of this cost function. The thereby obtained analysis and the associated variance are then subsequently propagated to the next observation time ti+1 to yield zf and at the next time step, at which a new analysis step can be performed. The equation for the critical point with zJ(z) = 0 is readily evaluated to be
e8
and yields (3) for the analysis mean , and (5) for the analysis covariance with Kalman gain matrices given by (4).
To control the variance of the unresolved pseudo-observables we set
e9
Introducing
e10
and upon applying the Sherman–Morrison–Woodbury formula (e.g., see Golub and Van Loan 1996) to , (9) yields the desired equation for :
e11
which is yet again a reciprocal addition formula for variances. Note that the naive expectation that is true only for , but is not generally true. For sufficiently small background error covariance , the error covariance as defined in (11) is not positive semidefinite. In this case the information given by the pseudo-observables has to be discarded. In the language of variational data assimilation the criterion of positive definiteness of determines whether the weak constraint is switched on or off. To determine those eigendirections for which the statistical information available can be incorporated, we diagonalize and define with for Dii ≥ 0 and for . The modified then uses information of the pseudo-observables only in those directions that potentially allow for improvement of the analysis. Noting that denotes the analysis covariance of an ETKF (with ), we see that (11) states that the variance constraint switches on for those eigendirections whose corresponding singular eigenvalues of are larger than those of . Hence, the proposed VLKF as defined here incorporates the climatic information of the unresolved variables in order to restrict the posterior error covariance of those pseudo-observables to lie below their climatic variance and to drive the mean toward their climatological mean.

4. Analytical linear toy model

In this section we study the VLKF for the following coupled linear skew product system for two oscillators :
eq11
where and Λ are all skew symmetric; σx,y and Γx,y are symmetric; and and are independent two-dimensional Brownian processes.2 We assume here for simplicity that
eq12
with the identity matrix , and
eq13
with the skew-symmetric matrix:
eq14
Note that our particular choice for the matrices implies .
The system models two noisy coupled oscillators: x and y. We assume that we have access to observations of the variable x at discrete observation times ti = iΔtobs, but have only statistical information about the variable y. We assume knowledge of the climatic mean μclim and the climatic covariance of the unobserved variable y. The noise is of Ornstein–Uhlenbeck type (Gardiner 2004), and may represent either model error or parameterize highly chaotic nonlinear dynamics. Without loss of generality, the coupling is chosen such that the y dynamics drives the x dynamics but not vice versa. The form of the coupling is not essential for our argument, and it may be oscillatory or damping with Λ = λI. We write this system in the more compact form for :
e12
with
eq15
The solution of (12) can be obtained using Ito’s formula and, introducing the propagator , which commutes with σ for our choice of the matrices, is given by
eq16
with mean
eq17
and covariance
e13
where
eq18
The climatic mean and covariance matrix are then obtained in the limit t → ∞ as
eq19
and
eq20
In order for the stochastic process (12) to have a stationary density and for Σ(t) to be a positive definite covariance matrix for all t, the coupling has to be sufficiently small with λ2 < 4γxγy. Note that the skew product nature of the system (12) is not special in the sense that a nonskew product structure where x couples back to y would simply lead to a renormalization of . However, it is pertinent to mention that although in the actual dynamics of the model (12) there is no back coupling from x to y, the Kalman filter generically introduces back coupling of all variables through the inversion of the covariance matrices [cf. (5)].

We will now investigate the variance-limiting Kalman filter for this toy model. In particular we will first analyze under what conditions is positive definite and the variance constraint will be switched on, and second we will analyze when the VLKF yields a skill improvement when compared to the standard ETKF.

We start with the positive definiteness of . When calculating the covariance of the forecast in an ensemble filter we need to interpret the solution of the linear toy model (12) as
eq21
where zj(ti+1) is the forecast of ensemble member j at time ti+1 = ti + Δtobs = (i + 1)Δtobs before the analysis propagated from its initial condition with at the previous analysis. The equality here is in distribution only (i.e., members of the ensemble are not equal in a pathwise sense as their driving Brownian will be different, but they will have the same mean and variance). The covariance of the forecast can then be obtained by averaging with respect to the ensemble and with respect to realizations of the Brownian motion, and is readily computed as
e14
where denotes the transpose of . The forecast covariance of an ensemble with spread is typically larger than the forecast covariance Σ of one trajectory with a nonrandom initial condition z0. The difference is most pronounced for small observation intervals when the covariance of the ensemble will be close to the initial analysis covariance , whereas a single trajectory will not have acquired much variance Σ. In the long-time limit, both, and Σ, will approach the climatic covariance Σclim [cf. (13)].

In the following we restrict ourselves to the limit of small observation intervals Δtobs ≪ 1. In this limit, we can approximate and explicitly solve the forecast covariance matrix using (14). This assumption requires that the analysis is stationary in the sense that the filter has lost its memory of its initial background covariance provided by the user to start up the analysis. We have verified the validity of this assumption for small observation intervals and for a range of initial background variances. This assumption renders (14) a matrix equation for . To derive analytical expressions we further Taylor-expand the propagator and the covariance Σtobs) for small observation intervals Δtobs. This is consistent with our stationarity assumption . The very lengthy analytical expression for can be obtained with the aid of Mathematica (Wolfram Research, Inc. 2008), but is omitted from this paper.

In filtering one often uses variance inflation (Anderson and Anderson 1999) to compensate for the loss of ensemble variance due to finite-size effects, sampling errors, and the effects of nonlinearities. We do so here by introducing an inflation factor δ > 1 multiplying the forecast variance . Having determined the forecast covariance matrix we are now able to write down an expression for the error covariance of the pseudo-observables . As before we limit the variance and the mean of our pseudo-observable y to be and aclim = μclim. Then, upon using the definitions (10) and (11), we find that the error covariance for the pseudo-observables is positive definite provided the observation interval Δtobs is sufficiently large.3 Particularly, in the limit of , we find that if
e15
the variance constraint will be switched on. Note that for δ > 1 the critical Δtobs above which is positive definite can be negative, implying that the variance constraint will be switched on for all (positive) values of Δtobs. If no inflation is applied (i.e., δ = 1), this simplifies to
e16
Because 4γxγyλ2 > 0 the critical observation interval Δtobs is smaller for nontrivial inflation with δ > 1 than if no variance inflation is incorporated. This is intuitive, because the variance inflation will increase instances with We have numerically verified that inflation is beneficial for the variance constraint to be switched on. It is pertinent to mention that for sufficiently large coupling strength λ or sufficiently small values of γx, (16) may not be consistent with the assumption of small observation intervals Δtobs ≪ 1.

We have checked analytically that the derivative of is positive at the critical observation interval Δtobs, indicating that the frequency of occurrence when the variance constraint is switched on increases monotonically with the observation interval Δtobs, in the limit of small Δtobs. This has been verified numerically with the application of VLKF for (12) and is illustrated in Fig. 1.

Fig. 1.
Fig. 1.

Proportion of incidences when the variance constraint is switched on and is positive definite as a function of the observation interval Δtobs for the stochastic linear toy model in (12). We used γx = 1, γy = 1, σx = 1, σy = 1, and λ = 0.2. We used k = 20 ensemble members, 100 realizations and , and no inflation with δ = 1. The analytically calculated critical observation interval according to (16) is Δtobs = 10−2.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

At this stage it is important to mention effects due to finite-size ensembles. For large observation intervals Δtobs → ∞ and large observational noise , we have and our analytical formulas would indicate that the variance constraint should not be switched on [cf. (10) and (11)]. However, in numerical simulations of the Kalman filter we observe that for large observation intervals the variance constraint is switched on for almost all analysis times. This is a finite-ensemble-size effect and is due to the mean of the forecast variance ensemble adopting values larger than the climatic value of σclim implying positive definite values of . The closer the ensemble mean approaches the climatic variance, the more likely fluctuations will push the forecast covariance above the climatic value. However, we observe that the actual eigenvalues of decrease for Δtobs → ∞ and for the size of the ensemble k → ∞.

The analytical results obtained above are for the ideal case with k → ∞. As mentioned in the introduction, in sparse observation networks finite ensemble sizes cause the overestimation of error covariances (Liu et al. 2008; Whitaker et al. 2009), implying that is positive definite and the variance-limiting constraint will be switched on. This finite-size effect is illustrated in Fig. 2, where the maximal singular value of , averaged over 50 realizations, is shown for ETKF as a function of ensemble size k for different observational noise variances. Here we used no inflation (i.e., δ = 1) in order to focus on the effect of finite ensemble sizes. It is clearly seen that the projected covariance decreases for large enough ensemble sizes. The variance will asymptote from above to in the limit k → ∞. For sufficiently small observational noise, the filter corrects too large forecast error covariances by incorporating the observations into the analysis leading to a decrease in the analysis error covariance.

Fig. 2.
Fig. 2.

Average maximal singular value of as a function of ensemble size k for the stochastic linear toy model in (12) using standard ETKF without inflation, with (dashed curve) and (solid curve). Parameters are σx = σy = γx = γy = 1, λ = 0.2, Δtobs = 1, for which the climatic variance is . We used 50 realizations for the averaging.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

However, the fact that the variance constraint is switched on does not necessarily imply that the variance-limiting filter will perform better than the standard ETKF. In particular, for very large observation intervals Δtobs when the ensemble will have acquired the climatic mean and covariances, VLKF and ETKF will have equal skill. We now turn to the question under what conditions VLKF is expected to yield improved skill compared to standard ETKF. To this end we introduce as skill indicator the (squared) RMS error:
e17
between the truth ztruth and the ensemble mean analysis (the square root is left out here for convenience of exposition). Here denotes the temporal average over analyzes cycles, and denotes averaging over different realizations of the Brownian paths W. We introduced the norm to investigate the overall skill using , the skill of the observed variables using and the skill of the pseudo-observables using . Using the Kalman filter (3) for the analysis mean with , we obtain for the ETKF:
eq22
Solving the linear toy model (12) for each member of the ensemble and then performing an ensemble average, we obtain
e18
Substituting a particular realization of the truth ztruth(t), and performing the average over the realizations, we finally arrive at
e19
with the mutually independent normally distributed random variables:
e20
We have numerically verified the validity of our assumptions of the statistics of and . Note that for to have mean zero and variance filter divergence has to be excluded. Similarly we obtain for the VLKF
e21
with the normally distributed random variable:
e22
where we used that aclim = 0. Note that using our stationarity assumption to calculate we have . Again we have numerically verified the statistics for . The expression for the RMS error of the VLKF (21) can be considerably simplified. Since for large ensemble sizes k → ∞ the random variable becomes a deterministic variable with mean zero, we may neglect all terms containing . We summarize to
e23
For convenience we have omitted superscripts for and in (19) and (23) to denote whether they have been evaluated for ETKF and VLKF. But note that, although the expressions in (19) and (23) are formally the same, one generally has , because the analysis covariance matrices are calculated differently for both methods leading to different gain matrices and different statistics of ξt in (19) and (23).
We can now estimate the skill improvement defined as
eq23
with values of indicating skill improvement of VLKF over ETKF. We shall choose from now on, and concentrate on the skill improvement for the pseudo-observables. Recalling that for large observation intervals Δtobs, we expect skill improvement for small Δtobs. We perform again a Taylor expansion in small Δtobs of the skill improvement . The resulting analytical expressions are very lengthy and cumbersome, and are therefore omitted for convenience.

We found that there is indeed skill improvement in the limit of either γy → ∞ or γx → 0. This suggests that the skill is controlled by the ratio of the time scales of the observed and the unobserved variables. If the time scale of the pseudo-observables is much larger than the one of the observed variables, VLKF will exhibit superior performance over ETKF. This can be intuitively understood since 1/(2γy) is the time scale on which equilibrium (i.e. the climatic state) is reached for the pseudo-observables y. If the pseudo-observables have relaxed toward equilibrium within the observation interval Δtobs, and their variance has acquired the climatic covariance , we expect the variance limiting to be beneficial.

Furthermore, we found analytically that the skill improvement increases with increasing observational noise Robs (at least in the small observation interval approximation). In particular we found that at Robs = 0. The increase of skill with increasing observational noise can be understood phenomenologically in the following way. For Robs = 0 the filter trusts the observations, which as a time series carry the climatic covariance. This implies that there is a realization of the Wiener process such that the analysis can be reproduced by a model with the true values of γx,y and σx,y. Similarly, this is the case in the other extreme Robs → ∞, where the filter trusts the model. For 0 ≪ Robs ≪ ∞ the analysis reproducing system would have a larger covariance σx than the true value. This slowed-down relaxation towards equilibrium of the observed variables can be interpreted as an effective decrease of the damping coefficient γx. This effectively increases the time-scale separation between the observed and the unobserved variables, which was conjectured above to be beneficial for skill improvement.

As expected, the skill improves with increasing inflation factor δ > 1. The improvement is exactly linear for Δtobs → ∞. This is due to the variance inflation leading to an increase of instances with , for which the variance constraint will be switched on.

In Fig. 3 we present a comparison of the analytical results (19) and (23) with results from a numerical implementation of ETKF and VLKF for varying damping coefficient γy. Since γy controls the time scale of the y process, we cannot use the same Δtobs for a wide range of γy in order not to violate the small observation interval approximations used in our analytical expressions. We choose Δtobs as a function of γy such that the singular values of the first-order approximation of the forecast variance is a good approximation for this Δtobs. For Fig. 3 we have Δtobs ∈ (0.005, 0.01) to preserve the validity of the Taylor expansion. Besides the increase of the skill with γy, Fig. 3 shows that the value of increases significantly for larger values of the inflation factor δ > 1.

Fig. 3.
Fig. 3.

Dependency of the skill improvement of VLKF over ETKF on the damping coefficient γy of the pseudo-observable. We show a comparison of direct numerical simulations (open circles) with analytical results using (21) (continuous curve) and the approximation of large ensemble size in (23) (dashed curve). Parameters are γx = 1, λ = 2, σx = σy = 1, and Robs = 0.25. We used an ensemble size of k = 20 and averaged over 1000 realizations. (a) No inflation with δ = 1. (b) Inflation with δ = 1.022.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

We will see in the next section that the results we obtained for the simple linear toy model (12) hold as well for a more complicated higher-dimensional model, where the dynamic Brownian driving noise is replaced by nonlinear chaotic dynamics.

5. Numerical results for the Lorenz-96 system

We illustrate our method with the Lorenz-96 system (Lorenz 1996) and show its usefulness for sparse observations in improving the analysis skill and stabilizing the filter. In Lorenz (1996), Lorenz proposed the following model for the atmosphere:
e24
with z = (z1, … , zD) and periodic zi+D = zi. This system is a toy model for midlatitude atmospheric dynamics, incorporating linear damping, forcing and nonlinear transport. The dynamical properties of the Lorenz-96 system have been investigated (e.g., Lorenz and Emanuel 1998; Orrell and Smith 2003; Gottwald and Melbourne 2005), and in the context of data assimilation it was also investigated (e.g., Ott et al. 2004; Fisher et al. 2005; Harlim and Majda 2010). We use D = 40 modes and set the forcing to F = 8. These parameters correspond to a strongly chaotic regime (Lorenz 1996). For these parameters one unit of time corresponds to 5 days in the earth’s atmosphere as calculated by calibrating the e-folding time of the asymptotic growth rate of the most unstable mode with a time scale of 2.1 days (Lorenz 1996). Assuming the length of a midlatitude belt to be about 30 000 km, the spatial scale corresponding to a discretization of the circumference of the earth along the midlatitudes in D = 40 grid points corresponds to a spacing between adjacent grid points zi of approximately 750 km, roughly equalling the Rossby radius of deformation at midlatitudes. We estimated from simulations the advection velocity to be approximately 10.4 m s−1, which compares well with typical wind velocities in the midlatitudes.

In the following we will investigate the effect of using VLKF on improving the analysis skill when compared to a standard ensemble transform Kalman filter, and on stabilizing the filter and avoiding blow-up as discussed in (Ott et al. 2004; Kepert 2004; Harlim and Majda 2010). We perform twin experiments using a k = 41-member ETKF and VLKF with the same truth time series, the same set of observations, and the same initial ensemble. We have chosen an ensemble with k > D in order to eliminate the effect that a finite-size ensemble can only fit as many observations as the number of its ensemble members (Lorenc 2003). Here we want to focus on the effect of limiting the variance.

The system is integrated using the implicit midpoint rule (e.g., see Leimkuhler and Reich 2005) to a time T = 30 with a time step dt = 1/240. The total time of integration corresponds to an equivalent of 150 days, and the integration time step dt corresponds to half an hour. We measured the approximate climatic mean and variance, μclim and , respectively, via a long time integration over a time interval of T = 2000, which corresponds roughly to 27.5 yr. Because of the symmetry of the system (24), the mean and the standard deviation are the same for all variables zi and are measured to be σclim = 3.63 and μclim = 2.34.

The initial ensemble at t = 0 is drawn from an ensemble with variance ; the filter was then subsequently spun up for sufficiently many analysis cycles to ensure statistical stationarity. We assume Gaussian observational noise of the order of 25% of the climatological standard deviation σclim, and set the observational error covariance matrix . We find that for larger observational noise levels the variance-limiting correction (11) is used more frequently. This is in accordance with our finding in the previous section for the toy model.

We study first the performance of the filter and its dependence on the time between observations Δtobs and the proportion of the system observed 1/Nobs. Here Nobs = 2 means only every second variable is observed, Nobs = 4 only every fourth, and so on.

We have used a constant variance inflation factor δ = 1.05 for both filters. We note that the optimal inflation factor at which the RMS error is minimal, is different for VLKF and ETKF. For Δtobs = 5/120 (5 h) and Nobs = 4 we find that δ = 1.06 produces minimal RMS errors for VLKF and δ = 1.04 produces minimal RMS errors for ETKF. For δ < 1.04, filter divergence occurs in ETKF, so we chose δ = 1.05 as a compromise between controlling filter divergence and minimizing the RMS errors of the analysis.

Figure 4 shows a sample analysis using ETKF with Nobs = 5, Δtobs = 0.15, and for an arbitrary unobserved component (top panel) and an arbitrary observed component (bottom panel) of the Lorenz-96 model. While the figure shows that the analysis (continuous gray line) tracks the truth (dashed line) reasonably well for the observed component, the analysis is quite poor for the unobserved component. Substantial improvements are seen for the VLKF when we incorporate information about the variance of the unobserved pseudo-observables, as can be seen in Fig. 5. We set the mean and the variance of the pseudo-observables to be the climatic mean and variance, aclim = μclime and to filter the same truth with the same observations as used to produce Fig. 4. For these parameters (and in this realization) the quality of the analysis in both the observed and unobserved components is improved.

Fig. 4.
Fig. 4.

Sample ETKF analysis (continuous gray line) for the (top) unobserved z1 and (bottom) observed z5 component. The dashed line is the truth and the crosses are observations. Parameters used were Nobs = 5, Δtobs = 0.15 (18 h), and .

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

Fig. 5.
Fig. 5.

Sample VLKF analysis (continuous gray line) for the (top) unobserved z1 and (bottom) observed z5 component. The dashed line is the truth and the crosses are observations. Parameters are as in Fig. 4.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

As for the linear toy model (12), finite ensemble sizes exacerbate the overestimation of error covariances. In Fig. 6 the maximal singular value of , averaged over 150 realizations, is shown for ETKF as a function of ensemble size k. Again we use no inflation (i.e., δ = 1) in order to focus on the effect of finite ensemble sizes. The projected covariance clearly decreases for large enough ensemble sizes. However, here the limit of the maximal singular value of for k → ∞ underestimates the climatic variance .

Fig. 6.
Fig. 6.

Average maximal singular value of as a function of ensemble size k for the Lorenz-96 model in (24), using standard ETKF without inflation and all other parameters are as in Fig. 4. We used 150 realizations for the averaging.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

To quantify the improvement of the VLKF filter we measure the site-averaged RMS error:
e25
between the truth ztruth and the ensemble mean with L = ⌊Ttobs⌋, where the average is taken over 500 different realizations, and DoD denotes the length of the vectors . In Table 1 we display for the ETKF and VLKF, respectively, as a function of Nobs and Δtobs. The increased RMS error for larger observation intervals Δtobs can be linked to the increased variance of the chaotic nonlinear dynamics generated during longer integration times between analyses. Figure 7 shows the average proportional improvement of the VLKF over ETKF, obtained from the values of Table 1. Figure 7 shows that the skill improvement is greatest when the system is observed frequently. For large observation intervals Δtobs ETKF and VLKF yield very similar RMS. We checked that for large observation intervals Δtobs both filters still produce tracking analyses. Note that the observation intervals Δtobs considered here are all much smaller than the e-folding time of 2.1 days. The most significant improvement occurs when one-quarter of the system is observed, that is for Nobs = 4, and for small observation intervals Δtobs. The dependency of the skill of VLKF on the observation interval is consistent with our analytical findings in section 4.
Table 1.

RMS errors for (top) ETKF and (bottom) VLKF for different values of Nobs and observational interval Δtobs, averaged over 500 simulations, and with as observational noise.

Table 1.
Fig. 7.
Fig. 7.

Proportional skill improvement of VLKF over ETKF as a function of the observation interval Δtobs for different values of Nobs, with observational noise . A total of 500 simulations were used to perform the ensemble average in the RMS errors using (25) for ETKF and VLKF. Δtobs is measured in hours.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

We have checked that the increase in skill as depicted in Fig. 7 is not sensitive to incomplete knowledge of the statistical properties of the pseudo-observables by perturbing and aclim and then monitoring the change in RMS error. We performed simulations where we drew and aclim independently from uniform distributions and (0.9aclim, 1.1aclim). We found that for parameters Nobs = 2, 4, 6; η = 0.05, 0.25, 0.5 [with η measuring the amount of the climatic variance used through ]; and Δtobs = 0.025, 0.05, 0.25 (corresponding to 3, 6, and 30 h) over a number of simulations, there was on average no more than 7% difference of the analysis mean and the singular values of the covariance matrices between the control run where and aclim = μclime is used, and when and aclim are simultaneously perturbed.

An interesting question is how the relative skill improvement is distributed over the observed and unobserved variables. This is illustrated in Figs. 8 and 9. In Fig. 8 we show the proportional skill improvement of VLKF over ETKF for the observed variables and the pseudo-observables, respectively. Figure 8 shows that the skill improvement is larger for the pseudo-observables than for the observables, which is to be expected. In Fig. 9 we show the actual RMS error for ETKF and VLKF for the observed variables and the pseudo-observables. It is shown that the skill improvement is better for the unobserved pseudo-observables for all observation intervals Δtobs. In contrast, VLKF exhibits an improved skill for the observed variables either for small observation intervals for all values of Nobs or for all observation intervals when Nobs = 4, 5. We have, however, checked that the analysis is still tracking the truth reasonably well, and the discrepancy with ETKF is not due to the analysis not tracking the truth anymore. As expected, the RMS error asymptotes for large observation intervals Δtobs (not shown) to the standard deviation of the observational noise 0.25σclim ≈ 0.910 for the observables, and to the climatic standard deviation σclim = 3.63 for the pseudo-observable (not shown), albeit slightly reduced for small values of Nobs due to the impact of the surrounding observed variables (see Fig. 10).

Fig. 8.
Fig. 8.

Proportional skill improvement of VLKF over ETKF as a function of the observation interval Δtobs for different values of Nobs. The RMS error is calculated using (a) only the observed variables or (b) only the pseudo-observables. Δtobs is measured in hours. Parameters are as in Fig. 7.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

Fig. 9.
Fig. 9.

RMS error of VLKF (solid lines) and ETKF (dashed lines) for , where is calculated using (a) only the observed variables or (b) only the pseudo-observables. Δtobs is measured in hours. Parameters are as in Fig. 7.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

Fig. 10.
Fig. 10.

RMS error for each variable zi as a function of the lattice site i. Only one observable was used at i = 21. Time between observations is Δtobs = 10 h and observational noise with covariance was used. The results are averaged over 100 different realizations.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

Note that there is an order of magnitude difference between the RMS errors for the observables and the pseudo-observables for large Nobs (cf. Fig. 9). This suggests that the information of the observed variables does not travel too far away from the observational sites. However, the nonlinear coupling in the Lorenz-96 system in (24) allows for information of the observed components to influence the error statistics of the unobserved components. Therefore the RMS errors of pseudo-observables adjacent to observables are better than those far away from observables. Moreover, the specific structure of the nonlinearity introduces a translational symmetry breaking (one may think of the nonlinearity as a finite-difference approximation of an advection term zzx), which causes those pseudo-observables to the right of an observable to have a more reduced RMS error than those to the left of an observable. This is illustrated in Fig. 10 where the RMS error is shown for each site when only one site is observed. The advective time scale of the Lorenz-96 system is much smaller than Δtobs, which explains why the skill is not equally distributed over the sites, and why, especially for large values of Nobs, we observe a big difference between the site-averaged skills of the observed and unobserved variables.

In Fig. 11 we show how the RMS error behaves as a function of the observational noise level. We see that for Nobs = 4, VLKF always has a smaller RMS error than ETKF.

Fig. 11.
Fig. 11.

RMS error for VLKF (solid lines) and ETKF (dashed lines), as a function of the observational noise, measured here by η defined via . The dashed–dotted line indicates the RMS error if only observations were used. Results for several observation intervals: (a) Δtobs = 1 h, (b) Δtobs = 2 h, and (c) Δtobs = 5 h; Nobs = 4 was used and 1000 simulations were carried out to perform the ensemble averages in the RMS errors using (25) for ETKF and VLKF.

Citation: Monthly Weather Review 139, 8; 10.1175/2011MWR3557.1

The results confirm again the results from our analysis of the toy model in section 4, which is that VLKF yields best performance for small observation intervals Δtobs and for large noise levels. For large observation intervals ETKF and VLKF perform equally well, since then the chaotic model dynamics will have lead the ensemble to have acquired the climatic variance during the time of propagation.

In Ott et al. (2004) it was observed that if not all variables zi are observed the Kalman filter diverges exhibiting blow-up. Similar behavior was observed in Harlim and Majda (2010). In Ott et al. (2004) the authors suggested that the sparsity of observations leads to an inhomogeneous background error, which causes an underestimation of the error covariance. Here we study this catastrophic blow-up divergence (as opposed to filter divergence when the analysis diverges from the truth) and its dependence on the time between observations Δtobs and the proportion of the system observed 1/Nobs. We note that blow-up divergence appears only in the case of sufficiently small observational noise and moderate values of Δtobs. Once Δtobs is large enough (in fact, larger than the e-folding time corresponding to the most unstable Lyapunov exponent, in our case 2.1 days) we notice that no catastrophic divergence occurs, independent of Nobs. This probably occurs because for large observation intervals the ensemble acquires enough variance through the nonlinear propagation. We prescribe Gaussian observational noise of the order of 5% of the climatological standard deviation σclim, and set the observational error covariance matrix to . The initial ensemble at t = 0 is drawn again from an ensemble with variance .

To study the performance of VLKF when blow-up occurs in ETKF simulations we count the number Nb of blow-ups that occur before a total of 100 simulations have terminated without blow-up. The proportions of blow-ups for the respective filters is then given by Nb/(Nb+100). We tabulate this proportion in Table 2 for the ETKF and VLKF, respectively, and the proportional improvement in Table 3. The ×s in the table represent cases where no successful simulations could be obtained due to blow-up.

Table 2.

Proportion of catastrophically diverging simulations with (top) ETKF and (bottom) VLKF for different values of Nobs and observation interval Δtobs. Observational noise with was used.

Table 2.
Table 3.

Proportional improvement of VLKF and ETKF as calculated as the ratio of the values from Table 2.

Table 3.

Both filters suffer from severe filter instability for Nobs = 6 (i.e., for very sparse observational networks), at small observation intervals Δtobs. No blow-up occurs for either filter when every variable is observed. Note the reduction in occurrences of blow-ups for large observation intervals Δtobs as discussed above. We have checked that for all Nobs there is no blow-up for ETKF (and VLKF) for sufficiently large Δtobs (not shown); the larger Nobs the smaller the upper bound of Δtobs such that no blow-ups occur. Collapse is most prominent for ETKF (and for VLKF, but to a much lesser extent) for larger values of Nobs and at intermediate observation intervals that depend on Nobs. Tables 2 and 3 clearly show that incorporating information about the pseudo-observables strongly increases the stability of the filter and suppresses blow-up. However, we note that despite the gain in stability VLKF has a skill less than the purely observational skill in the cases when blow-up occurs for ETKF, because the solutions become nontracking. Further research is under way to improve on this in the VLKF framework.

The fact that incorporating information about the variance of the unobserved variables improves the stability of the filter is in accordance with the interpretation of filter divergence of sparse observational networks provided in Ott et al. (2004).

6. Discussion

We have developed a framework to include information about the variance of unobserved variables in a sparse observational network. The filter is designed to control overestimation of error covariances typical in sparse observation networks, and limits the posterior analysis covariance of the unresolved variables to stay below their climatic variance. We have done so in a variational setting and found a relationship between the error covariance of the variance constraint and the assumed target variance of the unobserved pseudo-observables .

We illustrated the beneficial effects of the variance-limiting filter in improving the analysis skill when compared to the standard ensemble square root Kalman filter. We expect the variance-limiting constraint to improve data assimilation for ensemble Kalman filters when finite-size effects of too small ensemble sizes overestimate the error covariances, in particular in sparse observational networks. In particular we found that the skill will improve for small observation intervals Δtobs and sufficiently large observational noise. We found substantial skill improvement for both observed and unobserved variables. These effects can be undestood with a simple linear toy model that allows for an analytical treatment. We further established numerically that VLKF reduces the probability of catastrophic filter divergence and improves the stability of the filter when compared to the standard ensemble square root Kalman filter.

We remark that the idea of the variance-limiting Kalman filter is not restricted to ensemble Kalman filters, but can also be used to modify the extended Kalman filter. However, for the examples we used here the nonlinearities were too strong and the extended Kalman filter did not yield satisfactory results, even in the variance-limiting formulation.

The effect of the variance-limiting filter to control unrealistically large error covariances of the poorly resolved variables due to finite ensemble sizes may find useful applications. We mention here that the variance constraint is able to adaptively damp unrealistic excitation of ensemble spread in underresolved spatial regions due to inappropriate uniform inflation. This may be an alternative to the spatially adaptive schemes which were recently developed (Anderson 2007; Li et al. 2009). In addition, it is known that localization of covariance matrices in EnKF leads to imbalance in the analyzed fields (e.g., see Houtekamer and Mitchell 2005; Kepert 2009 for recent studies). Filter localization typically excites unwanted gravity waves that when uncontrolled can substantially degrade filter performance. One may construct balance constraints as pseudo-observations and thereby potentially reduce this undesired aspect of covariance localization. As more specific applications, we mention climate reanalysis and data assimilation for the mesosphere. It would be interesting to see how the proposed variance-limiting filter can be used in climate reanalysis schemes to deal with the vertical sparcity of observational data and the less dense observation network on the Southern Hemisphere in the preradiosonde era (see Whitaker et al. 2004). One would need to establish though whether the historical observation intervals Δtobs are sufficiently small to allow for a skill improvement. Similarly, it may help to control the dynamically dominant gravity wave activity in the mesosphere as the upper lid is pushed farther and farther (e.g., see Polavarapu et al. 2005). However, a word of caution is required here. In some atmospheric data assimilation problems, it is not at all uncommon to have an ensemble prior variance for certain variables that is significantly larger than the climatological variance, when the atmosphere is locally far away from equilibrium. One relevant example would be in the vicinity of strong fronts over the Southern Ocean. In such a case, it may not be appropriate to limit the variance to the climatological value.

In this work we have studied systems where for sufficiently large observation intervals Δtobs the variables acquire their true climatological mean and variance when the model is run. In particular we have not included model error. It would be interesting to see whether the variance-limiting filter can help to control model error in the case that the free running model would produce unrealistically large forecast covariances. Usually numerical schemes underestimate error covariances, but this is often caused by severe divergence damping (Durran 1999), which is artificially introduced to the model to control unwanted gravity wave activity and to stabilize the numerical scheme. The stabilization may be achieved by a much smaller amount of divergence damping by implementing the variance-limiting constraint in the data assimilation procedure. The VLKF would in this case act as an effective adaptive damping scheme, counteracting the model error.

Acknowledgments

We thank Craig Bishop and Jeffrey Kepert for pointing us to the possible application of simulations involving the mesosphere. We also thank the editor and three anonymous reviewers for valuable comments. GAG acknowledges support by the ARC.

REFERENCES

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 28842903.

  • Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210224.

  • Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 7283.

  • Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 27412758.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, L., and Coauthors, 2007: The need for a dynamical climate reanalysis. Bull. Amer. Meteor. Soc., 88, 495501.

  • Bergemann, K., G. Gottwald, and S. Reich, 2009: Ensemble propagation and continuous matrix factorization algorithms. Quart. J. Roy. Meteor. Soc., 135, 15601572.

    • Search Google Scholar
    • Export Citation
  • Bocquet, M., C. A. Pires, and L. Wu, 2010: Beyond Gaussian statistical modeling in geophysical data assimilation. Mon. Wea. Rev., 138, 29973023.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., M. Miller, and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 125, 28872908.

    • Search Google Scholar
    • Export Citation
  • Charron, M., G. Pellerin, L. Spacek, P. L. Houtekamer, N. Gagnon, H. L. Mitchell, and L. Michelin, 2010: Toward random sampling of model error in the Canadian ensemble prediction system. Mon. Wea. Rev., 138, 18771901.

    • Search Google Scholar
    • Export Citation
  • Compo, G. P., and Coauthors, 2011: The twentieth century reanalysis project. Quart. J. Roy. Meteor. Soc., 137, 128, doi:10.1002/qj.776.

    • Search Google Scholar
    • Export Citation
  • Durran, D. R., 1999: Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. Springer, 482 pp.

  • Eckermann, S. D., and Coauthors, 2009: High-altitude data assimilation system experiments for the northern summer mesosphere season of 2007. J. Atmos. Sol.-Terr. Phys., 71, 531551.

    • Search Google Scholar
    • Export Citation
  • Ehrendorfer, M., 2007: A review of issues in ensemble-based Kalman filtering. Meteor. Z., 16, 795818.

  • Evensen, G., 2006: Data Assimilation: The Ensemble Kalman Filter. Springer, 280 pp.

  • Fisher, M., M. Leutbecher, and G. A. Kelly, 2005: On the equivalence between Kalman smoothing and weak-constraint four-dimensional variational data assimilation. Quart. J. Roy. Meteor. Soc., 131, 32353246.

    • Search Google Scholar
    • Export Citation
  • Gardiner, C. W., 2004: Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences. 3rd ed. Springer, 415 pp.

  • Golub, G. H., and C. F. Van Loan, 1996: Matrix Computations. 3rd ed. The Johns Hopkins University Press, 728 pp.

  • Gottwald, G. A., and I. Melbourne, 2005: Testing for chaos in deterministic systems with noise. Physica D, 212, 100110.

  • Hamill, T. M., and J. S. Whitaker, 2005: Accounting for the error due to unresolved scales in ensemble data assimilation: A comparison of different approaches. Mon. Wea. Rev., 133, 31323147.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790.

    • Search Google Scholar
    • Export Citation
  • Harlim, J., and A. J. Majda, 2010: Catastrophic filter divergence in filtering nonlinear dissipative systems. Comm. Math. Sci., 8, 2743.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129, 123136.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2005: Ensemble Kalman filtering. Quart. J. Roy. Meteor. Soc., 131, 32693289.

  • Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133, 604620.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., H. L. Mitchell, and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter. Mon. Wea. Rev., 137, 21262143.

    • Search Google Scholar
    • Export Citation
  • Ide, K., P. Courtier, M. Ghil, and A. C. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181189.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., 2002: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, 364 pp.

  • Kepert, J. D., 2004: On ensemble representation of the observation-error covariances in the ensemble Kalman filter. Ocean Dyn., 54, 561569.

    • Search Google Scholar
    • Export Citation
  • Kepert, J. D., 2009: Covariance localisation and balance in an Ensemble Kalman Filter. Quart. J. Roy. Meteor. Soc., 135, 11571176.

  • Leimkuhler, B., and S. Reich, 2005: Simulating Hamiltonian Dynamics. Cambridge University Press, 379 pp.

  • Li, H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter. Quart. J. Roy. Meteor. Soc., 135, 523533.

    • Search Google Scholar
    • Export Citation
  • Liu, J., E. J. Fertig, H. Li, E. Kalnay, B. R. Hunt, E. J. Kostelich, I. Szunyogh, and R. Todling, 2008: Comparison between local ensemble transform Kalman filter and PSAS in the NASA finite volume GCM—Perfect model experiments. Nonlinear Processes Geophys., 15, 645659.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4DVAR. Quart. J. Roy. Meteor. Soc., 129, 31833203.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1996: Predictability—A problem partly solved. Predictability, T. Palmer, Ed., European Centre for Medium-Range Weather Forecasts, 1–18.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399414.

    • Search Google Scholar
    • Export Citation
  • Mitchell, H. L., and P. L. Houtekamer, 2000: An adaptive ensemble Kalman filter. Mon. Wea. Rev., 128, 416433.

  • Neef, L., S. M. Polavarapu, and T. G. Shepherd, 2006: Four-dimensional data assimilation and balanced dynamics. J. Atmos. Sci., 63, 18401850.

    • Search Google Scholar
    • Export Citation
  • Orrell, D., and L. Smith, 2003: Visualising bifurcations in high dimensional systems: The spectral bifurcation diagram. Int. J. Bifurcation Chaos, 13, 30153028.

    • Search Google Scholar
    • Export Citation
  • Ott, E., B. Hunt, I. Szunyogh, A. Zimin, E. Kostelich, M. Corrazza, E. Kalnay, and J. Yorke, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415428.

    • Search Google Scholar
    • Export Citation
  • Pires, C. A., O. Talagrand, and M. Bocquet, 2010: Diagnosis and impacts of non-Gaussianity of innovations in data assimilation. Physica D, 239, 17011717.

    • Search Google Scholar
    • Export Citation
  • Polavarapu, S., T. G. Shepherd, Y. Rochon, and S. Ren, 2005: Some challenges of middle atmosphere data assimilation. Quart. J. Roy. Meteor. Soc., 131, 35133527.

    • Search Google Scholar
    • Export Citation
  • Sankey, D., S. Ren, S. Polavarapu, Y. Rochon, Y. Nezlin, and S. Beagley, 2007: Impact of data assimilation filtering methods on the mesosphere. J. Geophy. Res., 112, D24104, doi:10.1029/2007JD008885.

    • Search Google Scholar
    • Export Citation
  • Sasaki, Y., 1970: Some basic formalisms on numerical variational analysis. Mon. Wea. Rev., 98, 875883.

  • Shutts, G. J., 2005: A stochastic kinetic energy backscatter algorithm for use in ensemble prediction systems. Quart. J. Roy. Meteor. Soc., 131, 30793102.

    • Search Google Scholar
    • Export Citation
  • Simon, D. J., 2006: Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. John Wiley & Sons, Inc., 552 pp.

  • Szunyogh, I., E. Kostelich, G. Gyarmati, D. J. Patil, B. Hunt, E. Kalnay, E. Ott, and J. Yorke, 2005: Assessing a local ensemble Kalman filter: Perfect model experiments with the National Centers for Environmental Prediction global model. Tellus, 57A, 528545.

    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 14851490.

    • Search Google Scholar
    • Export Citation
  • Wang, X., C. H. Bishop, and S. J. Julier, 2004: Which is better, an ensemble of positive–negative pairs or a centered spherical simplex ensemble? Mon. Wea. Rev., 132, 15901605.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., G. P. Compo, X. Wei, and T. M. Hamill, 2004: Reanalysis without radiosondes using ensemble data assimilation. Mon. Wea. Rev., 132, 11901200.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., G. P. Compo, and J. N. Thépaut, 2009: A comparison of variational and ensemble-based data assimilation systems for reanalysis of sparse observations. Mon. Wea. Rev., 137, 19911999.

    • Search Google Scholar
    • Export Citation
  • Wolfram Research, Inc., 2008: Mathematica Version 7.0.Wolfram Research, Inc., Champaign, IL. [Available online at http://www.wolfram.com/mathematica/.]

    • Search Google Scholar
    • Export Citation
  • Zupanski, D., 1997: A general weak constraint applicable to operational 4DVar data assimilation systems. Mon. Wea. Rev., 125, 22742292.

    • Search Google Scholar
    • Export Citation
1

The exposition is restricted to , but we note that the formulation can be generalized for Hilbert spaces.

2

We will use bold font for matrices and vectors, and regular font for scalars. It should be clear from the context whether bold fonts refer to a matrix or a vector.

3

We actually compute , however, since is diagonal for our choice of the matrices, positive definiteness of implies positive definiteness of .

Save