• Anderson, E., and Coauthors, 1992: LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, 235 pp.

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129 , 28842903.

  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131 , 634642.

  • Axelsson, O., 1984: Iterative Solution Methods. Cambridge University Press, 644 pp.

  • Bell, B. M., and F. W. Cathey, 1993: The iterated Kalman filter update as a Gauss-Newton method. IEEE Trans. Automat. Contr., 38 , 294297.

    • Search Google Scholar
    • Export Citation
  • Bishop, C., J. Etherton, and S. J. Majmudar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129 , 420436.

    • Search Google Scholar
    • Export Citation
  • Brasseur, P., J. Ballabrera, and J. Verron, 1999: Assimilation of altimetric data in the mid-latitude oceans using the SEEK filter with an eddy-resolving primitive equation model. J. Mar. Syst., 22 , 269294.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., and A. Montani, 1999: Targeting observations using singular vectors. J. Atmos. Sci., 56 , 29652985.

  • Cohn, S. E., 1997: Estimation theory for data assimilation problems: Basic conceptual framework and some open questions. J. Meteor. Soc. Japan, 75 , 257288.

    • Search Google Scholar
    • Export Citation
  • Cohn, S. E., A. da Silva, J. Guo, M. Sienkiewicz, and D. Lamich, 1998: Assessing the effects of data selection with the DAO physical-space statistical analysis system. Mon. Wea. Rev., 126 , 29132926.

    • Search Google Scholar
    • Export Citation
  • Courtier, P., J-N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var using an incremental approach. Quart. J. Roy. Meteor. Soc., 120 , 13671388.

    • Search Google Scholar
    • Export Citation
  • Daley, R., and R. Menard, 1993: Spectral characteristics of Kalman filter systems for atmospheric data assimilation. Mon. Wea. Rev., 121 , 15541565.

    • Search Google Scholar
    • Export Citation
  • Daley, R., and E. Barker, 2001: NAVDAS: Formulation and diagnostics. Mon. Wea. Rev., 129 , 869883.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte-Carlo methods to forecast error statistics. J. Geophys. Res., 99 , C5,. 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53 , 343367.

  • Evensen, G., and P. J. van Leeuwen, 2000: An ensemble Kalman smoother for nonlinear dynamics. Mon. Wea. Rev., 128 , 18521867.

  • Fisher, M., and P. Courtier, 1995: Estimating the covariance matrix of analysis and forecast error in variational data assimilation. ECMWF Tech. Memo. 220, 28 pp.

  • Gandin, L. S., 1963: Objective Analysis of Meteorological Fields. (in Russian). Gidrometeorizdar, 238 pp. [English translation by Israel Program for Scientific Translations, 1965, 242 pp.].

    • Search Google Scholar
    • Export Citation
  • Gill, P. E., W. Murray, and M. H. Wright, 1981: Practical Optimization. Academic Press, 401 pp.

  • Golub, G. H., and C. F. van Loan, 1989: Matrix Computations. 2d ed. The Johns Hopkins University Press, 642 pp.

  • Gottwald, G., and R. Grimshaw, 1999a: The formation of coherent structures in the context of blocking. J. Atmos. Sci., 56 , 36403662.

  • Gottwald, G., and R. Grimshaw, 1999b: The effect of topography on the dynamics of interacting solitary waves in the context of atmospheric blocking. J. Atmos. Sci., 56 , 36633678.

    • Search Google Scholar
    • Export Citation
  • Greenwald, T. J., S. A. Christopher, J. Chou, and J. C. Liljegren, 1999: Inter-comparison of cloud liquid water path derived from the GOES 9 imager and ground based microwave radiometers for continental stratocumulus. J. Geophys. Res., 104 , 92519260.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128 , 29052919.

  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 27762790.

    • Search Google Scholar
    • Export Citation
  • Haugen, V. E. J., and G. Evensen, 2002: Assimilation of SLA and SST data into an OGCM for the Indian Ocean. Ocean Dyn., 52 , 133151.

  • Heemink, A. W., M. Verlaan, and J. Segers, 2001: Variance reduced ensemble Kalman filtering. Mon. Wea. Rev., 129 , 17181728.

  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376 pp.

  • Kalman, R., and R. Bucy, 1961: New results in linear prediction and filtering theory. Trans. AMSE J. Basic Eng., 83D , 95108.

  • Keppenne, C. L., 2000: Data assimilation into a primitive-equation model with a parallel ensemble Kalman filter. Mon. Wea. Rev., 128 , 19711981.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., and M. M. Rienecker, 2002: Initial testing of massively parallel ensemble Kalman filter with the Poseidon isopycnal ocean general circulation model. Mon. Wea. Rev., 130 , 29512965.

    • Search Google Scholar
    • Export Citation
  • Langland, R. H., and Coauthors, 1999: The North Pacific Experiment (NORPEX-98): Targeted observations for improved North American weather forecasts. Bull. Amer. Meteor. Soc., 80 , 13631384.

    • Search Google Scholar
    • Export Citation
  • Lermusiaux, P. F. J., and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Part I: Theory and schemes. Mon. Wea. Rev., 127 , 13851407.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor. Soc., 112 , 11771194.

  • Luenberger, D. L., 1984: Linear and Non-linear Programming. 2d ed. Addison-Wesley, 491 pp.

  • Majumdar, S. J., C. H. Bishop, B. J. Etherton, and Z. Toth, 2002: Adaptive sampling with the ensemble transform Kalman filter. Part II: Field program implementation. Mon. Wea. Rev., 130 , 13561369.

    • Search Google Scholar
    • Export Citation
  • Marchant, T. R., and N. F. Smyth, 2002: The initial-boundary problem for the Korteweg–de Vries equation on the negative quarter-plane. Proc. Roy. Soc. London, 458A , 857871.

    • Search Google Scholar
    • Export Citation
  • Menard, R., S. E. Cohn, L-P. Chang, and P. M. Lyster, 2000: Assimilation of stratospheric chemical tracer observations using a Kalman filter. Part I: Formulation. Mon. Wea. Rev., 128 , 26542671.

    • Search Google Scholar
    • Export Citation
  • Mitsudera, H., 1994: Eady solitary waves: A theory of type B cyclogenesis. J. Atmos. Sci., 51 , 31373154.

  • Navon, I. M., X. Zou, J. Derber, and J. Sela, 1992: Variational data assimilation with an adiabatic version of the NMC spectral model. Mon. Wea. Rev., 120 , 14331446.

    • Search Google Scholar
    • Export Citation
  • Nocedal, J., 1980: Updating quasi-Newton matrices with limited storage. Math. Comput., 35 , 773782.

  • Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A , 415428.

  • Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza, 1998: Singular vectors, metrics, and adaptive observations. J. Atmos. Sci., 55 , 633653.

    • Search Google Scholar
    • Export Citation
  • Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center’s Spectral Statistical Interpolation Analysis System. Mon. Wea. Rev., 120 , 17471763.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., J. Verron, and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for data assimilation in oceanography. J. Mar. Syst., 16 , 323340.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., A. McNally, E. Andersson, P. Courtier, P. Unden, J. Eyre, A. Hollingsworth, and F. Bouttier, 1998: The ECMWF implementation of three dimensional variational assimilation (3D-Var). Part II: Structure functions. Quart. J. Roy. Meteor. Soc., 124A , 18091829.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., H. Jarvinen, E. Klinker, J-F. Mahfouf, and A. Simmons, 2000: The ECMWF operational implementation of four-dimensional variational assimilation. I: Experimental results with simplified physics. Quart. J. Roy. Meteor. Soc., 126A , 11431170.

    • Search Google Scholar
    • Export Citation
  • Reichle, R. H., D. B. McLaughlin, and D. Entekhabi, 2002a: Hydrologic data assimilation with the ensemble Kalman filter. Mon. Wea. Rev., 130 , 103114.

    • Search Google Scholar
    • Export Citation
  • Reichle, R. H., J. P. Walker, R. D. Koster, and P. R. Houser, 2002b: Extended versus ensemble Kalman filtering for land data assimilation. J. Hydrometor., 3 , 728740.

    • Search Google Scholar
    • Export Citation
  • Szunyogh, I., Z. Toth, A. V. Zimin, S. J. Majumdar, and A. Persson, 2002: Propagation of the effect of targeted observations: The 2000 Winter Storm Reconnaissance Program. Mon. Wea. Rev., 130 , 11441165.

    • Search Google Scholar
    • Export Citation
  • Tippett, M., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131 , 14851490.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 2001: An ensemble smoother with error estimates. Mon. Wea. Rev., 129 , 709728.

  • Verlaan, M., and A. W. Heemink, 2001: Nonlinearity in data assimilation applications: A practical method for analysis. Mon. Wea. Rev., 129 , 15781589.

    • Search Google Scholar
    • Export Citation
  • Vvedensky, D., 1993: Partial Differential Equations with Mathematica. Addison-Wesley, 465 pp.

  • Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130 , 19131924.

  • Zou, X., Y-H. Kuo, and Y-R. Guo, 1995: Assimilation of atmospheric radio refractivity using a nonhydrostatic adjoint model. Mon. Wea. Rev., 123 , 22292250.

    • Search Google Scholar
    • Export Citation
  • Zou, X., H. Liu, J. Derber, J. G. Sela, R. Treadon, I. M. Navon, and B. Wang, 2001: Four-dimensional variational data assimilation with a diabatic version of the NCEP global spectral model: System development and preliminary results. Quart. J. Roy. Meteor. Soc., 127 , 10951122.

    • Search Google Scholar
    • Export Citation
  • Zupanski, M., 1993: Regional four-dimensional variational data assimilation in a quasi-operational forecasting environment. Mon. Wea. Rev., 121 , 23962408.

    • Search Google Scholar
    • Export Citation
  • Zupanski, M., D. Zupanski, D. Parrish, E. Rogers, and G. DiMego, 2002: Four-dimensional variational data assimilation for the Blizzard of 2000. Mon. Wea. Rev., 130 , 19671988.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Time integration of the KdVB model and observations: (a) targeted observations and (b) in situ observations. The triangles denote the observations. The horizontal axis represents the model domain, and the ordinate axis is the amplitude. The cycles shown are 1, 4, 7, and 10. Note how the targeted observations follow the solitons, whereas the in situ observations remain in one location.

  • View in gallery

    Chi-square statistics in the linear observation operator assimilation experiment, with (a) 10 and (b) 101 observations per cycle. The dashed line represents instant values of χ2 from each analysis cycle, and the solid line represents a 10-cycle moving average.

  • View in gallery

    Innovation histogram in the linear observation operator assimilation experiment, with (a) 10 and (b) 101 observations per cycle. The solid line represents the normal distribution N(0, 1).

  • View in gallery

    The rms error in the control MLEF experiment, with quadratic observation operator and 10 observations (thin solid line). The horizontal axis denotes the analysis cycles, and the ordinate axis is the rms error. Also, the rms error in the no-assimilation experiment is shown (thick solid line).

  • View in gallery

    The analysis error covariance in the control MLEF experiment: analysis cycles (a) 1, (b) 4, (c) 7, and (d) 10. Each point represents the (i, j)th matrix element pij, with the horizontal axis denoting the i index and the ordinate axis denoting the j index. Dark-shaded area represents positive covariance, and the light-shaded area represents the negative covariance, using the threshold of ±1 × 10−4 nondimensional units. The contour interval is (a) 20, (b) 2, (c) 2, and (d) 2 nondimensional units.

  • View in gallery

    Innovation statistics in the control MLEF experiment: (a) χ2 test and (b) PDF histogram. The notation is same as in Figs. 2 and 3.

  • View in gallery

    Impact of minimization on the MLEF performance, showing the rms errors of ensemble data assimilation without minimization (solid line) and those of the control MLEF for comparison (dashed line). The horizontal axis denotes the analysis cycles, and the ordinate axis is the rms error.

  • View in gallery

    Same as in Fig. 7, but for the experiment with five observations.

  • View in gallery

    Same as in Fig. 7, but for in situ observation experiment with 10 observations.

  • View in gallery

    Impact of observation location. The dashed line represents the rms errors obtained with in situ observations, and the solid line is the rms error from the control MLEF experiment (i.e., targeted observations).

  • View in gallery

    Same as in Fig. 5, but for the in situ assimilation experiment. The contour interval is (a) 20, (b) 2, (c) 2, and (d) 20 nondimensional units.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 544 289 20
PDF Downloads 339 201 15

Maximum Likelihood Ensemble Filter: Theoretical Aspects

View More View Less
  • 1 Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, Colorado
Full access

Abstract

A new ensemble-based data assimilation method, named the maximum likelihood ensemble filter (MLEF), is presented. The analysis solution maximizes the likelihood of the posterior probability distribution, obtained by minimization of a cost function that depends on a general nonlinear observation operator. The MLEF belongs to the class of deterministic ensemble filters, since no perturbed observations are employed. As in variational and ensemble data assimilation methods, the cost function is derived using a Gaussian probability density function framework. Like other ensemble data assimilation algorithms, the MLEF produces an estimate of the analysis uncertainty (e.g., analysis error covariance). In addition to the common use of ensembles in calculation of the forecast error covariance, the ensembles in MLEF are exploited to efficiently calculate the Hessian preconditioning and the gradient of the cost function. A sufficient number of iterative minimization steps is 2–3, because of superior Hessian preconditioning. The MLEF method is well suited for use with highly nonlinear observation operators, for a small additional computational cost of minimization. The consistent treatment of nonlinear observation operators through optimization is an advantage of the MLEF over other ensemble data assimilation algorithms. The cost of MLEF is comparable to the cost of existing ensemble Kalman filter algorithms. The method is directly applicable to most complex forecast models and observation operators. In this paper, the MLEF method is applied to data assimilation with the one-dimensional Korteweg–de Vries–Burgers equation. The tested observation operator is quadratic, in order to make the assimilation problem more challenging. The results illustrate the stability of the MLEF performance, as well as the benefit of the cost function minimization. The improvement is noted in terms of the rms error, as well as the analysis error covariance. The statistics of innovation vectors (observation minus forecast) also indicate a stable performance of the MLEF algorithm. Additional experiments suggest the amplified benefit of targeted observations in ensemble data assimilation.

Corresponding author address: Milija Zupanski, Cooperative Institute for Research in the Atmosphere, Colorado State University, Foothills Campus, Fort Collins, CO 80523-1375. Email: zupanskim@cira.colostate.edu

Abstract

A new ensemble-based data assimilation method, named the maximum likelihood ensemble filter (MLEF), is presented. The analysis solution maximizes the likelihood of the posterior probability distribution, obtained by minimization of a cost function that depends on a general nonlinear observation operator. The MLEF belongs to the class of deterministic ensemble filters, since no perturbed observations are employed. As in variational and ensemble data assimilation methods, the cost function is derived using a Gaussian probability density function framework. Like other ensemble data assimilation algorithms, the MLEF produces an estimate of the analysis uncertainty (e.g., analysis error covariance). In addition to the common use of ensembles in calculation of the forecast error covariance, the ensembles in MLEF are exploited to efficiently calculate the Hessian preconditioning and the gradient of the cost function. A sufficient number of iterative minimization steps is 2–3, because of superior Hessian preconditioning. The MLEF method is well suited for use with highly nonlinear observation operators, for a small additional computational cost of minimization. The consistent treatment of nonlinear observation operators through optimization is an advantage of the MLEF over other ensemble data assimilation algorithms. The cost of MLEF is comparable to the cost of existing ensemble Kalman filter algorithms. The method is directly applicable to most complex forecast models and observation operators. In this paper, the MLEF method is applied to data assimilation with the one-dimensional Korteweg–de Vries–Burgers equation. The tested observation operator is quadratic, in order to make the assimilation problem more challenging. The results illustrate the stability of the MLEF performance, as well as the benefit of the cost function minimization. The improvement is noted in terms of the rms error, as well as the analysis error covariance. The statistics of innovation vectors (observation minus forecast) also indicate a stable performance of the MLEF algorithm. Additional experiments suggest the amplified benefit of targeted observations in ensemble data assimilation.

Corresponding author address: Milija Zupanski, Cooperative Institute for Research in the Atmosphere, Colorado State University, Foothills Campus, Fort Collins, CO 80523-1375. Email: zupanskim@cira.colostate.edu

1. Introduction

Since early 1960s, data assimilation in atmospheric and oceanographic applications is based on the Kalman filtering theory (Kalman and Bucy 1961; Jazwinski 1970). Beginning with optimal interpolation (Gandin 1963), and continuing with three-dimensional (Parrish and Derber 1992; Rabier et al. 1998; Cohn et al. 1998; Daley and Barker 2001) and four-dimensional variational data assimilation (Navon et al. 1992; Zupanski 1993; Zou et al. 1995, 2001; Courtier et al. 1994; Rabier et al. 2000; Zupanski et al. 2002), data assimilation methodologies operationally used in atmospheric and oceanic applications can be viewed as an effort to approximate the Kalman filter/smoother theoretical framework (Cohn 1997). The approximations are necessary because of the lack of knowledge of statistical properties of models and observations, as well as because of a tremendous computational burden associated with high dimensionality of realistic atmospheric and oceanic data assimilation problems. So far, common approaches to realistic data assimilation were to approximate (e.g., model) error covariances, as well as to avoid the calculation of posterior (e.g., analysis) error covariance. These approximations have a common problem of not being able to use fully cycled error covariance information, as the theory suggests. The consequence is not only that the produced analysis is of reduced quality, but also that no reliable estimates of the uncertainties of the produced analysis are available.

For the first time, a novel approach to data assimilation in oceanography and meteorology pursued in recent years (Evensen 1994; Houtekamer and Mitchell 1998; Pham et al. 1998; Lermusiaux and Robinson 1999; Brasseur et al. 1999; Hamill and Snyder 2000; Evensen and van Leeuwen 2000; Keppenne 2000; Bishop et al. 2001; Anderson 2001; van Leeuwen 2001; Haugen and Evensen 2002; Reichle et al. 2002b; Whitaker and Hamill 2002; Anderson 2003; Ott et al. 2004), based on the use of ensemble forecasting in nonlinear Kalman filtering, offers the means to consistently estimate the analysis uncertainties. The price to pay is the reduced dimension of the analysis subspace (defined by ensemble forecasts); thus there is a concern of not being sufficient to adequately represent all important dynamical features and instabilities. Preliminary results show, however, that this may not always be a problem (e.g., Houtekamer and Mitchell 2001; Keppenne and Rienecker 2002). On the other hand, it is anticipated that the ensemble size will need to be increased as more realistic and higher-resolution models and observations are used. This, however, may be feasible even on currently available computers. With the advancement in computer technology, and multiple processing in particular, which is ideally suited for ensemble framework, the future looks promising for continuing development and realistic applications of ensemble data assimilation methodology.

In achieving that goal, however, there are still few unresolved methodological and practical issues that will be pursued in this paper. Current ensemble data assimilation methodologies are broadly grouped in stochastic and deterministic approaches (Tippett et al. 2003). A common starting point to these algorithms is the use of the solution form of the extended Kalman filter (e.g., Evensen 2003), obtained assuming linearized dynamics and observation operators, with Gaussian assumption regarding the measurements and control variables (e.g., initial conditions). We refer to this as a linearized solution form. Since realistic observation operators are generally nonlinear, a common approach to nonlinearity in ensemble data assimilation is to use a first-order Taylor series assumption, that is, to use a difference between two nonlinear operators in the place of a linearized observation operator. The use of linearized solution form with nonlinear observation operators, however, creates a mathematical inconsistency in treatment of nonlinear observation operators. An alternate way to deal with the nonlinearity of observation operators is to first pose a fully nonlinear problem, and then find the solution in the ensemble-spanned subspace. This is the approach adopted in this paper.

The proposed ensemble data assimilation method is based on a combination of the maximum likelihood and ensemble data assimilation, named the maximum likelihood ensemble filter (MLEF). The analysis solution is obtained as a model state that maximizes the posterior conditional probability distribution. In practice, the calculation of the maximum likelihood state estimate is performed using an iterative minimization algorithm, thus making the MLEF approach closely related to the iterated Kalman filter (Jazwinski 1970; Cohn 1997). Since the cost function used to define the analysis problem is arbitrarily nonlinear, the treatment of nonlinear observation operators is considered an advantage of the MLEF algorithm. The use of optimization in MLEF forms a bond between ensemble data assimilation and control theory. Like other ensemble data assimilation algorithms, MLEF produces an estimate of the analysis uncertainty (e.g., analysis error covariance). The idea behind this development is to produce a method capable of optimally exploiting the experience gathered in operational data assimilation and the advancements in ensemble data assimilation, eventually producing a qualitatively new system. The practical goal is to develop a single data assimilation system easily applicable to the simplest, as well as to the most complex nonlinear models and observation operators.

While the maximum likelihood estimate has a unique solution for unimodal probability density functions (PDFs), there is a possibility for a nonunique solution in the case of multimodal PDFs. This issue will be given more attention in the future.

The method is explained in section 2, algorithmic details are given in section 3, experimental design is presented in section 4, results in section 5, and conclusions are drawn in section 6.

2. MLEF methodology

From variational methods it is known that a maximum likelihood estimate, adopted in MLEF, is a suitable approach in applications to realistic data assimilation in meteorology and oceanography. From operational applications of data assimilation methods, it is also known that a Gaussian PDF assumption, used in derivation of the cost function (e.g., Lorenc 1986), is generally accepted and widely used. Although the model and observation operators are generally nonlinear, and observation and forecast errors are not necessarily Gaussian, the Gaussian PDF framework is still a state-of-the-art approach in meteorological and oceanographic data assimilation (e.g., Cohn 1997). This is the main reason why a Gaussian PDF framework is used in this paper.

The mathematical framework of the MLEF algorithm is presented in two parts, the forecast and the analysis steps, followed by a brief comparison with related data assimilation methodologies.

a. Forecast step

The forecast error covariance evolution of the discrete Kalman filter with Gaussian error assumption can be written (Jazwinski 1970) as
i1520-0493-133-6-1710-e1
where 𝗣f(k) is the forecast error covariance at time tk, 𝗠k−1,k is the linearized forecast model (e.g., Jacobian) from time tk−1 to time tk, 𝗣a(k − 1) is the analysis error covariance at time tk−1, and 𝗤(k − 1) is the model error covariance at time tk−1. The model error is neglected in the remainder of this paper. With this assumption, and after dropping the time indexing, the forecast error covariance is
i1520-0493-133-6-1710-e2
Let us assume the square root analysis error covariance is a column matrix
i1520-0493-133-6-1710-e3
where the index N defines the dimension of the model state (e.g., initial conditions), and the index S refers to the number of ensembles. In practical ensemble applications, S is much smaller than N. Using (3) in definition (2), the square root forecast error covariance is
i1520-0493-133-6-1710-e4
where xk−1 is the analysis from the previous analysis cycle, at time tk−1. Note that each of the columns {bi: I = 1, . . . , S} has N elements. The ensemble square root forecast error covariance 𝗣1/2f can be obtained from S nonlinear ensemble forecasts, M(xk−1 + pi), plus one control forecast, M(xk−1) [e.g., (4)]. The forecast error covariance definition (4) implies the use of a control (deterministic) forecast instead of an ensemble mean, commonly used in other ensemble data assimilation methods. Ideally, the control forecast represents the most likely dynamical state; thus it is intrinsically related to the use of the maximum likelihood approach. In principle, however, the use of an ensemble mean instead of the most likely deterministic forecast is also possible.

Important to note is that the availability of an ensemble square root analysis error covariance 𝗣1/2a, provided by the data assimilation algorithm, is critical for proper coupling between analysis and forecast. In addition to data assimilation cycles, the 𝗣1/2a columns could be used as initial perturbations for ensemble forecasting, in agreement with (4).

b. Analysis step

In the MLEF method, the analysis solution is obtained as a maximum likelihood estimate, that is, a model state that maximizes the posterior probability distribution. With the Gaussian PDF assumption implied in the definition of the cost function, the maximum likelihood problem is redefined as the minimization of an arbitrary nonlinear cost function of the form (e.g., Lorenc 1986)
i1520-0493-133-6-1710-e5
where x is the model state vector, xb denotes the prior (background) state, and y is the measurement vector. The background state xb is an estimate of the most likely dynamical state; thus it is a deterministic forecast from the previous assimilation cycle. The nonlinear observation operator H represents a mapping from model space to observation space, and 𝗥 is the observation error covariance matrix.

Note that the error covariance matrix 𝗣f is defined in the ensemble subspace [e.g., (4)], and thus it has a much smaller rank than the true forecast error covariance. Therefore, the cost function definition (5) only has a similar form as the three-dimensional variational cost function (e.g., Parrish and Derber 1992); however, it is defined in the ensemble subspace only. In strict terms, the invertibility of 𝗣f in (5) is preserved only in the range of 𝗣f, implying that the cost function (5) is effectively defined in the range of 𝗣f as well. The same reasoning and definitions are implicit in other ensemble data assimilation methods, with the exemption of hybrid methods (e.g., Hamill and Snyder 2000).

Hessian preconditioning is introduced by a change of variable
i1520-0493-133-6-1710-e6
where the vector ζ is the control variable defined in ensemble subspace, and
i1520-0493-133-6-1710-e7
The notation 𝗣T/2f = (𝗣1/2f)T is used in the above formula. A closer inspection reveals that the change of variable (6) is a perfect preconditioner in quadratic minimization problems (Axelsson 1984), that is, assuming linear observation operators. This means that, with the change of variable (6) and linear observation operators, the solution is obtained in a single step of minimization iteration. The matrix defined in (6) is the square root of an inverse Hessian of (5). The matrix 𝗖 is commonly neglected in Hessian preconditioning in variational problems, because of high dimensionality and associated computational burden.
The practical problem is now to define the matrices appearing in (6). The square root forecast error covariance is calculated from previous ensemble forecasts [e.g., (4)]. The calculation of the matrix (𝗜 + 𝗖)−T/2, however, requires some attention. Since the columns of the square root forecast error covariance are known, the ith column of the matrix appearing in (7) is
i1520-0493-133-6-1710-e8
Note that each of the column vectors zi has the dimension of observation space. The matrix 𝗖 can be then defined as
i1520-0493-133-6-1710-e9
The matrix 𝗖 is an S × S symmetric matrix; thus it has small dimensions defined by the number of ensembles. To calculate efficiently the inversion and the square root involved in (𝗜 + 𝗖)−T/2, an eigenvalue decomposition (EVD) of the matrix 𝗖 may be used. One obtains 𝗖 = 𝗩Λ𝗩T, where 𝗩 denotes the eigenvector matrix, and Λ is the eigenvalue matrix. Then
i1520-0493-133-6-1710-e10
Note that the definition of matrix 𝗖 and subsequent EVD are equivalent to the matrix transform introduced in the ensemble transform Kalman filter (ETKF; Bishop et al. 2001). The change of variable (6) can be now easily accomplished. The use of ensembles is consistently introduced by (4) and (8).
After successfully accomplishing the Hessian preconditioning, the next step in an iterative minimization is to calculate the gradient in ensemble-spanned subspace. One can first redefine the cost function (5) using the change of variable (6) and then calculate the first derivative
i1520-0493-133-6-1710-e11
Note that the use of an adjoint (e.g., transpose) in (11) is avoided by employing (8) in calculation of the matrix 𝗥−1/2𝗛𝗣1/2f.

As shown in appendix A, within a linear operator framework, the first minimization iteration calculated using the preconditioned steepest descent is equivalent to the ensemble-based reduced-rank Kalman filter (Verlaan and Heemink 2001; Heemink et al. 2001), or to the Monte Carlo–based ensemble Kalman filter (Evensen 1994). Although different in detail, the computational effort involved in calculation with ensemble-based Kalman filters is comparable to the calculation of ensemble-based Hessian preconditioning and the gradient present in the MLEF algorithm. In both the ensemble Kalman filters and the MLEF, the computational cost of the analysis step is dominated by a matrix inversion computation [e.g., (A5)(A7), appendix A].

In calculating the analysis error covariance, the MLEF employs a strategy somewhat different from other ensemble data assimilation methods. The MLEF calculates the analysis error covariance as the inverse Hessian matrix at the minimum (e.g., Fisher and Courtier 1995), generally available as a by-product of minimization. For the quasi-Newton minimization algorithm, one could use the inverse Hessian update produced by the minimization algorithm (e.g., Nocedal 1980). In applications with conjugate-gradient algorithm (e.g., Gill et al. 1981; Luenberger 1984), used here, one would update the matrix 𝗖 using the solution at the minimum (i.e., the optimized analysis xopt), and then calculate 𝗣1/2a according to
i1520-0493-133-6-1710-e12
The expression (12) has the same form as the analysis error covariance used in the ETKF (Bishop et al. 2001). The important difference, however, exists in applications with nonlinear observation operators. Because in MLEF the inverse Hessian is calculated at the minimum, the Taylor expansion of a nonlinear Hessian operator is well approximated by the first order (e.g., linear) term. This implies that the equivalence between the inverse Hessian and the analysis error covariance, valid only in linear framework, is preserved for arbitrary nonlinear operators. For linear observation operator, the analysis error covariance estimates from both algorithms would be the same. The columns of the matrix 𝗣1/2a are then used as initial perturbations for the next assimilation cycle, according to (3) and (4), and cycling of analysis and forecast continues.

c. The MLEF and related data assimilation methodologies

The MLEF method encompasses a few important existing methodologies and algorithms.

1) Variational data assimilation

The minimization of the cost function, used to derive the maximum likelihood estimate in MLEF, is inherently related to variational data assimilation algorithms. The difference is that, in the MLEF formulation, the minimization is performed in an ensemble-spanned subspace, while in the variational method the full model space is used. The issue of the number of degrees of freedom is problem dependent and will require consideration in future realistic applications. At present, one should note that there are ways to introduce complementary degrees of freedom and obtain a unique mathematical solution (e.g., Hamill and Snyder 2000). Also, there is a practical possibility to increase the degrees of freedom by introducing more ensemble members. All of these options require careful examination in problem-oriented applications.

2) Iterated Kalman filter

Another methodology related to MLEF is the iterated Kalman filter (IKF; Jazwinski 1970; Cohn 1997), developed with the idea to solve iteratively the nonlinear problem. Bell and Cathey (1993) demonstrated that IKF is a Gauss–Newton method. As is the MLEF, the IKF is calculating the mode (e.g., maximum likelihood approach) with underlying Gaussian assumption. An obvious difference is that MLEF is defined within an ensemble framework. The practical advantage of an iterative methodology, such as the IKF or MLEF, is fundamentally tied to the choice of minimization method. An integral part of the MLEF is the use of an unconstrained minimization algorithm, in the form of the nonlinear conjugate gradient and the limited-memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) quasi-Newton methods (e.g., Gill et al. 1981; Luenberger 1984; Nocedal 1980). The unconstrained minimization approach allows very efficient iterative solution to the problem with significant nonlinearities and large residuals (e.g., Gill et al. 1981).

3) Ensemble transform Kalman filter

The matrix transform and eigenvalue decomposition used for the Hessian preconditioning in MLEF [(6)(10)] is equivalent to the matrix transform introduced in the ETKF algorithm (Bishop et al. 2001). This approach allows an efficient reduction of the dimensions of a matrix to be inverted. Therefore, the MLEF algorithm can be viewed as a maximum likelihood approach to the ETKF (C. Bishop 2003, personal communication).

The idea behind the MLEF is to retain only the components and concepts deemed advantageous from other algorithms, while weak components are changed or improved. For example, the cost function minimization, used in variational methods and IKF, is characterized as beneficial: minimization allows the equivalence between the inverse Hessian and analysis error covariance to be valid even for arbitrary nonlinear observation operators. Modeling of forecast error covariance, Hessian preconditioning, and adjoint model development are all considered weak points of variational methods and are improved or avoided using an ensemble framework. Hessian preconditioning introduced in the ETKF is considered advantageous as well. The ensemble framework makes the probabilistic forecasting and data assimilation with realistic prediction models and observations feasible, which is not possible with IKF.

The end products of the MLEF algorithm are 1) deterministic analysis, corresponding to the model state that maximizes the posterior probability distribution, and 2) (square root) analysis error covariance, corresponding to an estimate of analysis uncertainty.

3. Algorithmic details

The MLEF algorithm is designed to exploit the data assimilation infrastructure in existing algorithms. For example, the innovation vectors (e.g., observation-minus-forecast residuals) are calculated as in existing data assimilation algorithms, and the minimization currently used in variational data assimilation can be used in MLEF. To optimize the MLEF performance in realistic applications, the multiple processor capability of parallel computing is made an important component of the algorithm.

As implied in the previous section, the underlying principle in the MLEF development was to improve the computational stability of the algorithm by using only square root matrices. There are five algorithmic steps in the MLEF.

a. Step 1: Ensemble forecasting from previous to new analysis cycle

A square root forecast error covariance is computed first. Normally, the initial ensemble perturbations are the columns of a square root analysis error covariance, available from a previous analysis cycle. At the very start of data assimilation, however, there is no previous analysis error covariance, and one needs to provide some initial ensemble perturbations to be used in (4). Amongst many feasible options, the following strategy is adopted in MLEF: define random perturbations to initial conditions some time into the past, say the time interval of one–two assimilation cycles, in order to form a set of perturbed initial conditions. Then use this set to initiate ensemble forecasting. The nonlinear ensemble forecast perturbations are computed as a difference between the ensemble forecasts and the control (e.g., unperturbed) forecast, valid at the time of the first data assimilation cycle. According to (4), these perturbations are then used as columns of a square root forecast error covariance, required for data assimilation.

Note that this step, common to all ensemble data assimilation algorithms, may contribute significantly to the computational cost of ensemble data assimilation in high-dimensional applications. It allows an efficient use of parallel computing, however, and thus the actual cost can be significantly reduced in practice.

b. Step 2: Forward ensemble run to observation location—Innovation vector calculation

Once the ensemble forecasting step is completed, producing square root forecast error covariance columns, the analysis step begins. An essential component of this step is the calculation of innovation vectors, that is, the observation-minus-first-guess differences for each ensemble member. In practice, the vectors zi [(8)] are computed as nonlinear ensemble perturbations of innovation vectors
i1520-0493-133-6-1710-e13
where the vectors bi are obtained from previously completed ensemble forecasts [(4)]. This means that each ensemble forecast is interpolated to observation location, using the same observation operator available in an existing variational data assimilation algorithm. The calculation of innovation vector perturbations is done without communication between processors; thus it is efficiently scalable on parallel computers.

c. Step 3: Hessian preconditioning and 𝗖-matrix calculation

This step is done only in first minimization iteration. The matrix 𝗖 is computed from ensemble perturbations around the initial forecast guess and is used for Hessian preconditioning. The innovation vectors calculated in step 2 are then used to calculate the elements of the matrix 𝗖 [(8)]. The elements of 𝗖 are computed through an inner-product calculation, and this represents the second dominant computational effort in MLEF (most dominant being the ensemble forecasting). Note that an equivalent computational effort is involved in the ETKF algorithm. Although 𝗖 is an S × S symmetric matrix (S being the ensemble size), there are still S(S + 1)/2 inner products to be calculated. If parallel computing is available, each of the inner products can be efficiently calculated on separate processors, essentially with no communication between the processors, thus significantly reducing the computational cost. The EVD calculation of 𝗖 is of negligible cost, 𝗖 being a small-dimensional matrix. Standard EVD subroutines for dense matrices, commonly available in a general mathematical library, such as the Linear Algebra Package (LAPACK; Anderson et al. 1992), or similar, may be used. As shown by (10), the matrix inversion involved in the change of variable (6) is easily accomplished.

d. Step 4: Gradient calculation

The gradient calculation requires a repeated calculation of innovation vector perturbations zi in each minimization iteration, however without the need to update the matrix 𝗖. The components of the gradient vector in ensemble space [(11)] are essentially the control forecast innovation vector components projected on each ensemble perturbation. With mentioned good parallel scalability of innovation vectors calculation, the cost of the gradient calculation is relatively small.

e. Step 5: Analysis error covariance

As stated earlier, the required square root of analysis error covariance is obtained as a by-product of minimization algorithm. The actual computation method depends on the employed minimization algorithm. For example, if a quasi-Newton algorithm is used, one could use the inverse Hessian update formula (e.g., Nocedal 1980) to update the analysis error covariance. In this work, however, we employed a nonlinear conjugate-gradient algorithm (e.g., Luenberger 1984), with the line-search algorithm as defined in Navon et al. (1992). To obtain a satisfactory square root analysis error covariance, the relation (12) is used, with 𝗖 computed around the optimal analysis. Otherwise, the calculation is identical to the step 3. Because 𝗖 is computed close to the true minimum, the nonlinear part of the Hessian is negligible, and a good estimate of the analysis error covariance can be obtained. The columns of the square root analysis error covariance are then used as perturbations to ensemble forecasting in step 1, and the new analysis cycle begins.

Note that error covariance localization, not employed in the current MLEF algorithm, is an important component of most ensemble-based data assimilation algorithms (e.g., Houtekamer and Mitchell 1998; Hamill et al. 2001; Whitaker and Hamill 2002). The idea is that, if the forecast error covariance is noisy and has unrealistic distant correlations, these correlations should be removed. The noisy error covariance is anticipated if the number of ensembles is very small. In the MLEF applications presented here, however, initially noisy error covariances were localized anyway after only few analysis cycles, without any need for additional localization procedure. For that reason, the issue of error covariance localization is left for future work.

4. Experimental design

The MLEF method will be used in a simple one-dimensional example, in order to illustrate the anticipated impact in realistic applications.

a. Model

The forecast model used in this paper is a one-dimensional Korteweg–de Vries–Burgers (KdVB) model
i1520-0493-133-6-1710-e14
where u is a nondimensional model state vector and ν is a diffusion coefficient. The numerical solution is obtained using centered finite differences in space, and the fourth-order Runge–Kutta scheme for time integration (Marchant and Smyth 2002). The model domain has dimension N = 101, with the grid spacing Δx = 0.5 nondimensional units, and the time step is Δt = 0.01 nondimensional units. The periodic boundary conditions are used. In the control experiment the diffusion coefficient is ν = 0.07.

The KdVB model includes a few desirable characteristics, such as the nonlinear advection, dispersion, and diffusion. It also allows the solitary waves (e.g., solitons), a nonlinear superposition of several waves, with damping due to diffusion. Various forms of this model are being used in hydrodynamics, nonlinear optics, plasma physics, and elementary particle physics. An interesting weather-related application of a coupled KdV-based system of equations can be found in Gottwald and Grimshaw (1999a, b). Also implied by Mitsudera (1994) in applications to cyclogenesis, the KdV-based system supports baroclinic instability, and it models realistically a nonlinear interaction between the flow and topography.

In the experiments presented here, a two-soliton analytic solution of the Korteweg–de Vries equation (Vvedensky 1993) is chosen for the initial conditions:
i1520-0493-133-6-1710-e15
where x refers to distance and t to time. The parameters β1 and β2 reflect the amplitude of the two solitons and are chosen to be β1 = 0.5 and β2 = 1.0. The solitons progress with the speed proportional to their amplitude, and the specific choice of the parameters assures that the solitons will often interact during the time integration of the model.

Note that the model run defined as truth is using β1 = 0.5 and β2 = 1.0, and the initial conditions used in assimilation experiments are defined using β1 = 0.4 and β2 = 0.9, with the time parameter t lagging behind the truth by one time unit (e.g., 100 model time steps). The initial forecast error covariance is defined using ensemble forecasts [e.g., (4)], initiated from a set of random perturbations two cycles prior to the first observation time. The initial perturbations are formed by randomly perturbing parameters of the solution (15), such as the time and the β1 and β2 parameters, around the values used in assimilation run, that is, using β1 = 0.4 and β2 = 0.9.

b. Observations

The observations are chosen as random perturbations to the truth [i.e., the forecast run with initial conditions using β1 = 0.5 and β2 = 1.0 in (15)], with the error εobs = 0.05 nondimensional units. Note that such choice implies a perfect model assumption. The observation error covariance 𝗥 is chosen to be diagonal (e.g., variance), with elements ε2obs. There are approximately 10 irregularly spaced observations available at each analysis time. Two types of the experiments are performed: (i) in situ observations, fixed at one location at all times, and (ii) targeted observations, with observations following the solitons’ peaks throughout the integration. Initially, however, both the in situ and targeted observations are chosen to be identical.

The observation operator is a quadratic transformation operator, defined as H(u) = u2. The choice of quadratic observation operator is influenced by a desire to test the algorithm with a relatively weakly nonlinear observation operator, not necessarily related to any meteorological observations. In practice, the observation operators of interest would include highly nonlinear observations operators, such as the radiative transfer model for cloudy atmosphere (e.g., Greenwald et al. 1999), with extensive use of exponential functions. Also, radar reflectivity measurements of rain, snow, and hail are related to model-produced specific humidity and density through logarithmic and other nonlinear functions (M. Xue 2004, personal communication). The observations are taken at grid points to avoid additional impact of interpolation. The case of the linear observation operator is less interesting, since then the MLEF solution is identical to the reduced-rank ensemble Kalman filter solution. The use of the linear observation operator, however, is important for algorithm development and initial testing. In that case, the MLEF solution is obtained in a single minimization step because of the implied perfect Hessian preconditioning.

The observations are made available every two nondimensional time units. Given the model time step of 0.01 units, each analysis cycle implies 200 model time steps. The time integration of the control forecast and the observations are shown in Fig. 1. Note that no data assimilation is involved in creating these plots. The shown time evolution corresponds to the first 10 analysis cycles and illustrates the two-soliton character of the solution. The shown cycles correspond to the cycles that are shown in section 5.

c. Experiments

The control experiment includes 10 ensemble members (as compared with 101 total degrees of freedom) with 10 targeted observations, and employs a quadratic observation operator. The iterative minimization employed is the Fletcher–Reeves nonlinear conjugate-gradient algorithm (e.g., Luenberger 1984). In each of the MLEF data assimilation cycles, three minimization iterations are performed to obtain the analysis. In all experiments 100 analysis cycles are performed, until the amplitude of solitary waves in the control forecast was reduced by one order of magnitude because of diffusion. Long assimilation also helps in evaluating the stability of the MLEF algorithm performance.

The experiments are designed in such a way as to address two potentially important and challenging problems in realistic atmospheric and oceanic data assimilation: impact of minimization and impact of observation location.

d. Validation

To compare the results of various experiments, four validation methods are employed.

1) Root-mean-square (rms) error

In calculating the rms error, it is assumed that the true analysis solution, denoted utrue, is given by the control forecast used to produce the observations. This is not completely true, being dependent on the relative influence of observation and forecast errors, but it is assumed acceptable. With this assumption, the rms error is calculated as
i1520-0493-133-6-1710-e16
As before, the index N defines the model state dimension (i.e., the number of grid points).

2) Analysis error covariance estimate

The analysis error covariance is an estimate obtained from an ensemble data assimilation algorithm, and it will be shown in terms of the actual matrix elements. This is the new information produced by ensemble data assimilation, generally not available in variational data assimilation. It requires special attention, since this information is directly transferred to ensemble forecasting and also estimates the uncertainty of the produced analysis.

3) The χ2 validation test

The χ2 validation diagnostics (e.g., Menard et al. 2000), developed to validate the Kalman filter performance, can also be used in the context of ensemble data assimilation. This diagnostics evaluates the correctness of the innovation (observation minus forecast) covariance matrix that employs a predefined observation error covariance 𝗥, and the MLEF-computed forecast error covariance 𝗣f. We adopt the definition used in Menard et al. (2000)χ2 is defined in observation space, normalized by the number of observation, Nobs:
i1520-0493-133-6-1710-e17
In the MLEF algorithm, the above formula is rewritten as
i1520-0493-133-6-1710-e18
where the matrix 𝗚−1 (e.g., its square root) is defined in appendix B, y denotes observations, and x is the model forecast. Because of an iterative estimation of optimal analysis in MLEF, the forecast x denotes the forecast from the last minimization iteration, and the matrix 𝗖 is calculated about the optimal state. For Gaussian distribution of innovations, and linear observation operator H, the conditional mean of χ2 defined by (18) should be equal to 1. As in Menard et al. (2000), the conditional mean is substituted by a time mean. In this paper, a 10-cycle moving average is computed, as well as the instant values of χ2, calculated at each assimilation cycle. Because of the use of a nonlinear model in calculation of 𝗣f, and a statistically small sample (i.e., relatively few observations per cycle), one can expect only values of χ2 close to 1 and not necessarily equal to 1.

4) Innovation vector PDF statistics

Another important statistical verification of an ensemble data assimilation algorithm, also related to innovation vectors, is the PDF of innovations (e.g., Reichle et al. 2002a). From (18), the normalized innovations are defined as
i1520-0493-133-6-1710-e19
With Gaussian filtering assumptions regarding the measurements and control variables, and for linear dynamic system and observation operators, the resulting innovation PDF should have a standard normal distribution N(0, 1). Note that, if innovations (19) are random variables with distribution N(0, 1), then (17)(18) define a χ2 distribution with Nobs degrees of freedom.

In our applications, because of the nonlinearity of the forecast model and the observation operator H and the relatively small statistical sample, only an approximate normal distribution should be expected.

5. Results

a. Linear observation operator experiments

When linear observation operators are employed, and Gaussian error distribution assumed, in principle there is no difference between the MLEF and any related EnKF algorithm. Formally, a single minimization iteration of the MLEF is needed, with step length equal to 1. These experiments are conducted in order to develop and test the MLEF algorithm, especially the statistics of produced results, using diagnostics defined in sections 4d(3) and 4d(4). Note that perfect statistical fit cannot be expected, since the forecast model is still a nonlinear model, with diffusion, and the posterior statistics is not exact Gaussian. An obvious consequence of having few observations per cycle is that the innovation statistics may not be representative of true PDF statistics. Two experiments are performed, one with 10 (targeted) observations per cycle, and the other with all observations (e.g., 101 per cycle).

The χ2 test is shown in Fig. 2. Although in both experiments the value of χ2 is close to one, much better agreement is obtained with more observations (e.g., Fig. 2b). It also takes more analysis cycles to converge to 1, which may be a sign of an increased difficulty of the KdVB model to fit numerous and noisy observations. Note that with all observations assimilated, there is a greater chance that some observations are negative, which would in turn impact the numerical stability of the model.

The innovation histogram is shown in Fig. 3 and indicates a similar impact of statistical sample. Deviations from a Gaussian PDF are more pronounced when fewer observations are used (Fig. 3a), then in the case with all observations (Fig. 3b). There is also a notable right shift of the PDF, which could be the impact of diffusion (e.g., Daley and Menard (1993) or the impact of model nonlinearity.

The results in Figs. 2a and 3a indicate what can be expected from the experiments with a quadratic observation operator and few observations per cycle. On the other hand, in future applications of the MLEF with real models and real observations, one could expect a much better innovation statistical sample, given the enormous number of satellite and radar measurements available today.

b. Quadratic observation operator

1) Control experiment

The rms result of the control MLEF experiment is shown in Fig. 4. Also, the rms error from the experiment with no observations is shown. Any acceptable data assimilation experiment should have smaller rms error than the no-assimilation experiment. During the initial 11–12 cycles, however, there is a pronounced increase of the rms error. This suggests that the particular initial perturbation (defined as a difference from truth) is unstable during initial cycles. As the cycles continue, however, the rms error in the no-assimilation experiment converges to the true solution, that is, indicating an ultimate stability of initial perturbation. This is an artifact of diffusion, which would eventually force all forecasts to be zero, therefore producing all rms errors equal to zero. One can note good rms convergence in the MLEF experiment, within the first few cycles. The final rms error is nonzero, since the defined truth (e.g., utrue) is just a long-term forecast used to create observations, not necessarily equal to the actual true analysis solution. Overall, the rms error indicates a stable MLEF performance.

The estimate of the analysis error covariance in the control MLEF experiment is shown in Fig. 5, for the analysis cycles 1, 4, 7, and 10. These cycles are chosen in order to illustrate an initial adjustment of the analysis. Each of the figures represents actual matrix elements, with the diagonal corresponding to the variance. All analysis error covariance figures have a threshold of ±1 × 10−4 nondimensional units, in order to ease the qualitative comparison between the results from different experiments. Although the true analysis error covariance is not known, it would have nonzero values, since the observations have a nonzero error. One can immediately note how analysis error covariance became localized by the fourth cycle, without the need for any artificial error covariance localization. Also, the values of the covariance matrix remain relatively small through cycles, moving with the solitons.

Statistics of innovation vectors are an important sign of the algorithm performance, especially useful when the truth is not known. The χ2 test and the innovation histogram are shown in Fig. 6. As suggested earlier, because of increased nonlinearity and small statistical sample, one should expect only approximate agreement. It is clear that the χ2 value remains near the value of one throughout analysis cycles, again suggesting a stable performance of MLEF algorithm. The innovation histogram is showing close resemblance to standard normal PDF, confirming that the statistics of innovations is satisfactory.

2) Impact of iterative minimization

The control MLEF experiment (with three minimization iterations) is compared with the ensemble data assimilation experiment with no explicit minimization. In both experiments 10 ensemble members and 10 observations are used, and a quadratic observation operator is employed. The only difference is that the MLEF employs an iterative minimization, while the no-minimization experiment is a single minimization iteration with the step length equal to 1 (e.g., appendix A). The experiment without minimization indirectly reflects the impact of linear analysis solution, implied in ensemble Kalman filters. It should be noted, however, that there are many other details of ensemble Kalman filters not captured in this experiment, and any direct comparison should be taken with caution.

The rms errors are shown in Fig. 7. It is obvious that without minimization, the ensemble-based reduced-rank data assimilation algorithm is not performing well. The explanation is that the MLEF algorithm is better equipped to handle nonlinearities of observation operators, and thus it creates smaller rms errors. Most of the difficulties in the no-minimization experiment are occurring during first 11–12 cycles, coinciding with the rms increase noted in the no-observation experiment (Fig. 4).

With fewer observations, the positive impact of iterative minimization is still notable in terms of rms error, although the impact is somewhat smaller in magnitude (Fig. 8). Again, most of the differences occur during first cycles, with both solutions reaching the same rms in later cycles. The reduced number of observations does have a negative impact on the performance of both algorithms, as expected. The impact of minimizations is also evaluated for the in situ observations, in terms of rms error (Fig. 9). As before, the positive impact of minimization is notable only in first cycles, with both algorithms showing signs of difficulty. A hidden problem with no-minimization experiments with five observations and with in situ observations was that the satisfactory solution was possible only for smaller initial ensemble perturbations. Therefore, the results shown in Figs. 8 and 9 imply smaller initial ensemble perturbations than in the experiments with 10 targeted observations (Fig. 7). This may be an indication of the sensitivity of the KdVB numerical solution to large perturbations, but also it may suggest a critical role of iterative minimization in situations with large innovation residuals. This issue will be further examined in future, in applications with realistic models and real observations.

Overall, the use of iterative minimization in MLEF shows a positive impact in terms of the rms error. The impact appears to be stronger for a better observed system.

3) Impact of observation location

An interesting problem, related to the impact of observation location on the performance of ensemble data assimilation, is now considered. The issue of targeted observations, as means to improve the regular observation network, has been thoroughly discussed and evaluated (Palmer et al. 1998; Buizza and Montani 1999; Langland et al. 1999; Szunyogh et al. 2002; Majumdar et al. 2002). Here, we indirectly address this issue by examining the impact of observation location on the performance of the MLEF algorithm.

At initial time, the in situ and targeted observations are chosen to be identical. The two solitons may be viewed as weather disturbances with phase and amplitude important to predict, that is, as the temperature associated with fronts, for example. Since these systems move and interact with each other, it is instructive to evaluate the impact of targeted observations, intuitively associated with the location of the disturbances. Figure 10 shows the rms errors in targeted and in situ MLEF experiments. There is a strong positive impact of targeted observations, with them being able to resolve the two disturbances at all times. The particular location of the in situ observations does not allow the optimal use of observation information with regard to the two solitons. Only at cycles when the solitons are passing through the in situ observation network is the observation information adequately transferred and accumulated, eventually resulting in small rms errors.

The analysis error covariance associated with the in situ MLEF experiment is shown in Fig. 11 and should be compared with the control MLEF experiment (Fig. 5). The positive impact of targeted observations is now even more obvious. In the first cycle, the results are identical since the targeted and in situ observations are identical. As the cycles proceed, much larger uncertainties are obtained than in the control MLEF experiment, especially near the location of solitons. Although one should not draw strong conclusions from this simple experiment, the results appear to suggest that targeted observations amplify the beneficial impact of ensemble data assimilation.

6. Summary and conclusions

The maximum likelihood ensemble filter is presented, in applications to one-dimensional Korteweg–de Vries–Burgers equation with two solitons. The filter combines the maximum likelihood approach with the ensemble Kalman filter methodology to create a qualitatively new ensemble data assimilation algorithm with desirable computational features. The analysis solution is obtained as a model state that maximizes the posterior probability distribution, via an unconstrained minimization of an arbitrary nonlinear cost function. This creates an important link between the control theory and ensemble data assimilation. Like other ensemble data assimilation algorithms, the MLEF produces an estimate of the analysis uncertainty (e.g., analysis error covariance) and employs solely nonlinear forecast model and observation operators. The use of linearized models, or adjoints, required for variational methods, is completely avoided. The impact of the MLEF method is illustrated in an example with quadratic observation operator. The innovation vector statistics (e.g., χ2 test and innovation histogram) indicates satisfactory, stable performance of the algorithm. Although in this paper the MLEF method is applied in a simple environment, all calculations and processing of observations are directly applicable to use with state-of-the-art forecast models and arbitrary nonlinear observation operators. Since the observations assimilated in the experiments presented here are just a single realization of infinitely many possible realizations, the obtained results also depend on the particular observation realization.

The impact of targeted observations is another important issue relevant to the operational data assimilation and the use of ensembles. It appears that the targeted observation network amplifies the beneficial impact of ensemble data assimilation. This is certainly an issue worthy of further investigation.

The positive impact of iterative minimization, on both the rms error and the analysis error covariance, is obvious. The MLEF algorithm clearly benefits from the maximum likelihood component. The additional calculation involved in iterative minimization is almost negligible, compared to the cost of ensemble forecasts and the Hessian preconditioning calculations. Only two–three minimization iterations are anticipated in realistic applications, further relaxing possible concern of using iterative minimization.

A positive impact of minimization in the case of the nonlinear observation operator suggests that an iterative minimization approach can be also used in other ensemble-based data assimilation algorithms based on the use and calculation of the conditional mean (e.g., ensemble mean). Such an algorithm would be more robust with respect to nonlinear observation operators.

Because of the use of a control deterministic forecast as a first guess, the MLEF method may be more appealing in applications where a deterministic forecast is of interest. The MLEF method offers a potential advantage when computational burden forces the ensemble forecasts to be calculated in coarser resolution than desired. One can still minimize the cost function defined in fine resolution and thus produce the control (maximum likelihood) forecast in fine resolution. Only the ensembles, used for error covariance calculation, are defined in coarse resolution. Using the ensemble mean as a first guess, on the other hand, may be a limiting factor in that respect, since data assimilation problem would be defined and solved only in coarser resolution.

In a forthcoming paper, the model error and model error covariance evolution will be added to the MLEF algorithm. Applications to realistic models and observations are also underway. For somewhat higher computational cost, the MLEF algorithm allows a straightforward extension to smoothing, which could be relevant in applications with high temporal frequency of observations.

In future MLEF development, the non-Gaussian PDF framework and improved Hessian preconditioning is anticipated, to further extend the use of control theory in challenging geophysical applications. Both the conditional mean (e.g., minimum variance) and the conditional mode (e.g., maximum likelihood) are important PDF estimates (e.g., Cohn 1997). Future development of the MLEF will address these issues.

Acknowledgments

I thank Dusanka Zupanski for many helpful discussions and careful reading of the manuscript. My gratitude is also extended to Ken Eis for helpful comments and suggestions. I also thank Thomas Vonder Haar and Tomislava Vukicevic for their continuous support throughout this work. I am greatly indebted to Rolf Reichle and an anonymous reviewer for thorough reviews that significantly improved the manuscript. This research was supported by the Department of Defense Center for Geosciences/Atmospheric Research at Colorado State University under Cooperative Agreement DAAD19-02-2-0005 with the Army Research Laboratory.

REFERENCES

  • Anderson, E., and Coauthors, 1992: LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, 235 pp.

  • Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129 , 28842903.

  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131 , 634642.

  • Axelsson, O., 1984: Iterative Solution Methods. Cambridge University Press, 644 pp.

  • Bell, B. M., and F. W. Cathey, 1993: The iterated Kalman filter update as a Gauss-Newton method. IEEE Trans. Automat. Contr., 38 , 294297.

    • Search Google Scholar
    • Export Citation
  • Bishop, C., J. Etherton, and S. J. Majmudar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129 , 420436.

    • Search Google Scholar
    • Export Citation
  • Brasseur, P., J. Ballabrera, and J. Verron, 1999: Assimilation of altimetric data in the mid-latitude oceans using the SEEK filter with an eddy-resolving primitive equation model. J. Mar. Syst., 22 , 269294.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., and A. Montani, 1999: Targeting observations using singular vectors. J. Atmos. Sci., 56 , 29652985.

  • Cohn, S. E., 1997: Estimation theory for data assimilation problems: Basic conceptual framework and some open questions. J. Meteor. Soc. Japan, 75 , 257288.

    • Search Google Scholar
    • Export Citation
  • Cohn, S. E., A. da Silva, J. Guo, M. Sienkiewicz, and D. Lamich, 1998: Assessing the effects of data selection with the DAO physical-space statistical analysis system. Mon. Wea. Rev., 126 , 29132926.

    • Search Google Scholar
    • Export Citation
  • Courtier, P., J-N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var using an incremental approach. Quart. J. Roy. Meteor. Soc., 120 , 13671388.

    • Search Google Scholar
    • Export Citation
  • Daley, R., and R. Menard, 1993: Spectral characteristics of Kalman filter systems for atmospheric data assimilation. Mon. Wea. Rev., 121 , 15541565.

    • Search Google Scholar
    • Export Citation
  • Daley, R., and E. Barker, 2001: NAVDAS: Formulation and diagnostics. Mon. Wea. Rev., 129 , 869883.

  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte-Carlo methods to forecast error statistics. J. Geophys. Res., 99 , C5,. 1014310162.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dyn., 53 , 343367.

  • Evensen, G., and P. J. van Leeuwen, 2000: An ensemble Kalman smoother for nonlinear dynamics. Mon. Wea. Rev., 128 , 18521867.

  • Fisher, M., and P. Courtier, 1995: Estimating the covariance matrix of analysis and forecast error in variational data assimilation. ECMWF Tech. Memo. 220, 28 pp.

  • Gandin, L. S., 1963: Objective Analysis of Meteorological Fields. (in Russian). Gidrometeorizdar, 238 pp. [English translation by Israel Program for Scientific Translations, 1965, 242 pp.].

    • Search Google Scholar
    • Export Citation
  • Gill, P. E., W. Murray, and M. H. Wright, 1981: Practical Optimization. Academic Press, 401 pp.

  • Golub, G. H., and C. F. van Loan, 1989: Matrix Computations. 2d ed. The Johns Hopkins University Press, 642 pp.

  • Gottwald, G., and R. Grimshaw, 1999a: The formation of coherent structures in the context of blocking. J. Atmos. Sci., 56 , 36403662.

  • Gottwald, G., and R. Grimshaw, 1999b: The effect of topography on the dynamics of interacting solitary waves in the context of atmospheric blocking. J. Atmos. Sci., 56 , 36633678.

    • Search Google Scholar
    • Export Citation
  • Greenwald, T. J., S. A. Christopher, J. Chou, and J. C. Liljegren, 1999: Inter-comparison of cloud liquid water path derived from the GOES 9 imager and ground based microwave radiometers for continental stratocumulus. J. Geophys. Res., 104 , 92519260.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128 , 29052919.

  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 27762790.

    • Search Google Scholar
    • Export Citation
  • Haugen, V. E. J., and G. Evensen, 2002: Assimilation of SLA and SST data into an OGCM for the Indian Ocean. Ocean Dyn., 52 , 133151.

  • Heemink, A. W., M. Verlaan, and J. Segers, 2001: Variance reduced ensemble Kalman filtering. Mon. Wea. Rev., 129 , 17181728.

  • Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796811.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123137.

    • Search Google Scholar
    • Export Citation
  • Jazwinski, A. H., 1970: Stochastic Processes and Filtering Theory. Academic Press, 376 pp.

  • Kalman, R., and R. Bucy, 1961: New results in linear prediction and filtering theory. Trans. AMSE J. Basic Eng., 83D , 95108.

  • Keppenne, C. L., 2000: Data assimilation into a primitive-equation model with a parallel ensemble Kalman filter. Mon. Wea. Rev., 128 , 19711981.

    • Search Google Scholar
    • Export Citation
  • Keppenne, C. L., and M. M. Rienecker, 2002: Initial testing of massively parallel ensemble Kalman filter with the Poseidon isopycnal ocean general circulation model. Mon. Wea. Rev., 130 , 29512965.

    • Search Google Scholar
    • Export Citation
  • Langland, R. H., and Coauthors, 1999: The North Pacific Experiment (NORPEX-98): Targeted observations for improved North American weather forecasts. Bull. Amer. Meteor. Soc., 80 , 13631384.

    • Search Google Scholar
    • Export Citation
  • Lermusiaux, P. F. J., and A. R. Robinson, 1999: Data assimilation via error subspace statistical estimation. Part I: Theory and schemes. Mon. Wea. Rev., 127 , 13851407.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor. Soc., 112 , 11771194.

  • Luenberger, D. L., 1984: Linear and Non-linear Programming. 2d ed. Addison-Wesley, 491 pp.

  • Majumdar, S. J., C. H. Bishop, B. J. Etherton, and Z. Toth, 2002: Adaptive sampling with the ensemble transform Kalman filter. Part II: Field program implementation. Mon. Wea. Rev., 130 , 13561369.

    • Search Google Scholar
    • Export Citation
  • Marchant, T. R., and N. F. Smyth, 2002: The initial-boundary problem for the Korteweg–de Vries equation on the negative quarter-plane. Proc. Roy. Soc. London, 458A , 857871.

    • Search Google Scholar
    • Export Citation
  • Menard, R., S. E. Cohn, L-P. Chang, and P. M. Lyster, 2000: Assimilation of stratospheric chemical tracer observations using a Kalman filter. Part I: Formulation. Mon. Wea. Rev., 128 , 26542671.

    • Search Google Scholar
    • Export Citation
  • Mitsudera, H., 1994: Eady solitary waves: A theory of type B cyclogenesis. J. Atmos. Sci., 51 , 31373154.

  • Navon, I. M., X. Zou, J. Derber, and J. Sela, 1992: Variational data assimilation with an adiabatic version of the NMC spectral model. Mon. Wea. Rev., 120 , 14331446.

    • Search Google Scholar
    • Export Citation
  • Nocedal, J., 1980: Updating quasi-Newton matrices with limited storage. Math. Comput., 35 , 773782.

  • Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A , 415428.

  • Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza, 1998: Singular vectors, metrics, and adaptive observations. J. Atmos. Sci., 55 , 633653.

    • Search Google Scholar
    • Export Citation
  • Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center’s Spectral Statistical Interpolation Analysis System. Mon. Wea. Rev., 120 , 17471763.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., J. Verron, and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for data assimilation in oceanography. J. Mar. Syst., 16 , 323340.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., A. McNally, E. Andersson, P. Courtier, P. Unden, J. Eyre, A. Hollingsworth, and F. Bouttier, 1998: The ECMWF implementation of three dimensional variational assimilation (3D-Var). Part II: Structure functions. Quart. J. Roy. Meteor. Soc., 124A , 18091829.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., H. Jarvinen, E. Klinker, J-F. Mahfouf, and A. Simmons, 2000: The ECMWF operational implementation of four-dimensional variational assimilation. I: Experimental results with simplified physics. Quart. J. Roy. Meteor. Soc., 126A , 11431170.

    • Search Google Scholar
    • Export Citation
  • Reichle, R. H., D. B. McLaughlin, and D. Entekhabi, 2002a: Hydrologic data assimilation with the ensemble Kalman filter. Mon. Wea. Rev., 130 , 103114.

    • Search Google Scholar
    • Export Citation
  • Reichle, R. H., J. P. Walker, R. D. Koster, and P. R. Houser, 2002b: Extended versus ensemble Kalman filtering for land data assimilation. J. Hydrometor., 3 , 728740.

    • Search Google Scholar
    • Export Citation
  • Szunyogh, I., Z. Toth, A. V. Zimin, S. J. Majumdar, and A. Persson, 2002: Propagation of the effect of targeted observations: The 2000 Winter Storm Reconnaissance Program. Mon. Wea. Rev., 130 , 11441165.

    • Search Google Scholar
    • Export Citation
  • Tippett, M., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131 , 14851490.

    • Search Google Scholar
    • Export Citation
  • van Leeuwen, P. J., 2001: An ensemble smoother with error estimates. Mon. Wea. Rev., 129 , 709728.

  • Verlaan, M., and A. W. Heemink, 2001: Nonlinearity in data assimilation applications: A practical method for analysis. Mon. Wea. Rev., 129 , 15781589.

    • Search Google Scholar
    • Export Citation
  • Vvedensky, D., 1993: Partial Differential Equations with Mathematica. Addison-Wesley, 465 pp.

  • Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130 , 19131924.

  • Zou, X., Y-H. Kuo, and Y-R. Guo, 1995: Assimilation of atmospheric radio refractivity using a nonhydrostatic adjoint model. Mon. Wea. Rev., 123 , 22292250.

    • Search Google Scholar
    • Export Citation
  • Zou, X., H. Liu, J. Derber, J. G. Sela, R. Treadon, I. M. Navon, and B. Wang, 2001: Four-dimensional variational data assimilation with a diabatic version of the NCEP global spectral model: System development and preliminary results. Quart. J. Roy. Meteor. Soc., 127 , 10951122.

    • Search Google Scholar
    • Export Citation
  • Zupanski, M., 1993: Regional four-dimensional variational data assimilation in a quasi-operational forecasting environment. Mon. Wea. Rev., 121 , 23962408.

    • Search Google Scholar
    • Export Citation
  • Zupanski, M., D. Zupanski, D. Parrish, E. Rogers, and G. DiMego, 2002: Four-dimensional variational data assimilation for the Blizzard of 2000. Mon. Wea. Rev., 130 , 19671988.

    • Search Google Scholar
    • Export Citation

APPENDIX A

Equivalence of the Kalman Gain and Hessian Preconditioning-Gradient Calculation

The preconditioned steepest descent is often used as a first iterative step in many gradient-based minimization algorithms, such as the conjugate-gradient, quasi-Newton, truncated Newton algorithms (e.g., Gill et al. 1981)
i1520-0493-133-6-1710-ea1
where α is the step length, 𝗘 is the Hessian, and g is the gradient of the cost function (6) in the first minimization iteration. Denoting the Jacobian of observation operator as 𝗛,
i1520-0493-133-6-1710-ea2
the gradient of the cost function in first iteration is
i1520-0493-133-6-1710-ea3
and the Hessian is
i1520-0493-133-6-1710-ea4
Substituting (A3) and (A4) in (A1) gives
i1520-0493-133-6-1710-ea5
After employing the matrix identity (Jazwinski 1970),
i1520-0493-133-6-1710-ea6
The analysis update in first minimization iteration becomes
i1520-0493-133-6-1710-ea7
For quadratic cost function, the step length α is equal to 1 (Gill et al. 1981). Therefore, for quadratic cost function, the solution of the iterative minimization problem in first iteration is identical to the extended Kalman filter solution (Jazwinski 1970). In this context, the matrix identity (A6) shows the equivalence between the Kalman gain calculation and the Hessian-gradient calculation in iterative minimization. For nonquadratic cost function, the step length is different from one, and the solution (A7) is not identical to the extended Kalman filter solution. The Kalman gain computation, however, is still the same as the Hessian-gradient computation.

APPENDIX B

Computation of the Matrix 𝗚−1/2

The matrices 𝗚−1 [(18)] and 𝗚−1/2 [(19)] are needed for computation of normalized innovations. An efficient algorithm for computing the inverse square root matrix 𝗚−1/2 is presented here. It relies on the use of Sherman–Morrison–Woodbury (SMW) formula, as well as on the use of an iterative matrix square root calculation procedure. This matrix is used to calculate the normalized innovations [(19)]. The calculated normalized innovations are then used in calculating the χ2 sum [(18)]. From (17) and (18), one can see that
i1520-0493-133-6-1710-eb1
Using (13) to define the columns of the matrix 𝗭 = (z1 z2 . . . zS), one can redefine 𝗚−1 as
i1520-0493-133-6-1710-eb2
Note that columns of the matrix 𝗭 are the same as used in the MLEF algorithm [e.g., (8) and (13)] and are available for no additional cost. The issue is how to calculate the inversion in (B2), as well as the matrix square root required by (19). From the SMW formula (e.g., Golub and van Loan 1989),
i1520-0493-133-6-1710-eb3
Note that 𝗖 = 𝗭T𝗭, where the matrix 𝗖 is the same as defined by (9) and (10) from the text. This means that (B3) can be rewritten as
i1520-0493-133-6-1710-eb4
where the eigenvector matrix 𝗩 and the eigenvalue matrix Λ are both available from the MLEF algorithm [(10)]. Therefore, all matrices on the right-hand side of (B4) are available. To calculate the square root of a positive-definite symmetric matrix 𝗚−1, one can exploit an iterative formula (Golub and van Loan 1989, p. 554, problem P11.2.4), which produces a unique symmetric positive-definite square root matrix 𝗚−1/2
i1520-0493-133-6-1710-eb5
It is important to realize that the specific form of the matrix 𝗚−1 [e.g., (B4)], and the fact that 𝗩 is unitary (e.g., 𝗩T𝗩 = 𝗜), allow a simplification of the matrix inversion involved in (B5). To see that, it is convenient to write 𝗚−1 in generic form
i1520-0493-133-6-1710-eb6
where Ψ0 is a nonzero diagonal matrix, with known elements ψi = 1/(1 + λi). After applying (B5) with formulation (B6), one obtains
i1520-0493-133-6-1710-eb7
With the help of the SWM formula, the inverse is
i1520-0493-133-6-1710-eb8
If the procedure is continued, it soon becomes clear that both 𝗫k and 𝗫−1k keep the same form, and only diagonal matrices Σk and Γk are updated during iterations. This greatly simplifies the computational burden of a matrix square root calculation. A recursive (iterative) algorithm for 𝗚−1/2 can then be defined:
i1520-0493-133-6-1710-eb9
i1520-0493-133-6-1710-eb10
i1520-0493-133-6-1710-eb11
The recursive (B9)(B11) are computationally very efficient, because the iterative procedure (B10) employs only diagonal matrices. Once it is determined that the algorithm converged, a square root matrix is formed [e.g., (B11)].

In experiments conducted in this paper, a satisfactory convergence was found after only three iterations [e.g., N = 3 in (B10)]. The above algorithm is stable and is convenient for the matrix square root calculations in the context of MLEF.

Fig. 1.
Fig. 1.

Time integration of the KdVB model and observations: (a) targeted observations and (b) in situ observations. The triangles denote the observations. The horizontal axis represents the model domain, and the ordinate axis is the amplitude. The cycles shown are 1, 4, 7, and 10. Note how the targeted observations follow the solitons, whereas the in situ observations remain in one location.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 2.
Fig. 2.

Chi-square statistics in the linear observation operator assimilation experiment, with (a) 10 and (b) 101 observations per cycle. The dashed line represents instant values of χ2 from each analysis cycle, and the solid line represents a 10-cycle moving average.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 3.
Fig. 3.

Innovation histogram in the linear observation operator assimilation experiment, with (a) 10 and (b) 101 observations per cycle. The solid line represents the normal distribution N(0, 1).

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 4.
Fig. 4.

The rms error in the control MLEF experiment, with quadratic observation operator and 10 observations (thin solid line). The horizontal axis denotes the analysis cycles, and the ordinate axis is the rms error. Also, the rms error in the no-assimilation experiment is shown (thick solid line).

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 5.
Fig. 5.

The analysis error covariance in the control MLEF experiment: analysis cycles (a) 1, (b) 4, (c) 7, and (d) 10. Each point represents the (i, j)th matrix element pij, with the horizontal axis denoting the i index and the ordinate axis denoting the j index. Dark-shaded area represents positive covariance, and the light-shaded area represents the negative covariance, using the threshold of ±1 × 10−4 nondimensional units. The contour interval is (a) 20, (b) 2, (c) 2, and (d) 2 nondimensional units.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 6.
Fig. 6.

Innovation statistics in the control MLEF experiment: (a) χ2 test and (b) PDF histogram. The notation is same as in Figs. 2 and 3.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 7.
Fig. 7.

Impact of minimization on the MLEF performance, showing the rms errors of ensemble data assimilation without minimization (solid line) and those of the control MLEF for comparison (dashed line). The horizontal axis denotes the analysis cycles, and the ordinate axis is the rms error.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 8.
Fig. 8.

Same as in Fig. 7, but for the experiment with five observations.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 9.
Fig. 9.

Same as in Fig. 7, but for in situ observation experiment with 10 observations.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 10.
Fig. 10.

Impact of observation location. The dashed line represents the rms errors obtained with in situ observations, and the solid line is the rms error from the control MLEF experiment (i.e., targeted observations).

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Fig. 11.
Fig. 11.

Same as in Fig. 5, but for the in situ assimilation experiment. The contour interval is (a) 20, (b) 2, (c) 2, and (d) 20 nondimensional units.

Citation: Monthly Weather Review 133, 6; 10.1175/MWR2946.1

Save