• Abramov, R. V., , G. Kovacic, , and A. J. Majda, 2003: Hamiltonian structure and statistically relevant conserved quantities for the truncated Burger-Hopf equation. Commun. Pure Appl. Math., 56, 146.

    • Search Google Scholar
    • Export Citation
  • Anthes, R. A., 1974: Data assimilation and initialization of hurricane prediction model. J. Atmos. Sci., 31, 702719.

  • Athans, M., , and P. L. Falb, 1966 : Optimal Control. McGraw-Hill, 879 pp.

  • Bennett, A., 1992: Inverse Methods in Physical Oceanography. Cambridge University Press, 346 pp.

  • Bennett, A., , and M. A. Thorburn, 1992: The generalized inverse of a nonlinear quasigeostrophic ocean circulation model. J. Phys. Oceanogr., 22, 213230.

    • Search Google Scholar
    • Export Citation
  • Bennett, S., 1996: A brief history of automatic control. IEEE Control Syst., 16,1725.

  • Bergman, K. H., 1979: Multivariate analysis of temperatures and winds using optimum interpolation. Mon. Wea. Rev., 107, 14231444.

  • Bergthórsson, P., , and B. Döös, 1955: Numerical weather map analysis. Tellus, 7, 329340.

  • Boltyanskii, V. G., 1971: Mathematical Methods of Optimal Control. Holt, Rinehart and Winston, 272 pp.

  • Boltyanskii, V. G., 1978: Optimal Control of Discrete Systems. John Wiley and Sons, 392 pp.

  • Bryson, A. E., 1996: Optimal control-1950 to 1985. IEEE Control Syst., 16, 2633.

  • Bryson, A. E., 1999: Dynamic Optimization. Addison-Wesley, 434 pp.

  • Canon, M. D., , C. D. Cullum Jr., , and E. Polak, 1970: Theory of Optimal Control and Mathematical Programming. McGraw Hill, 285 pp.

  • Carrier, G. F., , and C. E. Pearson, 1976: Partial Differential Equations: Theory and Techniques. Academic Press, 320 pp.

  • Catlin, D. E., 1989: Estimation, Control and the Discrete Kalman Filter. Springer-Verlag, 274 pp.

  • Dee, D. P., , and A. M. da Silva, 1998: Data assimilation in the presence of forecast bias. Quart. J. Roy. Meteor. Soc., 124, 269295.

  • Derber, J., 1989: A variational continuous assimilation technique. Mon. Wea. Rev., 117, 24372446.

  • Eliassen, A., 1954: Provisional report on the calculation of spatial covariance and autocorrelation of pressure field. Institute of Weather and Climate Research, Academy of Sciences Rep. 5, 12 pp. [Available from Norwegian Meteorological Institute, P.O. Box 43, Blindern, N-0313, Oslo, Norway.]

  • Friedland, B., 1969: Treatment of bias in recursive filtering. IEEE Trans. Autom. Control., 14, 359367.

  • Gandin, L. S., 1965: Objective Analysis of Meteorological Fields. Israel Program for Scientific Translations, 242 pp.

  • Goldstein, H. H., 1950: Classical Mechanics. Addison-Wesley, 399 pp.

  • Goldstein, H. H., 1980: A History of the Calculus of Variations from the 17th through the 19th Century. Springer-Verlag, 410 pp.

  • Griffith, A. K., , and N. K. Nichols, 2001: Adjoint methods in data assimilation for estimating model error. Flow, Turbul. Combust., 65, 469488.

    • Search Google Scholar
    • Export Citation
  • Hirsch, M. W., , and S. Smale, 1974: Differential Equations, Dynamical Systems, and Linear Algebra. Academic Press, 358 pp.

  • Kalman, R. E., 1963: The theory of optimal control and calculus of variations. Mathematical Optimization Techniques, R. Bellman, Ed., University of California Press, 309–329.

  • Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation, and Predictability. Cambridge University Press, 341 pp.

  • Keller, H. B., 1976: Numerical Solution of Two Point Boundary Value Problems. Regional Conference Series in Applied Mathematics, Vol. 24, SIAM Publications, 69 pp.

  • Kuhn, H. W., , and A. W. Tucker, 1951: Nonlinear programming. Proc. Second Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, CA, University of California, Berkeley, 481–492.

  • Lakshmivarahan, S., , and S. K. Dhall, 1990: Analysis and Design of Parallel Algorithm: Arithmetic and Matrix Problems. McGraw Hill, 657 pp.

  • Lakshmivarahan, S., , and J. M. Lewis, 2013: Nudging: A critical overview. Data Assmilation for Atmospheric, Oceanic and Hydrologic Applications, Vol. 2, S. K. Park and L. Liang, Eds., Springer-Verlag, in press.

  • Lewis, F. L., 1986: Optimal Control. John Wiley and Sons, 362 pp.

  • Lewis, J. M., 1972: An operational upper air analysis using the variational methods. Tellus, 24, 514530.

  • Lewis, J. M., , and S. Lakshmivarahan, 2008: Sasaki’s pivotal contribution: Calculus of variation applied to weather map analysis. Mon. Wea. Rev., 136, 35533567.

    • Search Google Scholar
    • Export Citation
  • Lewis, J. M., , S. Lakshmivarahan, , and S. K. Dhall, 2006: Dynamic Data Assimilation: A Least Squares Approach. Cambridge University Press, 654 pp.

  • Lynch, P., 2006: The Emergence of Numerical Weather Prediction: Richardson’s Dream. Cambridge University Press, 279 pp.

  • Majda, A. J., , and I. Timofeyev, 2000: Remarkable statistical behavior for truncated Burgers–Hopf dynamics. Proc. Natl. Acad. Sci. USA, 97, 12 41312 417.

    • Search Google Scholar
    • Export Citation
  • Majda, A. J., , and I. Timofeyev, 2002: Statistical mechanics for truncations of Burger-Hopf equation: A model for intrinsic stochastic behavior with scaling. Milan J. Math., 70, 3996.

    • Search Google Scholar
    • Export Citation
  • Menard, R., , and R. Daley, 1996: The application of Kalman smoother theory to estimation of 4DVAR error statistics. Tellus, 48A, 221237.

    • Search Google Scholar
    • Export Citation
  • Naidu, D. S., 2003: Optimal Control Systems. CRC Press, 433 pp.

  • Platzman, G. W., 1964: An exact integral of complete spectral equations for unsteady one-dimensional flow. Tellus, 16, 422431.

  • Polak, E., 1997: Optimization. Springer, 779 pp.

  • Pontryagin, L. S., , V. G. Boltyanskii, , R. V. Gamkrelidze, , and E. F. Mischenko, 1962 : The Mathematical Theory of Optimal Control Processes. John Wiley, 360 pp.

  • Roberts, S. M., , and J. S. Shipman, 1972: Two-Point Boundary Value Problems: Shooting Method. Elsevier, 289 pp.

  • Rouch, H. E., , F. Tung, , and C. T. Striebel, 1965: Maximum likelihood estimates of linear dynamic systems. J. Amer. Inst. Aeronaut. Astronaut., 3, 14451450.

    • Search Google Scholar
    • Export Citation
  • Sasaki, Y., 1958: An objective analysis based on the variational method. J. Meteor. Soc. Japan, 36, 7788.

  • Sasaki, Y., 1970a: Some basic formulations in numerical variational analysis. Mon. Wea. Rev., 98, 875883.

  • Sasaki, Y., 1970b: Numerical variational analysis formulated under the constraints as determined by longwave equations and low-pass filter. Mon. Wea. Rev., 98, 884898.

    • Search Google Scholar
    • Export Citation
  • Sasaki, Y., 1970c: Numerical variational analysis with weak constraint and application to surface analysis of severe storm gust. Mon. Wea. Rev., 98, 899910.

    • Search Google Scholar
    • Export Citation
  • Shen, J., , T. Tang, , and L. L. Wang, 2011: Spectral Methods. Springer-Verlag, 470 pp.

  • Wiener, N., 1948: Cybernetics: Control and Communication in the Animal and Machine. John Wiley, 194 pp.

  • Wiin-Nielsen, A., 1991: The birth of numerical weather prediction. Tellus, 43A, 3652.

  • Zupanski, D., 1997: A general weak constraint applicable to operational 4DVAR data assimilation system. Mon. Wea. Rev., 125, 22742292.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    A plot of the solution q(x, t) in (4.3) at times t = 0, 0.5, 1, 1.5, and 2.

  • View in gallery

    Comparison of the error in the four-mode approximation and the error in the eight-mode approximation at t = 2.0. Fourier coefficients at t = 2 are given in Table 1.

  • View in gallery

    Comparison of the four components of the uncontrolled error e0 = ξkzk and the controlled error ec = xkzk for p = 4, = 4, c = {100 000, 1000, 1}.

  • View in gallery

    Comparison of four components of the control sequence {uk} for p = 4, = 4, and c = {100 000, 1000, 1}.

  • View in gallery

    Comparison of the four components of the uncontrolled error e0 = ξkzk and the controlled error ec = xkzk for p = 1, = (1 1 1 1)T, and c = {100 000, 1000, 1}.

  • View in gallery

    Comparison of the four components of the error ζkzk between the corrected but uncontrolled model state {ζk} in (6.10) with the observation {zk} and the error between the original uncorrected and uncontrolled model state in (2.1) with the observation {zk}.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 20 20 1
PDF Downloads 10 10 0

Data Assimilation as a Problem in Optimal Tracking: Application of Pontryagin’s Minimum Principle to Atmospheric Science

View More View Less
  • 1 School of Computer Science, University of Oklahoma, Norman, Oklahoma
  • 2 National Severe Storms Laboratory, Norman, Oklahoma, and Desert Research Institute, Reno, Nevada
  • 3 School of Computer Science, University of Oklahoma, Norman, Oklahoma
© Get Permissions
Full access

Abstract

A data assimilation strategy based on feedback control has been developed for the geophysical sciences—a strategy that uses model output to control the behavior of the dynamical system. Whereas optimal tracking through feedback control had its early history in application to vehicle trajectories in space science, the methodology has been adapted to geophysical dynamics by forcing the trajectory of a deterministic model to follow observations in accord with observation accuracy. Fundamentally, this offline (where it is assumed that the observations in a given assimilation window are all given) approach is based on Pontryagin’s minimum principle (PMP) where a least squares fit of idealized path to dynamic law follows from Hamiltonian mechanics. This utilitarian process optimally determines a forcing function that depends on the state (the feedback component) and the observations. It follows that this optimal forcing accounts for the model error. From this model error, a correction to the one-step transition matrix is constructed. The above theory and technique is illustrated using the linear Burgers’ equation that transfers energy from the large scale to the small scale.

Corresponding author address: S. Lakshmivarahan, School of Computer Science, University of Oklahoma, 110 W Boyd St., Room DEH 230, Norman, OK 73019. E-mail: varahan@ou.edu

Abstract

A data assimilation strategy based on feedback control has been developed for the geophysical sciences—a strategy that uses model output to control the behavior of the dynamical system. Whereas optimal tracking through feedback control had its early history in application to vehicle trajectories in space science, the methodology has been adapted to geophysical dynamics by forcing the trajectory of a deterministic model to follow observations in accord with observation accuracy. Fundamentally, this offline (where it is assumed that the observations in a given assimilation window are all given) approach is based on Pontryagin’s minimum principle (PMP) where a least squares fit of idealized path to dynamic law follows from Hamiltonian mechanics. This utilitarian process optimally determines a forcing function that depends on the state (the feedback component) and the observations. It follows that this optimal forcing accounts for the model error. From this model error, a correction to the one-step transition matrix is constructed. The above theory and technique is illustrated using the linear Burgers’ equation that transfers energy from the large scale to the small scale.

Corresponding author address: S. Lakshmivarahan, School of Computer Science, University of Oklahoma, 110 W Boyd St., Room DEH 230, Norman, OK 73019. E-mail: varahan@ou.edu

1. Introduction

Data assimilation as a means of constructing the initial conditions for dynamical prediction models in meteorology has 50+ yr of history. It began in the late 1940s–early 1950s as a response to anticipation of numerical weather prediction (NWP) that began in a research mode at Princeton’s Institute for Advanced Study (IAS) in 1946 [reviewed in Lynch (2006)]. By the mid-1950s, operational NWP commenced in Sweden and shortly thereafter in the United States (Wiin-Nielsen 1991). The first operational numerical weather map analysis or objective analysis as it was then called came from the work of Bergthórsson and Döös (1955)—the B–D scheme.

The pragmatic and utilitarian B–D scheme established the following guidelines that became central to development of meteorological data assimilation: 1) use of a background field that, in their case, was a combination of a forecast from an earlier time (12 h earlier) and climatology; and 2) interpolation of an “increment” field, the difference between the forecast and observation at the site of the observation, to grid points as a means of adjusting the background. Two optimal approaches to data assimilation came in the wake of the B–D scheme. The first was a stochastic method designed by Eliassen (1954) with further development and operational implementation by Gandin (1965) at the National Meteorological Center (NMC), United States [reviewed in Bergman (1979)]. The second was a deterministic scheme developed by Sasaki (1958, 1970a,b,c) with operational implementation by Lewis (1972) at the U.S. Navy’s Fleet Numerical Weather Center (FNWC). The subsequent advancement of these two approaches became known as three-dimensional variational data assimilation (3DVAR) and four-dimensional variational data assimilation (4DVAR), respectively. A comprehensive review of the steps that led to these developments is found in the historical paper by Lewis and Lakshmivarahan (2008). As currently practiced, both 3DVAR and 4DVAR make use of a background, a forecast from an earlier time, and thereby embrace a Bayesian philosophy (Kalnay 2003; Lewis et al. 2006).

The subject of automatic control and feedback control in particular came into prominence in the immediate post–World War II (WWII) period (Wiener 1948) when digital computers became available and control of ballistic objects such as missiles and rockets took center stage in the Cold War era (Bennett 1996; Bryson 1996). Development of mathematical algorithms to optimally track rockets and artificial satellites and to efficiently and economically change their course became a fundamental theme in control theory. One of the algorithms developed during this period became known as Pontryagin’s minimum principle (PMP) (Pontryagin et al. 1962; Boltyanskii 1971, 1978; Bryson 1996, 1999). This principle, developed by Lev Pontryagin and his collaborators, is expressed in the following form: In the presence of dynamic constraints (typically differential equations of motion), find the best possible control for taking a dynamic system from one state to another. Essentially, this principle embodies the search for minimization of a cost function that contains the Euler–Lagrange search for the minimum. As will be shown in section 3, 4DVAR is a special case of PMP. We will test this methodology and concept in meteorological problems where the task will be to force the system toward observations in much the same spirit as the nudging method (Anthes 1974)—but importantly, in this case, the process is optimal (Lakshmivarahan and Lewis 2013).

In this paper we succinctly review the basis for the PMP as it applies to the determination of the optimal control/forcing by minimizing a performance functional that is a sum to two quadratic forms representing two types of energy where the given model is used as a strong constraint. The first term of this performance functional is the total energy of the error, the difference between the observations (representing truth), and model trajectory starting from an arbitrary initial condition. Minimization of this energy term has been the basis for the variational methods (Lewis et al. 2006). The second quadratic form represents the total energy in the control signal. It must be emphasized that the use of least energy to accomplish a goal is central to engineering design and distinguishes this approach from the traditional variational approaches to dynamic data assimilation. A family of optimal controls can be achieved by giving different weights to these two energy terms.

By introducing an appropriate Hamiltonian function, this approach based on PMP reduces the well-known second-order Euler–Lagrange equation to a system of two first-order canonical Hamiltonian equations, the like of which have guided countless developments in physics (Goldstein 1950, 1980). While Kuhn and Tucker (1951) extended the Lagrangian technique for equality constraints to include inequality constraints by developing the theory of nonlinear programming for static problems, Pontryagin et al. (1962) used this Hamiltonian formulation to extend the classical Euler–Lagrange formulation in the calculus of variations. This extension has been the basis for significant development of optimal control theory in the dynamical setting. The resulting theory is so general that it can handle both equality and inequality constraints on both the state and the control. Further, there is a close relationship between the PMP and Kuhn–Tucker condition. See Canon et al. (1970) for details.

Recall that the optimal control computed using the PMP forces the model trajectory toward the observations. Hence, it is natural to interpret this optimal control as the additive optimal model error correction. In an effort to further understand the impact of knowing this optimal sequence of model errors, we take PMP one step further. Given an erroneous linear model with as its one-step state transition matrix, we have developed a flexible framework that consolidates the information in the optimal model error sequence into a correction matrix such that the corrected model governed by ( + ) will match the optimal trajectory.

While the PMP approach to dynamic data assimilation in meteorology is new, there are conceptual and methodological similarities between this approach and the vast literature devoted to analysis of model errors. We explore some of the similarities. The contributions in the area of model error correction are broadly classified along two lines—deterministic or stochastic model and the model constraint that is strong or weak.

In a stimulating paper, Derber (1989) first postulates that the deterministic model error can be expressed as the product of an unknown temporally fixed spatial function φ and a prespecified time-varying function. Using the model as a strong constraint, he then extends the 4DVAR method to estimate φ along with the initial conditions which to our knowledge represents the first attempt to quantify the model errors using the variational framework. Griffith and Nichols (2001) again postulate that the model error evolves according to an auxiliary model with unknown parameters. By augmenting this empirical secondary model with the given model, they then estimate both the initial condition and the parameters using the 4DVAR, using the model as a strong constraint. The PMP-based approach advocated in this paper does not rely on empirical auxiliary models.

It is also appropriate to briefly mention the earlier efforts in control theory and meteorology to account for model error. See Rouch et al. (1965), Friedland (1969), Bennett (1992), Bennett and Thorburn (1992), and Dee and da Silva (1998). In the spirit of these contributions, the work by Menard and Daley (1996) made the first attempt to relate Kalman smoothing to PMP. The primary difference between our approach and the Menard and Daley (1996) approach is that we consider a deterministic strong constraint model with time-varying errors while they develop a weak constraint stochastic model formulation with stochastic error terms with known covariance structure. Zupanski’s (1997) discussion of advantages with the weak constraint formulation of the 4DVAR to assess systematic and random model errors is a meaningful complement to Menard and Daley (1996).

In section 2 we provide a robust derivation of the PMP for the general case of (autonomous) nonlinear model where observations are (autonomous) nonlinear functions of the state. The computation of the optimal control sequence in this general case reduces to solving a nonlinear two-point boundary-value problem (TPBVP). We then specialize these results for the case when both the model and observations are linear in section 3. In this case of linear dynamics, the well-known sweep method (Kalman 1963) is used to reduce the TPBVP to solve two initial-value problems. To illustrate the power of the PMP we have chosen the linear Burgers equation where the advection velocity is a sinusoidal function of the space variable—this linear model has many of the characteristics of Platzman’s (1964) classic study of Burgers’s nonlinear advection. Many of the key properties of this linear Burgers equation and its n-mode spectral counterpart [also known as the low-order model LOM(n)] obtained by using the standard Galerkin projection method (Shen et al. 2011) are described in section 4. Numerical experiments relating to the optimal control of LOM(4) are given in section 5. In a series of interesting papers, Majda and Timofeyev (2000, 2002) and Abramov et al. (2003) analyze the statistical properties of the solution of the n-mode spectral approximation to the nonlinear Burgers equation. Section 6 illustrates the computation of the consolidated correction matrix using the computed time series of optimal controls and the associated optimal trajectory. It is demonstrated that the uncontrolled solution of the corrected model ( + ) indeed matches the optimal trajectory of the model. Section 7 contains concluding remarks. The three appendices provide supplementary results used in derivation found in the main body of the text.

2. Minimum principle in discrete form

a. Stepwise solution of the variational problem

In this section we provide a summary of the celebrated Pontryagin minimum principle, which is based on expressing the classical Lagrangian function in terms of the Hamiltonian function. In the following we follow the developments in Athans and Falb (1966), Lewis (1986), and Naidu (2003). Let
e2.1
be the given discrete time nonlinear model dynamics where , is the state of the time invariant dynamics, and ηkn is the given intrinsic physical forcing that is part of the model.
Pontryagin’s method calls for adding an external forcing term to the given model dynamics in (2.1). Let the resulting forced dynamics be given by
e2.2
where 1 ≤ pn, n×p, and ukp is the new control or decision vector. As an example, when p = 1 and = (1, 1, … , 1)Tn, then the same (scalar) control uk is applied to each and every component of the state vector. At the other extreme, when p = n and = n, the identity matrix of order n, then ukn and the ith component of uk is applied to the ith component of the state vector. It is assumed that the initial condition xo is specified. Let
e2.3
where zkm for some positive integer m denotes the observation vector at time k, h: nm denotes the map (also known as the forward operator) that relates the model state xk to the observation zk, and vk is the observation noise vector, which is assumed to be white and Gaussian. That is, vk ~ N(0, k), where km×m is a known positive definite matrix.
Define a performance measure
e2.4
where N is the number of observations, the cost functional Vk is a sum of two terms given by
e2.5
with
e2.6
e2.7
The notation 〈a, b〉 indicates the standard inner product, and p×p is a given symmetric and positive definite matrix. Clearly denotes the energy in the normalized forecast error
e2.8
and accounts for the energy in the control input. The traditional variational methods use only . For a given −1, one can obtain a variety of tradeoffs between these two energy terms by appropriately choosing the matrix .
Define the Lagrangian L, obtained by augmenting the dynamical constraint in (2.1) with J in (2.4), as follows:
e2.9
where λkn for 1 ≤ kN denotes the set of N undetermined Lagrangian multipliers or the costate variables. Now define the associated Hamiltonian function
e2.10
Substituting (2.10) in (2.9), the latter becomes
e2.11
By splitting the summation on the right-hand side of (2.11), we obtain
e2.12
Since ηk is specified, no variation of ηk is considered. Let δL be the induced increment in L resulting from the increments δxk in xk and δuk in uk for 0 ≤ kN − 1 and δλk in λk for 0 ≤ kN. Since Hk is a scalar valued function of the vectors xk, uk, ηk, and λk+1, from the first principles (Lewis et al. 2006) we obtain
e2.13
where xHkn, uHkp, and λHkn are the gradients of Hk with respect to xk, uk, and λk+1, respectively.

Recall that δL must be zero at the minimum, and in view of the arbitrariness of δxk, δuk, and δλk, we readily obtain a set of necessary conditions expressed as follows, all for 0 ≤ kN − 1.

1) Condition 1: Model dynamics

The first summation, which is the fourth term on the right-hand side of (2.13), is zero when
e2.14
Now computing the gradient of Hk in (2.10) with respect to λk and substituting it in (2.14), the latter becomes
e2.15
which in fact turns out to be the model equations given in (2.2). Stated in other words, Pontryagin’s method dictates that the sequence of states xk arise as a solution of the model used as a strong constraint.

2) Condition 2: Costate or adjoint dynamics

The fifth summation on the right-hand side of (2.13) is zero when
e2.16
Computing the gradient of Hk in (2.10) with respect to the model state xk and using it in (2.16), the latter becomes
e2.17
where xVk is the gradient of Vk in (2.5) given by
e2.18
which is the normalized forecast error viewed from the model space,
e2.19
is the Jacobian of the forward operator h(x) with respect to the model state, and
e2.20
is the Jacobian of the model map in (2.2).
Equation (2.17) is called the adjoint dynamics or the costate dynamics. By substituting (2.18) into (2.17), it takes a familiar form
e2.21
which is well known in the literature on 4DVAR methods (Lewis et al. 2006, 408–411).

3) Condition 3: Stationarity condition

Similarly, combining the third summation, which is the sixth term with the second term on the right-hand side of (2.13), it follows that both of these two terms vanish when
e2.22
Again computing the gradient of Hk in (2.10) with respect to the control uk and using it in (2.22), the latter becomes
e2.23
From (2.5) to (2.7) we get the gradient of Vk with respect to uk,
e2.24
and from (2.2) we get
e2.25
which is the Jacobian of the model in (2.2) with respect to the control uk.
Now substituting (2.24) and (2.25) into (2.23), the structure of the optimal control is given by
e2.26
which is well defined since the matrix in (2.7) is assumed to be a positive definite matrix. Notice that the second term on the right-hand side of (2.13) is already accounted for in (2.22). Thus, we are left with only the first and the third terms on the right-hand side of (2.13), which in turn provide the required boundary conditions.
Recall that x0 is given but xN is free. Hence δx0 = 0 and δxN is arbitrary. Thus, the first term on the right-hand side of (2.13) is automatically zero. The third term is zero by forcing
e2.27
The above analysis naturally leads to a framework for optimal control, which is stated below.
(i) Step 1: Compute the optimal control

The structure of the optimal control sequence uk is computed by solving the stationarity condition (2.22) and is given by (2.26).

(ii) Step 2: Solve the nonlinear TPBVP
Using the form of the optimal control in (2.26) in the model dynamics (2.15) or (2.2) and in the costate or adjoint dynamics in (2.21), we arrive at TPBVP given by
e2.28
e2.29
where the initial condition x0 for (2.28) is given and the final condition λN = 0 is given for (2.29). Clearly, the solution (2.28) and (2.29) gives the optimal trajectory. A number of observations are in order.

The importance of the Hamiltonian formulation of the Euler–Lagrange necessary condition for the minimum stems from the simplicity and conciseness of the two first-order equations (2.14) and (2.16) involving the state and the costate/adjoint variables. This Hamiltonian formulation has been the basis of countless developments in physics (Goldstein 1980).

b. Computation of optimal control

Equation (2.28), a representation of the model dynamics, is solved forward in time starting from the known initial condition x0. But the adjoint (2.29), representing the costate/adjoint dynamics, is solved backward in time starting from λN = 0. The two systems in (2.28) and (2.29) form a nonlinear coupled two-point boundary value problem, which in general does not admit to closed form solution. A number of numerical methods for solving (2.28) and (2.29) have been developed in the literature, a sampling of which is found in Roberts and Shipman (1972), Keller (1976), Polak (1997), and Bryson (1999). A closed form solution to the optimal control problem exists for the special case when the model dynamics is linear and the cost function Vk is a quadratic form in state xk and control uk. This special case is covered in section 3 of this paper.

c. Connection to 4DVAR
Consider the special case of an unforced model given by
e2.30
where the initial condition x0 is arbitrary and
e2.31
where is given by (2.6). Define the Lagrangian
e2.32
where
e2.33
By repeating the above argument we obtain the analog of (2.8) as
e2.34
The necessary conditions 1–3 for this special case take the following form.

1) Condition 1A: Model dynamics

Vanishing of the third term on the right-hand side of (2.34) when is arbitrary leads to the condition
eq1
which in the light of (2.33) becomes the model equation
e2.35

2) Condition 2A: Costate/adjoint dynamics

Since δxk is arbitrary, vanishing of the fourth term on the right-hand side of (2.34) gives
eq2
which in the light of (2.33) becomes the costate dynamics given by
e2.36
where is given in (2.18).
Since xN is free, δxN is arbitrary. Hence, vanishing of the second term in (2.33) requires
e2.37
Combining these, we readily see that (2.34) reduces to
e2.38
So, from first principles and using (2.33), it follows that
e2.39
The above development naturally leads to the standard 4DVAR algorithm (Lewis et al. 2006, 411–412), which is summarized below:
  1. Starting from an arbitrary x0, compute the model solution xk by iterating (2.35).
  2. Using the observations zk, compute
    e2.40
  3. Since , using (2.40) in (2.36) iterate the adjoint dynamics backward to get the value of λ1.
  4. Substitute λ1 in (2.39) to get the gradient .
It is easy to verify (Lewis et al. 2006, 386–389) that
e2.41
where is given by (2.31).

3. Optimal tracking: Linear dynamics

In this section we apply the minimum principle developed in section 2 to solve the problem of finding explicit form for optimal control or forcing that will drive the dynamics to track or follow the given set of observations when the model is linear and the performance measure is a quadratic function of the state and the control (Kalman 1963; Catlin 1989).

Let the deterministic dynamical model be given by
e3.1
where n×n, ηkn is the intrinsic forcing term that is part of the model and n×p, which is the special case of (2.1). Let the observations be given by
e3.2
where m×n and vk ~ N(0, ) and m×m is the known positive definite matrix denoting the covariance of vk.
We consider the same cost functional given in (2.4)(2.7). Substituting
e3.3a
and
e3.3b
in the expression for the Lagrangian in (2.9) and in the subsequent developments in section 2, it can be verified that the necessary conditions for this linear case reduces to the following:
  1. Structure of optimal control. From the stationarity condition developed in (2.22)(2.26), it readily follows that the structure of the optimal control in this linear case is given by
    e3.4
    which is the same as in the nonlinear case treated in section 2.
  2. The linear two-point boundary value problem. Substituting the special form of the dynamics and the observation given by (3.1)(3.3) and the expression for uk given by (3.4) in (2.28) and (2.29), the latter pair of equations become
    e3.5
    e3.6
    where we have used the fact that h(x) = x and Dx(h) = . The initial condition for (3.5) is the given x0 and the final condition for (3.6) is λN = 0. Again, recall that (3.6) is the well-known adjoint equation that routinely arises in the 4DVAR analysis (Lewis et al. 2006, 408–412). For later reference we rewrite (3.5) and (3.6) as
    e3.7
    where = −1T, = T−1, and = T−1.

It turns out this special linear TPBVP can be transformed into a pair of initial value problems using the sweep method, which in turn can be easily solved. By exploiting the structure of (3.5) and (3.6), it can be verified (see appendix A for details) that λk is an affine function of the state xk.

Consequently, we posit that λk is related to xk via a general affine transformation of the type
e3.8
Substituting (3.8) in the state equation in (3.7) and simplifying, we get
e3.9
Again substituting (3.8) and (3.9) in the costate equation in (3.7), after simplifying we get
e3.10
Since (3.10) must hold good for all xk, we immediately obtain equations that define the evolution of the matrix k and the vector gk as follows:
e3.11
which is a nonlinear matrix Riccati equation and
e3.12
Since λN = 0 and xN is arbitrary, from (3.8) it is immediately clear that
e3.13
Against this backdrop, we now state the algorithm for computing the optimal control and the optimal trajectory.
  • Step 1Given (3.1)(3.3), compute = −1T, = T−1, and = T−1. Solve the nonlinear matrix Ricatti difference equation in (3.11) for k backward starting at N = 0. Since this computation is independent of the observations, it can be precomputed and stored if needed.
  • Step 2Solve the linear vector difference equation in (3.12) for gk backward in time starting from gN = 0. Notice that gk depends on the observations and the intrinsic forcing ηk that is part of the given model. It will be seen that the impact of the observations on the optimal control is through gk.
  • Step 3Once k and gk are known, we can compose the optimal control using (3.4) and (3.8):
    e3.14
    Using (3.1) in (3.14), the latter becomes
    e3.15
    Premultiplying both sides by C and simplifying, we get an explicit expression for the optimal control as
    e3.16
    where the feedback gain k is given by
    e3.17
    and the feedforward gain k is given by
    e3.18
    From (3.1) and (3.16), the optimal trajectory is then given by
    e3.19
    or as
    e3.20
The second term on the right-hand side of (3.20) is indeed the optimal forcing term uk and it plays a dual role. First, it forces the model trajectory toward the observations where the measure of closeness depends on the choice of p, the dimension of the control vector uk, the matrix n×p, and the matrix p×p, where it is assumed that the observational error covariance matrix is given. Consequently, uk contains information about the model error. Thus, for a given value of and a prespecified measure of closeness between the observations and the model trajectory, one can, in principle, obtain a family of optimal control uk to achieve this goal by suitably varying the integer p, (1 ≤ pn), and n×p and p×p with being symmetric and positive definite.

4. Dynamical constraint: Linear Burgers’s equation

To illustrate Pontryagin’s method, we choose a dynamic constraint that follows the theme of Platzman’s classical study of Burgers’s equation (Platzman 1964). In that study, Platzman investigated the evolution of an initial single primary sine wave over the interval [0, 2π]. The governing dynamics described the transfer of energy from this primary wave to waves of higher wavenumber as the wave neared the breaking point. In a tour de force with spectral dynamics, Platzman obtained a closed form solution for the Fourier amplitudes and then analyzed the consequences of truncated spectral expansions. The contribution was instrumental in helping dynamic meteorologists understand the penalties associated with truncated spectral weather forecasting in the early days of numerical weather prediction.

We maintain the spirit of Platzman’s investigation but in a somewhat simplified form. Whereas the nonlinear dynamic law advects the wave with the full spectrum of Fourier components, we choose to advect with only the initial primary wave—sin(x). This problem retains the transfer of energy from the primary wave to the higher wavenumber components as the wave steepens, but the more complex phenomenon of folding over or breaking of the wave is absent in this linear problem.

a. Model and its analytic solution

The governing dynamics for the linear Burgers’s equation is
e4.1
with initial condition q(x, 0) = sin(x) and boundary conditions q(0, t) = q(2π, t) = 0. Following Platzman (1964), we seek a solution to (4.1) by the method of characteristics (Carrier and Pearson 1976). The characteristics of (4.1) are given by
e4.2
where x0 is the intersection of a particular characteristic curve with the line of initial time (t = 0). Using the mathematical expression for the characteristics in concert with the initial condition, the analytic solution is found to be
e4.3
From this analytic solution, the profiles of the wave at times t = 0, 0.5, 1.0, 1.5, and 2.0 are shown in Fig. 1. The slope of the wave is finite at x = π but approaches ∞ as t → ∞.
Fig. 1.
Fig. 1.

A plot of the solution q(x, t) in (4.3) at times t = 0, 0.5, 1, 1.5, and 2.

Citation: Journal of the Atmospheric Sciences 70, 4; 10.1175/JAS-D-12-0217.1

Let be the (exact) value of the kth Fourier coefficient of the solution q(x, t) in (4.3) given by
e4.4
Define the vector
e4.5
of the first n Fourier coefficients of q(x, t). The values of the coefficients (computed using the well-known quadrature formula) for 1 ≤ kn = 8 and 0 ≤ t ≤ 2.0 in steps of Δt = 0.2 are given in (rows of) Table 1.
Table 1.

Values of the first eight Fourier coefficients of q(x, t) in (4.3) for various times computed using quadrature formula.

Table 1.
An n-mode approximation [resulting from spectral truncation to q(x, t)] is then given by
e4.6
A comparison of the exact solution q(x, t) with the four-mode approximation and the eight-mode approximation obtained from (4.6) with n = 4 and 8, respectively, at t = 2.0 is given in Fig. 2. As to be expected, the eight-mode approximation is closer to the true solution than is the four-mode approximation. Further, the errors are the greatest at the point of extreme steepness of waves.
Fig. 2.
Fig. 2.

Comparison of the error in the four-mode approximation and the error in the eight-mode approximation at t = 2.0. Fourier coefficients at t = 2 are given in Table 1.

Citation: Journal of the Atmospheric Sciences 70, 4; 10.1175/JAS-D-12-0217.1

b. The low-order model

In demonstrating the power of Pontryagin’s method developed in sections 2 and 3, our immediate goal is to obtain a discrete time model representative of (3.1). There are at least two ways, in principle, to achieve this goal. The first way is to directly discretize (4.1) by embedding a grid in the two-dimensional domain with 0 ≤ x ≤ 2π and t ≥ 0. Second is to project the infinite dimensional system in (4.1) onto a finite dimensional space using the standard Galerkin projection method and obtain a system of n ordinary differential equations (ODEs) describing the evolution of the Fourier amplitudes qi(t) in (4.4), 1 ≤ in. The resulting nth-order system is known as the low-order model (LOM). Then LOM can be discretized using one of several known methods. In this paper we embrace this latter approach using LOM.

The Fourier series of q(x, t) consists of an infinite series of sine waves given by
e4.7
An LOM(n) describing evolution of the amplitudes of the spectral components are obtained by exploiting the orthogonality properties of the sin(ix) functions for 1 ≤ in. Substituting (4.4) into (4.1), multiplying both sides by sin(ix), and integrating both sides from 0 to 2π, we obtain the LOM(n) (also known as the spectral model):
e4.8
where
e4.9
as its initial condition and the matrix given by
e4.10
where ai = −(i − 1), ci = (i + 1). An example for n = 4 is given by
e4.11

We now state a number of interesting properties of the solution of the LOM(n) in (4.8).

1) Conservation of energy

Consider a quadratic form E(q) representing generalized energy and given by
e4.12
where
e4.13
is a diagonal matrix with the indicated entries as its diagonal elements. It can be verified that the time derivative of E(q) evaluated along the solution of (4.8) is given by
e4.14
From the form of in (4.13) and in (4.10), it is an easy exercise to verify that the product is a skew-symmetric matrix given by
e4.15
where si = i(i + 1) for 1 ≤ in − 1. Hence, it can be verified that the quadratic form qTq is zero, which in turn implies that the energy E(q) is conserved by (4.8); that is,
e4.16
An immediate consequence of (4.16) is that the solution q(t) of (4.8) always lies on the surface of an n-dimensional ellipsoid defined by
e4.17
Clearly, the length of the kth semiaxis of this ellipsoid is given by . Hence the volume of this ellipsoid is given by
e4.18
Since n! = O(2nlogn), it turns out that the volume of this ellipsoid goes to zero at an exponential rate as n increases signaling degeneracy for large n.

2) Solution of LOM(n) in (4.8)

Much like the PDE (4.1), its LOM(n) counterpart in (4.8) can also be solved exactly. The process of obtaining its solution is quite involved. To minimize the digression from the main development, we have chosen to describe this solution process in appendix B. The eigenstructure of , its Jordan canonical form, and the form of the general solution of (4.6) are discussed in detail in appendix B. Our goal is to use the exact solution of (4.8) given in appendix B to calibrate the choice of Δt—the time discretization interval to be used in the following section.

5. Numerical experiments

Discretizing the spectral model in (4.8) with n = 4 using the first-order Euler scheme, we obtain
e5.1
where ξk = q(t = kΔt) and Δt denotes the length of the time interval used in time discretization and
e5.2
where 4×4 is given in (4.11) and the initial condition in (4.9).
Pontryagin’s approach requires the addition of the forcing term to (5.1). The forced version of (5.1) is then represented as
e5.3
where ukp and n×p. Clearly (5.3) is the same as (3.1) with ηk ≡ 0 and x0 = q(0).

Equation (5.3) defines the evolution of the spectral amplitudes. Compared to the original equation, the spectral model in (5.3) has two types of model errors: one from the spectral truncation in the Galerkin projection and one due to finite differencing in (4.8) using the first-order method.

Observations

We propose to use the exact Fourier coefficient vector at t = kΔt in (4.5) corrupted by additive noise as the observations in our numerical experiments. Define
e5.4
where zkn, , νk ~ N(0, R), and .

Comparing (5.4) with (3.2), it is immediate that m = n and = n.

The form of the functional Vk is given by
e5.5
where p×p is a symmetric and positive definite matrix.
Applying the results from section 3, it follows that
e5.6
where
e5.7
with λN = 0.
The TPBVP problem in (5.3) and (5.7) is then solved using the sweep method described in section 3. Accordingly,
e5.8
where
e5.9
e5.10
where = −1T, = −1, = −1, N = 0, and gN = 0.

Solving (5.9) and (5.10), we then assemble uk using (3.14)(3.18). Substituting it in (5.1) we get the optimal solution.

1) Experiment 1

In this first experiment, we set n = m = p = 4, = 4, and uk4. The uncontrolled model is
e5.11
and the controlled model is
e5.12
with = ( + Δt) and is given in (4.11).

Both models start from the same initial condition ξ0 = x0 = (1.1, 0, 0, 0)T, which is different from the one that was used to generate the observations. Consequently, the solution to the unforced model in (5.11) inherits three types of errors: the first because of the spectral truncation, the second because of finite differencing, and the third owing to error in the initial condition. The power of the Pontryagin’s approach is to compute the optimal control uk such that the term uk compensates for all the three types of errors.

The observation vector zk4 is given by
e5.13
for 1 ≤ k ≤ 10, where given in Table 1, and νk ~ N(0, ).
It is further assumed that and = cp. Substituting these in the expression for Vk in (2.5)(2.7), it can be verified that
e5.14

A comparison of the evolution of the four components of the uncontrolled error, e0 = ξkzk4, and the corresponding components of the controlled error, ec = ξkzk4, when σ2 = 0.001 fixed but c is varied through 105, 103, and 1, are given in Figs. 3a–d. It is clear that the magnitudes of the individual components of the controlled error are uniformly (in time k) less than the magnitudes of the corresponding components of the uncontrolled error. Further, the magnitudes of the controlled error decrease with the decrease in the value of the control parameter c.

Fig. 3.
Fig. 3.

Comparison of the four components of the uncontrolled error e0 = ξkzk and the controlled error ec = xkzk for p = 4, = 4, c = {100 000, 1000, 1}.

Citation: Journal of the Atmospheric Sciences 70, 4; 10.1175/JAS-D-12-0217.1

This behavior can be easily explained using (5.14). When the value of the control parameter c is large (for a fixed −1), the minimization process forces uk to be small. However, if c is small, the minimization allows for larger value of uk. This increased forcing helps to move xk in such a way that xk is closer to zk. This observation is corroborated by the plots of the evolution of the four components of the control {uk} given in Figs. 4a–d. It is evident from Fig. 4 that the magnitude of the initial values of the control increases as the parameter c is decreased.

Fig. 4.
Fig. 4.

Comparison of four components of the control sequence {uk} for p = 4, = 4, and c = {100 000, 1000, 1}.

Citation: Journal of the Atmospheric Sciences 70, 4; 10.1175/JAS-D-12-0217.1

A standard measure of the closeness of the jth component of the controlled and uncontrolled model solution with the jth component of the observations are given by
e5.15
and
e5.16
Table 2 gives the values of these measures for various combinations of the values of and c. It is clear from Fig. 3 and Table 2 that for a given , RMS1 decreases as c decreases.
Table 2.

Root-mean-square errors of the controlled and uncontrolled model solution with observations (p = 4, = 4).

Table 2.

2) Experiment 2

In this experiment we set p = 1 and = (1 1 1 1)T and all the other parameters are the same as in experiment 1. A comparison of the plots of the observations with controlled and uncontrolled model solution is given in Fig. 5. Table 3 provides a comparison of the RMS errors for various choices of and c. Recall that when p = 1, the same control is applied to every component of the state vector as opposed to when p = 4 where different elements of the control vector impact the evolution of the different components of the state vector. Thus in experiment 1 (p = 4) the components of the control vector are customized to each component of the state vector and hence the errors are less as borne by comparing the corresponding elements of Tables 2 and 3. Clearly, larger p is better.

Fig. 5.
Fig. 5.

Comparison of the four components of the uncontrolled error e0 = ξkzk and the controlled error ec = xkzk for p = 1, = (1 1 1 1)T, and c = {100 000, 1000, 1}.

Citation: Journal of the Atmospheric Sciences 70, 4; 10.1175/JAS-D-12-0217.1

Table 3.

Root-mean-square errors of the controlled and uncontrolled model solution with observations [p = 1, = (1 1 1 1)T].

Table 3.

6. Identification of model errors

One of the lofty goals of dynamic data assimilation is to find a correction for model error—errors due to the absence or inappropriate parameterization of physical processes germane to the phenomenon under investigation, and/or incorrect specification of the deterministic model’s control vector (initial conditions, boundary conditions, and physical/empirical parameters). The theory developed in sections 2 and 3 and the illustrations in sections 4 and 5 demonstrate the inherent strength of Pontryagin’s minimum principle as a means of finding this correction.

In an effort to further understand the sources of model error, we take the Pontryagin procedure one step further—we attempt to find a correction matrix n×n such that the solution of the corrected but unforced model ( + ) matches the optimal trajectory from Pontryagin. That is,
e6.1
where {xk} is the optimal trajectory of (5.3). Subtracting (5.3) from (6.1), we find
e6.2
for all 1 ≤ kN, where yk = uk. That is, given {xk} and the optimal input time series {yk}, we seek to find a time invariant linear operator that will map xk to yk for all 1 ≤ kN.
This inverse problem can be recast as an unconstrained minimization of the quadratic functional : n×n defined by
e6.3
with respect to n×n, where
e6.4
and
e6.5
e6.6
e6.7
From appendix C it readily follows that the optimal is given by
e6.8
where + denotes the generalized inverse of .

Those familiar with optimal interpolation method (Gandin 1965) will readily recognize that the first term on the right-hand side of (6.8) is akin to the cross covariance between xk and yk and the second term is akin to the inverse of the covariance of xk with itself. We now illustrate this idea in the following example.

Example 6.1

Using the optimal control sequence yk = uk and its associated optimal trajectory xk found in example 5.1 (with n = 4), the value of computed from (6.8) is given by
e6.9
The trajectory of the corrected but uncontrolled model is given by
e6.10
A comparison of ζkzk, the error between corrected but uncontrolled model in (6.10), and ξkzk, the error between the uncorrected and uncontrolled model in (5.11), is given in Fig. 6. It is evident from Fig. 6 that the trajectory of the corrected but uncontrolled model fits the observations better.
Fig. 6.
Fig. 6.

Comparison of the four components of the error ζkzk between the corrected but uncontrolled model state {ζk} in (6.10) with the observation {zk} and the error between the original uncorrected and uncontrolled model state in (2.1) with the observation {zk}.

Citation: Journal of the Atmospheric Sciences 70, 4; 10.1175/JAS-D-12-0217.1

We conclude this section with the following remarks:

  1. Define a vector s(x) = (sinx, sin2x, sin3x, sin4x)T and define
    e6.11
    where ξk is the (uncontrolled) model trajectory obtained from (5.1) using matrix and ζk is the (uncontrolled) model trajectory obtained from (6.10) with matrix ( + ). Clearly q1(x, k) and q2(x, k) are approximations to the exact solution q(x, t) in (4.3) at t = kΔt. It can be verified that
    e6.12
    where q(x, k) = q(x, t) at t = kΔt. That is, the one-step model error correlation matrix forces the model solution closer to the true solution.
  2. Only for simplicity in exposition did we pose the inverse problem in (6.3) as an unconstrained problem. In fact, one could readily accommodate structural constraint on —such as requiring it to be a diagonal, tridiagonal, or lower-triangular matrix, etc. Further, we could also readily impose inequality constraints on a selected subset of elements of .
  3. Again, only for simplicity did we obtain a single matrix that covers the entire assimilation window and mapping xk to yk for all 1 ≤ kN. In principle, we could divide the assimilation window into L subintervals. Then we could estimate the correction matrix q using only the (xk, yk) pairs that reside in the qth subinterval. In this latter case, we will have a time varying one-step transition correction matrix q for each subinterval, 1 ≤ qL.

7. Conclusions

The essence of the PMP-based approach to dynamic data assimilation is computation of optimal control sequence ukp where the parameter 1 ≤ pn denotes the “richness” of the control. It follows from experiments 1 and 2 that a larger value of p is better. And when this sequence is applied to the deterministic model, it forces the model to track the observations as closely as desired where the closeness is controlled by judicious choices of the relative weights of the two energy terms in the cost functional. More specifically, for a given observational noise covariance matrix , a simple choice of = cp with smaller value of the constant c provides a better fit between the model and the data. The computation of this optimal control sequence rests on the solution to a nonlinear TPBVP. While the solution to this latter class of problems can be a daunting task, especially for the large-scale problems of interest in the geosciences, several well-tested numerical methods for finding the solution are known and are available as components of several program libraries.

We have demonstrated the power of this approach by applying it to a nontrivial linear advection problem. For this linear problem, the TPBVP reduces to two initial value problems. In addition we have developed a flexible framework to consolidate the information from the optimal control sequence into a single correction matrix , which, when added to the given model matrix , will indeed generate a solution that will closely match the optimal trajectory computed using the PMP.

It should be interesting and valuable to compare the model error corrections obtained using the PMP with those obtained from using the model in a weak constraint formulation.

Acknowledgments

We are very grateful to Qin Xu and an anonymous reviewer for their comments and suggestions that helped to improve the organizations of the paper. S. Lakshmivarahan’s efforts are supported in part by two grants: NSF EPSCOR Track 2 Grant 105-155900 and NSF Grant 105-15400.

APPENDIX A

On the Correctness of the Affine Relation between the Costate Variable λk and the State Variable xk Given in (3.8)

Since λN = 0, from the second equation in (3.7) we get
eA.1
which clearly shows that λN−1 is an affine function of xN−1. Substituting (A.1) back into the second equation in (3.7), we obtain
eA.2
But from the first equation in (3.7), it follows that
eA.3
Using (A.1) in (A.3) and simplifying we get
eA.4
Now substituting (A.4) into (A.2), it follows that
eA.5
which is clearly affine in xN−2.

Continuing inductively it can be easily verified that λk is an affine function of xk as posited in (3.8).

APPENDIX B

Solution of the LOM(n) in (4.6)

In this appendix we analyze the eigenstructure of the matrix in (4.8) leading to its Jordan canonical form, which, in turn, leads to the closed form solution of the LOM in (4.8).

a. Eigenstructure of the matrix

Since the structure of the matrix in (4.8) is closely related to the tridiagonal matrix, we start this discussion by stating a well-known result relating to the recursive computation of the determinant of the tridiagonal matrix (Lakshmivarahan and Dhall 1990, 416–418).

Let kk×k be a general tridiagonal matrix of the form
eB.1
for 1 ≤ kn. Let Di denote the determinant of the principal submatrix consisting of the first i rows and i columns of k. Then the determinant Dk of the matrix k in (B.1) is obtained by applying the Laplace expansion to the kth (last) row of k and is given by the second-order linear recurrence
eB.2
where D0 = 1 and D1 = b1.
Setting bi ≡ 0 for all 1 ≤ in, for 1 ≤ i ≤ (n − 1), and for 2 ≤ in in (B.1), it can be verified that n reduces to in (4.10). Substituting these values in (B.2), the latter becomes
eB.3
with D0 = 1 and D1 = 0. Iterating (B.3), it can be verified that
eB.4
Thus, in (4.8) is singular when n is odd. Henceforth we only consider the case when n is even. Refer to Table B1 for values of the determinant of for 2 ≤ n ≤ 10.
Table B1.

Determinant, characteristic polynomial, and eigenvalues of the matrix for 2 ≤ n ≤ 10.

Table B1.
The characteristic polynomial of in (B.1) is found by setting bi = −λ, , and . In this case, the determinant Dn of n in (B.1) represents the characteristic polynomial of in (4.8). Making the above substitutions in (B.2), the latter becomes
eB.5
with D0 = 1 and D1 = −λ. Iterating (B.5) leads to the sequence of characteristic polynomials of for various values of n. Table B1 also gives the characteristic polynomial and the eigenvalues of for 2 ≤ n ≤ 10. From this table it is clear that the absolute value of the largest eigenvalue increases and that of the smallest (nonzero) eigenvalue decreases with n. It turns out that the larger eigenvalues correspond to the high-frequency components and the smaller eigenvalues correspond to the low-frequency components that make up the solution q(t) of (4.8).

b. Jordan canonical form for

Let n×n denote the matrix eigenvalues of and let n×n denote a nonsingular matrix of the corresponding eigenvectors; that is,
eB.6
Then
eB.7
and takes a special block diagonal form
eB.8
and
eB.9
for each complex conjugate pair ±iλi of eigenvalues of for . The matrix in (B.8) is known as the Jordan canonical form of (Hirsch and Smale 1974).

Solution of (4.8):

The general form of the solution q(t) of (4.8) is given by
eB.10
Using (B.7) in (B.10), it can be shown that
eB.11
or
eB.12
where q(t) and are related by the linear transformation
eB.13
By exploiting the structure of , it can be verified that
eB.14
where
eB.15
and
eB.16
Substituting (B.14)(B.16) into (B.12), we obtain . Clearly, is the solution of (4.8).

We conclude this appendix with the following.

Example (B.1).
Consider the case with n = 4 and
eB.17
with given by (4.8). From Table B1, the eigenvalues of (listed in the increasing order of their absolute values computed using MATLAB) are given by
eB.18
From (B.14)(B.16), we obtain
eB.19
and
eB.20
where
eB.21
and
eq3
eq4
Hence, is given by
eB.22
It can be verified that the matrix of eigenvector corresponding to above is given by
eB.23
Hence, the solution of (B.17) is given by
eB.24
Clearly, the general solution qi(t) for each i is a linear combination of the harmonic terms cos(λkt) and sin(λkt), , where the coefficients of the linear combination are given by the elements of the ith row of the matrix of eigenvectors of .

APPENDIX C

Gradient of () in (6.3)

Let : n×n be a functional defined over a set of n × n matrices. Then, by definition, the gradient () is a matrix given by
eC.1
For the gradient of α(, x) in (6.5), let
eC.2
be a row partition of . Then, the Grammian T can be expressed as
eC.3
Consequently,
eC.4
Hence the gradient of α(, x) with respect to the column vector is given by
eC.5
Taking transpose of both sides,
eC.6
By stacking these rows of derivatives, we get
eC.7

a. Gradient of β(, x, y) in (6.6)

From (6.6),
eC.8
Hence,
eC.9

b. Gradient of () in (6.3)

Combining (C.7) and (C.9) with (6.4)(6.7), it is immediate that
eC.10
and
eC.11
Hence, the minimizer of () in (6.3) is given by
eC.12
where + is the generalized inverse of .

REFERENCES

  • Abramov, R. V., , G. Kovacic, , and A. J. Majda, 2003: Hamiltonian structure and statistically relevant conserved quantities for the truncated Burger-Hopf equation. Commun. Pure Appl. Math., 56, 146.

    • Search Google Scholar
    • Export Citation
  • Anthes, R. A., 1974: Data assimilation and initialization of hurricane prediction model. J. Atmos. Sci., 31, 702719.

  • Athans, M., , and P. L. Falb, 1966 : Optimal Control. McGraw-Hill, 879 pp.

  • Bennett, A., 1992: Inverse Methods in Physical Oceanography. Cambridge University Press, 346 pp.

  • Bennett, A., , and M. A. Thorburn, 1992: The generalized inverse of a nonlinear quasigeostrophic ocean circulation model. J. Phys. Oceanogr., 22, 213230.

    • Search Google Scholar
    • Export Citation
  • Bennett, S., 1996: A brief history of automatic control. IEEE Control Syst., 16,1725.

  • Bergman, K. H., 1979: Multivariate analysis of temperatures and winds using optimum interpolation. Mon. Wea. Rev., 107, 14231444.

  • Bergthórsson, P., , and B. Döös, 1955: Numerical weather map analysis. Tellus, 7, 329340.

  • Boltyanskii, V. G., 1971: Mathematical Methods of Optimal Control. Holt, Rinehart and Winston, 272 pp.

  • Boltyanskii, V. G., 1978: Optimal Control of Discrete Systems. John Wiley and Sons, 392 pp.

  • Bryson, A. E., 1996: Optimal control-1950 to 1985. IEEE Control Syst., 16, 2633.

  • Bryson, A. E., 1999: Dynamic Optimization. Addison-Wesley, 434 pp.

  • Canon, M. D., , C. D. Cullum Jr., , and E. Polak, 1970: Theory of Optimal Control and Mathematical Programming. McGraw Hill, 285 pp.

  • Carrier, G. F., , and C. E. Pearson, 1976: Partial Differential Equations: Theory and Techniques. Academic Press, 320 pp.

  • Catlin, D. E., 1989: Estimation, Control and the Discrete Kalman Filter. Springer-Verlag, 274 pp.

  • Dee, D. P., , and A. M. da Silva, 1998: Data assimilation in the presence of forecast bias. Quart. J. Roy. Meteor. Soc., 124, 269295.

  • Derber, J., 1989: A variational continuous assimilation technique. Mon. Wea. Rev., 117, 24372446.

  • Eliassen, A., 1954: Provisional report on the calculation of spatial covariance and autocorrelation of pressure field. Institute of Weather and Climate Research, Academy of Sciences Rep. 5, 12 pp. [Available from Norwegian Meteorological Institute, P.O. Box 43, Blindern, N-0313, Oslo, Norway.]

  • Friedland, B., 1969: Treatment of bias in recursive filtering. IEEE Trans. Autom. Control., 14, 359367.

  • Gandin, L. S., 1965: Objective Analysis of Meteorological Fields. Israel Program for Scientific Translations, 242 pp.

  • Goldstein, H. H., 1950: Classical Mechanics. Addison-Wesley, 399 pp.

  • Goldstein, H. H., 1980: A History of the Calculus of Variations from the 17th through the 19th Century. Springer-Verlag, 410 pp.

  • Griffith, A. K., , and N. K. Nichols, 2001: Adjoint methods in data assimilation for estimating model error. Flow, Turbul. Combust., 65, 469488.

    • Search Google Scholar
    • Export Citation
  • Hirsch, M. W., , and S. Smale, 1974: Differential Equations, Dynamical Systems, and Linear Algebra. Academic Press, 358 pp.

  • Kalman, R. E., 1963: The theory of optimal control and calculus of variations. Mathematical Optimization Techniques, R. Bellman, Ed., University of California Press, 309–329.

  • Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation, and Predictability. Cambridge University Press, 341 pp.

  • Keller, H. B., 1976: Numerical Solution of Two Point Boundary Value Problems. Regional Conference Series in Applied Mathematics, Vol. 24, SIAM Publications, 69 pp.

  • Kuhn, H. W., , and A. W. Tucker, 1951: Nonlinear programming. Proc. Second Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, CA, University of California, Berkeley, 481–492.

  • Lakshmivarahan, S., , and S. K. Dhall, 1990: Analysis and Design of Parallel Algorithm: Arithmetic and Matrix Problems. McGraw Hill, 657 pp.

  • Lakshmivarahan, S., , and J. M. Lewis, 2013: Nudging: A critical overview. Data Assmilation for Atmospheric, Oceanic and Hydrologic Applications, Vol. 2, S. K. Park and L. Liang, Eds., Springer-Verlag, in press.

  • Lewis, F. L., 1986: Optimal Control. John Wiley and Sons, 362 pp.

  • Lewis, J. M., 1972: An operational upper air analysis using the variational methods. Tellus, 24, 514530.

  • Lewis, J. M., , and S. Lakshmivarahan, 2008: Sasaki’s pivotal contribution: Calculus of variation applied to weather map analysis. Mon. Wea. Rev., 136, 35533567.

    • Search Google Scholar
    • Export Citation
  • Lewis, J. M., , S. Lakshmivarahan, , and S. K. Dhall, 2006: Dynamic Data Assimilation: A Least Squares Approach. Cambridge University Press, 654 pp.

  • Lynch, P., 2006: The Emergence of Numerical Weather Prediction: Richardson’s Dream. Cambridge University Press, 279 pp.

  • Majda, A. J., , and I. Timofeyev, 2000: Remarkable statistical behavior for truncated Burgers–Hopf dynamics. Proc. Natl. Acad. Sci. USA, 97, 12 41312 417.

    • Search Google Scholar
    • Export Citation
  • Majda, A. J., , and I. Timofeyev, 2002: Statistical mechanics for truncations of Burger-Hopf equation: A model for intrinsic stochastic behavior with scaling. Milan J. Math., 70, 3996.

    • Search Google Scholar
    • Export Citation
  • Menard, R., , and R. Daley, 1996: The application of Kalman smoother theory to estimation of 4DVAR error statistics. Tellus, 48A, 221237.

    • Search Google Scholar
    • Export Citation
  • Naidu, D. S., 2003: Optimal Control Systems. CRC Press, 433 pp.

  • Platzman, G. W., 1964: An exact integral of complete spectral equations for unsteady one-dimensional flow. Tellus, 16, 422431.

  • Polak, E., 1997: Optimization. Springer, 779 pp.

  • Pontryagin, L. S., , V. G. Boltyanskii, , R. V. Gamkrelidze, , and E. F. Mischenko, 1962 : The Mathematical Theory of Optimal Control Processes. John Wiley, 360 pp.

  • Roberts, S. M., , and J. S. Shipman, 1972: Two-Point Boundary Value Problems: Shooting Method. Elsevier, 289 pp.

  • Rouch, H. E., , F. Tung, , and C. T. Striebel, 1965: Maximum likelihood estimates of linear dynamic systems. J. Amer. Inst. Aeronaut. Astronaut., 3, 14451450.

    • Search Google Scholar
    • Export Citation
  • Sasaki, Y., 1958: An objective analysis based on the variational method. J. Meteor. Soc. Japan, 36, 7788.

  • Sasaki, Y., 1970a: Some basic formulations in numerical variational analysis. Mon. Wea. Rev., 98, 875883.

  • Sasaki, Y., 1970b: Numerical variational analysis formulated under the constraints as determined by longwave equations and low-pass filter. Mon. Wea. Rev., 98, 884898.

    • Search Google Scholar
    • Export Citation
  • Sasaki, Y., 1970c: Numerical variational analysis with weak constraint and application to surface analysis of severe storm gust. Mon. Wea. Rev., 98, 899910.

    • Search Google Scholar
    • Export Citation
  • Shen, J., , T. Tang, , and L. L. Wang, 2011: Spectral Methods. Springer-Verlag, 470 pp.

  • Wiener, N., 1948: Cybernetics: Control and Communication in the Animal and Machine. John Wiley, 194 pp.

  • Wiin-Nielsen, A., 1991: The birth of numerical weather prediction. Tellus, 43A, 3652.

  • Zupanski, D., 1997: A general weak constraint applicable to operational 4DVAR data assimilation system. Mon. Wea. Rev., 125, 22742292.

    • Search Google Scholar
    • Export Citation
Save