An optimal perturbation is an initial condition that optimizes some measure of amplitude growth over a prescribed time in a linear system. Previous studies have argued that optimal perturbations play an important role in turbulence. Two basic questions related to this theory are whether optimal perturbations necessarily grow in all turbulent background flows and whether the turbulent flow necessarily excites optimal perturbations at the rate required to account for the observed eddy variance. This paper shows that both questions can be answered in the affirmative for statistically steady turbulence. More precisely, it is shown that eddies in statistically stationary turbulence must project onto a class of amplifying perturbations called instantaneous optimals, which are defined as initial conditions that optimize the rate of change of energy associated with the dynamical system linearized about the time-mean flow. An analogous conclusion holds for potential enstrophy when the latter satisfies a similar conservation principle. It is shown that the growing instantaneous optimals imply the existence of growing finite-time singular vectors. Moreover, the average projection on the growing instantaneous optimals must be sufficient to balance the average projection on all other eddies. In contrast to most other types of optimal perturbations, the phase space spanned by the growing instantaneous optimals is independent of the norm used to measure the initial amplitude. This paper also proves that growing instantaneous optimals must exist and play a significant role in nonlinear vacillation phenomena. The argument put forward here follows essentially from statistical equilibrium and conservation of energy, and is independent of any closure theory of turbulence.
The concept of an optimal perturbation was introduced by Lorenz (1965) in his analysis of forecast error growth in dynamical systems. In essence, Lorenz posed the problem of finding the infinitesimal initial condition that maximized perturbation amplitude in a dynamical system at some prescribed lead time. The solution to this optimization problem, for any norm for measuring amplitude, can be reduced to a singular value decomposition of a suitable propagator. The singular vectors give an orthogonal set of initial conditions that optimize amplitude, ordered such that the leading vector maximizes amplitude growth over all initial conditions, the second vector maximizes amplitude growth over all initial conditions orthogonal to the first, and so on.
In a series of papers, Farrell (1982, 1984, 1985, 1988a, 1989) argued that optimal perturbations could account for a wide variety of observational features of cyclogenesis, including the appearance of an upper-level trough overtaking a surface depression prior to cyclogenesis, and growth rates in excess of unstable normal modes. Farrell and Ioannou (1993) argued that optimal perturbations could account for the development of eddies in viscous shear flow and for the maintenance of pipe turbulence. Farrell (1988b), Butler and Farrell (1992), Farrell and Ioannou (1996), and Schmid and Henningson (2001) have argued that optimal perturbations are important for the transition to turbulence.
Despite the wealth of evidence presented in the above studies, there still remain unresolved issues regarding the importance of optimal perturbations in turbulence. First, nothing in the calculation of optimal perturbations ensures that they necessarily grow in turbulent background flows. Although all studies of which we are aware indicate that growing optimal perturbations exist in realistic turbulent flows, there does not seem to be a proof that this must be so. Second, even if growing optimal perturbations exist, nothing in their calculation ensures that they will be excited at a rate necessary for them to play a significant role. Indeed, if optimal perturbations account for too little energy, then their importance to turbulence becomes questionable.
Whether growing optimal perturbations exist and whether they are excited at the rate required to account for the observed eddy variance depends on the eddy statistics of turbulent flows, for which no comprehensive theory exists. Consequently, it might appear that the above questions cannot be answered in the absence of a complete theory of turbulence. It turns out, however, that both questions can be answered in the affirmative for statistically steady turbulence. In essence, the result follows from conservation of energy—eddies can maintain their energy level against dissipation only if they extract energy from the background flow, but extraction of mean energy requires the presence of “instantaneous optimals,” which in turn implies the existence of finite- time singular vectors. Our argument is simple, but seems to have gone unnoticed. In the interest of presenting a self-contained argument, we first review the classical theory of optimal perturbations in the next section, then present our main argument in section 3. The results are generalized to time-dependent basic states in section 4. We conclude with a summary and discussion of the results. Some connections between this work and stochastic turbulence models are discussed in the review article by DelSole (2004).
2. Brief review of optimal perturbations
The concept of an optimal perturbation arises when one considers the maximum growth of an initial condition in a linear system. By linear system we mean that the state vector g is governed by a set of differential equations of the form
where the dot denotes a time derivative and 𝗟 is a dynamical operator. It suffices in this paper to consider only constant dynamical operators 𝗟. As is well known, the solution to the differential equation (1) at time t, gt, given the initial condition g0, can be written
where Φ is a time-dependent matrix called the propagator. Without loss of generality, we choose the initial time at t = 0. Among many possibilities, the amplitude of gt can be measured by the sum square elements gHtgt, where superscript H denotes the conjugate transpose. The growth of a perturbation then can be measured by the ratio of amplitudes:
This ratio, called the amplification factor, is in the form of a Rayleigh quotient. It follows from well-known properties of the Rayleigh quotient (Noble and Daniel 1988) that this ratio is optimized by the eigenvectors of ΦHΦ. More precisely, the eigenvectors of ΦHΦ form an orthogonal set that can be ordered such that the first maximizes α, the second maximizes α over all vectors orthogonal to the first, and so on. The eigenvectors are identical to the right singular vectors of Φ, and the left singular vectors give the final states into which the corresponding right singular vectors evolve. These singular vectors are independent of the initial condition time, since 𝗟 is independent of time.
In this paper, the above vectors will be called finite- time singular vectors. Another type of perturbation, called an instantaneous optimal, is the initial condition that maximizes the instantaneous growth rate at zero lead time (Farrell and Ioannou 1994; Schmid and Henningson 2001, p. 189). If the amplitude gHtgt at time t is E, then the tendency equation for E derived from (1) is
The amplitude growth rate is then given by the ratio
The right-hand side of (5) is a Rayleigh quotient, from which it follows that the eigenvectors of 𝗟 + 𝗟H define an orthogonal set of vectors that optimize Ė/E. For short lead time t, it can be shown that ΦHΦ ≈ 𝗜 + t(𝗟 + 𝗟H). Hence, instantaneous optimals can be interpreted as finite-time singular vectors in the limit of short lead time.
Since 𝗟 is independent of time, the instantaneous optimals give bounds on the growth rate not only at zero lead time, but at any instant in a perturbation's evolution. This fact reveals an important property of instantaneous optimals: they define the complete subspace of perturbations capable of extracting net energy from the mean flow—any other perturbation orthogonal to this subspace must decay.
If all instantaneous optimals decay, then all finite- time singular vectors decay, regardless of lead time. To see this, suppose all instantaneous optimals decay for a given 𝗟. In this case, Ė is negative for all perturbations at all times—only decaying solutions exist. Clearly, if at least one instantaneous optimal grows, then a growing finite-time singular vector also exists. Finally, if 𝗟 supports unstable eigenmodes, then a growing instantaneous optimal must also exist. Furthermore, the unstable eigenmode must project on the growing instantaneous optimals. To see this, note that an unstable eigenmode must have positive energy growth rate (5), which can occur only if the projection on the growing instantaneous optimals weighted by the growth rates dominates the projection on all decaying instantaneous optimals. This result implies that unstable modes and growing instantaneous optimals share similar structures, so separating the role of these perturbations in turbulent flows can be problematic. It is probably worth noting that the above results do not generally hold for time-dependent 𝗟—if the basic state changes in time, a singular vector could decay initially but grow beyond some lead time.
3. Main result
Nothing in the above procedure for obtaining optimal perturbations guarantees that the resulting perturbations actually grow in the observed, time-mean state of a turbulent flow nor that they are a significant component of turbulence. To address these issues in a general framework, a concrete turbulence model must be defined. In this section, we define a general class of turbulence models; then we show that the equations linearized about the time-mean state must support growing optimal perturbations.
We consider only discretized fluid dynamical equations, which, following Lorenz (1963a), can be written (under suitable boundary conditions) in the form
where c is a vector associated with forcing, 𝗗 is a matrix associated with damping, and Nijk is associated with nonlinear interactions. (A minor technicality in geophysical flows is that 𝗗 also arises from advection of planetary vorticity—“beta effects”—but the associated operator is skew symmetric and hence does not affect energy.) A crucial property of the above system is that it conserves energy in the absence of forcing and dissipation. As Lorenz showed, this property ensures that the solutions of (6) are bounded. This class of turbulence models includes the incompressible Navier–Stokes equation and the quasigeostropic equations but not models that contain nonlinear terms beyond second order (such as those that include moist processes and/or Richardson- number-dependent flux parameterizations). Thus, our conclusions drawn from (6) are intended to be interpreted as suggestive properties of classical shear flows.
For any coordinate system, energy can be written as E = gHt𝗠gt, where 𝗠 is a positive definite matrix. The variable gt can be transformed into a new variable, g̃t, such that energy is the Euclidean norm E = g̃Htg̃t. Interestingly, it can be shown that gt still satisfies an equation of the same form as (6), but with suitably modified c, 𝗗, Nijk. Thus, without loss of generality, we choose a coordinate system such that E = gHtgt; that is, 𝗠 = 𝗜.
The fact that gHtgt is conserved in the absence of forcing and dissipation implies
for all vectors gt. A symmetry in (6) implies Nijk = Nikj. Furthermore, by assuming gt to be nonzero only at one, two, or three arbitrarily selected components, we may prove that the energy conservation law (7) implies that Nijk vanishes if any two indices are equal, and that Nijk + Njki + Nkji = 0 for any three indices i, j, k.
In quasigeostrophic systems, under certain boundary conditions, the nonlinear terms also conserve a second quadratic quantity called potential enstrophy. Enstrophy can be written as S = gHt𝗭gt, where 𝗭 is a positive definite matrix. By similar arguments as used above, the variable gt may be transformed such that enstrophy equals a Euclidean norm and the governing equations are isomorphic to (6). Thus, for every property derived from energy conservation, an analogous property can be derived from enstrophy conservation when the latter holds. Also, the quasigeostrophic equations conserve any power of potential vorticity, but this conservation property holds only in the continuous limit, corresponding to an infinite set of equations. The models in this paper are finite dimensional and hence do not satisfy all conservation principles of the original equations.
If the flow is in statistical equilibrium, then it can be decomposed into a time mean and transient component. Let the mean be represented by 〈g〉 and deviations therefrom be g′t. Substituting gt = 〈g〉 + g′t into the full equation (6) gives
The operator 𝗟 is the dynamical operator linearized about the time-mean flow and, hence, is constant in time. The trace of 𝗟 equals the trace of 𝗗 since Nijk vanishes if any two indices are equal. It follows that the trace of 𝗟 is negative since 𝗗 is negative definite owing to its association with dissipative processes.
For quasigeostrophic systems, the nonlinear interaction terms conserve energy and potential enstrophy, which we denote by E = gHtgt and S = gHt𝗭gt. Thus, the third- order terms (11) vanish in the case 𝗠 = 𝗜 (energy) and 𝗠 = 𝗭 (potential enstrophy). Under a time average, the left-hand side of (10) vanishes and the linear terms in (10) vanish, leaving the following equilibrium conditions:
These equations state the standard result that the energy and potential enstrophy extracted from the time-mean flow must balance the losses by the explicit damping.
To solve 〈gH(𝗟 + 𝗟H)g〉 = 0, we note that 𝗟 + 𝗟H is symmetric, and hence its eigenvectors form a complete, orthonormal basis set. If the state gt is represented by a linear combination of the eigenvectors, then the expansion coefficient for the kth eigenvector uk is ak = uHkgt. In this coordinate system, the first equation in (12) becomes
where λk denotes an eigenvalue of 𝗟 + 𝗟H. Assuming that 𝗟 + 𝗟H does not have vanishing eigenvalues, the above equation has nontrivial solutions only if the matrix 𝗟 + 𝗟H has at least one positive and one negative eigenvalue, and if the associated projection coefficients are nonzero. Furthermore, condition (13) constitutes a constraint on the amplitude variance of the projection coefficients. Recall that the eigenvectors of 𝗟 + 𝗟H are the instantaneous optimals. Moreover, owing to the negative trace of 𝗟, the matrix 𝗟 + 𝗟H always has at least one negative eigenvalue. It follows that (13) can be satisfied only if there exists a growing instantaneous optimal, and the eddies project on it.
The above considerations establish the following necessary conditions for the dynamical system (6) to have nontrivial, statistically steady solutions: 1) the dynamical operator linearized about the time-mean solution must possess at least one growing instantaneous optimal, and 2) the perturbations must project, on average, onto the growing instantaneous optimals with amplitudes sufficient to balance the energy decay by all other eddies. Exactly the same reasoning can be applied to the second balance condition (12) to establish analogous necessary conditions for potential enstrophy. These necessary conditions are exact and independent of any closure theory.
The above conclusion is independent of the norm used to measure the initial amplitude. This follows from the fact that the conclusion is based on the sign of gHt(𝗟 + 𝗟H)gt, which is not altered by normalizing the amplitude of gt.
If 𝗟 supports only decaying eigenmodes (i.e., has only negative real eigenvalues), then 𝗟 must be nonnormal in stationary turbulence since 𝗟 + 𝗟H can have positive eigenvalues for stable 𝗟 only if 𝗟 is nonnormal. The situation is not so clear when 𝗟 supports unstable eigenmodes since in this case 𝗟 need not be nonnormal to satisfy (13). Indeed, since growing instantaneous optimals must project on unstable eigenmodes, separating the role of these perturbations in turbulence flows can be problematic.
Many nonlinear systems exhibit not only turbulence, but also regular oscillations, called vacillations (Lorenz 1963b). In the above considerations, the only assumed properties of turbulence were that the eddies exist (i.e., have nonzero amplitude) and were in statistical equilibrium. Vacillation cycles trivially satisfy these assumptions since the tendency of any periodic quantity vanishes when averaged over a complete period. It follows then that instantaneous optimals defined with respect to the time-mean flow must exist and constitute a significant component of vacillation cycles too.
In the absence of forcing and dissipation, fluid dynamical equations are time reversible; that is, they are invariant with respect to the transformation t → −t and associated transformations on velocities and rotation parameters. In this case, time reversibility implies that, for every growing perturbation, there exists a corresponding decaying perturbation that decays at precisely the same rate. Adding dissipation to the governing equations breaks the time reversible symmetry and (usually) produces a preponderance of decaying perturbations. For sufficiently strong dissipation, no eddies grow. This raises the question as to whether the atmosphere resides in the weakly damped case, the strongly damped case, or somewhere in between.
A survey of the literature does not yield a decisive answer to the above question. Ehrendorfer and Errico (1995) computed the finite-time singular vectors for a realistic mesoscale model. They found that many optimals were associated with adjustment processes, which may not be representative of the perturbations that arise naturally in a nonlinear simulation. If, instead, the energy of rotational modes were used for the norm, the number of growing perturbations was less than 3%. Unfortunately, this conclusion does not pertain precisely to instantaneous optimals since the optimization time was 24 hours. Also, mesoscale models are not of the form (6), due to the higher-order nonlinear terms in the parameterization of convection and radiation, and in vertical advection in sigma coordinates. Reynolds and Palmer (1998) computed the 48-h singular vectors for a three-layer quasigeostrophic model and found approximately half of the singular vectors grew, corresponding to the weakly damped case. However, the basic state was not the time-mean flow of the fully nonlinear system. To our knowledge, no clean example seems to exist in the literature. A systematic investigation of instantaneous optimals in fully developed turbulence will be reported in a future paper.
4. Extension to time-dependent flows
We now briefly describe the extension of the above methodology to time-dependent basic states. To avoid confusion, we define a new eddy variable, ĝ, as the difference between the state and an arbitrary basic state g∗, where g∗ may depend on time:
Substituting this equation into (6) gives
where we have used the concise notation
This equation may be written as
The fact that energy is conserved in the absence of forcing and dissipation implies
Owing to this relation, the exact energy equation for the eddies is
If the basic state g∗ is an exact nonlinear solution of (6), then Γ vanishes, and
where 𝗟∗ is the tangent linear operator. This equation gives the total energy of perturbations about an arbitrary nonlinear trajectory. Since the operator 𝗟∗ depends on time, the instantaneous optimals could decay or grow depending on lead time.
The above equation has implications to forecast error growth. Forecast error may be defined as the difference between the “truth,” defined by some nonlinear trajectory, and the forecast. If total error energy amplifies, then (21) is positive, which requires that growing instantaneous optimals exist and dominate the error population. The full implications of this constraint cannot be appreciated without also considering the evolution of the error covariance matrix in the context of data assimilation, which involves discontinuous changes when observations are assimilated. These considerations lie outside the scope of the present paper.
Finally, we may define the basic state to be not merely a nonlinear solution, but a steady nonlinear solution. Such a solution is called a stationary point of the nonlinear equations. In this case, (21) still holds exactly, but the dynamical operator 𝗟∗ is now independent of time. If the statistics are stationary, then the left-hand side vanishes under a time average, leaving an equation of the same form as (12), but with 𝗟∗ being interpreted as the dynamical operator linearized about any stationary point of the governing equations. In this case, (21) implies that the eddies ĝ must project on the instantaneous optimals of all possible stationary points of the nonlinear equations. This situation might appear to be significant because the stationary points of a dynamical system can be computed prior to any simulation. Moreover, since the stationary states differ from the time- mean state, it might appear that the new equation provides additional constraints for the eddies. However, since ĝ has nonzero mean, the constraint that it must project on growing instantaneous optimals does not necessarily imply that eddy perturbations about the time mean also project onto the growing optimals. Rewriting the balance constraint in terms of the time mean and eddy, where ĝ = g′ + 〈g〉 − g∗, gives
Note that the right-hand side is independent of time. Further note that if g∗ is identified with the time mean, then the right-hand side vanishes and (12) is recovered. Thus, the above equation can be interpreted as a more general balance constraint. Nevertheless, we have examined the above equation in the special case of a Lorenz model, in which it yields three equations corresponding to the three stationary states, and found that it does not lead to any independent constraint that is not derivable from the mean equations and (12). We suspect that, in general, the above equation does not lead to independent constraints on turbulent eddies, though we have not been able to prove this conjecture.
5. Summary and discussion
This paper established some necessary conditions for statistically steady turbulence in a class of dynamical systems characterized by (at most) second-order nonlinear terms that conserve energy (or some other quadratic form). In particular, we proved that turbulence requires that the dynamical operator, linearized about the time-mean flow, support at least one growing instantaneous optimal. An instantaneous optimal is a perturbation that maximizes the instantaneous growth rate of energy. This result implies that both instantaneous optimals and finite-time singular vectors (under the energy norm) grow in all realizable turbulent systems. An analogous conclusion holds for potential enstrophy, when it is conserved in the absence of forcing and dissipation. This paper also showed that the instantaneous optimals must have amplitudes sufficient to balance the decay by all other eddies and, hence, must have physically significant amplitudes in the sense that they satisfy the balance condition (12).
The existence of instantaneous optimals implies the existence of growing, finite-time singular vectors. However, the amplitudes of the latter are not constrained as much as those of the former. At short lead times, the two perturbations are indistinguishable and hence obey the same constraints. At long lead times, however, finite- time singular vectors either grow or decay indefinitely, depending on the stability of the linearized dynamical operator and, hence, imply unrealistically large or small initial amplitudes to explain the observed eddy energy. These considerations suggest that the existence and significance of finite-time singular vectors at long lead times cannot be established.
A fundamental property of growing instantaneous optimals is that they span the complete subspace of all possible amplifying perturbations—any other perturbation orthogonal to this subspace can only decay. Hence, if no instantaneous optimal can grow, then no perturbation whatsoever can grow, and eddy energy must decay. This property is consistent with the fact that growing instantaneous optimals are the only perturbations capable of extracting (net) energy from the background flow.
Finite-time singular vectors sometimes are described as “unrealistic” in the sense that they do not resemble structures with significant variance (i.e., empirical orthogonal functions). Yet, the above results clearly establish that instantaneous optimals have physically significant amplitudes. The resolution of this paradox lies partly in the fact that instantaneous optimals generally have larger spatial scale than finite-time singular vectors, and partly in the fact that the optimals with large growth rates need relatively little energy to account for the observed eddy energy, owing to the inverse relation between amplitude and growth rate in (13).
The above results appear to address several criticisms about optimal perturbations. In contrast to finite-time singular vectors, growing instantaneous optimals do not involve an arbitrary lead time, and span a space that is independent of the norm used to measure the initial amplitude. The results also suggest that norms based on energy or potential enstrophy have a preferred status in turbulent theory, in the sense that the associated optimals are guaranteed to grow in all realizable turbulent flows and have physically significant amplitudes, in constrast to optimals for other norms that have no such a priori guarantees. Finally, an instantaneous optimal is a linear concept of a time-mean flow. To some extent, then, they justify the significance of linear concepts, even though turbulence is a nonlinear phenomenon, and justify the significance of the time-mean flow, even though the actual flow fluctuates in time. Although instantaneous optimals have been discussed in earlier research (e.g., Farrell and Ioannou 1994; Schmid and Henningson 2001, p. 189), the above fundamental properties do not seem to have been noted previously.
This paper showed that the instantaneous optimals of the tangent linear operator must play an important role in error growth in data assimilation, though the full implications of this result could not be examined. It should be recognized that the dynamics of errors at short lead times depend on the covariances of observation errors and the local characteristics of dynamical systems in phase space. This contrasts with the conclusion reached in this paper, which applies to the global characteristics of dynamical systems over long times. For instance, in the context of the three-variable Lorenz model, the theorem merely states that the attractor must have significant projection onto the growing instantaneous optimals of the Lorenz system. The theorem does not imply that the errors must have significant projection onto these same instantaneous optimals.
It should be recognized that the above results constitute a balance constraint for all instantaneous optimals and do not indicate how energy (or enstrophy) is partitioned among the optimals. It also should be recognized that the balance constraint gives little insight into how nonlinear interactions unfailingly excite instantaneous optimals. Some insight into these questions might be obtained from numerical simulations of turbulence.
It is a pleasure to thank D. Nolan, B. Kirtman, B. Fox-Kemper, and B. Cash for discussions regarding this work. I also would like to thank the anonymous reviewers for helpful suggestions. This research was supported by the NSF (ATM9814295), NOAA (NA96-GP0056), and NASA (NAG5-8202).
Corresponding author address: Timothy DelSole, Center for Ocean–Land–Atmosphere Studies, 4041 Powder Mill Rd., Suite 302, Calverton, MD 20705-3106. Email: firstname.lastname@example.org