## 1. Introduction

Gravity-wave-induced zonal momentum flux is believed on good evidence to account for most of the angular momentum transport into the mesosphere that is known to be crucial for many features of the observed global-scale circulation, especially in the summer mesosphere (e.g., Holton et al. 1995) and also in the upper stratosphere (e.g., Rosenlof 1996). Nonorographic sources need to be involved because orographic waves have critical layers near 20-km altitude (see Fig. 1) and cannot reach the mesosphere. One possible nonorographic source is Kelvin–Helmholtz (KH) shear instability above tropopause jets, forming pancake-shaped clear-air turbulence (CAT) layers with envelope scales of hundreds of meters in the vertical and tens of kilometers in the horizontal.

However, the horizontal wavelength of individual KH billows is typically far too short to reach the mesosphere or even to be emitted at all. As noted in the companion paper (Bühler and McIntyre 1999, hereafter Part II), waves long enough to reach the mesosphere, that is, long enough to avoid Doppler shifts up to the buoyancy frequency *N* and consequent back-reflection, need to have horizontal wavelength ≳30 km, implying that if KH emission is important, then it must be nonlinear emission on the envelope scale, somewhat as in the original suggestion by Fritts (1982, 1984) and Chimonas and Grant (1984).

Envelope-scale emission can occur either through nonlinear forcing of waves during the actively turbulent stage of the CAT layer, or during the collapse of the mixed or partially mixed layer that is left behind once the active turbulence has died away. A numerical study of the first mechanism based on nonlinear grouping events between individual KH billows has recently been undertaken by Scinocca and Ford (1999). The present paper focuses on the second mechanism for envelope-scale emission, namely, the mixed-layer collapse.

This is a formidable problem, and for tractability’s sake a severe idealization of it is studied. The problem is broken down into three stages, as indicated in Fig. 2. Stage A is the initial mixing event, that is, the rapid formation of a CAT layer in which the background gradients of potential temperature *θ*(*z*) and mean zonal wind *u*(*z*) are eroded, that is, weakened, by three-dimensional turbulent mixing. This provides the initial conditions for the linear wave emission problem studied in stage B. To allow the initial development of a slowly varying wave train, the mean shear must be neglected at this stage. The main subject of this paper is the analytic solution to this linear emission problem and a comparison of it with fully nonlinear two-dimensional numerical simulations of the same initial-value problem. The latter suggests that the linearization works surprisingly well, despite being far from strictly valid. In stage C the subsequent development of the wave train is studied using a ray-tracing scheme that takes into account mean shear and wave dissipation by viscous diffusion and infrared radiative damping. This is the subject of the companion paper (Part II). The overall result is that only waves with positive zonal pseudomomentum can reach the summer mesosphere and hence exert a one-signed, ratchetlike eastward force there, and that the average of this force due to a realistic ensemble of stratospheric CAT layers cannot safely be neglected for the global angular momentum budget.

## 2. Stage A: Formation of KH-induced CAT layer

A comprehensive model for KH and KH-induced CAT in the lower stratosphere would necessarily be very complicated and would also involve subtle fluid-dynamical issues such as the detailed structure of shear-induced gravity wave breaking. Therefore, only one very simple source mechanism of KH is considered in detail here, which we believe is characteristic of many naturally occurring situations. It involves a thin vertical layer of increased static stability (i.e., high buoyancy frequency *N*) that is rendered KH unstable as the result of transient, that is, temporary, tilting by large-scale motions, producing KH instability in the manner of the Thorpe (1973) tilted-tank laboratory experiments in which the tank was first tilted and then brought back to the horizontal. The large-scale tilting motion could itself be due to, for instance, passing large-scale gravity waves or quasi-horizontal vortices, but it is not necessary to consider a specific model for it.

It is a well-established, if surprising, fact that such tilting motion makes layers with high *N* the preferred sites of KH instability, despite the fact that for fixed background shear *u*_{z} the background Richardson number Ri = (*N*/*u*_{z})^{2} before the tilting is *larger,* and hence more stable, inside these layers. This is because the local shear that develops under large-scale tilting motion is not fixed but is proportional to the local *N*^{2} in the “tilting” situation. In turn, the local Richardson number becomes proportional to *N*^{−2}; that is, it becomes least stable in regions of strong stratification.^{1} This robust link between strong stratification and increased likelihood of KH is in accordance with many observations in the abovementioned tilted-tank experiments, and field studies such as the study of KH billows on the ocean thermocline in Woods (1968); also, e.g., Phillips (1977) and Fritts and Rastogi (1985).

It is now assumed that the KH billows eventually break down into three-dimensional small-scale turbulence via a sequence of secondary instabilities, whose exact details are not important for the purpose of this paper. It is further assumed that the three-dimensional turbulence achieves rapid mixing of potential temperature and horizontal velocity, eroding the mean gradients in both fields. When the transient large-scale tilt is approximately reversed, a well-mixed layer of decaying turbulence remains in which the background gradients of *θ*(*z*) and *u*(*z*) have been significantly eroded, this erosion being complete in the limiting case of perfect mixing that marks the outcome of a very strong mixing event. Although Ri values must be low during the tilt, we assume that they are relatively large afterward. Figure 3 shows a schematic of this mixed-layer scenario.

How well does this picture of the mixing process compare with field observations and numerical simulations? Observations such as those reported in Browning and Watkins (1970) show conspicuous evidence for the irreversible mixing of *θ* and the associated “rabbit-ear” structure of the *N*^{2} profile across the layer, which corresponds to homogenized values of *θ* in the core and to all the stratification being pushed into the edges where *θ* has to match up with the background temperature profile. Observations of velocity profiles before and after CAT events are much more scarce and are also difficult to interpret because of the unknown importance of time-dependent background shear in the vicinity of the CAT layer. For instance, the observations of Browning and Watkins (1970) showed little change in the velocity profile before and after the CAT event apart from some degree of overall smoothing.

Direct numerical simulations of KH and its turbulent breakdown into CAT are, at present, barely beginning to achieve a degree of complexity that makes them useful for comparison with real atmospheric events (e.g., Scinocca 1995; and a series of papers including Fritts et al. 1996 and Werne and Fritts 1999). Such simulations are still severely limited in Reynolds number and by the degree of three-dimensionality that they can achieve. Atmospheric CAT Reynolds numbers are at least two orders of magnitude larger than those in the simulations, and hence atmospheric CAT can be expected to be much more vigorous than the simulated turbulence. Lack of full three-dimensionality can lead to qualitatively different long-time behavior such as long-lived two-dimensional vortex rolls that do not break down, contrary to what is conspicuously observed in the atmosphere.

Such long-lived rolls would imply persistent mean shear in the layer. However, once three-dimensional turbulence sets in vigorously, then robust fluid-dynamical energy arguments predict that the mean shear must be eroded to supply the energy for the three-dimensional eddies. These robust arguments underpin, for instance, any theory based on turbulent eddy diffusivity for the mean velocity, provided only that the natural variation of eddy diffusivity across the vertical depth of the layer (i.e., large diffusivity in the well-mixed strongly turbulent core, small diffusivity near the stratified weakly turbulent edges) is taken into account. This leads to the prediction that direct numerical simulations at higher resolution and larger Reynolds numbers will eventually corroborate the assumption made here that potential temperature *θ* and horizontal velocity *u* are mixed in essentially the same way in a CAT layer.

## 3. Stage B: Analytic wave emission model

The general solution to the linear “Rossby adjustment” initial-value problem for a localized, compact initial disturbance embedded in an unbounded rotating Boussinesq system is derived. The compact initial conditions dissolve into two different flow components: propagating inertia–gravity waves and a residual flow in geostrophic balance. The solution is derived in detail for the most complicated three-dimensional rotating case, and, where appropriate, suitable modifications of the formulas to obtain other cases are indicated. Care is taken to obtain a form of the solution that is equally useful for calculating snapshots of the time evolution as well as for calculating the total asymptotic wave emission into different directions.

Sufficient detail is given for a numerical discretization of the solution using digital Fourier transforms (FTs). A nontrivial discretization difficulty arises for spectral modes with zero horizontal wavenumber, and appropriate limits have to be extracted carefully from the solution formulas to yield a robust discretization.

### a. General solution to linear initial-value problem

*f*-plane Boussinesq equations in a frame of reference moving with the local mean velocity are

*x*and

*y*are the horizontal coordinates, with

*x*increasing eastward and

*y*increasing northward, and in which

**ẑ**is a vertical unit vector pointing upward in the direction of increase of the vertical coordinate

*z.*The corresponding linearized velocity components are

**u**= (

*u, υ, w*),

**f**=

*f*

**ẑ**is the Coriolis parameter,

*P*is the pressure in excess of the hydrostatic pressure divided by the constant Boussinesq reference density,

*N*is the constant buoyancy frequency, and

*σ*is the buoyancy acceleration. In the usual way, the no-divergence condition (3), together with suitable boundary conditions at infinity, determines

**∇***P*uniquely, which effectively reduces the number of independent fields by one. Hence the initial-value problem is complete by posing initial conditions for the velocities and the buoyancy acceleration only, that is, by posing

**u**

_{0}must satisfy (3). All initial fields are assumed to be localized in a finite region around the origin of the coordinate system, and they are also assumed to be proper disturbance fields in the sense that they must have zero mean value when integrated over the entire spatial domain.

**k**= (

*k, l, m*) in Fourier space the solution is described by the sum of three modes, namely, a steady mode in geostrophic balance and two time-dependent inertia-gravity-wave modes. The linear steady balanced mode for any given

**k**can be extracted from the initial conditions in the usual way by “inverting” the potential vorticity (PV) of the initial fields, which is

*Q*

_{t}= 0. Because

*Q*is time-independent, it is clear that only the steady balanced mode can be significant for

*Q*once all the inertia–gravity waves have propagated away. Therefore, for each

**k**the steady balanced mode is found by setting all time derivatives to zero in (1)–(3) and by assuming that the PV for this

**k**is entirely due to the steady balanced mode. Denoting the balanced fields by a superscript

*b*, this results in

*n*-dimensional spatial FT pair:

*g*(

**x**) is any function of

**x**.

*w,*that is,

*u*

^{w}

*υ*

^{w}

*w*

^{w}

*σ*

^{w}

*u*

*u*

^{b}

*υ*

*υ*

^{b}

*w,*

*σ*

*σ*

^{b}

**k**) that allows treating all variables on an equal footing. In particular, the complex Φ neatly combines both real-valued wave modes, and |Φ|

^{2}will turn out to be the relevant spectral density of wave energy. The definition of Φ is

*k*

_{h}≡

*k*

^{2}+

*l*

^{2}

*ω̂*

**k**) is the intrinsic frequency given by the positive branch of the Boussinesq inertia-gravity wave dispersion relation:

*ω̂*

*N*> |

*f*|.

*g*(

**x**) satisfies

*ĝ*(

**k**) = ±

*ĝ**(−

**k**), respectively, where the star denotes taking the complex conjugate. For instance, combining (12) and (16) at

*t*= 0 gives

*w*

^{w}= ℜFT

^{−1}{

*ŵ*

^{w}

_{0}

*iω̂*

*σ̂*

^{w}

_{0}

*N*

^{2}}, and due to

*ω̂*

**k**) =

*ω̂*

**k**) the second term results in a purely imaginary FT

^{−1}, leaving only

*w*

^{w}

_{0}

The two-dimensional case is included by setting the respective wavenumber component to zero (e.g., *l* ≡ 0 if there is no *y* dependence) and using *n* = 2 in the definition of the FT pair in (9). Also, the nonrotating case is included by setting *f* = 0. The preceding analytic solution in terms of Φ can straightforwardly be evaluated numerically using digital FTs, except for modes with horizontal wavenumber *k*_{h} = 0. For these modes, it turns out to be vital for a robust and accurate numerical discretization that appropriate limiting values are carefully extracted from the preceding formulas, and this task is described in the appendix. Using these limiting values, it was sufficient to use 32 Fourier coefficients in each spatial direction to obtain convergence for the emission strength (i.e., convergence in the diagnostic ^{ I}_{∗}

### b. Asymptotic wave emission

The solution derived in the last section describes in detail how compact initial conditions unfold into a residual (steady) balanced flow and a propagating train of inertia-gravity waves. Asymptotically, that is, after sufficiently long time, this wave train can simply be characterized in terms of a local wavenumber and a suitable local wave amplitude. It is now demonstrated how the function Φ(**k**) defined in (12) can be used to calculate these local wave train characteristics. This is useful for two reasons: first, it gives a precise answer to the question of how much wave activity is radiated in different directions; second, it provides the relevant initial conditions for ray-tracing schemes in which the subsequent propagation of the wave train under the influence of mean shear, radiative damping, etc., can be studied (cf. the companion paper).

*s*(

**x**,

*t*) is a field with compact initial conditions and given in terms of a spectral amplitude

*S*(

**k**) as

*s*

^{−1}

*e*

^{−iω̂t}

*S*

**k**

*t*) is a

*time-varying*region in

*n*-dimensional

**x**space, the boundary of which is moving with the local group velocity

**c**

_{g}=

**x**/

*t,*then

*fixed*region in

**k**space corresponding pointwise to

*t*) through the group-velocity condition:

*s*

^{2}replaced by the square of any of the wave fields defined in (10), and this allows evaluating the wave energy per unit mass contained in a region

*t*) by replacing the

**x**integral over

*t*) of the wave energy density

**k**integral of a suitable spectral density over the corresponding region

^{2}is the spectral wave energy density. Global conservation of wave energy is included as a special case by letting

_{0}, can be written as the sum of

^{w}and the energy of the balanced flow

^{b}. Specifically,

*E*is multiplied by a quantity that remains constant on group velocity rays. The local wavenumber vector

**k**and the local frequency

*ω̂*

**k**) are such quantities, and this allows the simple evaluation of the

*x*component of horizontal pseudomomentum per unit mass that is contained in

*t*) as

*kE*/

*ω̂*

*k*and hence is not sign definite. This means that there is no obvious upper bound on the amount of pseudomomentum generated by given initial conditions.

^{ I}(

*t*) denote the total amount of pseudomomentum found at time

*t*upward and eastward of a compact source centered at the origin, that is,

^{ I}(

*t*) as

*t*→ ∞ as

*k*> 0 selects waves with group velocity (and phase velocity) directed eastward, and that

*m*< 0 selects waves with positive vertical group velocity. A nondimensional quantity

^{ I}

_{∗}

*H*is a length scale provided by the initial conditions. The quantity

^{ I}

_{∗}

### c. Mixed-layer initial conditions

*N*

^{2}and −

*N*/

*u*

_{0}and

*σ*

_{0}, and the remaining velocity components

*υ*

_{0}and

*w*

_{0}are chosen to satisfy the continuity equation (3), that is,

*υ*

_{0y}+

*w*

_{0z}= −

*u*

_{0x}. In the three-dimensional case this does not determine

*υ*

_{0}and

*w*

_{0}uniquely, and an additional condition must be specified. We envisage that the turbulent mixing acts on the mean velocities like a comparatively rapid zonal force-pair that erodes the zonal mean shear. It is straightforward to show (e.g., Batchelor 1967) that in the limit of an impulsive zonal force (i.e., a force that acts over a time interval

*t*≪ 1/

*N*), the (

*υ*

_{0},

*w*

_{0}) response consists of irrotational (i.e.,

*υ*

_{0z}=

*w*

_{0y}) two-dimensional motion in each

*y–z*plane driven solely by a mass source–sink equal to −

*u*

_{0x}. This would lead to

*t*≪ 1/

*N.*However, for the flat, pancakelike initial conditions with small aspect ratio

*h*/

*b*considered here, a modified version can be shown to hold for longer time intervals

*t*≪

*b*/(

*hN*). Significant values of

*û*

_{0}are then found only in regions of spectral space satisfying

*k*

_{h}/|

*m*| ∼

*h*/

*b*≪ 1. This allows use of the hydrostatic approximation to (29), in which

*O*(

*k*

^{2}

_{h}

*m*

^{2}) terms are neglected. This leads finally to

*x, y, z*) coordinate system are given by

*y*= 0. The mixing parameters

*β*and

*α*can range from 0 to 1 and measure the extent to which the respective background gradients have been eroded. The limiting case of perfect mixing at the center of the core corresponds to a value of 1. The Gaussian envelope is determined by the vertical and horizontal half-width scales

*H*and

*L.*The corresponding initial disturbance energy is

*σ*

_{0}(which is the same as

*u*

_{0}with a minus sign) is depicted in Fig. 4 for

*β*= 1. The extent of the well-mixed core of the mixed region indicated by

*h*and

*b*in Fig. 3 must be related to envelope half-widths

*H*and

*L.*Assuming that

*h*/

*b*=

*H*/

*L*and defining

*h*as the vertical distance between the two extremes of the initial disturbance profile in Fig. 4, the relation

*H,*can be used for that purpose. This leaves only the aspect ratio

*H*/

*L*to set. The choice

*H*

*L*

*h*

*b*

*H*/

*L*is used throughout.

*N*and

*f.*Using

*N*to nondimensionalize the problem leaves open the choice of Prandtl’s ratio

*f*/

*N.*The following values for Prandtl’s ratio are used:

*f*

*N*

*β*and

*α*may vary independently in physical reality, but in the nondimensional linear problem only their ratio enters. This is because multiplying both parameters by the same number only multiplies all wave fields by that number, and in particular leaves

^{ I}

_{∗}

*α*/

*α, β,*and Ri can all be combined into a single parameter

*α*/(

*β*

*α*

*β*

*α*/(

*β*

*α*/(

*β*

In summary, the nondimensional diagnostic ^{ I}_{∗}*α*/(*β**f*/*N, H*/*L,* and the number of spatial dimensions. In comparison, the total pseudomomentum, as defined in (27), has some additional explicit dependence on the model parameters as follows. The total pseudomomentum has dimensions *H*^{4}*N,* (or *H*^{3}*N* in two dimensions). Hence, if the size of the mixed region is doubled while the values of *f*/*N, H*/*L,* Ri, and *β, α* are kept fixed, then the total amount of pseudomomentum is increased by a factor of 16 (or 8 in two dimensions), highlighting the sensitive dependence of the pseudomomentum emission on layer size. Also, if *N* is doubled under the same conditions, then the total pseudomomentum is doubled. Finally, the total pseudomomentum is proportional to the squares of the mixing parameters, that is, doubling both *β* and *α* quadruples the total pseudomomentum.

### d. Model runs and upward–eastward pseudomomentum emission

The wave emission has been calculated in two and three dimensions and for the parameter combinations discussed above. It turns out that in all cases the upward–eastward pseudomomentum emitted due to joint velocity and density mixing is simply the sum of the upward–eastward pseudomomentum emitted due to either velocity or density mixing in isolation. In other words, the pseudomomentum fluxes due to velocity mixing and due to density mixing simply add. This is a nontrivial result, which depends on certain special (but quite reasonable) properties of the chosen initial conditions, as follows.

First, consider the spectral wave energy density |Φ|^{2}, defined by (12), in the nonrotating two-dimensional case. In this case all balanced fields are 0. For the chosen initial conditions in (31) the property *u*_{0} ∝ *σ*_{0} holds throughout the domain, and therefore (using *ŵ*_{0} = −*kû*_{0}/*m*) the complex FTs *û*_{0}, *ŵ*_{0}, and *σ̂*_{0}^{2} in (12) is given for each spectral wavenumber vector **k** as a sum of squares of initial velocity and density FTs, without cross-correlation terms between these velocity and density FTs. In other words, |Φ(**k**)|^{2} is simply a (weighted) sum of *α*^{2} and *β*^{2}.

A corresponding statement is trivially true for the rotating two-dimensional case as well as for the nonrotating three-dimensional case. In these cases the disturbance PV is due only to a single term, which is either *fσ*_{z}/*N*^{2} or −*u*_{y}, respectively. Using the expression for *σ̂*^{b}*σ̂*^{w}_{0}*σ̂*_{0}*σ̂*^{b}*σ̂*_{0}*f* = 0 case simply because *σ*^{b} ≡ 0), with a real-valued factor of proportionality that is either less than or equal to unity, in the respective cases. Again, this implies that there are no velocity–density cross-correlation terms in |Φ(**k**)|^{2}.

On the other hand, in the rotating three-dimensional case the PV depends on a linear combination of velocity and density contributions, and this leads to an expression for *σ̂*^{w}_{0}*α* and *β.* Consequently, |Φ(**k**)|^{2} now includes a cross-correlation term ∝−2*αβlmf.* Nevertheless, for initial conditions that factorize in the *y* coordinate (which includes those considered here) the contribution of this cross-term to an integral of |Φ|^{2} over all *y* wavenumbers *l* must vanish. This is because the cross-correlation term in (12) is then necessarily an odd function of *l,* as can be easily shown.

All of the above remains true if |Φ|^{2} is replaced by the spectral pseudomomentum density |Φ|^{2}*k*/*ω̂.*^{w} and the total pseudomomentum emitted upward–eastward involve an integral over all *l,* and hence both are simply additive over velocity and density mixing in all cases. One consequence of this is that ^{w}/_{0} as well as ^{ I}_{∗}*α*/(*β*

Consider now the numerical results for the ratio ^{w}/_{0} = (_{0} − ^{b})/_{0} in Table 1, which describes how _{0} is split between energy of the propagating gravity waves and energy of the balanced residual state. Basically, this ratio is diminished as the PV content of the initial conditions is increased, and vice versa. Hence, there is a general decrease of ^{w}/_{0} as *f*/*N* is increased. In the two-dimensional cases ^{w}/_{0} increases as the velocity mixing strength *α* is increased, which is because velocity does not contribute to the PV in these cases.

^{2}

*k*/

*ω̂*

^{2}. A typical shape of |Φ|

^{2}

*k*/

*ω̂*

^{2}

*k*/

*ω̂*

*k*and

*m*for two cases,

*f*= 0 on the left and

*f*=

*N*/100 on the right. The total pseudomomentum emitted upward–eastward is equal to the integral over the lower-right quadrant, in which

*k*≥ 0 and

*m*⩽ 0. Note that the pseudomomentum density takes both positive and negative values, which could not occur for the everywhere nonnegative energy density. The plots illustrate that the total pseudomomentum radiated upward (or downward) is 0, that is, initial conditions such as (31) emit a wave train with an eastward–westward symmetry. There is also a conspicuous difference between the rotating and nonrotating cases, namely, that in the rotating case (on the right) there is a well-defined maximum of the spectral pseudomomentum density in the interior of each quadrant at

*kH*

*mH*

*k*= 0 that connects extreme values on either side of the discontinuity. This is due to the factor

*k*/

*ω̂,*

*k*= 0, and in the rotating case goes to zero at

*k*= 0, as illustrated in Fig. 6 and further discussed in the appendix.

Numerical results for ^{ I}_{∗}^{ I}_{∗}*f*/*N* increases, and that, somewhat surprisingly, the differences between two-dimensional and three-dimensional values of ^{ I}_{∗}*f* = 0. This can be compared with the values for ^{w}/_{0} in Table 1, in which these differences are in many cases significant. This appears to be due to the factor *k*/*ω̂*

*f*/

*N*comparable with the aspect ratio

*H*/

*L,*there is little difference in

^{ I}

_{∗}

*f*/

*N*≈

*H*/

*L,*then, depending on the relative strength of the density and the velocity mixing, the values of

^{ I}

_{∗}

*f*

*N*

*H*

*L*

^{ I}

_{∗}

*f*= 0, then the three-dimensional results must be used because the two-dimensional model greatly overpredicts the emission strength. The range of

^{ I}

_{∗}

*f*= 0 is

*f*

*N*

^{ I}

_{∗}

### e. Sensitivity to small-scale details in the envelope structure

The Gaussian envelope structure of the initial conditions in (31) was chosen for its simplicity, and it is important to check whether the pseudomomentum emission could be sensitive to small-scale details of the envelope structure. Clearly, such small-scale details would be very hard to predict because they are likely to be, to some degree, dependent on the details of the actively turbulent phase of the CAT layer. The emitted wave energy ^{w} is always bounded by _{0} and is hence insensitive to small-scale changes. However, as noted before, there is no such obvious bound for the emitted pseudomomentum ^{ I}. It is the pseudomomentum that gives the wave-induced momentum flux, and hence its sensitivity needs to be considered.

The behavior of *k*/*ω̂,**k*/*ω̂***k** = (*k, l, m*) takes on large absolute values. Rotating and nonrotating cases need to be considered separately, and *f* = 0 is considered first. Waves with zonal wavenumber *k* larger than ≈2*π*/30 km will be back-reflected by the easterly jet before they reach the summer mesosphere (as shown in detail in Part II); and hence only bounded values of *k* need to be considered. Large absolute values of meridional wavenumber *l* mean that *ω̂**N* in *k*/*ω̂,**k* remains bounded. Hence, only waves with large absolute values of vertical wavenumber *m* need to considered.

In this case, *k*/*ω̂**m*|*k*/(*k*_{h}*N*), which exhibits unbounded growth as |*m*| grows (cf. left panel in Fig. 6). This implies sensitivity of the pseudomomentum emission to vertical small-scale details, that is, to details of the vertical envelope shape. However, these large-|*m*| waves necessarily have low vertical group velocities, and hence spend a long time near their launch altitude where they are subject to dissipative processes such as radiative damping. Radiative damping acts very effectively on large-|*m*| waves (*provided* that background rotation is absent; this crucial condition is investigated in detail in Part II); and hence these waves will be dissipated near their launch altitude and not reach the mesosphere.

The above points concerning the horizontal wavenumbers *k* and *l* apply equally in the rotating case *f* ≠ 0. In the rotating limit of large |*m*| the factor *k*/*ω̂**k*/*f,* which is again bounded as long as *k* is bounded; and hence there is no sensitivity of the pseudomomentum emission to vertical small-scale details in the rotating case. In summary, it can be concluded that the amount of pseudomomentum that reaches the summer mesosphere is probably quite insensitive to the small-scale details of the initial conditions.

## 4. Nonlinear numerical simulations

Linear theory is not formally valid in the early stage of the emission, when the wave train first develops from the mixed region. This is true especially in the limiting case of perfect mixing, in which isentropes acquire vertical slopes. For this reason, independent fully nonlinear simulations of the emission stage have been performed to test the accuracy of the linear theory. The simulations were two-dimensional, that is, their domain was restricted to the *x–z* plane, but a three-dimensional velocity field **u** = (*u, υ, w*) was considered. The flow fields are then *y* independent but, in the presence of nonzero background rotation *f,* the velocity component *υ* in the *y* direction is nontrivially coupled to the other components.

This allows, at reasonable computational cost, some form of dynamically significant nonlinear vortex stretching to take place, in addition to allowing nonlinear effects caused by the interplay of advection and buoyancy forces in the *x–z* plane. The latter effects may actually be stronger in two than in three dimensions, due to the comparatively less rapid amplitude decrease of the dispersing flow fields in two dimensions.

Various simulations with and without background rotation and with or without a balanced flow component have been performed, and good quantitative agreement with the linear theory could be demonstrated. Although these simulations provide a good test of the linear predictions, there remains one caveat, namely, that mean shear has again been neglected.

### a. Numerical model

**u**= (

*u, υ, w*), that is,

*y*independent;

**ẑ**is the upward unit vector;

**= (∂/∂**

**∇***x,*0, ∂/∂

*z*); and

*N, f,*and

*ν*are constants.

The vertical scale of the initial conditions *H* and the buoyancy frequency *N* were chosen as length and (inverse) timescales. The Reynolds number *H*^{2}*N*/*ν* was very large, of the order of 10^{6}, and hence the simulations were nearly inviscid. The resolution was always 400 × 400 grid points, and the (*x, z*) domain size was always (62.155, 6215.5)*H.* Outer boundary conditions were chosen as radiation conditions. This was achieved by applying a no-stress condition at the boundary, and by smoothly increasing Rayleigh damping in all model fields toward the outer boundaries of the computational domain. The simulations showed no sign of spurious back-reflection from the boundary.

### b. Pseudomomentum diagnostics

Consider Fig. 7, which shows a snapshot at a given time *t* of the nonlinear *σ* field in a particular simulation (case B in the summary below). Note that for comparison the corresponding linear solution using 256 × 256 spectral modes is also shown, indicating the good qualitative agreement in this case.

Now, the pseudomomentum radiation upward–eastward that has occurred up to time *t* is denoted by *P*_{rad}(*t*), and it can be estimated as the sum of two parts as follows. The first part is the pseudomomentum content at time *t* in the dashed–solid control rectangle located in the first quadrant of Fig. 7. The second part is the time-integrated (from the initial time up to time *t*) pseudomomentum flux across the two dashed control sides of this rectangle.

**F**, the diagnostic

*P*

_{rad}(

*t*) is calculated as

**n**is the outward unit normal vector, is extended over the two dashed sides, which form

*part*of the boundary of the control rectangle. This flux integration is further symbolized by the two arrows in Fig. 7, which highlight the fact that the flux integration leaves out the solid edges of the control rectangle. Clearly, in cases where linear theory is quantitative applicable,

*P*

_{rad}(

*t*) ≈

^{ I}(

*t*), where

^{ I}(

*t*) is defined by the two-dimensional version of (26).

There is, however, a problem that requires some discussion. Suitable nonlinear definitions for the pseudomomentum density p and its flux **F** in terms of the Eulerian flow variables *σ* and **u** are readily available in the nonrotating case *f* = 0 (e.g., Scinocca and Shepherd 1992). Here it is understood that “suitably defined” implies not only that the pseudomomentum is conserved, but also that the vertical pseudomomentum flux is equal to the leading-order wave-induced flux of zonal momentum in the usual way. However, it turns out that such a suitable Eulerian definition of pseudomomentum is not available in the rotating case *f* ≠ 0. This fact seems to be related to the essential need, in the rotating case, for additional Lagrangian particle displacement information in the sideways (*y*) direction in order to obtain a suitable pseudomomentum definition. This point is currently investigated further by one of us (O. Bühler, 1999, unpublished manuscript).

_{0}is the Eulerian pseudomomentum density of the nonrotating case

*f*= 0 for uniform background fields (cf. Scinocca and Shepherd 1992), and

*Q*is the PV in excess of the background PV divided by

*N*

^{2}, and hence

*Q*

_{t}+ (

**u**·

**)**

**∇***Q*= 0.

Now, in all the numerical cases studied, p_{0} and the underlined parts of **F** in (45) and (46) have been used in (43) in order to calculate the diagnostic *P*_{rad}(*t*). It is straightforward to show that in the small-amplitude WKB limit both p_{0} and the underlined parts of **F** reduce to the standard WKB expressions *kE*/*ω̂***c**_{g}*kE*/*ω̂***c**_{g} is the group velocity. Importantly, this is true in both rotating and nonrotating cases. Hence, provided that the dashed control edges in Fig. 7 lie in the far field where the flow has already dispersed into a small-amplitude slowly varying wave train, the underlined terms can be used as an approximation to **F**.

Near the origin, on the other hand, p_{0} is not strictly conserved if *υQ* ≠ 0 [cf. (49)], and the use of p_{0} as the pseudomomentum density is hence only approximate. Still, the balanced part of p_{0} is clearly zero by inspection (because both *u*^{b} and *w*^{b} are zero even in nonlinear balance); and hence the approximation becomes better with increasing time *t.* This is because more and more waves will by then have propagated away from the initial seat of the disturbance, where *Q* might be nonzero. This is corroborated by case E of the numerical simulations.

### c. Run parameters and results

Six nonlinear runs have been performed. The same initial conditions as in the linear theory were used, that is, the initial fields were given by (31) with *y* = 0, *H*/*L* = 0.01, and with varying mixing and rotation parameter settings that are collected in Table 3. The notional background Richardson number has been set to Ri = 1 for simplicity. However, as in the linear theory, no mean shear has actually been present in the simulations.

The density mixing parameter *β* ranged between 0 and 1, and the velocity mixing parameter *α* was either 0 or chosen such that the minimum local Richardson number in the center of the mixed region [which is given there by (1 − *β*)/*α*^{2}] was equal to a quarter. This was done in order to achieve maximal initial disturbance amplitudes while avoiding unstable initial conditions. Only cases E and F had nonzero background rotation, and only case E has nonzero potential vorticity.

The results for *P*_{rad}(*t*) are collected in Fig. 8. The corresponding linear estimates from the previous section are in all cases indicated by a broken line. In all cases with zero *Q, P*_{rad}(*t*) quickly converges to an asymptotic value close to the linear estimate. On the other hand, in case E the convergence is markedly slower and also involves stronger oscillations. It has been checked that these strong oscillations are due to slowly decaying correlations between the balanced, steady part of *σ* and the wavelike, unsteady part of *u*_{z} − *w*_{x}. These correlations impact on p_{0} as defined in (44), and hence on the first part of (43). It has also been checked that these oscillations occur equally when diagnosing a time-dependent linear solution.

The agreement between the linear and nonlinear estimates for the emitted pseudomomentum has been remarkably good, suggesting an error margin of perhaps 20%. This provides a good check on the use of linear theory for the emission stage, if, as noted before, the effect of mean shear can indeed be neglected.

## 5. Concluding remarks

A simple linear model for the emission of inertia–gravity waves by shear-generated mixed layers of CAT in the lower stratosphere has been formulated. The solution to this mixed-layer linear initial-value problem in two and three dimensions has been derived, with due care over numerical details in order to achieve a robust discretization of it in both the rotating and the nonrotating cases, the latter being the relevant case in equatorial regions.

The total pseudomomentum emission upward–eastward from a mixed layer has then been computed over a range of parameters (as summarized in Table 2), and these results have been been partially cross-checked by independent nonlinear numerical simulations that showed very good agreement with the linear theory. The formulated model therefore appears to capture most of the relevant features of the emission problem, with the possible exception of strong nonlinear effects associated with strong mean shear on the scale of the mixed region. It is conceivable that nonlinear effects due to strong shear might significantly disrupt the formation of the inertia-gravity wave train, with unclear implications for the resulting pseudomomentum fluxes. Hence, the impact of strong shear on mixed-layer emission remains a fruitful topic of further research, which will however require techniques completely different from the ones used here.

The companion paper (Part II) takes up the question of vertical propagation of the emitted wave spectrum to the mesosphere.

## Acknowledgments

OB thanks the Gottlieb Daimler and Karl Benz foundation in Germany and the U.K. Natural Environment Research Council (NERC) for research studentships, and NERC and the U.K. Engineering and Physical Sciences Research Council (EPSRC) for further postdoctoral funding under Grants GR9/01907 and GR/K75583, respectively. MEM thanks the EPSRC for generous support in form of a senior research fellowship.

## REFERENCES

Andrews, D. G., and M. E. McIntyre, 1978a: An exact theory of nonlinear waves on a Lagrangian-mean flow.

*J. Fluid Mech.,***89,**609–646.——, and ——, 1978b: On wave-action and its relatives.

*J. Fluid Mech.,***89,**647–664; Corrigendum,**95,**796; also**106,**331.Batchelor, G. K., 1967:

*An Introduction to Fluid Dynamics.*Cambridge University Press, 615 pp.Browning, K. A., and C. D. Watkins, 1970: Observations of clear air turbulence by high power radar.

*Nature,***227,**260–263.Bühler, O., and M. E. McIntyre, 1999: On shear-generated gravity waves that reach the mesosphere. Part II: Wave propagation.

*J. Atmos. Sci.,***56,**3764–3773.Chimonas, G., and J. R. Grant, 1984: Shear excitation of gravity waves. Part II: Upscale scattering from Kelvin–Helmholtz waves.

*J. Atmos. Sci.,***41,**2278–2288.Clark, T. L., 1977: A small scale numerical model using a terrain following coordinate system.

*J. Comput. Phys.,***24,**186–215.Fritts, D. C., 1982: Shear excitation of atmospheric gravity waves.

*J. Atmos. Sci.,***39,**1936–1952.——, 1984: Shear excitation of atmospheric gravity waves. Part II: Nonlinear radiation from a free shear layer.

*J. Atmos. Sci.,***41,**524–537.——, and P. K. Rastogi, 1985: Convective and dynamical instabilities due to gravity wave motions in the lower and middle atmosphere:Theory and observations.

*Radio Sci.,***20,**1247–1277.——, T. L. Palmer, O. Andreassen, and I. Lie, 1996: Evolution and breakdown of Kelvin–Helmholtz billows in stratified compressible flows. Part I: Comparison of two- and three-dimensional flows.

*J. Atmos. Sci.,***53,**3173–3191.Holton, J. R., P. H. Haynes, M. E. McIntyre, A. R. Douglass, R. B. Rood, and L. Pfister, 1995: Stratosphere–troposphere exchange.

*Rev. Geophys.,***33,**403–439.Lighthill, M. J., 1978:

*Waves in Fluids.*Cambridge University Press, 504 pp.McIntyre, M. E., and W. A. Norton, 1990: Dissipative wave–mean interactions and the transport of vorticity or potential vorticity.

*J. Fluid Mech.,***212,**403–435; Corrigendum,**220,**693.Phillips, O. M., 1977:

*The Dynamics of the Upper Ocean.*2d ed. Cambridge University Press, 336 pp.Rosenlof, K., 1996: Summer hemisphere differences in temperature and transport in the lower stratosphere.

*J. Geophys. Res.,***101,**19 129–19 136.Scinocca, J. F., 1995: The mixing of mass and momentum by Kelvin–Helmholtz billows.

*J. Atmos. Sci.,***52,**2509–2530.——, and T. G. Shepherd, 1992: Nonlinear wave-activity conservation laws and Hamiltonian structure for the two-dimensional anelastic equations.

*J. Atmos. Sci.,***49,**3–25.——, and R. Ford, 1999: The nonlinear forcing of large-scale internal gravity waves by stratified shear instability.

*J. Atmos. Sci.,*in press.Thorpe, S. A., 1973: Experiments on instability and turbulence in a stratified shear flow.

*J. Fluid Mech.,***61,**731–752.Werne, J., and D. C. Fritts, 1999: Stratified shear turbulence: Evolution and statistics.

*Geophys. Res. Lett.,***26,**439–442.Whitham, G. B., 1974:

*Linear and Nonlinear Waves.*Wiley Interscience, 620 pp.Woods, J. D., 1968: Wave-induced shear instability in the summer thermocline.

*J. Fluid Mech.,***32,**791–800.

## APPENDIX

### Discretization Formulas for Modes with *k*_{h} = 0

As *k*_{h} → 0 the appropriate limits for the balanced fields in (8), the amplitude function Φ in (12), and the wave fields in (14)–(17) must be extracted carefully. If *k*_{h} → 0 and *m* → 0 simultaneously, then Φ, as well as the balanced and the wave fields, goes to 0 because all initial fields are assumed to have zero mean value. Therefore, only the limit *k*_{h} → 0 and *m* ≠ 0 needs to be considered. It turns out that rotating and nonrotating cases must be considered separately, and that the rotating case is easier to deal with than the nonrotating case.

*f*≠ 0) the balanced fields in (8) tend to

*û*

^{b}

*υ̂*

^{b}

*ŵ*

^{b}

*σ̂*

^{b}

*σ̂*

_{0}

*k*

_{h}→ 0 and

*m*≠ 0. The amplitude function Φ in (12) tends to [using

**·**

**∇****u**= 0 and (11)]

*m*) ≡ ±1 if

*m*> 0 or

*m*< 0, respectively. This shows that the limit of Φ as

*k*

_{h}→ 0 is path-dependent; that is, it depends on the limiting values of

*k*/

*k*

_{h}and

*l*/

*k*

_{h}, and therefore Φ is discontinuous at this limit. However, |Φ| as well as all the combinations appearing as integrands in (14)–(17), are in fact

*independent*of the limiting values of

*k*/

*k*

_{h}and

*l*/

*k*

_{h}. Hence, any choice of limiting values will lead to the same spectral wave energy density |Φ|

^{2}and wave fields. For definiteness, the convention

*k*

*k*

_{h}

*l*

*k*

_{h}

*m*

*û*

^{w}

_{0}

*i*

*υ̂*

^{w}

_{0}

*f*= 0) the balanced fields tend to

*û*

^{b}and

*υ̂*

^{b}

*σ̂*

^{b}

*k*/

*k*

_{h}and

*l*/

*k*

_{h}. The same convention as in (A3) is adopted, and hence

The genuine discontinuity as *k*_{h} → 0 of the spectral wave energy density |Φ|^{2}, while not affecting the validity of the analytic solution, remains an obstacle for calculating ^{ I}_{∗}^{2} has to be evaluated (and integrated) pointwise in spectral space; and hence the suitability of the convention (A7) must be checked. In the two-dimensional nonrotating case the discontinuity of |Φ| affects the pseudomomentum estimate ^{ I}_{∗}*m* at *k* = 0, as was illustrated in Figs. 5 and 6. The limiting value of Φ at *k* = 0 in (A7) is then consistent with the limit *k* → 0+, that is, with *k* = 0 being approached through positive values of *k.* Hence, the convention (A7) allows the desired pseudomomentum estimate for eastward-moving waves, which have *k* > 0, to integrate correctly. In the three-dimensional nonrotating case the discontinuity of |Φ| turns out not to affect ^{ I}_{∗}

Three stages of KH-induced wave problem seen in receding zoom: (a) rapid formation of a CAT layer (stage A), (b) initial development of wave train (neglecting mean shear) (stage B), and (c) subsequent development of wave train using ray tracing (taking negative mean shear into account) (stage C). Stage C is the subject of the companion paper (Part II). See text for details.

Citation: Journal of the Atmospheric Sciences 56, 21; 10.1175/1520-0469(1999)056<3749:OSGGWT>2.0.CO;2

Three stages of KH-induced wave problem seen in receding zoom: (a) rapid formation of a CAT layer (stage A), (b) initial development of wave train (neglecting mean shear) (stage B), and (c) subsequent development of wave train using ray tracing (taking negative mean shear into account) (stage C). Stage C is the subject of the companion paper (Part II). See text for details.

Citation: Journal of the Atmospheric Sciences 56, 21; 10.1175/1520-0469(1999)056<3749:OSGGWT>2.0.CO;2

Three stages of KH-induced wave problem seen in receding zoom: (a) rapid formation of a CAT layer (stage A), (b) initial development of wave train (neglecting mean shear) (stage B), and (c) subsequent development of wave train using ray tracing (taking negative mean shear into account) (stage C). Stage C is the subject of the companion paper (Part II). See text for details.

Citation: Journal of the Atmospheric Sciences 56, 21; 10.1175/1520-0469(1999)056<3749:OSGGWT>2.0.CO;2

Outline of typical CAT layer whose horizontal scale *b* is much larger than its vertical depth *h.* (The relative vertical depth *h*/*b* is exaggerated for clarity.) (left) KH billows rolling up in the initial stage of the event; (right) profiles of potential temperature *θ*(*z*), buoyancy frequency *N*^{2}(*z*) ∝ *θ*_{z}, and zonal velocity *u*(*z*) before (thin lines) and after (bold lines) a strong turbulent mixing event. The layer itself is thought to be positioned in the lower summer stratosphere, i.e., above the tropopause jet, and hence inside a region of negative background zonal shear (cf. Fig. 1), as indicated.

Outline of typical CAT layer whose horizontal scale *b* is much larger than its vertical depth *h.* (The relative vertical depth *h*/*b* is exaggerated for clarity.) (left) KH billows rolling up in the initial stage of the event; (right) profiles of potential temperature *θ*(*z*), buoyancy frequency *N*^{2}(*z*) ∝ *θ*_{z}, and zonal velocity *u*(*z*) before (thin lines) and after (bold lines) a strong turbulent mixing event. The layer itself is thought to be positioned in the lower summer stratosphere, i.e., above the tropopause jet, and hence inside a region of negative background zonal shear (cf. Fig. 1), as indicated.

Outline of typical CAT layer whose horizontal scale *b* is much larger than its vertical depth *h.* (The relative vertical depth *h*/*b* is exaggerated for clarity.) (left) KH billows rolling up in the initial stage of the event; (right) profiles of potential temperature *θ*(*z*), buoyancy frequency *N*^{2}(*z*) ∝ *θ*_{z}, and zonal velocity *u*(*z*) before (thin lines) and after (bold lines) a strong turbulent mixing event. The layer itself is thought to be positioned in the lower summer stratosphere, i.e., above the tropopause jet, and hence inside a region of negative background zonal shear (cf. Fig. 1), as indicated.

(left) Initial *σ* disturbance over *x–z* plane at *y* = 0. Note that the direction into the paper corresponds to the vertical. (right) Initial *σ* disturbance over *z* at (*x, y*) = 0.

(left) Initial *σ* disturbance over *x–z* plane at *y* = 0. Note that the direction into the paper corresponds to the vertical. (right) Initial *σ* disturbance over *z* at (*x, y*) = 0.

(left) Initial *σ* disturbance over *x–z* plane at *y* = 0. Note that the direction into the paper corresponds to the vertical. (right) Initial *σ* disturbance over *z* at (*x, y*) = 0.

Spectral pseudomomentum density. Solid contours denote positive values, dotted contours denote negative values: (left) without background rotation; (right) with background rotation equal to *f* = *N*/100. In both cases *α* = *β.*

Spectral pseudomomentum density. Solid contours denote positive values, dotted contours denote negative values: (left) without background rotation; (right) with background rotation equal to *f* = *N*/100. In both cases *α* = *β.*

Spectral pseudomomentum density. Solid contours denote positive values, dotted contours denote negative values: (left) without background rotation; (right) with background rotation equal to *f* = *N*/100. In both cases *α* = *β.*

The factor *k*/*ŵ* over the quadrant (*k* ≥ 0, *m* ⩽ 0) of spectral space: (left) without background rotation; (right) with background rotation equal to *f* = *N*/100.

The factor *k*/*ŵ* over the quadrant (*k* ≥ 0, *m* ⩽ 0) of spectral space: (left) without background rotation; (right) with background rotation equal to *f* = *N*/100.

The factor *k*/*ŵ* over the quadrant (*k* ≥ 0, *m* ⩽ 0) of spectral space: (left) without background rotation; (right) with background rotation equal to *f* = *N*/100.

Contour plots of linear and nonlinear *σ* field after one period of time: *t* = 200*π*/*N.* Note the stretched vertical coordinate. The broken lines show the control sides across which the upward–eastward pseudomomentum flux is calculated. This is case B of the cases summarized below.

Contour plots of linear and nonlinear *σ* field after one period of time: *t* = 200*π*/*N.* Note the stretched vertical coordinate. The broken lines show the control sides across which the upward–eastward pseudomomentum flux is calculated. This is case B of the cases summarized below.

Contour plots of linear and nonlinear *σ* field after one period of time: *t* = 200*π*/*N.* Note the stretched vertical coordinate. The broken lines show the control sides across which the upward–eastward pseudomomentum flux is calculated. This is case B of the cases summarized below.

Nonlinear pseudomomentum diagnostics. The broken lines represent the linear estimates. Note the much larger vertical scale for case E. Each period corresponds to a time unit of 200*π*/*N.* The diagnostic “P flux” corresponds to the second, flux-related part of *P*_{rad}(*t*) in its definition (43).

Nonlinear pseudomomentum diagnostics. The broken lines represent the linear estimates. Note the much larger vertical scale for case E. Each period corresponds to a time unit of 200*π*/*N.* The diagnostic “P flux” corresponds to the second, flux-related part of *P*_{rad}(*t*) in its definition (43).

Nonlinear pseudomomentum diagnostics. The broken lines represent the linear estimates. Note the much larger vertical scale for case E. Each period corresponds to a time unit of 200*π*/*N.* The diagnostic “P flux” corresponds to the second, flux-related part of *P*_{rad}(*t*) in its definition (43).

Nondimensional wave energy * ^{w}*/

_{0}. Each column corresponds to a different value of

*f*/

*N*and each row corresponds to a different value of

*α*/(

*β*

*/*

^{w}_{0}does not depend on the signs of

*α*and

*β*. In the nonrotating two-dimensional case the PV is identically zero and, hence,

*=*

^{w}_{0}there. The differences between two-dimensional and three-dimensional values are in many cases significant.

Nondimensional pseudomomentum ^{ I}_{∗}*α*/(*β*^{ I}_{∗}*α* and *β*.

Parameters used in the nonlinear simulations. The velocity mixing rates in cases C, D, and F are chosen such that the minimum local Richardson number in the center of the mixed region is ¼. The notional background Richardson number Ri = 1. Only case E has nonzero potential vorticity.

^{1}

For example, if Coriolis forces are negligible, then the local shear is proportional to *γ*(*t*)*N*^{2}, where *γ*(*t*) is the time integral of the time-dependent tilting angle *α*(*t*) of the layer, and hence Ri ∝ (*γN*)^{−2}. In the opposite-extreme case of geostrophic balance, the formula Ri ∝ (*γN*)^{−2} still holds but with *γ*(*t*) = *α*(*t*)/*f* where *f* is the Coriolis parameter.