Spectral Preconditioners for Nonhydrostatic Atmospheric Models

Stephen J. Thomas National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Stephen J. Thomas in
Current site
Google Scholar
PubMed
Close
,
Joshua P. Hacker National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Joshua P. Hacker in
Current site
Google Scholar
PubMed
Close
,
Piotr K. Smolarkiewicz National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Piotr K. Smolarkiewicz in
Current site
Google Scholar
PubMed
Close
, and
Roland B. Stull Department of Earth and Ocean Sciences, University of British Columbia, Vancouver, British Columbia, Canada

Search for other papers by Roland B. Stull in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The elliptic problems in semi-implicit nonhydrostatic atmospheric models are difficult. Typically, they are poorly conditioned, nonseparable, contain cross-derivative terms, and are often nonsymmetric. Here, the resulting linear system is solved using a preconditioned Krylov subspace method—the generalized conjugate residual (GCR) algorithm. A horizontal spectral preconditioner is developed as an alternative to a more standard and much simpler line Jacobi relaxation scheme. However, the efficacy of the spectral preconditioner requires neglecting the cross-derivative terms and homogenization (e.g., averaging) metric coefficients over the computational domain. Because such a compromise causes a substantial departure of the preconditioner from the original elliptic operator, it is not obvious a priori whether it leads to a competitive solver. The robustness of the proposed approach over a broad range of representative meteorological applications is evaluated, in the context of a three-time-level semi-implicit semi-Lagrangian all-scale weather-prediction/research model.

Corresponding author address: Dr. Stephen J. Thomas, NCAR/SCD, P.O. Box 3000, Boulder, CO 80307-3000. Email: thomas@ucar.edu

Abstract

The elliptic problems in semi-implicit nonhydrostatic atmospheric models are difficult. Typically, they are poorly conditioned, nonseparable, contain cross-derivative terms, and are often nonsymmetric. Here, the resulting linear system is solved using a preconditioned Krylov subspace method—the generalized conjugate residual (GCR) algorithm. A horizontal spectral preconditioner is developed as an alternative to a more standard and much simpler line Jacobi relaxation scheme. However, the efficacy of the spectral preconditioner requires neglecting the cross-derivative terms and homogenization (e.g., averaging) metric coefficients over the computational domain. Because such a compromise causes a substantial departure of the preconditioner from the original elliptic operator, it is not obvious a priori whether it leads to a competitive solver. The robustness of the proposed approach over a broad range of representative meteorological applications is evaluated, in the context of a three-time-level semi-implicit semi-Lagrangian all-scale weather-prediction/research model.

Corresponding author address: Dr. Stephen J. Thomas, NCAR/SCD, P.O. Box 3000, Boulder, CO 80307-3000. Email: thomas@ucar.edu

1. Introduction

A semi-implicit time-stepping scheme (Robert 1969) is commonly applied to the terms responsible for fast waves in large-scale weather prediction and general circulation models. This removes the associated time step restrictions and makes computation practical. More recently, semi-implicit integrators have also been applied at mesoscale resolutions (Tapp and White 1976; Cullen 1990; Tanguay et al. 1990; Golding 1992; Yeh et al. 2002; Grabowski and Smolarkiewicz 2002) and convective scales (Robert 1993; Grabowski and Smolarkiewicz 2002). The resulting elliptic partial differential equations are nontrivial. They are poorly conditioned, nonseparable, contain cross-derivative terms, and are typically nonsymmetric. This is due to domain anisotropy, effects of planetary rotation, ambient stratification, the use of general curvilinear coordinates in the governing equations [e.g., the Gal-Chen and Somerville (1975) terrain-following transformation], or the imposition of partial-slip conditions along an irregular lower boundary. Among the most effective methods reported for solving such problems are the preconditioned nonsymmetric conjugate-gradient-type (alias Krylov subspace, hereafter Krylov) iterative schemes (e.g., Kapitza and Eppel 1992; Skamarock et al. 1997; Thomas et al. 1998, 2000). A number of alternative nonsymmetric Krylov solvers are used in computational research and engineering (Axelsson 1994; Greenbaum 1997). Although this paper focuses on a particular scheme, the issues discussed are relevant to other iterative methods as well. Our method of choice is the restarted generalized conjugate residual [GCR(k)] algorithm1 akin to the popular generalized minimum residual [GMRES(k)] solver (Eisenstat et al. 1983; Saad and Schultz 1986; Smolarkiewicz and Margolin 1994, 1997; Smolarkiewicz et al. 1997).2 For the reader's convenience, GCR(k) is summarized in appendix A, where we also introduce some terminology, notation, and a notion of preconditioning (left, as opposed to right) used throughout this paper. A brief discussion of line-relaxation preconditioners is included in appendix B.

Designing a suitable preconditioner is important, as it can dramatically accelerate solver convergence, thereby reducing the overall computational expense of a model. In principle, the preconditioner P can be any linear operator such that LP−1 is definite, where L symbolizes the original elliptic operator implied by the physical problem at hand. The role of the preconditioner is to substitute the governing elliptic problem L(Ψ) − Q = 0 with an auxiliary problem P−1[L(Ψ) − Q] = 0 that converges faster (than the original problem) because of a closer clustering of the eigenvalues of the auxiliary elliptic operator P−1L, where Ψ and Q symbolize the dependent variable and the rhs, respectively. For the preconditioner to be useful, the convergence of the auxiliary problem must be sufficiently rapid to overcome the effort associated with “inverting” the preconditioner itself [i.e., computing P−1(·)]. In general, the closer the preconditioner approximates the original operator, the faster the solver converges but the more difficult it is to compute P−1(·).

There is no general method for designing an optimal preconditioner (Axelsson 1994, section 7). Kadogliu and Mudrick (1992) argued that basic point-Jacobi relaxation provides an effective preconditioner for GMRES(k) applied to atmospheric flow problems. However, their conclusion is derived from solving the diagnostic quasigeostrophic omega equation in a periodic channel—a fairly simple elliptic problem that does not reflect the full complexity of a multiscale atmosphere. In particular, it does not extrapolate to prognostic problems when the shallowness of the earth's atmosphere dictates condition numbers κ(L) ∼ O(1010); recall that κ(L) can be interpreted as the squared ratio of the longest to the shortest wave present in the system. Because the asymptotic convergence rate of conjugate-gradient-type methods is proportional to κ(L)−1/2, a direct preconditioner in the vertical is the “categorical imperative” of an effective iterative solver for all-scale atmospheric models (e.g., Marshall et al. 1997; Skamarock et al. 1997; Thomas et al. 1998; Smolarkiewicz et al. 2001). In this paper we take this preconditioning strategy in the vertical for granted and focus on the optimization of the horizontal component of P.

Guided by the development of elliptic solvers for early anelastic models, we extend a fast direct preconditioning strategy to three spatial dimensions via horizontal spectral decomposition. This offers an alternative to alternating-direction-implicit (ADI) iterative methods (Skamarock et al. 1997), of which the line Jacobi preconditioner is a special case (appendix B). The use of the FFT approach has a long history in the design of elliptic solvers. Because FFTs are effective for separable problems, early elliptic solvers for meteorological models often employed operator decomposition (into a separable part and the residual) and a stationary block-iteration with the nonseparable residual lagging behind (cf. Schumann and Volkert 1987 and Schumann and Sweet 1988, and references therein). Bernardet (1995) investigated an alternative approach (conceptually the opposite) where the full operator is retained in a nonstationary Krylov iteration, but FFTs are used to invert the “flat” preconditioner obtained by neglecting orography. He used the classical, symmetric, conjugate-gradient solver of Hestenes and Stiefel (1952), thereby necessitating careful modifications to the boundary conditions (in order to assure a self-adjoint elliptic operator). His development addressed 2D mesoscale problems in the xz plane. Because of the inherent difficulties with extending this approach to 3D, Lafore et al. (1998) adopted an FFT-preconditioned Richardson iteration. Recently, Elman and O'Leary (1998) analyzed the performance of nonsymmetric Krylov solvers for indefinite 3D Helmholtz equations with Sommerfeld radiation boundary conditions and found that FFT methods can result in effective preconditioners even on parallel architectures. Because of the indefiniteness, they addressed a more difficult problem than those typically encountered in meteorological applications. On the other hand, their Helmholtz operator contained the standard Laplacian on isotropic grids and had constant coefficients. Both the coefficients and solutions were smooth.

Although there exists a substantial body of evidence indicating the potential of FFT preconditioners, a systematic study in the context of all-scale meteorological applications is needed. This is because modeling natural flows leads to elliptic problems characterized by highly inhomogeneous anistropic coefficients and full spectra forced on the rhs. We complement earlier work with a comprehensive study based on a fully compressible nonhydrostatic all-scale atmospheric model (Thomas et al. 1998). A semi-implicit time discretization leads to a nonsymmetric, but definite, Helmholtz problem. In order to design an effective FFT preconditioner, we derive a separable constant-coefficient approximation to the semi-implicit Helmholtz equation by dropping cross-derivative terms and averaging metric coefficients across the domain. Neumann boundary conditions are specified to close the problem, and thus a discrete cosine transform (DCT) is required to diagonalize the discrete horizontal Laplacian component of the preconditioner.

Robustness is vital for the elliptic solver when a numerical model admits a broad range of scales of motion. We assess the relative efficiency of line relaxation versus DCT preconditioners in both linear and nonlinear flow regimes (prototypes for, respectively, laminar and turbulent flows) throughout the range O(101)–O(107) m of spatial scales. Our study encompasses the simulation of shallow thermal convection, small and large mesoscale mountain flows, and synoptic-scale winter storm prediction. We measure the performance of the two preconditioners by comparing the amount of work (the iteration count and total CPU time of the model) required to achieve a specified convergence threshold. Because the residual error of the elliptic problem has a sense of Δt-normalized first-order spatial partial derivatives of velocity, the convergence criterion is based on the magnitude of the local Courant number and its variations. This specifies both the absolute admissible error and its magnitude compared to the flow. In effect, both numerical and physical error metrics are employed (cf. Smolarkiewicz et al. 1997).

The paper is organized as follows: Section 2 summarizes the governing equations of the model along with their discretization and formulation of the elliptic problem. Section 3 presents the derivation of a spectral preconditioner. Section 4 discusses the results of our simulations. Remarks in section 5 conclude the paper.

2. Model description

a. Analytic formulation

To focus the discussion, consider the standard compressible Euler model for a dry, adiabatic, stratified, rotating fluid. Cast in a noninertial frame rotating with angular velocity Ω, it consists of mass, momentum, and entropy evolution laws, and the equation of state:
i1520-0493-131-10-2464-e1
where D/Dt = ∂/∂t + v · ∇ is the material-derivative operator, v the velocity vector, p thermodynamic pressure, ρ density, and g combines gravitational and centrifugal accelerations. Also, Θ = T exp(−κq) is the potential temperature, where T denotes the temperature, κ = R/cp (with R and cp being the gas constant and specific heat at constant pressure), and q = ln(p/p00), with p00 = 100 kPa.3
In order to isolate the fast acoustic-gravity modes (for the sake of implicit numerical treatment), the prognostic equations in (1) are decomposed by introducing a horizontally homogeneous, static reference atmosphere (hereafter denoted with subscript 0). Thermodynamic variables are decomposed into a sum of the base state and perturbations
i1520-0493-131-10-2464-e2
where
i1520-0493-131-10-2464-e3
After grouping terms associated with “fast” linear waves on the lhs, while retaining the “slow” residual on the rhs and eliminating ρ using the gas law, the governing evolution equations take the form
i1520-0493-131-10-2464-e4
where B = gT′/T0 denotes the buoyancy, and f = 2Ω sinϕ is the Coriolis parameter at latitude ϕ. Also, γ = cp/cυ (with cυ = cpR denoting the specific heat at constant volume), and N0 and c0 are, respectively, the Brunt–Väisälä frequency and speed of sound:
i1520-0493-131-10-2464-e5
Equations (1) and (4), as written, are valid in a rotating Cartesian reference frame. To account for the curvature of the earth when simulating atmospheric flows and produce results consistent with standard weather maps, Eqs. (4) are written in terms of a Cartesian coordinate system projected onto the plane intersecting the sphere at reference latitude ϕ0 (cf. Tanguay et al. 1990). The resulting equations take the form
i1520-0493-131-10-2464-e6
where the (X, Y) conformal coordinate-based form of D/Dt, horizontal divergence DH(U, V) of the wind image (U, V), pseudokinetic energy K, and metric coefficient S are all detailed in appendix C for the sake of completeness. To account for the irregularity of the lower boundary, the system in (6) is transformed to the terrain-following curvilinear framework of Gal-Chen and Sommerville (1975). Optionally, variable spacing of vertical levels is allowed by composing the Gal-Chen transformation with a smooth stretching (Thomas et al. 1998). Rather than complicating (6) with details of the composition of these mappings, all metric terms associated with the vertical transformations are absorbed into the definitions of the partial derivatives (appendix C).

b. Numerical approximations

The semi-Lagrangian approach employed for approximating (6) on a discrete mesh (Robert 1993) aims at second-order accuracy in space and time.4 A staggered Arakawa C grid (Arakawa and Lamb 1977) is employed in the horizontal, and a staggered Tokioka B grid (Tokioka 1978) in the vertical. The associated spatial partial-derivative discrete operators are detailed in appendix D.

In general, the governing equations—here, either (1), (4), or (6)—can be viewed symbolically as
i1520-0493-131-10-2464-e7
where ψ stands for any of the kinematic or thermodynamic dependent variables of the model, and Fψ for the associated rhs. Then, semi-Lagrangian schemes can be interpreted as approximations to the trajectory integrals of the governing equations:
i1520-0493-131-10-2464-e8
where ψ and ψ0 denote, respectively, the values of ψ(X, t) at the two endpoints of the trajectory Γ connecting [X0(X, t1), t0] and (X, t1). For our study (of spectral preconditioners), we adopt the Mesoscale Compressible Community (MC2) model (Tanguay et al. 1990; Benoit et al. 1997; Laprise et al. 1997; Thomas et al. 1998, and references therein), which implements the semi-implicit semi-Lagrangian algorithm of Robert (1981). In the context of the integral representation, this scheme blends midpoint three-time-level and trapezoidal two-time-level quadrature rules. The midpoint and trapezoidal approximations result, respectively, in explicit (leapfrog) and implicit (Crank–Nicholson) time discretizations of the various forcing terms. In order to concisely present the resulting finite-difference equations, we define the discrete operators
i1520-0493-131-10-2464-e9
where ψn+1i denotes the unknown (sought) value of ψ at the grid point (Xi, tn+1), and ψn−10 is the value of ψ at the foot (X0, tn−1) of the trajectory arriving at (Xi, tn+1), evaluated using cubic interpolation from the neighboring grid points. With the definitions in (9), the compact discretized form of the equations (6) becomes (after some regrouping)
i1520-0493-131-10-2464-e10
where spatial partial-derivative and average operators (DX, DY, DZ, and μZ) are defined in appendix D, and the Fnψ forcings combine the Coriolis and slow-residual terms that are interpolated at the trajectory midpoint, (X, tn), with X ≡ (Xi + X0)/2 [cf. Thomas et al. (1998) for a discussion of the trajectory scheme)].

c. Elliptic problem

In order to derive the discrete Helmholtz problem that underlies the semi-implicit scheme in (10), all unknown tn+1 terms are grouped on the lhs, and all known tn and tn−1 terms are separated on the rhs. Denoting the rhs by Q and dropping all temporal superscripts (as there is no ambiguity at this point), the resulting linear system of algebraic equations (valid at each grid point) is written as follows:
i1520-0493-131-10-2464-e11
To arrive at the implicit equation for q′, all remaining dependent variables are eliminated from (11)–(15). Combining (14) and (15) eliminates the buoyancy B and results in
i1520-0493-131-10-2464-e16
where
i1520-0493-131-10-2464-e17
Next, taking the divergence of (12) and (13) and substituting for DH(U, V) in (11),
i1520-0493-131-10-2464-e18
where
QqQqtγDHQUQV
Following the original notation of Tanguay et al. (1990), we define the operators,
i1520-0493-131-10-2464-e20
and the coefficient N = (1 + Δt2N20). Then, solving directly for w in (16) and substituting the result in (18), we arrive at the discrete Helmholtz equation,
i1520-0493-131-10-2464-e21
where the modified forcing on the rhs takes the form
Q*qQqtD(2)ZN−1Qw

d. Boundary conditions

Lateral boundaries are open with inflow/outflow determined by the normal components of the velocity. From the perspective of the Helmholtz problem in (21), normal velocities specified at the lateral boundaries imply Neumann boundary conditions on the pressure via (12) and (13). The model top and bottom boundaries are impermeable. At the model top ztop, the terrain-following coordinate Z becomes flat [cf. (C7) in appendix C], whereupon the contravariant “vertical” velocity becomes identical to the vertical component of the physical velocity and they both vanish:
Żztopwztop
At the lower boundary, the normal contravariant velocity vanishes (Ż|0 = 0), implying that the physical vertical velocity w follows coordinate surfaces according to
w0GSG13UG23V0
The boundary conditions on q′ associated with (23) and (24) are derived by substituting the momentum equations (12) and (13) into, respectively, (23) and (24) and requiring that the resulting relations be consistent with (16). Then Qw and D(1)Z can be specified at Z = 0 and Z = ztop.5

3. Spectral preconditioner

The discrete Helmholtz equation (21) takes the form
LqQ*q
referred to early in the introduction. The symbol L on the lhs of (25) represents the entire operator appearing between the brackets {·} on the lhs of (21). For the terrain-following transformation and conformal coordinates, the generalized Laplacian-like operator DH(DX, DY) embedded in L includes cross-derivative terms and is nonseparable. The discrete horizontal divergence and 3D gradient, (D7) and (D4) respectively, couple the horizontal to the vertical operators through a complex blend of XZ and YZ derivatives and averages. The first step in constructing an effective preconditioner P (for atmospheric models) is to neglect these cross terms—to facilitate a direct inversion of P in the vertical (recall the discussion in the introduction). The resulting simplification leads to the preconditioned linear system of equations:
Per
or, equivalently,
i1520-0493-131-10-2464-e27
where 2H = δXX + δYY denotes the standard discrete horizontal Laplacian; e is the current iterate (of the Krylov solver) estimation of the solution error e = q′ − q′, with the overbar denoting the sought exact solution; and r is the current iterate value of the residual error r = Lq′ − Q*q (cf. appendix A). Because the operator on the lhs of (27) is a simple sum of the vertical and horizontal differential operators, iterative line-relaxation preconditioners (outlined in appendix B) can be easily applied at this point. Here, we advocate an alternate strategy, where a separable (in the horizontal) constant-coefficient approximation of the operator on the lhs of (27) is specified—thereby departing even further with P from the original operator L in (25)—enabling a fast direct inversion in the horizontal.
The horizontal component of P in (27) is nonseparable because the metric coefficient S = S(X, Y) is a function of both X and Y.6 In general, a separable operator is obtained by a cross-directional homogenization of metric coefficients—for example, by averaging S that multiplies the δXX and δYY parts of 2H, respectively, in the Y and X directions—whereupon the tensor-product method (Lynch et al. 1964) can be applied to construct a direct solver. This requires computing separately the eigenvalues and eigenvectors of the matrices repre-senting the discrete operators SYδXX and SXδYY, and diagonalization follows by combining two unitary similarity transformations. Insofar as stand-alone solvers for separable problems are concerned, this approach requires O(nzn3) operations and is not competitive with ADI schemes [an O(nzn2 log2n) operation count] except when a fast transform [e.g., FFT with O(nzn2 logn) operations] is available (Lynch et al. 1964). Here, n = nx = ny refers to the number of mesh points in either horizontal direction, and nz is the number of points in the vertical. Because (27) is merely a preconditioner, S can be globally averaged or extremized (Bernardet 1995; Smolarkiewicz and Margolin 2000), thereby enabling application of the FFT. Here, we replace S in (27) with the global mean; flat topography is specified to assure constant coefficients in the vertical derivatives. The discrete operator 2H, with homogeneous Neumann boundary conditions,7 is then readily diagonalized by the real part of the Fourier transform (via the discrete cosine transform; Sweet 1973). This results in nx × ny independent tridiagonal vertical problems:
i1520-0493-131-10-2464-e28
where
i1520-0493-131-10-2464-eq1
are the eigenvalues of 2H in the X and Y directions, and ˆ· denotes Fourier transform.

4. Numerical experiments

A suite of model runs8 is designed to compare the performance of the DCT preconditioner against the line Jacobi preconditioner, proven effective in meteorological applications (Thomas et al. 1998, 2000). The relative efficiency of the two preconditioners is measured with two Jacobi/DCT ratios [hereafter the performance ratios (PR)]: the ratio of total GCR solver iterations and the ratio of the total model CPU times. Both ratios are informative. For a given convergence criterion, the first one estimates the computational cost of the solver, independent of the implementation and machine, whereas the second assesses the total work required for the application at hand. Values of PR > 1 show efficiency gains by the DCT preconditioner. Because the residual error of the Helmholtz problem in (21) combines Δt-normalized first-order spatial partial derivatives of the velocity components, it motivates the stopping criterion:
LqQ*qC,L
In (29), C and L represent the Courant ‖ΔtvX‖ and Lipschitz ‖Δt(∂v/∂x)‖ numbers—correspondingly, the measures of the flow magnitude and its variations. Typically, in order to control the stability/accuracy of the Eulerian and semi-Lagrangian computations, CLO(1), whereupon ε specifies both the absolute admissible error and its magnitude compared to the flow.

The test cases considered here are designed to evaluate the robustness and efficiency of the DCT preconditioner over a wide range of meteorological applications. The first test is very special; it probes the reflexivity of our preconditioned Krylov approach. In all problems where the preconditioner is an exact solver, GCR should converge in one iteration to machine precision, regardless of the value of ε. In the MC2 model, all small- to mesoscale problems with map scale factor m ≡ 1 in (C1) and flat lower boundary with h(X, Y) ≡ const in (C7) belong to this class, which constitutes a broad range of research applications including studies of fully developed turbulence and deep moist convection. For illustration, we consider a full-spectrum turbulent thermal convection simulation.

The physical scenario is a buoyant thermal rising in a neutrally stratified environment—a 3D extension of the earlier experiment in Smolarkiewicz and Pudykiewicz (1992). The grid employed consists of 100 × 100 × 150 points with constant interval ΔX = ΔY = ΔZ = 10 m; Δt = 2.5 s. Initially, a spherical temperature anomaly of radius r0 = 250 m and Θ′ = 0.5 K is centered in the horizontal, r0 + ΔZ above the lower boundary. Figure 1 shows potential temperature (Θ) xz and xy cross sections through the center of the convective bubble and at z = 800 m, respectively; at the end of the integration, t = 180Δt, by which time the flow becomes turbulent and exhibits a full range of scales (see Fig. 2).

The solution in Figs. 1 and 2 is for the DCT preconditioner. It is independent of the stopping criterion in (29). This is documented in Fig. 3, which compares the rms divergence of the solution using the DCT and line Jacobi preconditioners for a range of ε. Following Skamarock et al. (1997), this diagnostic is used to measure the solution convergence in elastic models. For the Jacobi preconditioner, as ε → 0 the curve flattens, indicating the critical value ε = 10−3, below which the solution can be considered physically converged; that is, further solver iterations do not contribute to the overall accuracy of the results. Because the DCT preconditioner is a direct solver of the elliptic problem at hand, convergence is reached in a single GCR iteration regardless of the specified value of ε. The latter is not the case for the Jacobi preconditioner, which requires an increasing number of GCR iterations as ε → 0. Figure 4 demonstrates this by displaying the PR for iteration count and total CPU time as a function of ε. Here, the DCT-preconditioned solver converges in one iteration, and the PR iteration count is equivalent to the average number of GCR iterations (per model time step) using the Jacobi preconditioner. The PR CPU time documents that, even with the overhead associated with computing FFTs, the DCT preconditioner is at least an order of magnitude more efficient for all values of ε representing the convergent solutions.

The second test simulates a smooth, vertically propagating 3D mountain wave, identical to those in Saito et al. (1998) and Thomas et al. (2000). The resulting flow is characterized by small-amplitude gravity wave response at Froude number FU/Nh = 4.4, where the linear theory of Smith (1980) is qualitatively valid. To contrast with the buoyant thermal case, the linear flow regime is chosen because it emphasizes a low signal-to-noise ratio. Here, unlike in the previous case, the energy spectrum of a fully developed solution is compact, and the grid aspect ratio significantly departs from unity, reflecting the anisotropy of the problem. To quantify the effect of domain anisotropy (shallow versus deep flow scenarios) on solver performance, both hydrostatic and nonhydrostatic variants are addressed [cf. Skamarock et al. (1997) for a discussion]. The parameters of the experiment are as follows.

The bell-shaped mountain,
i1520-0493-131-10-2464-eq2
with the half-width a = 3ΔX = 3ΔY and the height h0 = 100 m is centered at the (30, 30) grid point of the 61 × 61 horizontal mesh. In the vertical, 31 exponentially stretched levels extend to 19 km, resulting in vertical grid increments ranging from 40 to 1180 m, respectively, for the bottom and top thermodynamic layers. The model is integrated over 480 time steps, when the solution is considered to have reached an approximate steady state. A constant inflow velocity of U = 8 m s−1 and an isothermal atmosphere with buoyancy frequency N = 0.018 s−1 are assumed. The Coriolis force and the curvature of the earth are neglected. No subgrid-scale parameterizations are employed. The Davies (1976) relaxation scheme is used at the lateral boundaries to minimize wave reflections.

In the hydrostatic case, the grid length ΔX = 2000 m and time step Δt = 60 s result in the advective Courant number C = 0.24. The horizontal scale of the mountain is a = 6000 m, so aN/U = 13.5. In the nonhydrostatic case, ΔX = 400 m and Δt = 15 s, giving C = 0.3. The horizontal scale decreases to a = 1200 m, so that aN/U = 2.7, whereupon nonhydrostatic effects are significant. For illustration, Fig. 5 displays the converged solution. Vertical velocity cross sections are shown through the vertical center plane and on the horizontal plane at z = 2100 m (3/4 of the vertical wavelength) for both cases. The maximum vertical velocities at this level are 0.06 and 0.2 m s−1 in the hydrostatic and nonhydrostatic cases, respectively. Note the downstream tilt of the wave packet (manifested by the downstream shift of the maxima) in the nonhydrostatic case. The corresponding energy spectra in the mean flow direction at z = 1800 m are shown in Fig. 6.

The iteration count of the Krylov solver using the DCT (solid line) and line Jacobi (dashed line) preconditioners is compared in Fig. 7 as a function of the convergence threshold ε in (29). The rms divergence (thick solid line) shows the critical value of ε = 10−5, below which the solution is considered physically converged.9 It is apparent that the number of iterations per time step required for the line Jacobi preconditioner is about 5(10) times that required for the DCT scheme in the hydrostatic (nonhydrostatic) case. However, this does not imply that the efficiency of the DCT preconditioner follows the same pattern. The computational cost of a single GCR iteration with the DCT preconditioner is much higher than with line Jacobi, whereupon the CPU time performance ratio is about 1.5 for the hydrostatic result and somewhat over 2 for the nonhydrostatic solution (see Fig. 8). The trend of decreasing relative efficacy of the DCT preconditioner with increasing hydrostacy of the problem is consistent with that observed for ADI preconditioners (Skamarock et al. 1997). It is likely related to the propagation characteristics of gravity waves present in the system.

Next, the complex Scandinavian orography is used for the lower boundary to probe the solver performance for natural mesoscale flow scenarios containing many scales of motion. For this case, ΔX = ΔY = 10 km, and ΔZ = 300 m, giving a grid aspect ratio of 0.01. The total horizontal domain size is 2000 km × 2000 km (200 × 200 horizontal grid points), with the model lid placed at 15 km. A uniform northwesterly ambient flow with speed of 10 m s−1 and buoyancy frequency N = 0.01 s−1 are assumed. Unlike in the previous cases, a β-plane approximation is employed, with the map scale factor m varying accordingly in (C1). The model is integrated over 24 h with 288 5-min time steps. The converged wind solution on the lowest momentum level (approximately 150 m above the ground) is shown in Fig. 9, for illustration. Based on experience gained from the earlier test cases, a safe value ε = 10−6 is assumed to assure physical convergence. Performance-ratio values are 4.04 and 5.46 for CPU time and iteration count, respectively, documenting much higher gains for the DCT preconditioner compared to the isolated mountain case discussed above. This is not surprising because compact spectra limit the number of significant eigenmodes present in the solution, thereby accelerating the convergence of Krylov solvers. For the sake of completeness, a 3D ADI preconditioner was also tested in this case. The number of GCR iterations required (for the given convergence threshold) is the same as with the DCT preconditioner, but the computational overhead is substantial, owing to multiple ADI iterations within the preconditioner itself.

Finally, we test the performance of the DCT preconditioner on a synoptic-scale, weather-prediction problem. An arbitrarily chosen forecast from 0000 UTC 16 September 2002 is run using initial and boundary conditions supplied by the National Centers for Environmental Prediction (NCEP) Eta Model. The MC2 model is integrated over 48 h with 480 6-min time steps. The computational domain is 200 × 160 × 19, with ΔX = ΔY = 50 km on a polar stereographic projection true at 60°N, and the vertically stretched grid ranges from ΔZ = 72 m near the surface to 3800 m near the top at 23 km. The resulting condition number of the Helmholtz operator is extreme, O(1012), and the energy spectrum full. As in the Scandinavian case, ε = 10−6 is assumed. The resulting performance ratios are 1.93 and 4.90 for CPU time and iteration count, respectively. The DCT preconditioner again requires fewer iterations, and the model integration rate is about twice as fast. This demonstrates that the DCT is far superior to line Jacobi, despite the large departure of the preconditioner from the full Helmholtz operator emphasized by a synoptic-scale forecast.

To assess the relative merits of alternative Krylov methods, we also implemented the right-preconditioned flexible GMRES (FGMRES) algorithm of Saad (1993).10 For comparison, we ran the MC2 model using FGMRES(4), with DCT preconditioner, for the Scandinavian topography simulation and winter storm synoptic forecast. In both cases, we found that FGMRES requires additional iterations because it tends to converge more slowly than GCR after a restart. The model run times were nearly identical, with FGMRES giving a slight advantage in the synoptic case. To compare solutions, we examined the maximum Courant numbers (equivalent to a maximum norm) and residual L2 norm (least squares error) of the elliptic problem over the computational domain for the GCR and FGMRES solvers. In both cases, these agree to six digits, and thus we conclude that the solutions are the same.

5. Concluding remarks

We have developed a spectral preconditioner for a generalized conjugate residual (GCR) Krylov solver applied to the Helmholtz problem arising in the semi-implicit time discretization of a compressible nonhydrostatic atmospheric model. Our approach offers an alternative to the more standard and much simpler line-relaxation, stationary block-iteration-type preconditioners used in meteorology. Despite substantial departures of the spectral preconditioner from the governing Helmholtz operator, we observed dramatic performance gains over these simpler schemes. The relative improvement in the model integration rate ranges from a factor of 2 to one order of magnitude, depending on the problem at hand.

The relative performance gains are, of course, greatest when the DCT preconditioner happens to be a direct solver for the governing Helmholtz problem, that is, zero orography and constant mapscale factor characteristic of small-scale dynamics applications. In a mesoscale flow scenario over Scandinavian topography, with a broad range of scales of motion present, a factor of 4 improvement was achieved. For smooth idealized mountain wave flows, frequently used to benchmark numerical model performance, the relative gains are more modest. But extrapolating these results to natural problems underestimates the potential gains because idealized flows often have compact spectra that favor simpler schemes. For synoptic weather prediction, where the shallowness of the atmosphere implies an extremely poorly conditioned problem, the performance improvement is still a factor of 2.

Spectral methods rely on global basis functions, and computing their expansion coefficients requires multiple evaluations of global sums. Consequently, spectral transform methods are criticized as being inappropriate for modern, distributed-memory parallel architectures. Nevertheless, in the context of FFT-based preconditioners for iterative Krylov methods, Elman and O'Leary (1998) documented overall parallel speedups despite the large data communication overheads. In meteorology, the vertical direction is physically distinct and typically not partitioned across the processors in numerical models. Thus, a massively parallel implementation of spectral preconditioners can be based on a data-transposition strategy (Thomas et al. 2002) similar to that used in the European Centre for Medium-Range Weather Forecasts (ECMWF) weather prediction model Integrated Forecast System (IFS; Skålin 1997; Barros and Kauranne 1994).

The performance of the next-generation (all-scale, nonhydrostatic semi-implicit, weather and climate prediction/research) numerical models in meteorology depends critically on the performance of elliptic solvers employed. It is widely recognized that the state-of-the-art methods rely on preconditioned nonsymmetric Krylov algorithms. Given the extensive body of mathematical literature, still relatively little numerical experience exists with such solvers in advanced meteorological applications (cf. Navon and Cai 1993). We have advocated a spectrally preconditioned GCR solver in the context of a particular model, representative of current trends in NWP. Further studies with a focus on alternative numerics, mesh refinement, complex geometries, and extreme meteorological scenarios will pose/define extreme challenges/benchmarks to the advocated approach.

Acknowledgments

The authors are grateful to Claude Girard and Michel Desgagné of RPN and Environment Canada, for maintaining and improving the MC2 model, and to Xingxiu Deng and Henryk Modzelewski from UBC, for preparing the synoptic case and adept computer administration, respectively. This work was supported by the National Science Foundation, the Department of Energy Climate Change Prediction Program (CCPP), the Canadian Natural Science and Engineering Council, RPN and Environment Canada, the Canadian Foundation for Innovation, BC Knowledge Development Fund, the Geophysical Disaster Computational Fluid Dynamics Center, and UBC Research Endowments. The comments of two anonymous referees helped to improve the presentation.

REFERENCES

  • Arakawa, A., and V. Lamb, 1977: Computational design of the basic dynamical processes of the UCLA general circulation model. Methods in Computational Physics, Vol. 17, J. Chang, Ed., Academic Press, 174–265.

    • Search Google Scholar
    • Export Citation
  • Axelsson, O., 1994: Iterative Solution Methods. Cambridge University Press, 654 pp.

  • Barros, S., and T. Kauranne, 1994: On the parallelization of global spectral weather models. Parallel Comput., 20 , 13351356.

  • Benoit, R., M. Desgagné, P. Pellerin, S. Pellerin, Y. Chartier, and S. Desjardins, 1997: The Canadian MC2: A semi-Lagrangian, semi-implicit wide-band atmospheric model suited for fine-scale process studies and simulations. Mon. Wea. Rev., 125 , 23822415.

    • Search Google Scholar
    • Export Citation
  • Bernardet, P., 1995: The pressure term in the anelastic model: A symmetric elliptic solver for an Arakawa C grid in generalized coordinates. Mon. Wea. Rev., 123 , 24742490.

    • Search Google Scholar
    • Export Citation
  • Bourke, W., 1974: A multi-level spectral model. I. Formulation and hemispheric integrations. Mon. Wea. Rev., 102 , 687701.

  • Cullen, M., 1990: A test of a semi-implicit integration technique for a fully compressible nonhydrostatic model. Quart. J. Roy. Meteor. Soc., 116 , 12531258.

    • Search Google Scholar
    • Export Citation
  • Davies, H., 1976: A lateral boundary formulation for multi-level prediction models. Quart. J. Roy. Meteor. Soc., 102 , 405418.

  • Eisenstat, S., H. Elman, and M. Schultz, 1983: Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal., 2 , 345357.

    • Search Google Scholar
    • Export Citation
  • Elman, H., and D. O'Leary, 1998: Efficient iterative solution of the three-dimensional Helmholtz equation. J. Comput. Phys., 142 , 163181.

    • Search Google Scholar
    • Export Citation
  • Freund, R., 1993: A transpose-free quasi-mimimum residual algorithm for non-Hermitian linear systems. SIAM J. Sci. Stat. Comput., 14 , 470482.

    • Search Google Scholar
    • Export Citation
  • Gal-Chen, T., and R. Sommerville, 1975: On the use of a coordinate transformation for the solution of the Navier–Stokes equations. J. Comput. Phys., 17 , 209228.

    • Search Google Scholar
    • Export Citation
  • Golding, B., 1992: An efficient nonhydrostatic forecast model. Meteor. Atmos. Phys., 50 , 89103.

  • Grabowski, W., and P. Smolarkiewicz, 2002: A multiscale model for meteorological research. Mon. Wea. Rev., 130 , 939956.

  • Greenbaum, A., 1997: Iterative Methods for Solving Linear Systems. Society of Industrial and Applied Mathematics, 220 pp.

  • Hestenes, M., and E. Stiefel, 1952: Method of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. (U.S.), 49 , 409436.

    • Search Google Scholar
    • Export Citation
  • Kadogliu, H., and S. Mudrick, 1992: On the implementation of the GMRES(m) method to elliptic equations in meteorology. J. Comput. Phys., 102 , 348359.

    • Search Google Scholar
    • Export Citation
  • Kapitza, H., and D. Eppel, 1992: The nonhydrostatic mesoscale model GESIMA. Part I: Dynamical equations and tests. Beitr. Phys. Atmos., 65 , 129146.

    • Search Google Scholar
    • Export Citation
  • Lafore, J., and Coauthors. 1998: The Meso–NH atmospheric modeling system. Part I: Adiabatic formulation and control simulations. Ann. Geophys., 16 , 90109.

    • Search Google Scholar
    • Export Citation
  • Laprise, R., D. Caya, G. Bergeron, and M. Giguère, 1997: The formulation of the André Robert MC2 (Mesoscale Compressible Community) model. Atmos.–Ocean, 35 , 127152.

    • Search Google Scholar
    • Export Citation
  • Lynch, R., J. Rice, and D. Thomas, 1964: Direct solution of partial difference equations by tensor product methods. Numer. Math., 6 , 185199.

    • Search Google Scholar
    • Export Citation
  • Machenhauer, B., and R. Daley, 1972: A baroclinic primitive equation model with a spectral representation in three dimensions. Tech. Rep. 4, Institute of Theoretical Meteorology, University of Copenhagen, 63 pp.

    • Search Google Scholar
    • Export Citation
  • Marshall, J., C. Hill, L. Perelman, and A. Adcroft, 1997: Hydrostatic, quasi-hydrostatic and non-hydrostatic ocean modeling. J. Geophys. Res., 102 , 57335752.

    • Search Google Scholar
    • Export Citation
  • Navon, I., and Y. Cai, 1993: Domain decomposition and parallel processing of a finite element model of the shallow water equations. Comput. Methods Appl. Mech. Eng., 106 , 179212.

    • Search Google Scholar
    • Export Citation
  • Prusa, J., P. Smolarkiewicz, and R. Garcia, 1996: On the propagation and breaking at high altitudes of gravity waves excited by tropospheric forcing. J. Atmos. Sci., 53 , 21862216.

    • Search Google Scholar
    • Export Citation
  • Roache, P., 1972: Computational Fluid Dynamics. Hermosa, 446 pp.

  • Robert, A., 1969: Integration of a spectral model of the atmosphere by the implicit method. Proc. WMO/IUGG Symp. on NWP, Vol. 7, Tokyo, Japan, Japan Meteorological Agency, 19–24.

    • Search Google Scholar
    • Export Citation
  • Robert, A., 1981: A stable numerical integration scheme for the primitive meteorological equations. Atmos.–Ocean, 19 , 3546.

  • Robert, A., 1993: Bubble convection experiments with a semi-implicit formulation of the Euler equations. J. Atmos. Sci., 50 , 18651873.

    • Search Google Scholar
    • Export Citation
  • Robert, A., T. Yee, and H. Ritchie, 1985: A semi-Lagrangian semi-implicit numerical integration scheme for multilevel atmospheric models. Mon. Wea. Rev., 113 , 388394.

    • Search Google Scholar
    • Export Citation
  • Saad, Y., 1993: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Stat. Comput., 14 , 461469.

  • Saad, Y., 1996: Iterative Methods for Sparse Linear Systems. PWS, 447 pp.

  • Saad, Y., and M. Schultz, 1986: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7 , 856869.

    • Search Google Scholar
    • Export Citation
  • Saito, K., U. Schättler, and U. Steppler, 1998: 3D mountain waves by the Lokal-Modell of DWD and the MRI mesoscale nonhydrostatic model. Pap. Meteor. Geophys., 49 , 719.

    • Search Google Scholar
    • Export Citation
  • Schumann, U., and H. Volkert, 1987: Three-dimensional mass- and momentum-consistent Helmholtz-equation in terrain-following coordinates. Notes on Numerical Fluid Mechanics, W. Hackbusch, Ed., Vol. 10, Springer-Verlag, 109–131.

    • Search Google Scholar
    • Export Citation
  • Schumann, U., and R. Sweet, 1988: Fast Fourier transforms for direct solution of Poisson's equation with staggered boundary conditions. J. Comput. Phys., 75 , 123137.

    • Search Google Scholar
    • Export Citation
  • Skålin, R., 1997: Scalability of parallel gridpoint limited-area atmospheric models. Part II: Semi-implicit time-integration schemes. J. Atmos. Oceanic Technol., 14 , 442455.

    • Search Google Scholar
    • Export Citation
  • Skamarock, W., P. Smolarkiewicz, and J. Klemp, 1997: Preconditioned conjugate-residual solvers for Helmholtz equations in nonhydrostatic models. Mon. Wea. Rev., 125 , 587599.

    • Search Google Scholar
    • Export Citation
  • Smith, R., 1980: Linear theory of stratified hydrostatic flow past an isolated mountain. Tellus, 32 , 348364.

  • Smolarkiewicz, P., and J. Pudykiewicz, 1992: A class of semi-Lagrangian approximations for fluids. J. Atmos. Sci., 49 , 20822096.

  • Smolarkiewicz, P., and L. Margolin, 1994: Variational solver for elliptic problems in atmospheric flows. Appl. Math. Comput. Sci., 4 , 527551.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., and L. Margolin, 1997: On forward-in-time differencing for fluids: An Eulerian/semi-Lagrangian nonhydrostatic model for stratified flows. Atmos.–Ocean, 35 , 127152.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., and L. Margolin, 2000: Variational methods for elliptic problems in fluid models. Proc. ECMWF Workshop on Developments in Numerical Methods for Very High Resolution Global Models, Reading, United Kingdom, ECMWF, 137–159.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., and J. Prusa, 2002: VLES modelling of geophysical fluids with nonoscillatory forward-in-time schemes. Int. J. Numer. Methods Fluids, 39 , 779819.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., V. Grubisǐć, and L. Margolin, 1997: On forward-in-time differencing for fluids: Stopping criteria for iterative solutions of anelastic pressure equations. Mon. Wea. Rev., 125 , 647654.

    • Search Google Scholar
    • Export Citation
  • Sweet, R., 1973: Direct methods for the solution of Poisson's equation on a staggered grid. J. Comput. Phys., 12 , 422428.

  • Tanguay, M., A. Robert, and R. Laprise, 1990: A semi-implicit semi-Lagrangian fully compressible regional forecast model. Mon. Wea. Rev., 118 , 19701980.

    • Search Google Scholar
    • Export Citation
  • Tapp, M., and P. White, 1976: A nonhydrostatic mesoscale model. Quart. J. Roy. Meteor. Soc., 102 , 277296.

  • Thomas, S., C. Girard, R. Benoit, M. Desgagné, and P. Pellerin, 1998: A new adiabatic kernel for the MC2 model. Atmos.–Ocean, 36 , 241270.

    • Search Google Scholar
    • Export Citation
  • Thomas, S., C. Girard, G. Doms, and U. Schättler, 2000: Semi-implicit scheme for the DWD Lokal-Modell. Meteor. Atmos. Phys., 73 , 105125.

    • Search Google Scholar
    • Export Citation
  • Thomas, S., J. Hacker, M. Desgagné, and R. Stull, 2002: An ensemble analysis of forecast errors related to floating point performance. Wea. Forecasting, 17 , 898906.

    • Search Google Scholar
    • Export Citation
  • Tokioka, T., 1978: Some considerations on vertical differencing. J. Meteor. Soc. Japan, 56 , 98111.

  • van der Vorst, H., 1992: Bi-CGStab: A fast and smoothly converging variant of Bi-CG for the solution of non-symmetric linear systems. SIAM J. Sci. Stat. Comput., 12 , 631644.

    • Search Google Scholar
    • Export Citation
  • Yeh, K-S., J. Cote, S. Gravel, A. Methot, A. Patoine, M. Roch, and A. Staniforth, 2002: The CMC–MRB Global Environmental Multiscale (GEM) model. Part III: Nonhydrostatic formulation. Mon. Wea. Rev., 130 , 339356.

    • Search Google Scholar
    • Export Citation

APPENDIX A

Generalized Conjugate-Residual Approach

Here we describe the preconditioned GCR(k) algorithm used in this study. In general, we consider a linear elliptic problem,
i1520-0493-131-10-2464-ea1
with variable coefficients A, CIJ, DI, Q, and either periodic, Dirichlet, or Neumann boundary conditions and adopt the following notation: The discrete representation of a field on the grid is denoted by the subscript i; the discrete representation of the elliptic operator on the lhs of (A1) is denoted by Li(Ψ); and the inner product 〈ξζ〉 ≡ Σi ξiζi. The preconditioner P is a linear operator that approximates L to a greater or lesser degree but is easier to invert. Its role is to substitute (A1) with an auxiliary problem that converges faster (than the original problem) because of a closer clustering of the eigenvalues of the auxiliary operator resulting from the superposition of L and P−1. In this paper, we are primarily concerned with “left” preconditioning that substitutes (A1) with an auxiliary problem P−1[L(Ψ) − Q] = 0. Its accelerated convergence exploits spectral properties of P−1L. In general, “right” preconditioning is also possible. It augments (A1) with LP−1[P(Ψ)] = Q, and its convergence relies on reducing the spectral radius of LP−1. Left preconditioning assumes P constant during the iteration process, whereas right preconditioning allows for variable P as, for example, in the FGMRES solver of Saad (1993).
The GCR(k) method of Eisenstat et al. (1983) may be derived via variational arguments (cf. Smolarkiewicz and Margolin 1994, 2000). In essence, we augment (A1) with a kth-order damped oscillation equation:
i1520-0493-131-10-2464-ea2
Then we discretize (A2) in a pseudotime τ to form the affine discrete equation for the progression of the residual errors r and determine the optimal parameters T1, … , Tk−1 and integration increment Δτ (variable in τ) that assure minimization of the residual errors in the norm defined by the inner product 〈rr〉. This leads to the following algorithm.
For any initial guess Ψ0i, set r0i = Li0) − Qi, p0i = P−1i(r0); then iterate:
i1520-0493-131-10-2464-eqa1

For the convergence, the GCR(k) algorithm above requires LP−1 to be negative definiteA1 but not necessarily self-adjoint.A2 Direct evaluation of the elliptic operator on the grid takes place only once per iteration following the preconditioning e = P−1(rν+1), which provides an estimate of the solution error eν+1 = Ψν+1 − Ψexact.

APPENDIX B

Line-Relaxation Preconditioners

A distinctive feature of meteorological flows is their anisotropy in the vertical direction—the larger the ratio of the horizontal scale of the flow to the fluid depth, the stiffer the elliptic problem. A class of simple yet effective preconditioners that mitigate this aspect (of elliptic problems in meteorology) derives from an implicit Richardson iteration (Smolarkiewicz and Margolin 2000),
i1520-0493-131-10-2464-eb1
a realization of the preconditioning step e = P−1(rν+1) of the GCR(k) solver summarized in appendix A. Here, Ph and Pz are the horizontal and the vertical counterparts of the operator P, respectively; Δτ̃ is a parameter of the iteration (a pseudo-time step) based on spectral properties of Ph [viz., linear stability analysis of (B1)]; μ numbers successive Richardson iterations; and ν numbers the outer iterations of the Krylov solver. Equation (B1) leads to a linear problem,
Iτ̃Pzeμ+1μ
where μeμ + Δτ̃[Ph(eμ) − rν+1], that can be solved readily using the well-known tridiagonal algorithm (cf. appendix A in Roache 1972). Alternating implicit discretization between the vertical and horizontal counterparts of P in fractional steps of τ̃ leads to the ADI preconditioners of Skamarock et al. (1997).
The implicit Richardson preconditioner in (B1) can be further improved. Consider extending the Richardson diffusion scheme with respect to Ph to the diagonally preconditioned Duffort–Frankel type implicit algorithm,
i1520-0493-131-10-2464-eb3
where (−1)D stands for the diagonal coefficient embedded within the matrix representing Ph on the grid. Note that adding the relaxation term on the rhs of (B1) has the effect of replacing the Deμ term with D(eμ+1) in Ph(eμ) without complicating flux boundary conditions imposed in constructing Ph(e) (cf. Smolarkiewicz and Margolin 2000 for discussions). In the limit Δτ̃ → ∞, (B3) is equivalent to the block Jacobi preconditioner proven effective in meteorological applications (Thomas et al. 1998, 2000).

APPENDIX C

Coordinate Systems

For a polar stereographic projection, true at longitude ϕ0, m is the map scale factor and S = m2:
i1520-0493-131-10-2464-ec1
The components (U, V) of the wind “image” are defined in terms of the scaled and rotated components of the true horizontal velocity (u, υ) in local (x, y, z) coordinates,
i1520-0493-131-10-2464-ec2
where λ represents longitude. The (X, Y) components of the conformal coordinates are scaled and rotated according to
i1520-0493-131-10-2464-ec3
In conformal coordinates (X, Y, z), the Euler equations take the following form:
i1520-0493-131-10-2464-ec4
where K = (U2 + V2)/2 is the pseudohorizontal kinetic energy per unit mass. The material derivative is written as
i1520-0493-131-10-2464-ec5
The divergence of (U, V, w) is given by
i1520-0493-131-10-2464-ec6
where the horizontal divergence of (U, V) on the rhs is denoted by DH(U, V) throughout the body of the paper.
To account for the orography one can introduce a terrain-following coordinate system, where any monotonic function of geometrical height can be used as a vertical coordinate, denoted by Z. Throughout this paper we employ the classical transformation of Gal-Chen and Sommerville (1975), where
i1520-0493-131-10-2464-ec7
Because Z in (C7) is assumed to be time independent [cf. Prusa et al. (1996) for time-dependent extensions], the resulting Z-coordinate system represents a nondeformable system with surfaces of constant Z fixed in physical space—in contrast to various pressure-based vertical coordinates used in hydrostatic weather-prediction models. The mapping from conformal (X, Y, z) coordinates to curvilinear (X, Y, Z) coordinates is represented by the metric coefficients GIJ of the transformation:
i1520-0493-131-10-2464-ec8
Admitting an optional, variable spacing of vertical levels modifies the metric coefficients in (C8) via multiplicative factors (cf. Thomas et al. 1998). The terrain-following system is right-handed, where Z increases from the bottom of the model domain to the top, and the Z surface corresponding to the lower boundary is conformal to the terrain. Thus, by definition, G is always positive. Partial derivatives of any variable ψ are to be interpreted for terrain-following coordinates as follows:
i1520-0493-131-10-2464-ec9
The contravariant “vertical” velocity is related to the physical velocity components by the equation
WŻG−1wSG13UG23V

APPENDIX D

Numerical Operators

The centered averaging μζ and differencing δζ operators are defined as
i1520-0493-131-10-2464-ed1
where ψ is a dependent model variable that may be defined at the center of a grid cell or on the grid cell faces. Multiple averages are denoted by
i1520-0493-131-10-2464-ed3
Discrete horizontal and vertical derivative operators are defined as follows:
i1520-0493-131-10-2464-ed4
The discrete divergence operator D, acting on the contravariant wind components, can be written as follows:
i1520-0493-131-10-2464-ed5
The operator in (D5) satisfies the Gaussian divergence theorem in discrete form (Thomas et al. 1998). The Jacobian G of the coordinate transformation is located at grid cell vertices and D(U, V, W) at cell centers. The contravariant velocity component W, defined in (C10), is computed from
i1520-0493-131-10-2464-ed6
and the formal discrete flow divergence (D5) is then implemented as D(U, V, W) = DH(U, V) + δzw, where
i1520-0493-131-10-2464-ed7

Fig. 1.
Fig. 1.

Potential temperature θ (a) xz and (b) xy cross sections through the center of the convective bubble at t = 450 s. The xy cross sections are at z = 800 m, shown by the horizontal dashed line in (a). Contour interval is 0.1 K

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 2.
Fig. 2.

Spanwise (a) x-average and (b) y-average 1D power spectra at three height levels of the solution in Fig. 1. The thin solid line shows the “−5/3” slope for reference

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 3.
Fig. 3.

The rms divergence as a function of ε in Eq. (29) for the convective bubble case shown in Fig. 1. Results with both the Jacobi (solid line) and the DCT (dashed line) preconditioners are shown

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 4.
Fig. 4.

Performance ratios for CPU and iteration count, as a function of ε, for the convective bubble case shown in Fig. 1.

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 5.
Fig. 5.

Vertical velocity (left) xz and (right) x–y cross sections through the mountain wave simulations at time step 480: (top) hydrostatic case and (bottom) nonhydrostatic case. The xy cross sections are at z = 2100 m. Contour intervals are 0.01 m s−1 for the hydrostatic case and 0.04 m s−1 for the nonhydrostatic case

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 6.
Fig. 6.

One-dimensional power spectra through the centers of the right-hand panels in Fig. 5. Solid and dashed lines denote the hydrostatic and nonhydrostatic case, respectively

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 7.
Fig. 7.

The rms divergence and average number of GCR iterations per time step for the (a) hydrostatic and (b) nonhydrostatic mountain wave cases. Results with both the Jacobi (JAC) and DCT preconditioner are shown

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 8.
Fig. 8.

The CPU time performance ratio as a function of convergence criterion ε for both the hydrostatic and nonhydrostatic mountain wave cases

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

Fig. 9.
Fig. 9.

Wind solution for the first model level (Z = 150 m) over the Scandinavian topography at 24 h with the DCT preconditioner. Topography is also shown at 500-m intervals

Citation: Monthly Weather Review 131, 10; 10.1175/1520-0493(2003)131<2464:SPFNAM>2.0.CO;2

1

Throughout this paper GCR (4) is employed unless stated otherwise.

2

Other popular alternatives include the biconjugate gradients stabilized (BiCGStab; van der Vorst 1992) and transpose-free quasi-minimal residual (TFQMR; Freund 1993) algorithms.

3

Using the auxiliary “pressure” variable q, rather than the pressure p itself, facilitates isolating the 1/ρp nonlinearity in the numerical-model equations, analogous to q = lnps (surface pressure ps) in hydrostatic primitive equation models (cf. Machenhauer and Daley 1972; Bourke 1974; Robert et al. (1985; Tanguay et al. 1990). The Exner function Π = (p/p00)κ is an alternative (Skamarock et al. 1997).

4

Numerical dissipative terms absent from the analytic model (1)—absorbers, filters, subgrid-scale turbulence parameterizations, etc.—are split and integrated to first order.

5

The reader interested in implicit formulations of boundary conditions in the context of Krylov solvers is referred to Smolarkiewicz and Margolin (2000) for a discussion.

6

In models with XY cross differentiation [e.g., due to the coordinate mapping for the sake of mesh adaptivity (Smolarkiewicz and Prusa 2002)] either ADI or spectral preconditioners would require neglecting these cross terms.

7

Nonperiodic boundary conditions are always homogeneous in the preconditioner (Smolarkiewicz and Margolin 2000, section 3).

8

All calculations reported were performed with 64-bit real arithmetic.

9

For the exact steady-state solution, the rms divergence should vanish.

10

Computational complexity estimates for the various algorithms are provided by Saad (1996).

A1

An operator A is said to be definite if 〈ξA(ξ)〉 is either strictly positive (positive definite) or strictly negative (negative definite) for all ξ.

A2

An operator A is said to be self-adjoint if 〈ξA(ζ)〉 = 〈ζA(ξ)〉 for all ξ and ζ.

Save
  • Arakawa, A., and V. Lamb, 1977: Computational design of the basic dynamical processes of the UCLA general circulation model. Methods in Computational Physics, Vol. 17, J. Chang, Ed., Academic Press, 174–265.

    • Search Google Scholar
    • Export Citation
  • Axelsson, O., 1994: Iterative Solution Methods. Cambridge University Press, 654 pp.

  • Barros, S., and T. Kauranne, 1994: On the parallelization of global spectral weather models. Parallel Comput., 20 , 13351356.

  • Benoit, R., M. Desgagné, P. Pellerin, S. Pellerin, Y. Chartier, and S. Desjardins, 1997: The Canadian MC2: A semi-Lagrangian, semi-implicit wide-band atmospheric model suited for fine-scale process studies and simulations. Mon. Wea. Rev., 125 , 23822415.

    • Search Google Scholar
    • Export Citation
  • Bernardet, P., 1995: The pressure term in the anelastic model: A symmetric elliptic solver for an Arakawa C grid in generalized coordinates. Mon. Wea. Rev., 123 , 24742490.

    • Search Google Scholar
    • Export Citation
  • Bourke, W., 1974: A multi-level spectral model. I. Formulation and hemispheric integrations. Mon. Wea. Rev., 102 , 687701.

  • Cullen, M., 1990: A test of a semi-implicit integration technique for a fully compressible nonhydrostatic model. Quart. J. Roy. Meteor. Soc., 116 , 12531258.

    • Search Google Scholar
    • Export Citation
  • Davies, H., 1976: A lateral boundary formulation for multi-level prediction models. Quart. J. Roy. Meteor. Soc., 102 , 405418.

  • Eisenstat, S., H. Elman, and M. Schultz, 1983: Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal., 2 , 345357.

    • Search Google Scholar
    • Export Citation
  • Elman, H., and D. O'Leary, 1998: Efficient iterative solution of the three-dimensional Helmholtz equation. J. Comput. Phys., 142 , 163181.

    • Search Google Scholar
    • Export Citation
  • Freund, R., 1993: A transpose-free quasi-mimimum residual algorithm for non-Hermitian linear systems. SIAM J. Sci. Stat. Comput., 14 , 470482.

    • Search Google Scholar
    • Export Citation
  • Gal-Chen, T., and R. Sommerville, 1975: On the use of a coordinate transformation for the solution of the Navier–Stokes equations. J. Comput. Phys., 17 , 209228.

    • Search Google Scholar
    • Export Citation
  • Golding, B., 1992: An efficient nonhydrostatic forecast model. Meteor. Atmos. Phys., 50 , 89103.

  • Grabowski, W., and P. Smolarkiewicz, 2002: A multiscale model for meteorological research. Mon. Wea. Rev., 130 , 939956.

  • Greenbaum, A., 1997: Iterative Methods for Solving Linear Systems. Society of Industrial and Applied Mathematics, 220 pp.

  • Hestenes, M., and E. Stiefel, 1952: Method of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. (U.S.), 49 , 409436.

    • Search Google Scholar
    • Export Citation
  • Kadogliu, H., and S. Mudrick, 1992: On the implementation of the GMRES(m) method to elliptic equations in meteorology. J. Comput. Phys., 102 , 348359.

    • Search Google Scholar
    • Export Citation
  • Kapitza, H., and D. Eppel, 1992: The nonhydrostatic mesoscale model GESIMA. Part I: Dynamical equations and tests. Beitr. Phys. Atmos., 65 , 129146.

    • Search Google Scholar
    • Export Citation
  • Lafore, J., and Coauthors. 1998: The Meso–NH atmospheric modeling system. Part I: Adiabatic formulation and control simulations. Ann. Geophys., 16 , 90109.

    • Search Google Scholar
    • Export Citation
  • Laprise, R., D. Caya, G. Bergeron, and M. Giguère, 1997: The formulation of the André Robert MC2 (Mesoscale Compressible Community) model. Atmos.–Ocean, 35 , 127152.

    • Search Google Scholar
    • Export Citation
  • Lynch, R., J. Rice, and D. Thomas, 1964: Direct solution of partial difference equations by tensor product methods. Numer. Math., 6 , 185199.

    • Search Google Scholar
    • Export Citation
  • Machenhauer, B., and R. Daley, 1972: A baroclinic primitive equation model with a spectral representation in three dimensions. Tech. Rep. 4, Institute of Theoretical Meteorology, University of Copenhagen, 63 pp.

    • Search Google Scholar
    • Export Citation
  • Marshall, J., C. Hill, L. Perelman, and A. Adcroft, 1997: Hydrostatic, quasi-hydrostatic and non-hydrostatic ocean modeling. J. Geophys. Res., 102 , 57335752.

    • Search Google Scholar
    • Export Citation
  • Navon, I., and Y. Cai, 1993: Domain decomposition and parallel processing of a finite element model of the shallow water equations. Comput. Methods Appl. Mech. Eng., 106 , 179212.

    • Search Google Scholar
    • Export Citation
  • Prusa, J., P. Smolarkiewicz, and R. Garcia, 1996: On the propagation and breaking at high altitudes of gravity waves excited by tropospheric forcing. J. Atmos. Sci., 53 , 21862216.

    • Search Google Scholar
    • Export Citation
  • Roache, P., 1972: Computational Fluid Dynamics. Hermosa, 446 pp.

  • Robert, A., 1969: Integration of a spectral model of the atmosphere by the implicit method. Proc. WMO/IUGG Symp. on NWP, Vol. 7, Tokyo, Japan, Japan Meteorological Agency, 19–24.

    • Search Google Scholar
    • Export Citation
  • Robert, A., 1981: A stable numerical integration scheme for the primitive meteorological equations. Atmos.–Ocean, 19 , 3546.

  • Robert, A., 1993: Bubble convection experiments with a semi-implicit formulation of the Euler equations. J. Atmos. Sci., 50 , 18651873.

    • Search Google Scholar
    • Export Citation
  • Robert, A., T. Yee, and H. Ritchie, 1985: A semi-Lagrangian semi-implicit numerical integration scheme for multilevel atmospheric models. Mon. Wea. Rev., 113 , 388394.

    • Search Google Scholar
    • Export Citation
  • Saad, Y., 1993: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Stat. Comput., 14 , 461469.

  • Saad, Y., 1996: Iterative Methods for Sparse Linear Systems. PWS, 447 pp.

  • Saad, Y., and M. Schultz, 1986: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7 , 856869.

    • Search Google Scholar
    • Export Citation
  • Saito, K., U. Schättler, and U. Steppler, 1998: 3D mountain waves by the Lokal-Modell of DWD and the MRI mesoscale nonhydrostatic model. Pap. Meteor. Geophys., 49 , 719.

    • Search Google Scholar
    • Export Citation
  • Schumann, U., and H. Volkert, 1987: Three-dimensional mass- and momentum-consistent Helmholtz-equation in terrain-following coordinates. Notes on Numerical Fluid Mechanics, W. Hackbusch, Ed., Vol. 10, Springer-Verlag, 109–131.

    • Search Google Scholar
    • Export Citation
  • Schumann, U., and R. Sweet, 1988: Fast Fourier transforms for direct solution of Poisson's equation with staggered boundary conditions. J. Comput. Phys., 75 , 123137.

    • Search Google Scholar
    • Export Citation
  • Skålin, R., 1997: Scalability of parallel gridpoint limited-area atmospheric models. Part II: Semi-implicit time-integration schemes. J. Atmos. Oceanic Technol., 14 , 442455.

    • Search Google Scholar
    • Export Citation
  • Skamarock, W., P. Smolarkiewicz, and J. Klemp, 1997: Preconditioned conjugate-residual solvers for Helmholtz equations in nonhydrostatic models. Mon. Wea. Rev., 125 , 587599.

    • Search Google Scholar
    • Export Citation
  • Smith, R., 1980: Linear theory of stratified hydrostatic flow past an isolated mountain. Tellus, 32 , 348364.

  • Smolarkiewicz, P., and J. Pudykiewicz, 1992: A class of semi-Lagrangian approximations for fluids. J. Atmos. Sci., 49 , 20822096.

  • Smolarkiewicz, P., and L. Margolin, 1994: Variational solver for elliptic problems in atmospheric flows. Appl. Math. Comput. Sci., 4 , 527551.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., and L. Margolin, 1997: On forward-in-time differencing for fluids: An Eulerian/semi-Lagrangian nonhydrostatic model for stratified flows. Atmos.–Ocean, 35 , 127152.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., and L. Margolin, 2000: Variational methods for elliptic problems in fluid models. Proc. ECMWF Workshop on Developments in Numerical Methods for Very High Resolution Global Models, Reading, United Kingdom, ECMWF, 137–159.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., and J. Prusa, 2002: VLES modelling of geophysical fluids with nonoscillatory forward-in-time schemes. Int. J. Numer. Methods Fluids, 39 , 779819.

    • Search Google Scholar
    • Export Citation
  • Smolarkiewicz, P., V. Grubisǐć, and L. Margolin, 1997: On forward-in-time differencing for fluids: Stopping criteria for iterative solutions of anelastic pressure equations. Mon. Wea. Rev., 125 , 647654.

    • Search Google Scholar
    • Export Citation
  • Sweet, R., 1973: Direct methods for the solution of Poisson's equation on a staggered grid. J. Comput. Phys., 12 , 422428.

  • Tanguay, M., A. Robert, and R. Laprise, 1990: A semi-implicit semi-Lagrangian fully compressible regional forecast model. Mon. Wea. Rev., 118 , 19701980.

    • Search Google Scholar
    • Export Citation
  • Tapp, M., and P. White, 1976: A nonhydrostatic mesoscale model. Quart. J. Roy. Meteor. Soc., 102 , 277296.

  • Thomas, S., C. Girard, R. Benoit, M. Desgagné, and P. Pellerin, 1998: A new adiabatic kernel for the MC2 model. Atmos.–Ocean, 36 , 241270.

    • Search Google Scholar
    • Export Citation
  • Thomas, S., C. Girard, G. Doms, and U. Schättler, 2000: Semi-implicit scheme for the DWD Lokal-Modell. Meteor. Atmos. Phys., 73 , 105125.

    • Search Google Scholar
    • Export Citation
  • Thomas, S., J. Hacker, M. Desgagné, and R. Stull, 2002: An ensemble analysis of forecast errors related to floating point performance. Wea. Forecasting, 17 , 898906.

    • Search Google Scholar
    • Export Citation
  • Tokioka, T., 1978: Some considerations on vertical differencing. J. Meteor. Soc. Japan, 56 , 98111.

  • van der Vorst, H., 1992: Bi-CGStab: A fast and smoothly converging variant of Bi-CG for the solution of non-symmetric linear systems. SIAM J. Sci. Stat. Comput., 12 , 631644.

    • Search Google Scholar
    • Export Citation
  • Yeh, K-S., J. Cote, S. Gravel, A. Methot, A. Patoine, M. Roch, and A. Staniforth, 2002: The CMC–MRB Global Environmental Multiscale (GEM) model. Part III: Nonhydrostatic formulation. Mon. Wea. Rev., 130 , 339356.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Potential temperature θ (a) xz and (b) xy cross sections through the center of the convective bubble at t = 450 s. The xy cross sections are at z = 800 m, shown by the horizontal dashed line in (a). Contour interval is 0.1 K

  • Fig. 2.

    Spanwise (a) x-average and (b) y-average 1D power spectra at three height levels of the solution in Fig. 1. The thin solid line shows the “−5/3” slope for reference

  • Fig. 3.

    The rms divergence as a function of ε in Eq. (29) for the convective bubble case shown in Fig. 1. Results with both the Jacobi (solid line) and the DCT (dashed line) preconditioners are shown

  • Fig. 4.

    Performance ratios for CPU and iteration count, as a function of ε, for the convective bubble case shown in Fig. 1.

  • Fig. 5.

    Vertical velocity (left) xz and (right) x–y cross sections through the mountain wave simulations at time step 480: (top) hydrostatic case and (bottom) nonhydrostatic case. The xy cross sections are at z = 2100 m. Contour intervals are 0.01 m s−1 for the hydrostatic case and 0.04 m s−1 for the nonhydrostatic case