## 1. Introduction

Over the last few decades, global climate models have been widely used for the prediction of future climate trends. However, climate change will mostly affect the ecosystems and social and economic well-being at the regional scale (Watson et al. 1997). Hence, improving our capabilities in regional atmospheric simulations is of critical importance. It is recognized that the climate of the atmosphere is influenced by a disparate range of richly interacting spatiotemporal scales (Wheeler and Kiladis 1999). Indeed, localized flow structures, like hurricanes, may play an important role in obtaining the correct climate signal (Emanuel 2005; Mann and Emanuel 2006). Alas, the climate problem involves multiple physical scales as it couples with the oceans, the carbon cycle, ice sheets, and various other processes yielding a difficult multiscale and multiphysics problem.

Current models are incapable of representing the multiscale aspects of the climate system. This is due in part to the physical parameterizations involved. Historically, climate models are spawns of weather models ran on much longer time scales, at much lower resolution, with physics crudely parameterized (out of sheer necessity) instead of directly simulated (Slingo et al. 2009). Nowadays, with much more powerful computing facilities (http://www.top500.org) it is tempting to readdress some of those simplifying assumptions by attempting to run a weather model at weather resolutions on climate time scales. Still, it is not clear if such an approach will produce viable climate predictions. To reach such resolutions and generate a correct climate, various new alternative approaches to physics need to be investigated (e.g., see Grabowski 2001; Khairoutdinov and Randall 2001; Khouider et al. 2011). The goal of these approaches is to reproduce the correct multiscale wave patterns observed in nature directly. Apart from the multicloud approach described in Khouider et al. (2011), the so-called cloud resolving models are extremely costly to run as compared with current parameterizations. Models will require more spatial resolution to eventually benefit of such improved physics. A very first step toward this goal consists of making them extremely scalable, while increasing their spatial resolution, regardless of the physics employed (Dennis et al. 2005b; Wehner 2008; Bhanot et al. 2008; Satoh et al. 2008). While this approach is promising, some flaws have recently been identified (e.g., see McClean et al. 2011). Nevertheless, even if a correct multiscale physics package was available, the remaining problem with such approaches would be mainly computational. Each uniform doubling of the horizontal resolution results into an increase in computational cost of a factor of 8, where a factor of 2 is stemming from each dimension in space and time. Physics are the computational bottleneck of a climate or weather model and their cost is unfortunately directly proportional to the total number of spatial grid nodes employed. One way to reduce the number of nodes in a model, while preserving the range of spatial scales involved, is to employ fully unstructured dynamically adaptive meshes. Such approaches were proven very successful quite recently on spherical geometries (Burstedde et al. 2010; Wilcox et al. 2010).

This article describes the development of a set of numerical techniques and demonstrates their potential for solving multiscale environmental flows in an efficient fashion. As a first step toward a more realistic three-dimensional model, it focuses on the shallow-water equations, presenting the major difficulties found in the horizontal aspects of three-dimensional global geophysical models (Williamson et al. 1992).

Designed for unstructured grids, the high-order discontinuous Galerkin (DG) method (Cockburn et al. 2000) is a good candidate to renew the dynamical cores employed in environmental flows models. It is conservative, accurate, and well suited for advection-dominated flows (Cockburn and Shu 2001). The DG method enjoys most of the strengths of finite-element and finite-volume schemes while avoiding most of their weaknesses. The polynomial interpolation used inside each element allows for a high-order representation of the solution. As for finite-volume methods, advection schemes take into account the characteristic structure of the equations. Moreover, no degree of freedom is shared between two geometric entities. This high-level of locality considerably simplifies the implementation of the method and contributes to its parallel efficiency. Finally, the mass matrix is block diagonal, and for explicit time-stepping schemes no linear solver is needed. We also observe a growing interest for the DG methods in marine modeling (Aizinger and Dawson 2002; Bernard et al. 2007; Kubatko et al. 2006; Blaise et al. 2010a,b; Comblen et al. 2010). For atmospheric modeling, the high-order capabilities of this scheme are attractive (Nair et al. 2005a,b; Giraldo 2006; Giraldo and Restelli 2008; St-Cyr and Neckels 2009; Nair et al. 2009), and the increasing use of DG follows the trend to replace the global spectral transform methods with local ones (Neale et al. 2010; Dennis et al. 2011, manuscript submitted to *Int. J. High Perform. Comput. Appl.*).

The combination of the high-order DG method with dynamic refinement allows the resolution to be enhanced statically in the areas of interest for regional climate simulations and dynamically where the unsteady dynamics are more demanding. Hence, the computational power is used more effectively by concentrating the CPU load where it is needed. Dynamic adaptation of the computational mesh to locally modify the resolution, also known as *h* adaptivity, has been introduced in the last decade to simulate shallow-water flows on the sphere (Remacle et al. 2006; Bernard et al. 2007; Läuter et al. 2007; Weller et al. 2009). While those studies are based on conforming local mesh modifications such as edges splitting, collapsing, swapping, and node movements, other approaches consider nonconforming mesh modifications (St-Cyr et al. 2008) or the more classic Berger–Oliger algorithm (Chen et al. 2011). The present work focuses on a *hybrid* Adaptive Mesh Refinement (AMR) method, consisting of recursive nonconforming elements splitting of an initially unstructured mesh. It thus provides a fast and very localized adaptation procedure (Löhner and Baum 1991). Having recourse to a shallow-water DG code to simulate multiscale flows, Kubatko et al. (2009) showed the gain of efficiency obtained by using *p* adaptation. It consists of modifying the local order of interpolation during the simulation. To the authors’ knowledge, only one very recent study was published using simultaneously *h* and *p* adaptation to solve shallow-water problems (Eskilsson 2011). While the simulations presented by Eskilsson (2011) are simplified and characterized by smooth fields on planar domains, the present work describes the first parallel *hp-*adaptive simulation of complex realistic flows on the sphere such as tsunamis.

This work has been conducted with an overarching goal of building a flexible software framework for the multiscale simulation of various problems in the climate sciences [a Multiscale Unified Simulation Environment (MUSE) for geoscientific applications, http://muse.ucar.edu]. Hence, the model described herein is a generic high-order *hp-*capable model for the simulation of various conservation laws. It is flexible enough to test innovative numerical techniques while relying on efficient computational kernel implementations with special attention to parallel performance. Solving the shallow-water equations is part of this attempt and constitutes the main topic of this paper. While dynamic adaptation is used, the definition of optimal refinement criteria is outside the scope of this work and will only be briefly discussed. The paper is organized in two sections. The model is described in the first section, with its discretization in both space and time as well as a description of how the equations are constrained onto the sphere. A short presentation of the *hp-*adaptation process is shown with details about the parallel load balancing strategy. The second section is dedicated to the assessment of the model. It is validated on test cases accepted by the community. The first test considers a geostrophically balanced flow where the availability of an exact solution permits the computation of the high-order spatial convergence rates. Then it follows the simulation of a flow impinging a mountain using the adaptive model, which is compared with a high-resolution solution. Finally, the simulation of a realistic global tsunami event is presented.

## 2. Model description

The model solves the shallow-water equations on the sphere using the nodal DG method with dynamic adaptivity. This section is devoted to the description of the model, including the continuous equations, their discretization, the handling of flows on the sphere, and the dynamic adaptivity procedure.

### a. The shallow-water equations

*η*,

**u**, and

*H*are the elevation of the free surface, the depth-averaged horizontal velocity, and the total depth, respectively. The gravitational acceleration, the Coriolis parameter, the constant density, and the surface and bottom stresses are denoted

*g*,

*f*,

*ρ*,

*τ**, and*

^{s}

*τ**, respectively. The vertical unit vector, pointing upward, is designated by*

^{b}**k**. The elevation gradient term of the momentum equation in (1) can be written as a sum of a flux term and a source term, leading to

*h*=

*H*−

*η*is the depth at rest.

### b. Weak formulation

For the sake of completeness and to highlight the peculiarities of the described implementation, we provide here the full weak DG finite-element formulation for the shallow-water equations. The weak formulations of the different equations are derived separately: the momentum equation is first considered followed by the free-surface equation.

#### 1) Momentum equation

*N*elements

_{e}**Ω**

*. Assuming now that each test function is nonzero in one element, and zero elsewhere, we can localize (4) obtaining*

_{e}*H*,

**u**, and

*η*are considered discontinuous at the boundaries of each element. The local problems in (5) have to be coupled with each other, which in DG methods is obtained by means of numerical fluxes, mimicking what is done for finite-volume formulations. To see how numerical fluxes can be introduced in (5), we integrate by parts and the boundary fluxes

*H*

**u**to be solved by the equation, the local Lax–Friedrichs unique flux is defined as

*R*and

*L*correspond to the values of the discontinuous fields at the right and left of the interface, respectively. The vector

**n**is the rightward normal; and

*S*} = 0.5(

*S*+

^{R}*S*) and jump [

^{L}*S*] = 0.5(

*S*−

^{R}*S*) operators, the Lax–Friedrichs flux in (6) can be written as

^{L}- elevation gradient:
- advection:
- Lax–Friedrichs penalization (expressed in the right-hand side):

*denotes the one-dimensional contour of the element Ω*

_{e}*. The Coriolis and stress terms remain unchanged since no spatial differentiation operator is appearing in their expressions. It is possible to do integration by parts of the elevation gradient and advection terms once again (Hesthaven and Wartburton 2008), leading to the following:*

_{e}- elevation gradient:
- advection:

#### 2) Free-surface equation

### c. Discrete formulation

*p*is the number of nodes in the considered element, while

*ϕ*are the associated shape functions. Using quadrilateral elements, they are obtained by the tensor product of Lagrange polynomials (Hesthaven and Wartburton 2008). Following the usual Galerkin procedure, we select discrete components of the test functions belonging to the same space as the polynomial basis functions used to approximate the solution (Karniadakis and Sherwin 2005; Hesthaven and Wartburton 2008). The unknown degrees of freedom associated to the node

_{l}*l*are represented by the vector (

*H*

**u**)

*for the transport and the scalar*

_{l}*η*for the free-surface elevation.

_{l}The two-dimensional integration rules are derived from the one-dimensional ones (see Deville et al. 2002). The model is able to use either the Gauss–Legendre (GL) or the Legendre–Gauss–Lobatto (LGL) quadrature rules (Fig. 1). The LGL discretization is faster and simpler, as it does not need any interpolation of the variables at the edges of the elements when the fluxes are computed. However, it is also less accurate: with *p* nodes, the GL quadrature rule integrates exactly polynomials of order (2*p* − 1), while the LGL rule only integrates exactly polynomials of order (2*p* − 3). Using under integration with integration by parts once, Kopriva and Gassner (2010) obtained faster simulations for a given error with GL nodes. However, they showed that this conclusion depends on the considered application. In the case of complex flows, the higher order of integration provided by the GL integration rule may be needed to guarantee the absence of aliasing that can trigger oscillations, especially in the presence of additional metric terms associated with the resolution of the equations on the sphere. However, the use of the LGL quadrature nodes in the simulations described in this article did not generate any oscillation and did not significantly increase the error. Hence, we relied on the LGL rule, allowing us to perform faster simulations.

### d. Curvilinear transformations

High-order methods need a high-order mapping in order for them to be effective: the high-order discretization will benefit from a high-order representation of the computational domain. While recent techniques can integrate the two-dimensional shallow-water equations on a large class of manifolds (e.g., Bernard et al. 2009), we focus on a simpler method to operate on the sphere. To this aim, we construct a mapping stemming from the original basis used in the mesh generation process and, for exact surfaces, a projection mapping points on the original mesh to the analytical surface. Next, the Jacobi matrices involved in the transformation are generated numerically by employing the discrete differentiation matrix (in each parametric direction) on the reference element. With the Jacobi matrices, all geometric entities can be recovered.

**x**= (

*x*

_{1},

*x*

_{2},

*x*

_{3})] to the reference element coordinates (

*ξ*,

*ζ*) ∈ [−1, 1] × [−1, 1] (see Fig. 1) is performed as follows. Considering as an exact surface the sphere, we define the analytical projection

*ϕ*

^{1D,1}s are the usual linear one-dimensional basis functions. The

**x**

*are the four nodes given by the mesh generator. If the mesh generator could give quadratic elements then nine nodes would be available and the tensor product of the basis functions on the right of (16) would be biquadratic. It is now possible to evaluate*

_{ij}*P*+

*k*)

^{2}LGL nodes,

*k*being an integer ≥ 1 (

*k*= 2 is used in this work). Evaluating at those quadrature nodes, gives a discrete vector

*ϕ*

^{1D,p+k}s are the one-dimensional basis functions of order

*p*+

*k*. Notice that it is possible to create the vector

*p*+

*k*quadrature points in each direction (Deville et al. 2002), yields the discrete version of the Jacobi matrices:

*p*+

*k*)

^{2}points to the original

*p*

^{2}points. The Jacobi matrices are then used to compute all the geometric information in the model. Finally, notice that it is necessary to use

*k*≥ 1 for the design order of the scheme to be

*p*.

### e. Temporal discretization

*n*to the next one

*n*+ 1, the procedure is composed of three stages:

**y**

*is the vector of all discrete degrees of freedom at the stage*

^{i}*i*. The discrete right-hand side of (13)–(14) using the value of the variables at the stage

*s*is denoted by

This method ensures nonlinear stability properties in the numerical solution of hyperbolic partial differential equations with discontinuous solutions (Gottlieb et al. 2011). It preserves strong stability properties of the spatial discretization, which is a great advantage for the simulation of geophysical flows, characterized by a wide range of resolved and unresolved phenomena. Note that many other explicit Runge–Kutta time integrators of different orders are available in the model (http://muse.ucar.edu) and others can be easily added by introducing the corresponding array of coefficients.

The marching schemes described here are subject to the Courant–Friedrichs–Lewy (CFL) condition, and their time step is controlled by the highest Courant number of the global domain. It is generally associated with the smaller elements of the mesh and can be much larger than the average Courant number because of the variable resolution resulting from the unstructured grid and dynamic adaptation. As a result, the time step used to update the fields in the whole domain is constrained by the size of the smaller element, which is rather inefficient. Although an implicit method is not subject to the stability constraint on the time step, it may be too diffusive for elements characterized by a high Courant number. Hence, for adaptive methods to deliver their full potential, they need to be used in conjunction with other approaches such as local time stepping (Lörcher et al. 2008). This method still needs to be implemented in the code.

### f. Solving the equations on the sphere

Solving geophysical flows on the sphere is not a straightforward task: even if the flow is two-dimensional, the velocity field is three-dimensional and must remain tangent to the surface of the sphere. The use of non-Cartesian coordinates to formulate the equations, restricting the domain to the surface of the sphere, is tempting but suffers from the existence of singularities at the Poles (Mohseni and Colonius 2000).

Discontinuous Galerkin methods were combined with several Cartesian techniques to handle the computation of flows on the sphere, such as the cubed sphere (Nair et al. 2005a,b; Dennis et al. 2005a), or local tangent bases (Comblen et al. 2009; Bernard et al. 2009).

In this work, we consider the method proposed by Côté (1988): the equations are solved in a three-dimensional Cartesian space, but an additional constraint ensures the fluid particles to remain on the surface of the sphere. It has been successfully used in the DG framework (Giraldo et al. 2002; Giraldo 2006). Although this technique requires the resolution of three equations instead of two for the momentum, it is probably the easiest way to transform a two-dimensional planar model to a model operating on the sphere.

*s*are at the normalized position

**r**(normalized coordinates of the nodes). After one Runge–Kutta substep (stage

*s*+ 1), those particles will occupy the normalized position

**r***. An additional term is added to the discrete momentum equation for each Runge–Kutta stage. It corresponds to a restoring force toward the center of the sphere, with its direction parallel to the semi-implicit position of the fluid particles originating from the nodes (Côté 1988):

*t*corresponds to the Runge–Kutta subtime step at stage

^{s}*s*. In the case of SSP33 in (18), those are

*μ*needs to be determined such that the updated transport field remains tangent to the sphere. Equation (19) can be written as

*H*

**u**)

^{s}^{+1}·

**r*** = 0]. After multiplying (22) by

**r***, using (20) and the orthogonality condition, we obtain

### g. Dynamic adaptation

The main characteristic of the model is its ability to modify the mesh and the order of interpolation inside the elements during run time, thus concentrating efficiently the computational resources where they are needed in order to capture the key aspects of the flow. In what follows, both the *p* and *h* refinement are described. Both refinement strategies are used in conjunction with an error indicator also described therein.

#### 1) *h* adaptation

In the model, we choose a hybrid mesh adaptation procedure. It combines an unstructured mesh with a quad-tree refinement strategy within each element. This approach closely follows the work described in Edwards (2002) and Stewart and Edwards (2002). The first assumption consists in denoting each element in the initial mesh as the *root* of a quad tree. A 2: 1 constraint is then enforced between the trees in order to allow for smooth transitions between refined and unrefined regions: two neighboring elements can only be separated by a single level of refinement. This procedure can be pursued until the desired level of refinement is reached (Fig. 2). A similar approach was also used in the spectral element code presented in St-Cyr et al. (2008). The main advantage of such an approach is that it simplifies the construction of the projection operators between the changing meshes. The elements are represented using a high-order basis with each newly created node having to be projected onto the geometry, in this study the surface of the sphere. The resulting mesh is nonconforming, in other words, some element corners face the center of neighboring elements edges. It is not problematic as the neighboring elements only interact through the fluxes between the edges, which can still be computed uniquely as described in Kopriva et al. (2002).

Adaptive mesh refinement process. Prolongation–restriction operators are needed to locally transfer the data between the different grids.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Adaptive mesh refinement process. Prolongation–restriction operators are needed to locally transfer the data between the different grids.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Adaptive mesh refinement process. Prolongation–restriction operators are needed to locally transfer the data between the different grids.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

*and its four child elements Ω*

_{i}*(Fig. 2). The prolongation (refinement) and restriction (coarsening) operators consist of a*

_{ij}*L*

^{2}minimization and require the resolution of

*A*approximated by

*A*

_{fine}on the refined (child) elements and

*A*

_{coarse}on the parent element,

As the node positions do not change during the refinement process, no numerical diffusion is introduced by the interpolation of the fields. The numerical diffusion introduced during mesh coarsening is minimal, resulting from the reduction of the discrete space. Hence, this adaptive mesh technique is usually less diffusive than other procedures based on remeshing or mesh modifications including nodes displacement, even local (Löhner and Baum 1991). The simple tree structure of the adapted mesh makes the adaptation process very fast, even in parallel [see Burstedde et al. (2010) for a *state-of-the-art* implementation].

#### 2) *p* adaptation

*L*

^{2}projection is used to interpolate the data from a discretization of the element Ω

*using*

_{i}*p*nodes to a discretization using

*q*nodes:

*p*nodes, while the subscript

*q*corresponds to the

*q*-nodes discretization. As for the mesh adaptation technique, numerical diffusion is only introduced when the order of interpolation is reduced, while an increase in polynomial order does not modify the fields.

Although any order of interpolation from 0 to 15 can be used in the model, elements characterized by an even number of nodes are preferred and will be used in the applications to fully benefit from vectorized instructions.^{1} With those instructions, two operations are executed simultaneously, and loops on arrays of even sizes become twice faster. The case of an odd number of nodes (and thus an even polynomial degree) yields complications in the SSE execution. Notice that this can be remedied by introducing padding in the array (i.e., extend the array by 1 and perform computation for both of the last entries).

#### 3) Adaptation strategy

The first step of the adaptation procedure is the identification of a set of elements to be refined or coarsened. This decision is based on an error indicator, which can be obtained in multiple ways. In this study, the jump in the values of the variables between each side of the interfaces of the elements is used as an error indicator, as described in Remacle et al. (2005). More details are given in the simulation descriptions.

*μ*and standard deviation

*σ*of the error estimator are computed. The refinement criteria is driven by

*f*and

_{r}*f*can be tuned to improve the procedure. They are both set to 0.2 in this work. The error is computed as the average of the indicator in the considered element. When

_{u}*h*adaptation is considered,

*f*= 1. For

_{p}*p*adaptation, it is set to

*f*= (

_{p}*p*/

*q*)

^{2}, where

*p*is the current number of nodes in the element, while

*q*is the number of nodes that will characterize the element if the refinement–unrefinement occurs. This is chosen principally because of the clustering of the LGL nodes near the edges of the elements. The closest nodes have a spacing of

*h*

_{LGL}~ 1/

*p*

^{2}. Hence, a doubling of the number of LGL nodes corresponds to refining the element in each direction twice based on the smallest spacing (provided that the field is smooth). To avoid strong mesh inhomogeneities that could potentially generate spurious reflections, two neighboring elements can only be separated by one level of mesh refinement. However, no such restriction is enforced for the

*p*refinement or derefinement procedure. Finally, since only time-dependent problems are considered, only one sweep of

*h*refinement and

*p*refinement is performed per adaptation step.

To reduce the cost of adaptation, the mesh is not modified every time step, but rather following a user-provided adaptation frequency. This frequency depends on the application considered. No significant difference was observed by adapting the mesh every 100 time steps rather than at each time step. For example, the maximum number of elements being at the wrong resolution (needing refinement–coarsening) because the refinement is not performed at each time step is always under 50 for the tsunami simulation presented in this article. It represents less than 1% of the total number of elements. However, to be more general, it may be useful to define a threshold of percentage of elements flagged for adaptation that would indicate when the adaptation procedure needs to be done.

In some situations, the key features of the flow may exit the refined zone, reducing considerably the accuracy of the solution. In those cases, it may be useful to refine in a broader area using a halo (St-Cyr et al. 2008) or more advanced (and more costly) techniques such as metrics advection to refine where the dynamics are complex, but also in the region where the flow will propagate (Wilson 2009).

#### 4) Load balancing

To all the roots of the trees involved in the base mesh corresponds a barycenter. Using this barycenter, it is possible to employ a recursive coordinate bisection (RCB) to distribute the roots (hence all their leaves) among processors. Thus, the smallest load-balanceable entities are the trees associated with the elements in the base mesh. Each root, corresponding to a single element in the base mesh, is weighted by summing the work performed by all of its leaves (the refined active elements). We pick *p*^{2}, the variable number of nodes on an element, as a measure of the work associated with a leaf in the tree. Hence, the barycenter associated with the root of a tree is weighted by the total work occurring in its leaves. The weighted RCB developed in Devine et al. (2002) and Boman et al. (2011) is then employed to calculate a new properly rebalanced partition (http://www.cs.sandia.gov/Zoltan). Finally, this new partition list is used to migrate the elements in parallel. Notice that the barycenter of each base element is fixed throughout the simulation and thus only recomputations of the weights are required. This makes the procedure entirely local and fast.

## 3. Validation

This section focuses on some of the test cases described by Williamson et al. (1992), chosen to validate the behavior of the model (theoretical order of convergence, accuracy for unsteady simulations, adaptation strategy). The section concludes with an application to a realistic tsunami propagation.

### a. Global steady-state nonlinear zonal geostrophic flow

This steady-state configuration is the second test problem described in Williamson et al. (1992). The velocity field corresponds to a solid-body rotation along an axis whose angle with the axis of rotation of the earth is *α*. In this run, the latter is set to *α* = *π*/4, such that the flow is not aligned with the grid (Fig. 3). The mesh employed is structured with a uniform static resolution. It consists in a cubed sphere (Ronchi et al. 1996; Rančić et al. 1996); made up of one element per cube face. The elements are then recursively split and projected onto the surface of the sphere to obtain finer meshes (Fig. 3).

The high-order meshes used for the different simulations are obtained by splitting recursively the elements of the initial cube mesh *R*_{0} and projecting them onto the surface of the sphere: (left) (top) *R*_{0} and (bottom) *R*_{1}; (right) (top) *R*_{3} and (bottom) *R*_{4}. (middle) Initial depth (colors) and transport field (arrows) for the test case 2 from Williamson et al. (1992).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

The high-order meshes used for the different simulations are obtained by splitting recursively the elements of the initial cube mesh *R*_{0} and projecting them onto the surface of the sphere: (left) (top) *R*_{0} and (bottom) *R*_{1}; (right) (top) *R*_{3} and (bottom) *R*_{4}. (middle) Initial depth (colors) and transport field (arrows) for the test case 2 from Williamson et al. (1992).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

The high-order meshes used for the different simulations are obtained by splitting recursively the elements of the initial cube mesh *R*_{0} and projecting them onto the surface of the sphere: (left) (top) *R*_{0} and (bottom) *R*_{1}; (right) (top) *R*_{3} and (bottom) *R*_{4}. (middle) Initial depth (colors) and transport field (arrows) for the test case 2 from Williamson et al. (1992).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

*λ*and

*θ*, respectively. The mean depth is set by

*gh*= 2.94 × 10

^{4}m

^{2}s

^{−2}with

*g*= 9.806 16 m s

^{−2}. The initial free-surface elevation is given by

*a*= 6.371 22 × 10

^{6}m denotes the earth’s radius and Ω = 7.292 × 10

^{−5}s

^{−1}is the rotation rate of the earth. The vectors and coordinates are expressed in a longitude–latitude basis, and are introduced in the Cartesian framework of the model using appropriate transformations:

Considering the availability of an analytical solution with continuous derivatives, this test case is convenient to check the numerical convergence of the model. Using elements *Q _{i}* of order

*i*, the error should theoretically converge to 0 at the rate

*i*+ 1. This optimal rate of convergence is observed for the different high-order elements tested in this study (Fig. 4). The meshes

*R*

_{0}to

*R*

_{3}have been used, corresponding to mesh resolutions at the equator varying from 90° to 11.25°. This mesh resolution is different from the overall spatial resolution, the latter being much higher thanks to the high-order representation within the elements (maximum resolution around 1.5°). For this convergence study, the time step was selected in such a way that the temporal error is negligible with respect to the spatial one. When the spatial discretization order becomes very high, such a requirement would lead to critically too tiny time steps. Therefore, it was not possible to carry on convergence mesh analysis on the finest mesh with

*Q*

_{5}and

*Q*

_{7}.

Convergence of the *L*_{2} error for the elevation of the (left) free surface and (right) velocity, for the test case 2 from Williamson et al. (1992), after 5 days of simulation. The mean rate of convergence is indicated for each order of interpolation.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Convergence of the *L*_{2} error for the elevation of the (left) free surface and (right) velocity, for the test case 2 from Williamson et al. (1992), after 5 days of simulation. The mean rate of convergence is indicated for each order of interpolation.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Convergence of the *L*_{2} error for the elevation of the (left) free surface and (right) velocity, for the test case 2 from Williamson et al. (1992), after 5 days of simulation. The mean rate of convergence is indicated for each order of interpolation.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

### b. Zonal flow over an isolated mountain

*α*= 0 and

*u*

_{0}= 20 m s

^{−1}. The mean depth is modified to include an isolated mountain centered around (

*λ*= 3

_{c}*π*/2,

*θ*=

_{c}*π*/6):

*h*

_{0}= 5960 m. The mountain height is given by

*R*=

*π*/9 and

*r*

^{2}= min[

*R*, (

*λ*−

*λ*)

_{c}^{2}+ (

*θ*−

*θ*)

_{c}^{2}]. The maximum height of the mountain is set to

*h*

_{s0}= 2000 m.

This test case allows a quantitative estimation of the discretization error thanks to the existence of a high-resolution solution from the German Weather Service made available online (http://icon.enes.org/swm/stswm/node5.html). This solution results from a T426 spectral simulation using the National Center for Atmospheric Research (NCAR) shallow-water spectral model (Jakob-Chien et al. 1995) with a resolution equivalent to about 31 km at the equator.

The simulations were performed with several meshes and orders of interpolations. When dynamic adaptation is used (Fig. 5), the mesh is refined locally where the flow is considered unresolved, in order to reduce the error generated by the model. The refinement criteria for *p* and *h* adaptation correspond to (27). It was previously shown in Bernard et al. (2007) that the jump in the values of the variables between each side of the interfaces of the elements consists in a good representation of the discretization error. Therefore, the latter can be used as an error indicator to identify where the mesh should be refined. Thus, in this simulation, we chose the jump of the free-surface elevation *η* as the error indicator. Other configurations may be more effective, as for example using a combination of the jumps of different variables or taking into account the smoothness of the fields inside each element. The design of an optimal adaptation strategy still has to be achieved and will be the subject of future work. Simulations were also performed using a threshold on the absolute value of the relative vorticity as an adaptation criteria, as done by St-Cyr et al. (2008).

(top to bottom) Snapshots of the simulation for the test case 5 from Williamson et al. (1992) after 0, 5, 10, and 15 days. Two levels of dynamic mesh refinement are considered (*R*_{2−4}), with two different orders of interpolation (*Q*_{3−5}). The Cartesian domain has been mapped to longitude–latitude coordinates. (left) Depth contour lines from 5050 to 5950 m with an interval of 50 m. (right) Absolute error on the depth ‖*H* − *H*_{ref}‖, using the solution from the German Weather Service as the reference *H*_{ref}. The black lines represent the dynamically adapted mesh while the circle indicates the contour of the mountain.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

(top to bottom) Snapshots of the simulation for the test case 5 from Williamson et al. (1992) after 0, 5, 10, and 15 days. Two levels of dynamic mesh refinement are considered (*R*_{2−4}), with two different orders of interpolation (*Q*_{3−5}). The Cartesian domain has been mapped to longitude–latitude coordinates. (left) Depth contour lines from 5050 to 5950 m with an interval of 50 m. (right) Absolute error on the depth ‖*H* − *H*_{ref}‖, using the solution from the German Weather Service as the reference *H*_{ref}. The black lines represent the dynamically adapted mesh while the circle indicates the contour of the mountain.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

(top to bottom) Snapshots of the simulation for the test case 5 from Williamson et al. (1992) after 0, 5, 10, and 15 days. Two levels of dynamic mesh refinement are considered (*R*_{2−4}), with two different orders of interpolation (*Q*_{3−5}). The Cartesian domain has been mapped to longitude–latitude coordinates. (left) Depth contour lines from 5050 to 5950 m with an interval of 50 m. (right) Absolute error on the depth ‖*H* − *H*_{ref}‖, using the solution from the German Weather Service as the reference *H*_{ref}. The black lines represent the dynamically adapted mesh while the circle indicates the contour of the mountain.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

The mesh is initially refined around the mountain (Fig. 5). Then, it is adapted every 100 time steps according to the distribution of the estimated error. During the first days of simulation, the dynamic refinement mainly occurs in the wake of the mountain where the flow is disturbed. After 10 and 15 days of simulation, the adaptation takes place in various areas of domain that are affected by the complex structure of the flow. The adaptation criteria is the same for both *h* and *p* refinement in (27). Hence, the order of interpolation is adapted around the same areas than the mesh.

A very similar distribution of the difference with the solution obtained from the German Weather Service (Fig. 5) is observed in other studies such as Comblen et al. (2009). There is a correlation between the peaks of this difference and the areas where the mesh is refined. However, the mesh is not systematically refined where the difference with the reference solution is high. Indeed, if an error is accumulated and transported, it will not be visible in the jump of the variables at the interfaces between the elements. Refining the mesh would not improve the solution anyway as the error has been generated earlier.

Different simulations for this test case were performed: 1) using both fixed grids and orders of interpolation and 2) having recourse to dynamic adaptation. As expected, the model converges to the reference solution as the resolution is increased (Fig. 6). Increasing the order of interpolation is much more efficient than reducing the grid size by adding elements because the convergence rate increases with respect to the former, while the rate remains constant with the latter. Considering the vorticity as an error estimator (St-Cyr et al. 2008) generates a higher global error compared to simulations using the jump of the elevation of the free surface to identify the areas needing refinement. As pointed out by St-Cyr et al. (2008), an additional level of refinement is required in order for the adaptive simulation to surpass the static one. The resolution characterizing the adaptive simulation reaches the resolution of the nonadaptive simulation only in the areas where the dynamic refinement is a maximum. It is lower anywhere else in the domain. Thus, it is obvious that an adaptive simulation will always be less accurate than the corresponding nonadaptive simulation based on the maximum level of refinement used by the adaptive one (e.g., the error associated with the simulation *Q*_{3−5}*R*_{2−3} is higher than the one obtained by the simulation *Q*_{5}*R*_{3}). However, an optimal adaptation strategy should greatly increase the accuracy of the adaptive simulation.

Evolution of the normalized *L*_{2} error for the test case 5 from Williamson et al. (1992), using the solution from the German Weather Service as the reference. Different meshes and orders of interpolation are considered. The dotted and dashed lines correspond to fixed orders of interpolation and meshes, while the plain lines are obtained using dynamic adaptation. The error estimator is either the jump of the free-surface elevation *η* at the interface between elements (blue) or the absolute value of the relative vorticity (red).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Evolution of the normalized *L*_{2} error for the test case 5 from Williamson et al. (1992), using the solution from the German Weather Service as the reference. Different meshes and orders of interpolation are considered. The dotted and dashed lines correspond to fixed orders of interpolation and meshes, while the plain lines are obtained using dynamic adaptation. The error estimator is either the jump of the free-surface elevation *η* at the interface between elements (blue) or the absolute value of the relative vorticity (red).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Evolution of the normalized *L*_{2} error for the test case 5 from Williamson et al. (1992), using the solution from the German Weather Service as the reference. Different meshes and orders of interpolation are considered. The dotted and dashed lines correspond to fixed orders of interpolation and meshes, while the plain lines are obtained using dynamic adaptation. The error estimator is either the jump of the free-surface elevation *η* at the interface between elements (blue) or the absolute value of the relative vorticity (red).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

This test case, characterized by a large-scale flow with a similar complexity over the whole domain, is not multiscale enough to really benefit from mesh adaptation. A tsunami simulation will now be described to illustrate the computational potential of dynamic adaptation for realistic multiscale configurations.

### c. Global tsunami simulation

Although idealized test cases are useful to check the desirable properties of a numerical discretization, a more demanding application is required to assess the behavior of a model under realistic conditions, especially its efficiency for the simulation of multiscale problems. In this study, we consider the propagation through the global ocean of the 27 February 2010 tsunami in Chile. The bottom topography of the ocean is very steep, with a water depth varying from 0 m to more than 5000 m in a single element (Fig. 7). This application is thus a good test case to check the robustness of the model when applied to complex realistic configurations.

Water depth at rest and initial mesh before dynamic adaptation used for the tsunami simulation.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Water depth at rest and initial mesh before dynamic adaptation used for the tsunami simulation.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Water depth at rest and initial mesh before dynamic adaptation used for the tsunami simulation.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

The bathymetry of the ocean is derived from the 1-minute gridded elevations–bathymetry for the world (ETOPO1) data (Amante and Eakins 2009). As no wetting-drying scheme is used, a minimum depth of 8 m is considered and the coasts correspond to closed reflective boundaries. There is no open boundary as the computational domain encompasses the global World Ocean. The initial condition is computed by assuming that the initial sea surface displacement is equal to the static sea floor uplift caused by an abrupt slip at the Nazca and South American plates interface. This uplift is obtained using Okada’s static dislocation formula (Okada 1985) with the coefficients described in Table 1, originating from the University of Bologna Tsunami Research Team (http://labtinti4.df.unibo.it/tsunami). The bottom stress is computed using the Manning formula *n* = 0.03.

Seismic parameters used to derive the earthquake initial uplift with the formula from Okada (1985).

The simulation is computed on an initial mesh built using the Gmsh software (http://www.geuz.org/gmsh; Geuzaine and Remacle 2009), with a resolution varying from 5 up to 1000 km (Fig. 7). The mesh is initially refined using the AMR technique where the initial free-surface elevation gradient exceeds a certain threshold (Fig. 8). The *hp* adaptation is then performed every 100 time steps using the jumps of the free-surface elevation *η* at the interface between elements as an error indicator and (27) as adaptation criteria. With two recursive levels of refinement (*R*_{0} to *R*_{2}, meaning that each element of the initial mesh can be recursively split twice), the maximum mesh resolution during the simulation is close to 1 km. Because of the high-order shape functions (*Q*_{3} and *Q*_{5} elements, their distribution being driven by the adaptation procedure), the resolution of the discretization is much higher than the mesh resolution and very close to 250 m.

(top to bottom) Propagation of the wave after 0, 5, and 10 h. (top of each panel) Free-surface elevation. (bottom of each panel) State of the mesh with order of interpolation (blue = *Q*_{3}, red = *Q*_{5}). The Cartesian domain has been mapped to longitude–latitude coordinates.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

(top to bottom) Propagation of the wave after 0, 5, and 10 h. (top of each panel) Free-surface elevation. (bottom of each panel) State of the mesh with order of interpolation (blue = *Q*_{3}, red = *Q*_{5}). The Cartesian domain has been mapped to longitude–latitude coordinates.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

(top to bottom) Propagation of the wave after 0, 5, and 10 h. (top of each panel) Free-surface elevation. (bottom of each panel) State of the mesh with order of interpolation (blue = *Q*_{3}, red = *Q*_{5}). The Cartesian domain has been mapped to longitude–latitude coordinates.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

As long as the tsunami propagates through the Pacific Ocean, the order of interpolation and the mesh are adapted to precisely track the waves (Fig. 8). The computational power is used effectively by concentrating the load at the front of the wave, where it is needed to accurately resolve the propagation. At the end of the simulation, when the front of the wave is broken by hitting the coasts (Fig. 9), the adaptation procedure focuses on many reflections of the initial wave and the refined areas spread through the domain. Notice that those reflections would be reduced with the inclusion of a wetting-drying (runoff) model.

As in Fig. 8, but after 15, 20, and 25 h.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

As in Fig. 8, but after 15, 20, and 25 h.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

As in Fig. 8, but after 15, 20, and 25 h.

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

The elevation of the free-surface has been compared with the Deep-ocean Assessment and Reporting of Tsunamis (DART) data from the National Oceanic and Atmospheric Administration (NOAA) Center for Tsunami Research at eight different stations (this data is available for download at http://nctr.pmel.noaa.gov/Dart). It is seen that the model accurately estimates the time at which the tsunami reaches the different stations (Fig. 10). The amplitude of the waves is also well predicted, except for the stations 5 and 8. Those stations are located in very shallow areas with an irregular bathymetry and several small islands. The resolution of the model is probably not sufficient to reproduce accurately the height of the waves in those regions.

Free-surface elevation at different DART stations: model data (blue) and DART data (red). The different plot boxes are aligned with the timeline to indicate the time at which the tsunami reaches the different stations (*t* = 0 at the moment of the initial earthquake).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Free-surface elevation at different DART stations: model data (blue) and DART data (red). The different plot boxes are aligned with the timeline to indicate the time at which the tsunami reaches the different stations (*t* = 0 at the moment of the initial earthquake).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Free-surface elevation at different DART stations: model data (blue) and DART data (red). The different plot boxes are aligned with the timeline to indicate the time at which the tsunami reaches the different stations (*t* = 0 at the moment of the initial earthquake).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

It can be seen that, after an initialization period, the total number of elements forming the mesh as well as the proportion of higher-order elements is rather stable and does not increase without bound (Fig. 11). As a consequence, the evolution of the number of degrees of freedom behaves similarly. The abrupt diminution of the number of elements after about 500 min corresponds to the moment when the tsunami leaves the vicinity of the earthquake, where initially the original Gmsh mesh was refined. Then, several elements disappear simultaneously the moment this area is coarsened.

(top) Evolution of the total number of elements during the adaptive tsunami simulation, with a separation for each order of interpolation. (bottom) Evolution of the total number of degrees of freedom (dof).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

(top) Evolution of the total number of elements during the adaptive tsunami simulation, with a separation for each order of interpolation. (bottom) Evolution of the total number of degrees of freedom (dof).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

(top) Evolution of the total number of elements during the adaptive tsunami simulation, with a separation for each order of interpolation. (bottom) Evolution of the total number of degrees of freedom (dof).

Citation: Monthly Weather Review 140, 3; 10.1175/MWR-D-11-00038.1

Statistics involving the computational cost were compared with a corresponding simulation that did not employ dynamic adaptation but was characterized by the same maximum resolution (Table 2). The static simulation was obtained by using the maximum order of interpolation (*Q*_{5}) and the maximum level of recursive mesh refinements (*R*_{2}) over the whole domain. In contrast, the adaptive simulation uses maximal resolution only where the dynamics are the most demanding. The comparison shows that the number of degrees of freedom can be reduced by a factor of almost 10 by using dynamic adaptation. The gain in efficiency, denoted by the simulation time, is slightly inferior. This is a result of the cost of the adaptation procedure, which includes parallel load balancing. The simulation time related to the configuration without adaptation has been obtained by computing only the first hour of physical time, a complete simulation being too long to run. When no adaptation is involved, the time-step is constant, since it is constrained by the gravity waves’ maximal speed. Thus, the computational cost of each minute of physical time remains the same throughout the simulation.

Comparison of the cost statistics between the simulation with dynamic adaptation and the corresponding simulation with fixed mesh and order of interpolation, characterized by the same maximum resolution. The ratio is an indicator of the gain of efficiency attributable to the adaptation procedure. The simulation times are obtained using 64 processors on a Cray XT5m for a simulation covering a physical time of 2000 min.

## 4. Conclusions

A discontinuous Galerkin shallow-water *hp-*adaptive model on the sphere has been described extensively. The resolution of the equations on the sphere as well as the adaptation procedure allow for the simulation of multiscale geophysical flows.

The model has been validated on classical shallow-water atmospheric test cases, producing the right order of convergence (test case 2) as well as results in agreement with a high-resolution reference solution (test case 5). The simulation of the February 2010 Chilean tsunami completed the validation procedure on a realistic configuration by predicting perturbations of the free surface of the ocean comparable to the DART buoy measurements.

The tsunami configuration demonstrates the potential of dynamic spatial adaptation as a mean to improve the efficiency of multiscale simulations. The highest resolution remains focused on the front of the wave, where the fields are characterized by a sharp distribution. As a consequence, the adaptive tsunami simulation was shown to be almost 10 times faster than the statically refined simulation of an equivalent resolution. The computational time being faster than the physical time on a reasonable number of processors, the adaptive model may be integrated in a high-resolution tsunami warning process.

It is interesting to notice that, even for the highly unresolved tsunami simulation exhibiting sharp gradients, no filter was needed to keep the model stable. Moreover, no smoothing of the bathymetry was required. There is no additional diffusion apart from the numerical diffusion built in the scheme. Hence, the stabilization provided by the DG method is sufficient to make the model robust, even in the case of difficult configurations.

While a rather simple adaptation strategy was used herein, more advanced adaptation criteria will be investigated. In particular, the decision to refine either the mesh or change the order of interpolation must be improved by using a distinct criterion. The way to obtain the error indicators may also need to be enhanced (e.g., by taking into account the field smoothness).

This paper constitutes a proof of concept for the efficient use of *hp* adaptation for atmospheric and oceanic simulations. Future work will focus on improved adaptation criteria, three-dimensional extensions, local time stepping (Lörcher et al. 2008), limiting strategies, and physical parameterizations in order to reach the goal of developing an effective multiscale model for environmental flows.

## Acknowledgments

Sébastien Blaise is a Postdoctoral Fellow with the Advanced Study Program of the National Center for Atmospheric Research. The present study was carried out within the scope of the project “Collaborative Research: A Multiscale Unified Simulation Environment for Geoscientific Applications,” which is funded by the National Science Foundation under the Peta-apps Grant 0904599. Support was provided by the Advanced Study Program and the Institute for Mathematics Applied to Geosciences of the National Center for Atmospheric Research. The authors are thankful to Natasha Flyer and Ramachandran Nair for their useful comments.

## REFERENCES

Aizinger, V., and C. Dawson, 2002: A discontinuous Galerkin method for two-dimensional flow and transport in shallow water.

,*Adv. Water Resour.***25**, 67–84, doi:10.1016/S0309-1708(01)00019-7.Amante, C., and B. W. Eakins, 2009: ETOPO1 1 arc-minute global relief model: Procedures, data sources and analysis. Tech. Rep., NOAA Tech. Memo. NESDIS NGDC-24, 19 pp.

Bernard, P.-E., N. Chevaugeon, V. Legat, E. Deleersnijder, and J.-F. Remacle, 2007: High-order h-adaptive discontinuous Galerkin methods for ocean modelling.

,*Ocean Dyn.***57**, 109–121.Bernard, P.-E., J.-F. Remacle, R. Comblen, V. Legat, and K. Hillewaert, 2009: High-order discontinuous Galerkin schemes on general 2D manifolds applied to the shallow water equations.

,*J. Comput. Phys.***228**(17), 6514–6535.Bhanot, G., and Coauthors, 2008: Early experiences with the 360TF IBM Blue Gene/L platform.

,*Int. J. Comput. Methods***5**, 237–253.Blaise, S., B. de Brye, A. de Brauwere, E. Deleersnijder, E. J. M. Delhez, and R. Comblen, 2010a: Capturing the residence time boundary layer–application to the Scheldt Estuary.

,*Ocean Dyn.***60**, 535–554, doi:10.1007/s10236-010-0272-8.Blaise, S., R. Comblen, V. Legat, J.-F. Remacle, E. Deleersnijder, and J. Lambrechts, 2010b: A discontinuous finite element baroclinic marine model on unstructured prismatic meshes. Part I: Space discretization.

,*Ocean Dyn.***60**(6), 1371–1393, doi:10.1007/s10236-010-0358-3.Boman, E., and Coauthors, cited 2011: Zoltan 3.0: Parallel partitioning, load-balancing, and data management services: User’s guide. Tech. Rep. SAND2007-4748W, Sandia National Laboratories, Albuquerque, NM. [Available online at http://www.cs.sandia.gov/Zoltan/ug_html/ug.html.]

Burstedde, C., O. Ghattas, M. Gurnis, T. Isaac, G. Stadler, T. Warburton, and L. Wilcox, 2010: Extreme-scale AMR.

*Proc. 2010 ACM/IEEE Int. Conf. for High Performance Computing, Networking, Storage, and Analysis,*Washington, DC, IEEE Computer Society, 1–12.Chen, C., F. Xiao, and X. Li, 2011: An adaptive multimoment global model on a cubed sphere.

,*Mon. Wea. Rev.***139**, 523–548.Cockburn, B., and C.-W. Shu, 2001: Runge–Kutta discontinuous Galerkin methods for convection-dominated problems.

,*J. Sci. Comput.***16**(3), 173–261.Cockburn, B., G. Karniadakis, and S.-W. Shu, 2000:

*Discontinous Galerkin Methods: Theory, Computation, and Applications*. Lecture Notes in Computational Science and Engineering, Vol. 11, Springer, 470 pp.Comblen, R., S. Legrand, E. Deleersnijder, and V. Legat, 2009: A finite element method for solving the shallow water equations on the sphere.

,*Ocean Modell.***28**, 12–23, doi:10.1016/j.ocemod.2008.05.004.Comblen, R., S. Blaise, V. Legat, J.-F. Remacle, E. Deleersnijder, and J. Lambrechts, 2010: A discontinuous finite element baroclinic marine model on unstructured prismatic meshes. Part II: Implicit/explicit time discretization.

,*Ocean Dyn.***60**(6), 1395–1414, doi:10.1007/s10236-010-0357-4.Côté, J., 1988: A Lagrange multiplier approach for the metric terms of semi-Lagrangian models on the sphere.

,*Quart. J. Roy. Meteor. Soc.***114**, 1347–1352.Dennis, J., A. Fournier, W. F. Spotz, A. St-Cyr, M. A. Taylor, S. J. Thomas, and H. Tufo, 2005a: High-resolution mesh convergence properties and parallel efficiency of a spectral element atmospheric dynamical core.

,*Int. J. High Perform. Comput. Appl.***19**(3), 225–235.Dennis, J., M. Levy, R. D. Nair, H. M. Tufo, and T. Voran, 2005b: Towards an efficient and scalable discontinuous Galerkin atmospheric model.

*Proc. 19th IEEE Int. Parallel and Distributed Processing Symposium (IPDPS’05),*Workshop 13, Vol. 14, Denver, CO, IEEE Computer Society, 257–263.Dennis, J., and Coauthors, 2011: CAM-SE: A scalable spectral element dynamical core for the Community Atmosphere Model.

, doi:10.1177/1094342011428142, in press.*Int. J. High Perform. Comput. Appl.*Deville, M. O., P. F. Fischer, and E. H. Mund, 2002:

*High-Order Methods for Incompressible Fluid Flow*.*Appl. Comput. Math. Monogr.,*No. 9, Cambridge University Press, 528 pp.Devine, K., E. Boman, R. Heaphy, B. Hendrickson, and C. Vaughan, 2002: Zoltan data management services for parallel dynamic applications.

,*Comput. Sci. Eng.***4**(2), 90–97.Edwards, H. C., 2002: SIERRA framework version 3: Core services theory and design. Tech. Rep. SAND2002-3616, Sandia National Laboratories, Albuquerque, NM, 97 pp.

Emanuel, K., 2005: Increasing destructiveness of tropical cyclones over the past 30 years.

,*Nature***436**, 686–688.Eskilsson, C., 2011: An hp-adaptive discontinuous Galerkin method for shallow water flows.

,*Int. J. Numer. Methods Fluids***67**(11), 1605–1623, doi:10.1002/fld.2434.Geuzaine, C., and J.-F. Remacle, 2009: Gmsh: A three-dimensional finite element mesh generator with built-in pre- and post-processing facilities.

,*Int. J. Numer. Methods Eng.***11**, 1309–1331.Giraldo, F. X., 2001: A spectral element shallow water model on spherical geodesic grids.

,*Int. J. Numer. Methods Fluids***35**(8), 869–901.Giraldo, F. X., 2006: High-order triangle-based discontinuous Galerkin methods for hyperbolic equations on a rotating sphere.

,*J. Comput. Phys.***214**(2), 447–465.Giraldo, F. X., and M. Restelli, 2008: A study of spectral element and discontinuous Galerkin methods for the Navier–Stokes equations in nonhydrostatic mesoscale atmospheric modeling: Equation sets and test cases.

,*J. Comput. Phys.***227**, 3849–3877, doi:10.1016/j.jcp.2007.12.009.Giraldo, F. X., J. S. Hesthaven, and T. Warburton, 2002: Nodal high-order discontinuous Galerkin methods for the spherical shallow water equations.

,*J. Comput. Phys.***181**, 499–525.Gottlieb, S., and C.-W. Shu, 1998: Total variation diminishing Runge–Kutta schemes.

,*Math. Comput.***67**, 73–85, doi:10.1090/S0025-5718-98-00913-2.Gottlieb, S., D. Ketcheson, and C.-W. Shu, 2011:

*Strong Stability Preserving Runge–Kutta and Multistep Time Discretizations*. World Scientific Publishing Company, 188 pp.Grabowski, W. W., 2001: Coupling cloud processes with the large-scale dynamics using the cloud resolving convection parameterization (CRCP).

,*J. Atmos. Sci.***58**, 978–997.Hesthaven, J. S., and T. Wartburton, 2008:

*Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications*. Texts in Applied Mathematics, Vol. 54, Springer, 515 pp.Jakob-Chien, R., J. J. Hack, and D. L. Williamson, 1995: Spectral transform solutions to the shallow water test set.

,*J. Comput. Phys.***119**, 164–187.Karniadakis, G. E., and S. J. Sherwin, 2005:

*Spectral/hp Element Methods for Computational Fluid Dynamics (Numerical Mathematics and Scientific Computation)*. Oxford University Press, 686 pp.Khairoutdinov, M. F., and D. A. Randall, 2001: A cloud resolving model as a cloud parameterization in the NCAR community climate system model: Preliminary results.

,*Geophys. Res. Lett.***28**, 3617–3620.Khouider, B., A. St-Cyr, A. J. Majda, and J. Tribbia, 2011: MJO and convectively coupled waves in a coarse resolution GCM with a simple multicloud parameterization.

,*J. Atmos. Sci.***68**, 240–264.Kopriva, D. A., and G. Gassner, 2010: On the quadrature and weak form choices in collocation type discontinuous Galerkin spectral element methods.

,*J. Sci. Comput.***44**, 136–155.Kopriva, D. A., S. L. Woodruff, and M. Y. Hussaini, 2002: Computation of electromagnetic scattering with a non-conforming discontinuous spectral element method.

,*Int. J. Numer. Methods Eng.***53**, 105–122.Kubatko, E. J., J. J. Westerink, and C. Dawson, 2006: hp discontinuous Galerkin methods for advection dominated problems in shallow water flows.

,*Comput. Methods Appl. Mech. Eng.***196**, 437–451, doi:10.1016/j.cma.2006.05.002.Kubatko, E. J., S. Bunya, C. Dawson, and J. J. Westerink, 2009: Dynamic p-adaptive Runge–Kutta discontinuous Galerkin methods for the shallow water equations.

,*Comput. Methods Appl. Mech. Eng.***198**(21–26), 1766–1774.Läuter, M., D. Handorf, N. Rakowsky, J. Behrens, S. Frickenhaus, M. Best, K. Dethloff, and W. Hiller, 2007: A parallel adaptive barotropic model of the atmosphere.

,*J. Comput. Phys.***223**(2), 609–628.Löhner, R., and J. Baum, 1991: Numerical simulation of time-dependent 3-d flows using adaptive unstructured grids.

*Advances in the Free-Lagrange Method Including Contributions on Adaptive Gridding and the Smooth Particle Hydrodynamics Method,*H. Trease, M. Fritts, and W. Crowley, Eds., Vol. 395, Springer, 47–56.Lörcher, F., G. Gassner, and C. Munz, 2008: An explicit discontinuous Galerkin scheme with local time-stepping for general unsteady diffusion equations.

,*J. Comput. Phys.***227**, 5649–5670.Mann, M. E., and K. A. Emanuel, 2006: Atlantic hurricane trends linked to climate change.

,*Eos, Trans. Amer. Geophys. Union***87**(24), 223–244.McClean, J. L., and Coauthors, 2011: A prototype two-decade fully-coupled fine-resolution CCSM simulation.

,*Ocean Modell.***39**, 10–30.Mohseni, K., and T. Colonius, 2000: Numerical treatment of polar coordinate singularities.

,*J. Comput. Phys.***157**, 787–795.Nair, R. D., S. J. Thomas, and R. D. Loft, 2005a: A discontinuous Galerkin global shallow water model.

,*Mon. Wea. Rev.***133**, 876–888.Nair, R. D., S. J. Thomas, and R. D. Loft, 2005b: A discontinuous Galerkin transport scheme on the cubed sphere.

,*Mon. Wea. Rev.***133**, 814–828.Nair, R. D., H.-W. Choi, and H. Tufo, 2009: Computational aspects of a scalable high-order discontinuous Galerkin atmospheric dynamical core.

,*Comput. Fluids***38**(2), 309–319.Neale, R. B., and Coauthors, 2010: Description of the NCAR community atmosphere model (CAM 4.0). Tech. Rep., NCAR, 212 pp.

Okada, Y., 1985: Surface deformation due to shear and tensile faults in a half-space.

,*Bull. Seismol. Soc. Amer.***75**, 1135–1154.Pedlosky, J., 1986:

*Geophysical Fluid Dynamics*. 2nd ed. Springer, 710 pp.Rančić, M., R. J. Purser, and F. Messinger, 1996: A global shallow-water model using an expanded spherical cube: Gnomonic versus conformal coordinates.

,*Quart. J. Roy. Meteor. Soc.***122**, 959–982.Remacle, J.-F., X. Li, M. S. Shephard, and J. E. Flaherty, 2005: Anisotropic adaptive simulation of transient flows using discontinuous Galerkin methods.

,*Int. J. Numer. Methods Eng.***62**(7), 899–923.Remacle, J.-F., S. S. Frazao, L. Xiangrong, and M. S. Shephard, 2006: An adaptive discretization of shallow-water equations based on discontinuous Galerkin methods.

,*Int. J. Numer. Methods Fluids***52**(8), 903–992.Remaki, L., and W. G. Habashi, 2006: 3-d mesh adaptation on multiple weak discontinuities and boundary layers.

,*SIAM J. Sci. Comput.***28**, 1379–1397, doi:10.1137/S1064827503429879.Ronchi, C., R. Iacono, and P. S. Paolucci, 1996: The “Cubed Sphere”: A new method for the solution of partial differential equations in spherical geometry.

,*J. Comput. Phys.***124**, 93–114.Satoh, M., T. Matsuno, H. Tomita, H. Miura, T. Nasuno, and S. Iga, 2008: Nonhydrostatic icosahedral atmospheric model (NICAM) for global cloud resolving simulations.

,*J. Comput. Phys.***227**, 3486–3514, doi:10.1016/j.jcp.2007.02.006.Slingo, J., and Coauthors, 2009: Developing the next-generation climate system models: Challenges and achievements.

,*Philos. Trans. Roy. Soc. A: Math. Phys. Eng. Sci.***367**, 815–831.St-Cyr, A., and D. Neckels, 2009: A fully implicit Jacobian-free high-order discontinuous Galerkin mesoscale flow solver.

*Proc. Ninth Int. Conf. on Computational Science,*Baton Roue, LA, LSU Center for Computation and Technology, 243–252.St-Cyr, A., C. Jablonowski, J. M. Dennis, H. M. Tufo, and S. J. Thomas, 2008: A comparison of two shallow-water models with nonconforming adaptive grids.

,*Mon. Wea. Rev.***136**, 1898–1922.Stewart, J. R., and H. C. Edwards, 2002: SIERRA framework version 3:

*h*-adaptivity design and use. Tech. Rep. SAND2002-4016, Sandia National Laboratories, Albuquerque, NM, 28 pp.Watson, R. T., M. C. Zinyowera, R. H. Moss, and D. J. Dokken, 1997:

*The Regional Impacts of Climate Change: An Assessment of Vulnerability*. IPCC, 27 pp. [Available online at http://www.ipcc.ch/pdf/special-reports/spm/region-en.pdf.]Wehner, M., 2008: Towards ultra-high resolution models of climate and weather.

,*Int. J. High Perform. Comput. Appl.***22**(2), 149–156.Weller, H., H. G. Weller, and A. Fournier, 2009: Voronoi, Delaunay, and block-structured mesh refinement for solution of the shallow-water equations on the sphere.

,*Mon. Wea. Rev.***137**, 4208–4224.Wheeler, M., and G. N. Kiladis, 1999: Convectively-coupled equatorial waves: Analysis of clouds and temperature in the wavenumber-frequency domain.

,*J. Atmos. Sci.***56**, 374–399.Wilcox, L. C., G. Stadler, C. Burstedde, and O. Ghattas, 2010: A high-order discontinuous Galerkin method for wave propagation through coupled elastic-acoustic media.

,*J. Comput. Phys.***229**, 9373–9396.Williamson, D. L., J. B. Drake, J. J. Hack, R. Jakob, and P. N. Swarztrauber, 1992: A standard test set for numerical approximations to the shallow water equations in spherical geometry.

,*J. Comput. Phys.***102**, 211–224.Wilson, C., 2009: Modelling multiple-material flows on adaptive unstructured meshes. Ph.D. thesis, Imperial College London, 217 pp.

^{1}

Streaming Single Instruction Multiple Data (SIMD) Extensions (SSE) to the ×86 architecture.