## 1. Introduction

Spherical icosahedral grids are usually generated by grid partitioning, starting from the icosahedron (e.g., Heikes and Randall 1995). Recently, they are beginning to be widely adopted for atmosphere and ocean models (e.g., Ringler et al. 2000; Satoh et al. 2008; Skamarock et al. 2012). One reason may be their quasi-uniform grid structure, which enables higher computational efficiency on massively parallel computers. A global high-resolution simulation, which covered the entire sphere with cells of several-kilometer scale and, thus, required huge computing power, was realized on the grid (e.g., Miura et al. 2007). Although durations of such high-resolution simulations are currently limited from 1 to 3 months so far, increases in computer speed will hopefully enable such high-resolution climate simulations in the near future. In terms of climate simulations, accurate transports of water substances, aerosols, and chemical species are crucial for a realistic hydrological cycle and accurate radiation balance. Therefore, accurate transport schemes are preferable to enhance credibility of climate simulations.

Research on transport schemes may be regarded as one of the central topics in numerical modeling on spherical icosahedral grids. Stuhne and Peltier (1996) and Majewski et al. (2002) developed transport schemes using the advective form equations and second-order polynomial fitting. It may be possible to develop higher-order schemes by making higher-order polynomial fits with larger stencils. However, it is very difficult for transport schemes in the advective form to ensure conservation of transported quantities, and thus, they might be unsuitable for long-term simulations. In contrast, transport schemes in the flux form are well suited for conservation. The finite-volume method is widely adopted to develop flux-form transport schemes on the spherical icosahedral grids because it can deal with unstructured grids in a relatively straightforward manner. There is a large amount of literature on the finite-volume method on unstructured grids and some approaches appear attractive for future transport-scheme developments (e.g., Iske and Sonar 1996; Friedrich 1998), but it is not the purpose of this study to give an overall review of these schemes. Here, we restrict out attention to the transport schemes developed for the spherical icosahedral grids.

For multistage time-stepping schemes, Masuda and Ohnishi (1986) developed a spatial discretization method that was equivalent to a second-order-centered scheme on the regular rectangular grid, and Heikes and Randall (1995), Tomita et al. (2001), and Ringler and Randall (2002) followed their approach with some modifications. Lee and MacDonald (2009) and Weller et al. (2009) used upwind-biased polynomial interpolations to have improved results. Recently, Skamarock and Gassmann (2011) provided third- and fourth-order discretization methods. In contrast, for single-stage forward-in-time schemes, Thuburn (1997) extended the Uniformly Third-Order Polynomial Interpolation Algorithm (UTOPIA) scheme (Leonard et al. 1993; Rasch 1994) to spherical icosahedral grids although distortions of grid cells were not taken into account; the interpolation constants derived on the perfect hexagonal grid were used. Lipscomb and Ringler (2005, hereafter LR05) and Yeh (2007) applied the piecewise linear approximation of van Leer (1977) and reconstructed piecewise linear profiles inside hexagonal or pentagonal cells. Miura (2007b, hereafter M07b) introduced two simplifications, which will be explained in the next section, to avoid complex conditional branching in the methods of LR05 and Yeh (2007). Recently, Skamarock and Menchaca (2010, hereafter SM10) improved the scheme of M07b by replacing the linear reconstruction by second- or the fourth-order reconstructions. SM10 showed that the fourth-order reconstruction scheme was less diffusive than the second-order one, but it had much higher computational cost. We will present a review of the M07b and SM10 schemes in section 2.

In this paper, following M07b and SM10, we propose another second-order reconstruction-based scheme that produces more accurate simulations with only a moderate increase in computational cost. The second-order profile is reconstructed under two constraints. One is that the area integral of the profile inside a hexagonal or pentagonal cell is equal to the cell-center value multiplied by the cell area. The other is that the profile is the least squares fit to the cell-vertex values. We did not choose the fourth-order reconstruction because of its high computational cost demonstrated by SM10.

Section 2 describes about the schemes of M07b and SM10 first, and then, introduces the algorithm of the new scheme. Results from the new scheme are compared to those from the schemes of M07b, SM10, and other schemes in section 3. They are subjected to a cosine-bell advection test (Williamson et al. 1992), the slotted-cylinder advection test of LR05, and a deformational flow test from Nair and Lauritzen (2010). Computational performance is also compared on three different computer architectures: Apple iMac, HP ProLiant, and HITACHI HA8000. Section 4 presents the summary.

## 2. Transport schemes

*i*th cell face shared by the zeroth cell and the

*i*th surrounding cell;

*i*th cell face that points outward from the zeroth cell; and

*i*th cell face

### a. Upwind-biased linear approximation (ULA) by M07b

*i*th cell face (

*i*th cell face during a time step

*i*th cell face by

*i*th surrounding cell

*x–y*coordinate with an origin

### b. The first upwind-biased quadratic approximation (UQA-1) by SM10

*f*inside a hexagonal or a pentagonal cell is equal to the tracer amount contained in the cell. We confirmed that phase error was reduced and accuracy was improved by this modification (not shown). About the zeroth cell in Fig. 1, this constraint is written aswhere

*c*means “corrected.” Because the method to compute the correction term

*i*th cell vertex

*i*is cyclic for

*w*can be precomputed before the time loop, the computational cost of this correction is small.

_{i}### c. The second upwind-biased quadratic approximation (UQA-2)

Next, we introduce a new scheme called the second upwind-biased quadratic approximation (UQA-2). UQA-2 might be regarded as a variant of the piecewise parabolic method (PPM) of Colella and Woodward (1984) because some ideas of PPM are used in its reconstruction procedure. The original PPM requires in its initial stage that the quadratic profiles be reconstructed for each cell so that they are continuous at the cell interfaces in smooth parts of solution (discontinuities at the cell interfaces are introduced later for monotonicity). The cell interface values are determined by an interpolation and, if the grid is equally spaced, they constitute a fourth-order transport scheme under uniform flow. On the spherical icosahedral grids, we do not have appropriate methods to reconstruct quadratic profiles that are continuous along all cell faces for any tracer distribution. Therefore, we consider reducing discontinuities at cell vertices instead. As a result, two differences exist between the algorithms of UQA-1 and UQA-2. The first is that UQA-2 uses mixing ratios at the cell vertices

*i*th cell vertex, denoted by

*i*is cyclic so that

*I*, determining a scalar value

_{0}from the scalar values

_{1},

_{2}, and

_{3}, respectively, of a triangle (Fig. 2b), is defined bywhere

_{0}

_{2}

_{3},

_{0}

_{3}

_{1}, and

_{0}

_{1}

_{2}, respectively. If the hexagonal grid is regular, (20) could constitute a fourth-order accurate gradient operator and a fourth-order accurate transport scheme under uniform flow (Miura 2007a). We use the pair of the constants,

## 3. Test results

The three transport schemes, ULA, UQA-1, and UQA-2, have been subjected to a cosine-bell advection test (Williamson et al. 1992), a slotted-cylinder advection test of LR05, and a deformational flow test (Nair and Lauritzen 2010) to compare their behaviors on the spherical icosahedral grids. The grids were generated by the iterative splitting from the icosahedron (Heikes and Randall 1995) and optimized by the spring method of Tomita et al. (2001) that was slightly modified by Miura and Kimoto (2005). After adjusting the positions of the grid nodes, the cell vertices were positioned at the barycenter of each triangle configured by three neighboring nodes. Computations were performed on grids having 2562, 10 242, 40 962, 16 382, and 655 362 hexagonal/pentagonal cells on the sphere. Instead of the “glevel” notation introduced by Tomita et al. (2001), we label them as H16, H32, H64, H128, and H256 from the number of hexagons plus one pentagon on each edge of the original icosahedron that is projected onto the sphere. Because it was incorporated into the derivations of ULA, UQA-1, and UQA-2, the single-stage forward-in-time scheme was used. The flux limiter of Thuburn (1995, 1996) was applied for monotonicity where not noted explicitly. The ZM-grid arrangement (Ringler and Randall 2002) was used; mixing ratios on the cell centers and flow velocities on the cell vertices.

### a. Advection of a cosine-shaped bell

*i*th cell node, respectively, and

First, we examine the dependence of the solution errors on the time interval. The grid used was H64 and the time intervals were

Next, we compare the convergence properties of ULA, UQA-1, and UQA-2. The grids used were H16, H32, H64, H128, and H256 and the time intervals were

The reason for the slower convergence is probably explained as follows. The solution of UQA-2 is more accurate than ULA and UQA-1 for coarser grids because UQA-2 is much less diffusive than ULA and UQA-1 as is confirmed in the test results on the H32 grid (Figs. 6a–c). Comparing Fig. 6d with Fig. 6g, Fig. 6e with Fig. 6h, and Fig. 6f with Fig. 6i, we see that the flux limiter corrects diffusive error effectively, but does not correct phase error. It is suggested that UQA-2 accompanies significantly smaller diffusive error but slightly larger phase error than UQA-1.

Compared to preexisting schemes, phase error of UQA-2 is much smaller than that depicted in Fig. 6 of Heikes and Randall (1995) and Fig. 1 of Lee and MacDonald (2009), and is almost equivalent to that in Fig. 7 of Thuburn (1997), Fig. 5 of Mittal et al. (2007), Fig. 3 of Skamarock and Gassmann (2011), and Fig. 4 of SM10. The diffusive error of UQA-2 is somewhat smaller than all of the schemes listed above. It can be seen in Fig. 5 that UQA-1 is almost equivalent to the second-order reconstruction scheme of SM10, and UQA-2 is generally more accurate than the fourth-order reconstruction scheme of SM10. The

### b. Advection of a slotted cylinder

Following LR05, a harder test that includes sharp discontinuities has been performed. The cosine bell in the previous test was replaced with a slotted cylinder that had an initial height of 1000 m. Two series of tests were performed with different settings in the time interval. One series used

Figure 7 shows that the

### c. Advection by a deformational flow

A test using a nondivergent deformational flow proposed by Nair and Lauritzen (2010) has been performed. The initial distribution of the tracer is given by a superposition of two Gaussian hills. The flow field is composed of a superposition of a deformational flow and a zonal background flow. These scalar and velocity fields were computed by using Eqs. (14), (31), and (32) in Nair and Lauritzen (2010). We set parameters not given in their paper as same as in Harris et al. (2011). The time intervals were

The initial pair of the Gaussian hills is strongly deformed until day 6, as depicted in Fig. 9a. Our result is similar to the reference solution given by Nair and Lauritzen (2010). UQA-2 produces a less diffusive solution than those of ULA and UQA-1 for H64 after 12 days (Figs. 9b–d). With the flux limiter (Fig. 10b), convergence rates of the

The reason for this degradation in convergence may be explained by considering diffusive and phase errors. When the horizontal resolution is not sufficient to resolve deformations of the Gaussian hills, the dominant source of error may be implicit diffusion. As suggested in the previous tests, UQA-2 appears to reduce diffusive error faster than phase error as the grid becomes finer, thus the dominant source of error may become phase error on finer grids. It is speculated that convergence of phase error is almost in the second order, while that of diffusive error is greater than the second order. Some signatures of this phase error are seen for UQA-2 (Fig. 9f), but not for UQA-1 (Fig. 9e).

### d. Computational cost

In this subsection, we compare the computational costs of ULA, UQA-1 and UQA-2. It should be noted that this comparison was made under limited conditions. Results may strongly depend not only on the environment, such as architectures and compilers, but also on coding skills. A single FORTRAN program coded for this work was run on three different computers. The computing environments were as follows. An Apple iMac with Intel Core i7 processor was used for a single process run. An Intel FORTRAN Compiler was used with -*fast* option. An HP ProLiant with Intel Xeon processors was used for a small multiprocess run. A PGI FORTRAN Compiler was used with *-fastsse* option and OpenMPI was used for parallelization. A HITACHI HA8000 with AMD Opteron processors was used for a large multiprocess run. A HITACHI FORTRAN Compiler was used with *-Oss -noparallel* options and MPICH-MX was used for parallelization. The numbers of processor cores were 10 and 160 for the small and the large multiprocess runs, respectively.

The pair of deformational flow tests from section 3c was performed with and without the flux limiter and was repeated three times. The grids used were H32 for iMac and ProLiant and H256 for HA8000. Figure 11 shows the averages of the computing times. *Preprocess* includes computations of the normal velocity, determinations of quadrature point(s) and evaluations of the upwind side. *Flux divergence* includes interpolations to the vertices (UQA-2 only), profile reconstructions, estimations of the cell face values, and computations of the flux divergence and the flux limiter (if used).

Figure 11 indicates that ULA is the fastest and UQA-2 is the slowest, as expected. On iMac and ProLiant, computing costs were only weakly sensitive to the differences in architecture, compiler, and parallelization. *Preprocess* of UQA-1 and UQA-2 consumed about 40% more time than ULA because of the additional quadrature points. Without the flux limiter, the cost of UQA-1 was about 40% larger than ULA in total, and that of UQA-2 was about 10% larger than UQA-1. With the flux limiter, those differences became smaller because the significant cost of the flux limiter was common for all. In this test, the cost of UQA-1 was about 25% larger than that of ULA in total, and that of UQA-2 was about 5% larger than that of UQA-1. The cost of UQA-2 is comparable to UQA-1 if the number of process cores is small and if the flux limiter is used. On HA8000, *flux divergence* of UQA-2 was obviously time consuming, comparing to UQA-1. This is because the interpolations to the cell vertices and necessary data transfers are included in *flux divergence*. To reduce this cost, we may need to improve this code for more efficient data transfer.

## 4. Summary

This study proposed a new upwind-biased forward-in-time transport scheme [second upwind-biased quadratic approximation (UQA-2)] for the spherical icosahedral grids. UQA-2 basically follows the ideas of ULA by M07b and UQA-1 by SM10 and also benefits from basic ideas of PPM (Colella and Woodward 1984). The second-order tracer distribution on the upwind side of a cell face is reconstructed by imposing two constraints. The first one is that the second-order polynomial is the least squares fit to the interpolated cell-vertex values. The second one is that the area integral of the second-order polynomial over the cell located in the upwind side is equal to the cell-averaged value times the cell area. By fitting the second-order polynomial to the cell-vertex values (that have been interpolated from the cell-center values), we significantly minimize the discontinuity at the cell edges and vertices; PPM enforces this continuity in its unlimited formulation.

Accuracy of UQA-2 was compared with those of ULA and UQA-1 through a cosine-bell advection test (Williamson et al. 1992), a slotted-cylinder advection test of LR05, and a deformational flow test (Nair and Lauritzen 2010). UQA-2 was more accurate than ULA and UQA-1 in most of the tests. UQA-2 showed nearly third-order convergence of the error norms for a C-infinity function in a lower-resolution range although convergence rates degraded as the grid becomes finer. For discontinuities, UQA-2 reproduced sharper solutions than ULA and UQA-1. Because of its higher spatial resolution, UQA-2 is more suitable than ULA and UQA-1 for high-resolution atmospheric simulations that contain sharp boundaries between cloudy and cloud-free regions.

Computational cost of UQA-2 was compared with those of ULA and UQA-1 on three different architectures using different compilers. Single process and multiprocess runs were also compared. General features of the results were not sensitive to the differences in architectures, compilers, and parallelization, but performance of UQA-2 degraded as a result of the cost of data transfers when many processors were used. Without a flux limiter, UQA-2 was more costly than ULA and UQA-1 by about 50% and about 10%, respectively. With a flux limiter, the cost differences were less because of the significant cost of the flux limiter, but UQA-2 was still more costly by about 30% compared to ULA and by about 5% compared to UQA-1.

## Acknowledgments

Hiroaki Miura thanks Prof. David Randall for supporting his visit to Colorado State University; a part of this work was done during that visit. Dr. Takanobu Yamaguchi and Dr. Ross Heikes are also acknowledged for fruitful discussions. This work was supported by the Grant-in-Aid for Young Scientists (B) of MEXT (22740310). The HA8000 supercomputer of The University of Tokyo was used in a test.

## REFERENCES

Colella, P., , and P. R. Woodward, 1984: The Piecewise Parabolic Method (PPM) for gas-dynamical simulations.

,*J. Comput. Phys.***54**, 174–201.Friedrich, O., 1998: Weighted essentially non-oscillatory schemes for the interpolation of mean values on unstructured grids.

,*J. Comput. Phys.***144**, 194–212.Golub, G. H., , and C. Reinsch, 1970: Singular value decomposition and least squares solutions.

,*Numer. Math.***14**, 403–420.Gross, E. S., , L. Bonaventura, , and G. Rosatti, 2002: Consistency with continuity in conservative advection schemes for free-surface models.

,*Int. J. Numer. Methods Fluids***38**, 307–327.Harris, L. M., , P. H. Lauritzen, , and R. Mittal, 2011: A flux-form version of the conservative semi-Lagrangian multi-tracer transport scheme (CSLAM) on the cubed sphere grid.

,*J. Comput. Phys.***230**, 1215–1237.Heikes, R., , and D. A. Randall, 1995: Numerical integration of the shallow-water equations on a twisted icosahedral grid. Part I: Basic design and results of tests.

,*Mon. Wea. Rev.***123**, 1862–1880.Iske, A., , and T. Sonar, 1996: On the structure of function spaces in optimal recovery of point functionals for ENO-schemes by radial basis functions.

,*Numer. Math.***74**, 177–202.Lee, J.-L., , and A. E. MacDonald, 2009: A finite-volume icosahedral shallow-water model on a local coordinate.

,*Mon. Wea. Rev.***137**, 1422–1437.Leonard, B. P., , M. K. MacVean, , and A. P. Lock, 1993: Positivity-preserving schemes for multidimensional advection. NASA Tech. Memo. 106055/ICOMP-93-05, Institute for Computational Mechanics in Propulsion, Lewis Research Center, Cleveland, OH, 62 pp.

Lipscomb, W. H., , and T. D. Ringler, 2005: An incremental remapping transport scheme on a spherical geodesic grid.

,*Mon. Wea. Rev.***133**, 2335–2350.Majewski, D., and Coauthors, 2002: The operational global icosahedral-hexagonal gridpoint model GME: Description and high-resolution tests.

,*Mon. Wea. Rev.***130**, 319–338.Masuda, Y., , and H. Ohnishi, 1986: An integration scheme of the primitive equations model with an icosahedral–hexagonal grid system and its application to the shallow water equations.

*Short- and Medium-Range Numerical Weather Prediction,*T. Matsuno, Ed., Japan Meteorological Society, 317–326.Mittal, R., , H. C. Upadhyaya, , and O. P. Sharma, 2007: On near-diffusion-free advection over spherical geodesic grids.

,*Mon. Wea. Rev.***135**, 4214–4225.Miura, H., 2007a: A fourth-order-centered finite-volume scheme for regular hexagonal grids.

,*Mon. Wea. Rev.***135**, 4030–4037.Miura, H., 2007b: An upwind-biased conservative advection scheme for spherical hexagonal–pentagonal grids.

,*Mon. Wea. Rev.***135**, 4038–4044.Miura, H., , and M. Kimoto, 2005: A comparison of grid quality of optimized spherical hexagonal–pentagonal geodesic grids.

,*Mon. Wea. Rev.***133**, 2817–2833.Miura, H., , M. Satoh, , T. Nasuno, , A. T. Noda, , and K. Oouchi, 2007: A Madden-Julian oscillation event realistically simulated by a global cloud-resolving model.

,*Science***318**, 1763–1765.Nair, R. D., , and P. H. Lauritzen, 2010: A class of deformational flow test cases for linear transport problems on the sphere.

,*J. Comput. Phys.***229**, 8868–8887.Niwa, Y., , H. Tomita, , M. Satoh, , and R. Imasu, 2011: A three-dimensional icosahedral grid advection scheme preserving monotonicity and consistency with continuity for atmospheric tracer transport.

,*J. Meteor. Soc. Japan***89**, 255–268.Rasch, P. J., 1994: Conservative shape-preserving two-dimensional transport on a spherical reduced grid.

,*Mon. Wea. Rev.***122**, 1337–1350.Ringler, T. D., , and D. A. Randall, 2002: A potential enstrophy and energy conserving numerical scheme for solution of the shallow-water equations on a geodesic grid.

,*Mon. Wea. Rev.***130**, 1397–1410.Ringler, T. D., , R. P. Heikes, , and D. A. Randall, 2000: Modeling the atmospheric general circulation using a spherical geodesic grid: A new class of dynamical cores.

,*Mon. Wea. Rev.***128**, 2471–2490.Satoh, M., , T. Matsuno, , H. Tomita, , H. Miura, , T. Nasuno, , and S. Iga, 2008: Nonhydrostatic icosahedral atmospheric model (NICAM) for global cloud resolving simulations.

,*J. Comput. Phys.***227**, 3484–3514.Skamarock, W. C., , and M. Menchaca, 2010: Conservative transport schemes for spherical geodesic grids: High-order reconstructions for forward-in-time schemes.

,*Mon. Wea. Rev.***138**, 4497–4508.Skamarock, W. C., , and A. Gassmann, 2011: Conservative transport schemes for spherical geodesic grids: High-order flux operators for ODE-based time integration.

,*Mon. Wea. Rev.***139**, 2962–2975.Skamarock, W. C., , J. B. Klemp, , M. G. Duda, , L. Fowler, , S.-H. Park, , and T. D. Ringler, 2012: A multiscale nonhydrostatic atmospheric model using centroidal Voronoi tesselations and C-grid staggering.

,*Mon. Wea. Rev.***140**, 3090–3105.Stuhne, G. R., , and W. R. Peltier, 1996: Vortex erosion and amalgamation in a new model of large scale flow on the sphere.

,*J. Comput. Phys.***128**, 58–81.Thuburn, J., 1995: Dissipation and cascades to small scales in numerical models using a shape-preserving advection scheme.

,*Mon. Wea. Rev.***123**, 1888–1903.Thuburn, J., 1996: Multidimensional flux-limited advection schemes.

,*J. Comput. Phys.***123**, 74–83.Thuburn, J., 1997: A PV-based shallow-water model on a hexagonal–icosahedral grid.

,*Mon. Wea. Rev.***125**, 2328–2347.Tomita, H., , M. Tsugawa, , M. Satoh, , and K. Goto, 2001: Shallow water model on a modified icosahedral geodesic grid by using spring dynamics.

,*J. Comput. Phys.***174**, 579–613.van Leer, B., 1977: Towards the ultimate conservative difference scheme. IV. A new approach to numerical convection.

,*J. Comput. Phys.***23**, 276–299.Weller, H., , H. G. Weller, , and A. Fournier, 2009: Voronoi, Delaunay, and block-structured mesh refinement for solution of the shallow-water equations on the sphere.

,*Mon. Wea. Rev.***137**, 4208–4224.Williamson, D. L., , J. B. Drake, , J. J. Hack, , R. Jakob, , and P. N. Swarztrauber, 1992: A standard test set for numerical approximations to the shallow water equations in spherical geometry.

,*J. Comput. Phys.***102**, 211–224.Yeh, K.-S., 2007: The streamline subgrid integration method: I. Quasi-monotonic second-order transport schemes.

,*J. Comput. Phys.***225**, 1632–1652.