## Abstract

Reanalysis data provide a good estimate of global atmospheric temperature and wind fields. However, the available reanalysis datasets reveal nonnegligible discrepancies in their mean state and temporal variability. In this study, the quality of eight reanalysis datasets is evaluated by examining their dynamical consistency in the extratropical stratosphere. The dynamical consistency is quantified by computing the residual of the zonal-mean momentum equation. The residual is generally small in the lower stratosphere, especially at and below 30 hPa, but increases significantly aloft in both hemispheres poleward of 45°, where the effect of parameterized gravity wave drag becomes important. However, at most levels, a large difference in the residual is found among the datasets. This interdata difference is mainly caused by an uncertainty in the Coriolis torque. The non-quasigeostrophic terms, such as those associated with the vertical motion, also play a nonnegligible role when the polar vortex accelerates or decelerates.

The latest reanalysis datasets exhibit smaller residuals than their earlier counterparts. For example, ERA-Interim is dynamically more consistent than ERA-40. This improvement over the generations is largely attributed to a better representation of the Coriolis torque. This is not likely achieved by the increase in satellite data observations over the past few decades. In fact, the dynamical consistency is only weakly sensitive to the analysis period. Instead, model-specific factors, such as data assimilation technique, model resolution, and physics, likely play a crucial role in improving the dynamical consistency.

## 1. Introduction

The zonal-mean momentum equation is often evaluated in reanalysis datasets with the assumption that reanalysis data provide a high degree of dynamical consistency. However, it is not clear how dynamically consistent they are. In fact, many factors, including the assimilation of observations and the parameterization of subgrid-scale processes, can affect the consistency of momentum diagnostics. If we define the dynamical consistency as the agreement between the momentum tendency and the forces influencing the momentum in a given dataset, a high level of consistency requires that the sum of all forcing terms in the momentum equation be equal to the wind tendency. In the Eulerian-mean framework, such forcings include the contributions of the Coriolis torque, the convergence of meridional and vertical fluxes of momentum as well as the advection of zonal momentum by the mean meridional and vertical circulations.

Previous studies have investigated monthly and globally averaged momentum. Although some regional differences are observed (Lehmann and Névir 2012), it has been documented that the globally averaged momentum is consistent across the reanalyses datasets (Berrisford et al. 2011), especially in the stratosphere (Paek and Huang 2012a,b), both in terms of interannual and decadal variability. One notable exception is the Twentieth Century Reanalysis (Compo et al. 2011), which does not have a realistic stratospheric mean state and variability.

A consistent momentum field among reanalysis datasets, however, does not guarantee a consistent momentum budget. Lu et al. (2015) have recently performed transformed Eulerian-mean (TEM) momentum budget analyses in the stratosphere using ERA-40 and ERA-Interim datasets and found substantial discrepancies in the forcing terms of the momentum budget in the two datasets, although the zonal momentum and its tendency in the stratosphere are generally similar (Dee et al. 2011; Lu et al. 2014). The discrepancies are mainly due to different vertical profiles of temperature in the stratosphere, which result in different Eliassen–Palm (EP) flux divergence. They also found a large imbalance between the wave forcing and the residual circulation, suggesting that a substantial fraction of the momentum forcing is unaccounted for in the momentum budget. This result suggests that our dynamical understanding of the zonal-mean circulation could be sensitive to the choice of reanalysis dataset.

An important source of uncertainty in stratospheric momentum diagnostics is the variability of the Brewer–Dobson circulation (BDC). The BDC is primarily responsible for the Coriolis torque, one of the leading terms of the momentum equation. While the BDC is predominantly wave driven, changes in radiative forcing can also alter the meridional circulation (Uppala et al. 2005). An improved overturning circulation in ERA-Interim is largely attributed to the improved assimilation of radiance observations (Dee et al. 2011). A similar improvement in radiation has also been documented in other reanalysis datasets. For example, the cold bias in the lower stratosphere in JRA-25 is reduced in JRA-55 through the improvement of the radiative transfer model (Kobayashi et al. 2015).

Another important source of dynamical inconsistency comes from the physical processes that occur at scales smaller than the model resolution. In the upper stratosphere, small-scale gravity waves are known to produce significant westward drag (e.g., Watanabe et al. 2008). Without consideration of gravity wave drag (GWD), the momentum diagnostics are bound to show large errors in the higher parts of the stratosphere. Discrepancies in GWD among reanalysis datasets may also affect the zonal-mean momentum budget through their role in driving the BDC (Seviour et al. 2012).

Some reanalysis datasets assimilate the wind fields that are retrieved from aircrafts, radiosondes, and cloud track drift measurements (e.g., Rienecker et al. 2011). However, such measurements have a minimal coverage in the stratosphere. Therefore, the stratospheric circulation is less constrained by observational input in the procedure of data assimilation in comparison to the tropospheric circulation, where radiosonde observations are more readily available. While the balanced component of the wind can be constrained from satellite-observed temperatures, the ageostrophic wind cannot be inferred from temperature observations and is therefore a potential source of uncertainty in the momentum diagnostics.

The purpose of this study is to evaluate the dynamical consistency of the reanalysis datasets and to quantify the progress made from the early to the latest generations of reanalysis products. Emphasis is placed on the daily zonal-mean variability in the extratropical stratosphere, which has been often explained using the zonal-mean momentum equation (e.g., Limpasuvan et al. 2004; Martineau and Son 2015). The climatology and the temporal evolution of the polar vortex are both examined. This approach is motivated by the Stratosphere–Troposphere Processes and Their Role in Climate (SPARC) Reanalysis Intercomparison Project (S-RIP) (Fujiwara et al. 2012). It therefore shares the same goals: to compare and understand the differences between reanalysis datasets and to provide guidance on the use of reanalyses in SPARC-related studies. Although a comparison could also be performed in the troposphere, we primarily focus on the winter polar stratosphere where the zonal wind evolves rapidly on a daily time scale.

To characterize the consistency of the momentum budget, we define the residual *R* as the zonal-mean zonal wind tendency that is unexplained by the momentum equation (Smith and Lyjak 1985; Lu et al. 2015). To highlight extratropical variability, the residual is integrated poleward of 45° in each hemisphere. This residual is compared among eight reanalysis datasets that offer an adequate representation of the stratosphere (Table 1). Individual terms of the zonal-mean momentum equation are also compared.

In section 2, we present the reanalysis datasets that are compared in this study. Section 3 provides a detailed description of the zonal-mean zonal-momentum diagnostics. In section 4, a climatology of the zonal wind and its variability in the Northern Hemisphere (NH) is presented. It is followed by a climatology of each term of the momentum equation in section 4a. The consistency of the momentum equation is then evaluated in section 4b. The dependence of the dynamical consistency to different stratospheric dynamical states is further examined in section 4c. The evolution of the dynamical consistency over time is investigated in section 5. Finally, a summary and concluding remarks are presented in section 6.

## 2. Data

This work considers all available global reanalysis datasets (Table 1), except for the NOAA Twentieth Century Reanalysis (20CR) and the ECMWF Twentieth-Century Reanalysis (ERA-20C). They are known to not have realistic winds in the stratosphere (Compo et al. 2011; Poli et al. 2013). Variables of interest include the temperature *T* and the three-dimensional wind field (*u*, *υ*, *ω*) on pressure levels. Monthly parameterized GWD is also investigated for J55 (see Table 1 for the definitions of the acronyms used for each reanalysis dataset). These variables are offered in various temporal, vertical, and horizontal resolutions. To make sure that the comparison is not affected by the data resolution, all reanalysis datasets are horizontally interpolated onto the same 2.5° by 2.5° grid, which is the coarsest resolution among all the reanalysis datasets considered in this study. The interpolation is carried before the computation of zonal averages and flux terms. This ensures that our methodology does not give an unfair advantage to the datasets that provide variables at a finer resolution. Likewise, 6-hourly data are used, although some datasets provide 3-hourly data. Only the pressure levels that are common to all datasets are kept for the calculations (i.e., 1000, 850, 700, 500, 400, 300, 250, 200, 150, 100, 70, 50, 30, 20, 10, 7, 5, 3, 2, and 1 hPa) except for N–N and N–D, which provide data only up to 10 hPa. As described in appendix A, the overall results are not sensitive to the data resolution.

Because N–N and N–D are available only up to 10 hPa, 30 hPa is chosen as the reference level to allow an accurate measurement of vertical derivatives. It is important to note that N–N does not provide vertical motion in the stratosphere; therefore, all terms involving *ω* (*p* velocity) are null. As such, no results are shown for N–N in the momentum diagnostics based on the primitive equation.

## 3. Methodology

### a. Zonal-mean momentum equation

This study evaluates how well the zonal-momentum tendency is explained by the zonal-mean momentum equation. Although the TEM equation (see appendix B) is used frequently in the literature as it allows the evaluation of wave forcing on the zonal flow (Edmon et al. 1980), the Eulerian zonal-mean momentum equation is used in this study for several reasons. First, diagnostics from the Eulerian equation contain fewer vertical derivatives compared to the TEM equation and are therefore less sensitive to the computational errors introduced by discretized numerical derivative schemes. Second, using the momentum equation in its simplest form allows an easier attribution of the dynamical inconsistency to specific forcing terms. For instance, the EP-flux divergence is highly derived, and one has to consider both eddy fluxes and basic-state stability to explain the dynamical consistency (Lu et al. 2015). Third, the non-quasigeostrophic terms are found in both the residual circulation and EP-flux terms in the TEM equation, which makes the evaluation of their relative contributions more difficult.

The primitive equation of zonal-mean zonal momentum (see Andrews et al. 1987) can be written as below:

Here, overbars and primes denote the zonal mean and departure from the zonal mean, respectively. All symbols are standard. The first and second terms of the right-hand side (RHS) denote the Coriolis torque and the convergence of meridional eddy momentum fluxes. The third and fourth terms represent the meridional and vertical advection of zonal-mean zonal wind, and the fifth term denotes the convergence of vertical momentum fluxes. The last term of Eq. (1), *R*, is the residual term that quantifies the ability of the zonal-mean momentum equation to recover the actual wind tendency; *R* can comprise unresolved processes that are parameterized in the model, imbalances caused by incremental analysis during data assimilation, and numerical diffusion. It also includes errors in the interpolation from model levels to pressure levels as well as errors related to the numerical methods employed to evaluate the momentum diagnostics. In appendix A, the details of the numerical methods used in the budget analysis and the sensitivity of the results to the choice of numerical schemes are described.

In the extratropics, the momentum budget is often examined with the quasigeostrophic (QG) approximation. The QG version of the zonal-momentum equation neglects the terms related to ageostrophic advection as well as the vertical eddy momentum fluxes (e.g., Edmon et al. 1980). In this study, the following equation is used:

Compared to the primitive version, is the sum of the last four terms in the RHS of Eq. (1). For the QG approximation, we use the total wind instead of the geostrophic component, and we allow the Coriolis parameter to vary with latitude. As such, this equation is not purely a QG equation.

### b. Analysis domain and time period

Since this study aims to measure the accuracy of momentum diagnostics in the extratropical stratosphere, we primarily analyze the zonal momentum averaged over the NH (45°–85°N). We also briefly discuss the dynamical consistency in the Southern Hemisphere (SH) extratropics (45°–85°S). The polar limit at 85° is chosen to avoid numerical errors in the meridional derivatives. As mentioned earlier, only wintertime, when the stratospheric wind is characterized by large variability because of the vertical propagation of planetary-scale waves, is considered. Specifically, December–February (DJF) and June–August (JJA) are considered for the NH and SH, respectively. To compare all available reanalysis datasets (Table 1), the common period from 1979–2001 is primarily analyzed. However, analyses are also extended to 2012 or back to 1958 when data is available. The results are only weakly sensitive to the analysis period, as discussed in section 5.

## 4. Climatology

Figures 1a and 1b illustrate the climatological zonal-mean zonal wind in DJF in the NH for six datasets that provide data up to 1 hPa and all eight datasets, respectively. The interdata spread, which is quantified by the standard deviation at each grid point, is also presented with shading. A pronounced interdata spread is observed in the tropical upper stratosphere, tropical upper troposphere, and lower stratosphere (Figs. 1a,b), indicating that the tropical stratospheric wind is not well constrained in the reanalyses. This is likely caused by uncertainties in equatorial waves, the quasi-biennial oscillation (QBO), and the semiannual oscillation (SAO), which are model dependent (Randel et al. 2004). In contrast to the tropical wind, the interdata spread in the extratropical wind is remarkably small. The maximum difference between datasets does not exceed 1 m s^{−1}, and it is mostly located in the upper stratosphere (Fig. 1a).

Figures 1c and 1d further show the variability of the zonal-mean zonal wind during DJF computed as the standard deviation of 6-hourly values. Again, a significant spread is found in the tropical stratosphere. In general, the variability of the polar vortex is very similar in all datasets, although slightly weaker variability is found in N–N, N–D, and E-I (not shown). These results indicate that both the climatology and variability of the polar vortex are well constrained in the reanalysis datasets.

### a. Reference momentum budget

This section first presents the climatological zonal-mean momentum budget using J55 as a reference (Fig. 2). Only the spatial structure of individual terms of the momentum equation [Eq. (1)] is illustrated. Their temporal evolution and an intercomparison of reanalysis datasets are presented in the next sections.

It is clear from Fig. 2 that the Coriolis torque (Fig. 2a) and the meridional momentum flux convergence (Fig. 2b) dominate the zonal-momentum equation and are strongly opposed to each other, as expected from QG dynamics. Non-QG terms are generally small in the stratosphere (Figs. 2c,d,e). While the vertical advection term is small throughout the stratosphere (Fig. 2c), the meridional advection term shows moderate forcing in the polar stratosphere above 10 hPa (Fig. 2d). Vertical eddy momentum flux convergence also shows moderate forcing up to 2 m s^{−1} day^{−1} in the upper stratosphere (Fig. 2e). Note that non-QG terms are nonnegligible in the upper troposphere, especially in the tropical upper troposphere where the Hadley circulation is strong (Figs. 2d,e) and in the midlatitudes where the tropospheric jet is located (Fig. 2e).

In Fig. 2f, the parameterized GWD is illustrated, although this forcing is not considered in the momentum budget analysis [Eq. (1)]. The parameterized GWD is generally small in the low and middle stratosphere at high latitudes, with climatological values rarely exceeding 0.25 m s^{−1} day^{−1}, but it is larger in the upper stratosphere.

To quantify the daily variability of *R*, the root-mean-square residual (RMSR) is defined as

where *R*(*t*) is the residual sampled every 6 h and *n* is the sample size. Figures 2g and 2h show *R* and RMSR for J55. Although the climatological *R* is small in the high latitudes below 10 hPa, its temporal variability, as indicated by the RMSR, is substantial. As expected, both *R* and RMSR are smaller than and (Figs. 2i,j), especially on the equatorward side of the subtropical jet, likely resulting from the inclusion of the meridional advection of zonal wind in the primitive equation (Fig. 2d).

Although *R* is reasonably small in the lower stratosphere, it becomes more negative with height (Smith and Lyjak 1985; Lu et al. 2015). A substantial fraction of *R* in the upper stratosphere can be explained by the exclusion of parameterized GWD (Fig. 2f) in the momentum budget. However, not all residuals are explained by the GWD (cf. Figs. 2f,g). Other processes, such as the imbalance caused by incremental analysis during data assimilation, numerical diffusion, and computational errors, may also contribute to the residual.

Figure 2h shows that the RMSR tends to increase toward the pole and the upper stratosphere despite the small values of *R*. This suggests that the momentum budget is less reliable at high latitudes and high altitudes and that the dynamical inconsistency results largely from the temporal fluctuation of *R* that vanishes when considering the long-term climatology. Although not shown, qualitatively similar *R* and RMSR are found in other datasets. In most reanalysis datasets, large *R*s are found right above the tropospheric jet around 30°N and in the vicinity of the polar vortex above 10 hPa, as in Fig. 2g. However, their magnitudes vary widely among the reanalysis datasets. For example, E40 exhibits the largest *R* with a zigzag pattern in the vertical direction (see also Lu et al. 2015). In contrast, ME shows the smallest *R,* as discussed in the next section.

### b. Comparison of reanalysis datasets

The dynamical consistency of each reanalysis dataset is quantified and compared in this section. As an example, Fig. 3 shows the evolution of 30-hPa zonal-mean zonal wind averaged from 45° to 85°N during the sudden stratospheric warming (SSW) event of December 1998. Throughout the event, the zonal wind and its tendency are reasonably similar across reanalysis datasets, although the strength of the polar vortex is slightly underestimated in N–N and N–D (Figs. 3a,b). This result again indicates that the zonal-mean momentum in the extratropical stratosphere is well constrained in reanalysis data.

Does the momentum equation faithfully reproduce the zonal-momentum tendency shown in Fig. 3b? The zonal wind tendency derived from the forcing terms of the momentum equation [Eq. (1)] is illustrated in Fig. 3c. Large biases are found in some datasets. For example, the momentum budget of E40 overestimates the actual tendency in early and late December (cf. pale blue lines in Figs. 3b,c), with a pronounced temporal oscillation occurring twice daily (Fig. 3c). Other reanalysis datasets, especially those that were released before 2009 (i.e., N–N, N–D, and J25), also present some discrepancies between the actual wind tendency and the momentum diagnostics, including a weak oscillatory behavior. This high-frequency variability partly results from semidiurnal tides that are not completely removed by the temporal smoothing (see appendix A).

The inspection of individual momentum forcing terms, RHS of Eq. (1), reveals that the interdata spread in the momentum budget, and the oscillatory behavior, largely comes from the Coriolis torque (Fig. 3d). Although further analyses are needed, the high-frequency oscillations could also result from the analysis increments during data assimilation. The meridional momentum flux convergence, which has comparable magnitude to the Coriolis torque, is generally more consistent across reanalysis datasets (Fig. 3e), although some notable differences are seen in N–N and N–D. Consistent with Fig. 2, the contribution of non-QG terms is rather minor (Fig. 3f). The residual, shown in Fig. 3g, oscillates around 0 in most reanalyses, except in E40, where it shows large excursions of several meters per second per day.

The above analysis is extended to the whole analysis period and to all levels above 300 hPa. The vertical profiles of *R* and RMSR are presented in Figs. 4a,b. In general, the absolute *R* is reasonably small at and below 30 hPa but grows rapidly with height in several reanalysis datasets, including the latest ones. It is important to note that *R* is negative in most datasets, especially above 10–20 hPa. This may result from parameterized GWD, which is not taken into account in the budget analysis (see also Fig. 2f). Small-scale gravity waves are responsible for a strong deceleration of zonal wind in the upper stratosphere (Watanabe et al. 2008) and should therefore contribute to negative *R* values according to Eq. (1).

The RMSRs of the latest reanalysis datasets are typically smaller than 0.5 m s^{−1} day^{−1} from 300 to 30 hPa but increase rapidly above 30 hPa (Fig. 4b). Such a decrease in dynamical consistency with height, especially above 10 hPa, is again partly due to unresolved GWD. By comparing Figs. 4b and 4d, we also observe that the latest reanalysis datasets benefit more strongly from the non-QG terms to improve the dynamical consistency above 30 hPa (i.e., J55, E-I, and ME). Here, it is noteworthy that, despite some reanalysis datasets showing very small climatological residual [e.g., E-I between 10 and 3 hPa (dark blue line in Fig. 4a)], the RMSR can be quite substantial (Fig. 4b). This suggests that, although the momentum equation is consistent in its long-term mean, the residual can vary significantly on a daily time scale. Diagnostics of *R* and RMSR offer therefore complementary information that should not be neglected.

The reduction of the RMSR by including non-QG terms can be partly attributed to a generally less negative *R* in comparison to in most datasets (Fig. 4c). An exception is found in E40, which exhibits a comparable and *R* in the stratosphere. This result indicates that non-QG processes are not well represented in E40. Further examination of Figs. 4a and 4c reveals that the interdata spread of is comparable to that of *R*. This indicates that the interdata spread in the climatological *R* is mainly caused by the QG terms instead of the non-QG terms.

The relative importance of individual forcing terms and their relationships to *R* at 30 hPa, where parameterized GWD likely plays a minimal role in driving the zonal-momentum tendency (e.g., Fig. 2f), are illustrated in Fig. 5. Figure 5a shows a near-one-to-one relationship between the Coriolis torque and *R*, with a correlation coefficient of −0.97. This result indicates that the interdata spread in *R* is primarily caused by differences in the Coriolis torque among the reanalysis datasets. When the Coriolis force is more positive (as in E40), it is compensated for by a more negative residual according to Eq. (1). Although a negative correlation is also found between the meridional eddy momentum flux convergence and *R*, the slope is weak (Fig. 5b). The relationship between the non-QG terms and the residual is also weaker than that of the Coriolis torque (Fig. 5c). This result, which is consistent with the case study illustrated in Fig. 3, indicates that, although *R* is reduced by considering the non-QG terms, its interdata spread is primarily dependent upon the QG terms. A similar relationship between the residual and the Coriolis torque is also found in the upper stratosphere (not shown).

### c. Vortex state

Does the residual depend on the stratospheric circulation? To address this question, the dynamical consistency is further evaluated for different stratospheric states. We characterize the dynamical state of the stratospheric polar vortex using the magnitudes of zonal-mean zonal wind and its tendency averaged from 45° to 85°N at 30 hPa. The two reference quantities are first smoothed over a 5-day window to remove spurious fluctuations and to focus on time scales ranging from several days to longer.

A vortex state is then defined in the and space, as illustrated in Fig. 6a. The reference point, that is the center of the vortex state diagram (Fig. 6a), is set as the wintertime median values of and . Although the reference point can be defined for each dataset individually, the same point is used for all datasets by first taking the multidata median values of and . This allows a direct comparison among datasets.

With respect to the reference point, new coordinates *x* and *y* are described as

They represent the normalized distributions of zonal wind and zonal wind tendency anomalies with respect to the winter median values ( tilde) and standard deviations (*σ*). Again, the multidata median values of and are used to define unique *x* and *y* that are independent of individual datasets. The phase angle is then defined as the angle obtained by transforming *x* and *y* from Cartesian to polar coordinates according to

where atan2 is the four-quadrant inverse tangent. The phase angle *θ* (illustrated with radial lines in Fig. 6a) is not the standard angle used in mathematics but is instead defined as rotating clockwise to represent an idealized evolution of the polar vortex. Note that by using the wintertime median values of and as the origin of the vortex state diagram (intersection of radially oriented lines in Fig. 6a) instead of the seasonal mean, the sample density in *θ* is made more uniform. For example, using the seasonal mean value results in an uneven distribution of data points in *θ* because of the skewness in the distribution of .

Here it is important to state that vortex trajectories do not always rotate around the reference point (, ). To illustrate this fact, the 1998 SSW event (Fig. 3) is shown in Figs. 6a. In the course of this event, individual reanalyses generally show small discrepancies in zonal wind and zonal wind tendencies, except N–N and N–D, which stand out slightly more from the others. The median of all reanalysis datasets, which is used to compute *θ* and *r*, is not sensitive to the two outliers and follows more closely the evolution of the other reanalyses. The 1998 SSW event starts with a vortex that is already unusually weak (around 180° of phase angle, just left of the reference point) and then further weakens over time, as indicated by the trajectories moving toward negative acceleration values and a weaker vortex strength (e.g., 120° in *θ*). The deceleration eventually ceases, and the vortex attains its weakest state while the phase angle is 180°. The polar vortex finally recovers, as shown by the trajectories moving toward positive acceleration values and stronger vortex speeds. As illustrated in Fig. 6a, the vortex strength is weaker than usual throughout the evolution of the 1998 SSW events. As such, this event can be exclusively characterized by phase angles from 100° to 240° and does not revolve around the reference point. This clearly illustrates that the phase angle does not necessarily capture the evolution of the vortex in time. In fact, no single definition of the vortex state would ensure that all vortex weakening or strengthening events revolve around the reference point. It is nonetheless useful to characterize different states of the polar vortex. For example, an unusually strong vortex that is accelerating (270°–0°) and decelerating (0°–90°) and an unusually weak vortex that is decelerating (90°–180°) and accelerating (180°–270°) can be easily described with the phase angle.

To examine the momentum budget in different states of the polar vortex, the variables of interest are averaged within a 20°-wide bin of *θ*. For example, the composite field at 0° represents an average of all data points between θ = −10° and θ = 10°. Ambiguous data points with a radius *r* smaller than 0.5 are discarded, as illustrated by the red circle in Fig. 6a. For all phase angles, the zonal wind and its tendency are similar across reanalysis datasets (Figs. 6c,d). However, depending on the phase, the range of *R* among the datasets reaches up to 1 m s^{−1} day^{−1} (Fig. 7a), which is substantially larger than the interdata differences in zonal wind tendency (Fig. 6d).

Figure 7a shows that most datasets exhibit a weak phase dependence of *R* except in J25 and E40. As discussed earlier, E40 shows the largest *R*, especially around 210°. The phase dependence of the residual is more evident in the QG momentum budget (Fig. 7b) with large negative over 75° and 270°. For all datasets, is more negative than *R*. This result indicates that the non-QG terms play a nonnegligible role in reducing both the mean residual and the phase-dependent residual (Fig. 7c). A greater reduction of the phase-dependent residual is seen in the latest reanalyses (i.e., J55 and ME), suggesting a better representation of the non-QG terms in those datasets.

To better understand the interdata spread of *R*, the evolution of momentum forcing terms is further evaluated as a function of the phase angle (Fig. 8). The Coriolis torque is generally negative, with a minimum value around 80° (Fig. 8a). Again, E40 is an outlier, especially from 150° to 270°. The convergence of meridional eddy momentum flux is similar across reanalysis datasets (Fig. 8b), although the values are smaller in magnitude in N–N and N–D during the deceleration phase. The non-QG terms are generally negative but rather small, except around 70° and 270° (Fig. 8c; the range of the *y* axis in Fig. 8c is only 20% of the other panels), explaining the phase dependence of seen in Fig. 7b.

## 5. Decadal variability and long-term trend

In this section, we evaluate if the increase in satellite observations over time has affected the dynamical consistency of reanalysis datasets in the stratosphere. Figure 9 presents the time series of *R* and RMSR at 30 hPa since 1958. The *R* and RMSR are computed individually for each year in both the NH and SH extratropics.

In the NH, most datasets show small *R*s and RMSRs with no notable changes over time (e.g., E-I, N–C, and ME). A clear exception is E40 which exhibits a strong interannual and decadal variability in *R* and RMSR. Interestingly, J55 shows a steady progression of *R* over time from negative to neutral values (Fig. 9a). A similar behavior is also observed in J25 and N–D with a pronounced decadal variability, especially in J25. These datasets show a decrease in the RMSR from the 1980s to the 2000s (Fig. 9b). Other datasets also show a weak hint of the decreased RMSR over time.

The typical magnitudes of *R* in the SH are similar to those in the NH. Most modern reanalyses show small and nearly constant residuals over time. Again, J55 shows a trend. However, unlike in the NH winter, the *R* in the SH winter becomes increasingly negative. The opposite is found in N–D: that is, *R* increases since the 2000s. The associated positive trends of RMSR in J55 and N–D are in stark contrast to their RMSR trends in the NH winter. Further analyses are needed to understand these hemispheric differences.

Although the quality and availability of satellite measurements has undoubtedly improved over time, one may expect some discontinuity in the dynamical consistency when satellite observations were first introduced, especially in older reanalyses, which employ data assimilation techniques that may be less favorable to a balance of the dynamical equations. However, Fig. 9 shows no marked changes in RMSR across 1979 both in J55 and E40. Although not shown, no changes in are found in N–N. This is surprising, especially in the SH, where satellite observations are important. This may indicate that the incremental analysis resulting from satellite data assimilation does not cause a significant imbalance in the momentum budget. There is no evidence either of a systematic change of the Coriolis torque over time in the NH (not shown). As discussed earlier, it may be difficult to constrain the meridional wind in the stratosphere with satellite observations.

## 6. Summary and discussion

This study evaluates the dynamical consistency of the reanalysis datasets in the context of wintertime stratospheric polar vortex variability. It is important to note that this work does not evaluate the accuracy of reanalyses in comparison to independent observational datasets.^{1} Instead, the consistency is evaluated for both the primitive and quasigeostrophic (QG) versions of the zonal-mean momentum equation. Specifically, the sum of individual forcing terms of the momentum equation is compared to the zonal wind tendency in the reanalysis datasets. Their difference is defined as the residual *R*. Although the residual may have a longitudinal structure, the comparison is limited to zonal-mean quantities averaged over the extratropics.

The residual is generally small in the lower stratosphere and becomes increasingly negative with height (Smith and Lyjak 1985; Lu et al. 2015). A large residual in the upper stratosphere is partly explained by the parameterized GWD, which is not included in the budget analysis. However, other processes, such as the imbalance caused by incremental analysis during data assimilation, numerical diffusion, and computational errors, may also play a role. Although the climatological residual is small in some reanalysis datasets in the higher stratosphere, the root-mean-square residual (RMSR) can be substantial, indicating large temporal variations in residual. As expected, the magnitude of the residual evaluated using the primitive equation is smaller than its QG counterpart at all levels. An exception is E40, in which the QG residual is comparable to the primitive equation residual. This result indicates that one needs to be cautious when analyzing daily zonal-mean momentum budget in E40 (Lu et al. 2015).

A detailed momentum budget analysis and an interdata comparison are performed for the NH extratropical stratosphere at 30 hPa, a level where the effect of parameterized GWD is relatively small. The *R* and the RMSR, integrated poleward of 45°, are reasonably small at that level. In general, the datasets released before 2009 (e.g., E40, N–D, and J25) are characterized by a more negative *R* and a larger RMSR than the latest ones (e.g., E-I, N–C, J55, and ME) both in the primitive and QG momentum budget. More importantly, at all levels above 100 hPa, the latest reanalysis datasets show smaller RMSR compared to older datasets from the same institution; i.e., from E40 to E-I, from J25 to J55, and from N–D to N–C. This result indicates that the dynamical consistency of the reanalysis datasets has improved over the generations.

It is further found that the interdata spread in *R* is mostly due to the uncertainty in the Coriolis torque, which depends only on the meridional winds. The other dominant term of the momentum equation, the eddy momentum flux convergence, is generally similar from one reanalysis to the other. Although non-QG terms are important to reduce *R*, especially when the polar vortex accelerates or decelerates, they do not explain much of the interdata spread of *R*. It is important to note that the differences in the zonal-mean meridional wind between datasets are small in comparison to the interdata spread in the zonal-mean zonal wind. While the interdata spread in is on the order of 0.02 m s^{−1}, the spread of is typically around 0.2 m s^{−1}. These values, however, represent about 50% and 1% of typical climatological values of and , respectively. The strong dependence of the residual on the Coriolis torque is explained mainly by the large Coriolis frequency in the high latitudes, which makes the forcing highly sensitive to small differences in the mean meridional motion.

The consistency does not change much during the satellite era, suggesting that the availability of satellite observations is not a major factor determining *R*. This finding rather suggests that the improved dynamical consistency over generations of reanalysis data is likely due to the improved models and data assimilation techniques, while the improved quantity and quality of observations played a minor role. This result is not necessarily surprising: while data assimilation acts to bring the model evolution closer to the true state of the atmosphere, it does not ensure a more consistent momentum budget. In fact, a model that does not assimilate observations can be very consistent as per our definition as long as the wind tendency is well accounted for by the forcing terms, which should be the case since the model follows its own dynamics.

A definitive identification of the sources of discrepancy in the Coriolis torque is difficult within the context of this study. The meridional circulation is known to result from wave drag and radiative processes within the boundaries of the atmosphere (e.g., Eliassen 1951). An improved representation of these processes is often accomplished from one reanalysis to the following generation. For example, E-I presents a more realistic radiation budget than E40 (Dee et al. 2011). This can explain, along with progress in assimilation techniques, the improvements in stratospheric circulation (Monge-Sanz et al. 2013). E40, the predecessor of E-I, is known to be characterized by an overly strong BDC, which may result in excessive forcing by the Coriolis torque in the stratosphere. It is also noted in Lu et al. (2015) that the improvements in the BDC could be related to the improvements in data assimilation method and the parameterization of gravity wave drag (e.g., Seviour et al. 2012).

We suggest that continued improvement of the representation of meridional wind in reanalysis datasets is important to reduce uncertainties in the momentum budget. In this regard, we recommend that reanalysis centers provide forcings from radiative processes and parameterized gravity wave drag as a standard daily output. It would allow a more precise attribution of the interdata variability in the meridional circulation to specific biases in forcing, in addition to closing the momentum budget in the upper stratosphere, where the drag exerted by gravity waves is significant.

## Acknowledgments

The authors appreciate the constructive comments of three anonymous reviewers. This research is supported by the Korea Meteorological Administration Research and Development Program under Grant 2015-2091-1 and the Seoul National University (SNU)–Yonsei Research Cooperation Program through SNU in 2015.

### APPENDIX A

#### Sensitivity to Data Resolution and Numerical Schemes

In this study, temporal derivatives are evaluated using a five-point stencil method (e.g., Durran 2010):

the error of which is on the order of (Δ*t*)^{4}. Temporal derivatives using this scheme exhibit some level of noise. To reduce this noise and to reduce the impact of inaccuracies in our numerical methods, a four-point running mean, equivalent to taking daily averages, is applied to 6-hourly data before computing the residual. Each term of the momentum equation [Eq. (1)] is smoothed by the same method. This time filtering also reduces temporal oscillations found in some terms of the momentum equation that may be due to reanalysis increments and semidiurnal tides (see Fig. 3d). The effect of time filtering on momentum diagnostics is illustrated in Fig. A1 for the month of December 1998. At all levels, the RMSR without temporal smoothing (SM-N) is more than twice as large as the reference RMSR. This indicates that temporal filtering is crucial to obtain a dynamically consistent momentum budget. The large RMSR found in unfiltered data may be explained in part by the use of the five-point stencil to compute wind tendency. This stencil filters out semidiurnal oscillations, while they are still present in some forcing terms, increasing the 6-hourly residuals. Here we note that the temporal smoothing only affects the RMSR; the time-averaged *R* is largely unaffected.

Vertical and meridional derivatives use a three-point stencil:

where *x* represents either pressure or latitude. This scheme, which has an accuracy on the order of (Δ*x*)^{2}, is chosen for its ability to evaluate derivatives close to the boundaries. Higher-order schemes, which are more accurate, do not allow the evaluation of derivatives close to the domain boundaries. This is also why we present diagnostics at 30 hPa, two levels lower than the upper limit of N–N and N–D. Since the pressure levels are not evenly spaced, the centered difference in the vertical is first computed for half levels and then linearly interpolated back to the original pressure levels. For this reason, we cannot expect vertical derivatives to be exactly of (Δ*x*)^{2} accuracy. To examine the sensitivity to the numerical schemes, variables of interest are first interpolated to the appropriate levels before performing vertical derivatives (VD-2 in Fig. A1a). The resulting RMSR is quantitatively similar to the reference RMSR. Other schemes of vertical derivatives are also considered (e.g., ln*p* interpolation instead of linear interpolation, derivation simply by using neighboring levels without any interpolation, etc.). Although not shown, the results are not sensitive to the details of the vertical derivative.

The sensitivity of the results to data resolution is also tested by repeating the momentum diagnostics with different spatial resolutions. Specifically, horizontal and vertical resolutions are altered (Fig. A1b). The horizontal-resolution test compares the RMSR using the original grid (1.25° × 1.25°; HR-H) with the reference RMSR (REF), which is derived from the interpolated grid (2.5° × 2.5°). Likewise, the vertical-resolution test compares the original levels (VR-H) to the common levels used in this work. In general, the horizontal resolution has a minimal impact on the residual with weak sensitivity in the upper stratosphere (cf. REF and HR-H in Fig. A1b). The vertical resolution also shows a weak sensitivity in the lower stratosphere and upper troposphere region (cf. REF and VR-H), but this does not significantly alter our interpretation of the results.

### APPENDIX B

#### The Transformed Eulerian-Mean Equation

Wave-mean flow interaction in the stratosphere is commonly evaluated using the TEM momentum equation, where the Eulerian-mean circulation is transformed into the residual circulation (, ), equivalent to Lagrangian-mean motion under linear assumptions (e.g., Dunkerton 1978). The residual circulation is expressed as

When substituting for and in Eq. (1), one obtains the following equation:

which is the TEM momentum equation (Andrews et al. 1987). Equation (B2) expresses the zonal-mean zonal wind tendency in terms of the Coriolis torque, which is proportional to the meridional component of the residual circulation, the wave driving expressed as the EP-flux divergence, and the meridional and vertical advection of momentum by the residual circulation. In spherical coordinates, the divergence operator is expressed as

where the EP-flux vector takes the following form:

In Eq. (B2), the residual *R* appears again and is identical to that in Eq. (1) since the transformation only redistributes forcing terms. Therefore, diagnostics of *R* are valid not only for the Eulerian-mean momentum equation but also for the TEM momentum equation.

In the TEM framework, the QG momentum equation (e.g., Edmon et al. 1980) becomes

The EP flux then takes the following form:

where the meridional and vertical components are functions of meridional momentum and heat fluxes, respectively. Not surprisingly, there is a large balance between wave drag () and the Coriolis torque (, not shown).

Although not shown, most diagnostics were also performed for the QG TEM equation. Similar to the Eulerian-mean equation, it is found that the interdata difference in *R* is largely explained by the Coriolis torque due to the residual circulation (, Fig. B1a). More specifically, this is caused by the Eulerian-mean rather than the vertical derivative of heat fluxes [see Eq. (B1)]. Figure B1b, in fact, shows that the vertical component of EP-flux divergence, which is essentially the same as the second term of in the QG limit [Eqs. (B1) and (B6)], is more similar among reanalysis datasets than (Fig. 5a) and is only weakly correlated to the residual. The meridional component of EP-flux divergence is the same term as the convergence of eddy momentum flux in the Eulerian equation (Fig. 5b) and shows small spread among reanalysis datasets. Our interpretation of the sources of *R* is therefore not dependent on the dynamical framework used.

## REFERENCES

*Middle Atmosphere Dynamics*. International Geophysics Series, Vol. 40, Academic Press, 489 pp.

*Numerical Methods for Fluid Dynamics*. Texts in Applied Mathematics, Vol. 32, Springer, 516 pp., doi:.

*SPARC Newsletter*, No. 38, SPARC Office, Zurich, Switzerland, 14–17.

## Footnotes

^{1}

Comparisons of reanalysis datasets with observations have been carried out with independent observations that are not assimilated in the reanalysis data. For instance, Kishore Kumar et al. (2015) found that the zonal winds of ME over India agreed reasonably well with rocketsonde wind measurements below 0.1 hPa, while meridional winds were not as well represented. Similarly, Kozubek et al. (2014) found a robust correspondence between midlatitude stratospheric winds and radiosondes for the zonal wind component. Other wind components show greater discrepancy, mostly occurring during boreal winter. Such discrepancy is stronger in E40 compared to N–N or E-I.