The Eliassen–Palm (E-P) flux divergences derived from ERA-40 and ERA-Interim show significant differences during northern winter. The discrepancies are marked by vertically alternating positive and negative anomalies at high latitudes and are manifested via a difference in the climatology. The magnitude of the discrepancies can be greater than the interannual variability in certain regions. These wave forcing discrepancies are only partially linked to differences in the residual circulation but they are evidently related to the static stability in the affected regions. Thus, the main cause of the discrepancies is most likely an imbalance of radiative heating.
Two significant sudden changes are detected in the differences between the eddy heat fluxes derived from the two reanalyses. One of the changes may be linked to the bias corrections applied to the infrared radiances from the NOAA-12 High-Resolution Infrared Radiation Sounder in ERA-40, which is known to be contaminated by volcanic aerosol from the 1991 eruption of Mt. Pinatubo. The other change may be due in part to the use of uncorrected radiances from the NOAA-15 Advanced Microwave Sounding Units by ERA-Interim since 1998. These sudden changes have the potential to alter the wave forcing trends in the affected reanalysis, suggesting that extreme care is needed when one comes to extract trends from the highly derived wave forcing quantities.
The equator to pole circulation in the winter stratosphere is primarily driven by the upward propagating waves from the troposphere. This large-scale dynamical process is called the Brewer–Dobson (B-D) circulation and can be studied using the transformed Eulerian mean (TEM) equations (Edmon et al. 1980; Andrews et al. 1987; Holton et al. 1995; Shepherd 2007; Birner and Bonisch 2011). The Eliassen–Palm (E-P) flux divergence, which represents the wave forcing that acts on the mean flow to cause wind and temperature variations, is the most important quantity in the TEM equations. Numerous studies have used the TEM equations together with the E-P flux divergence to study the interannual variation and long-term trends of the B-D circulation (Edmon et al. 1980; Seviour et al. 2012), the behavior of planetary wave activity (Hu and Tung 2002; Karpetchko and Nikulin 2004; Hu et al. 2005), the variability of the polar vortex (Waugh et al. 1999; Newman et al. 2001), the momentum balance of the stratosphere (Dima and Wallace 2007; Monier and Weare 2011), and the annual cycle in tropical tropopause temperature (Kerr-Munslow and Norton 2006; Randel et al. 2008; Randel and Jensen 2013), among other topics. The fidelity of these studies relies crucially on the accuracy and homogeneity of the datasets that are used to derive the E-P flux divergence and the residual circulation that approximates the B-D circulation.
The E-P flux divergence is a highly derived quantity. Its calculation requires nonlocal information such as the spatial and temporal departures of the primary variables (i.e., winds and temperatures) from their mean fields. It is therefore extremely difficult to estimate the E-P flux divergence directly using station-based measurements. In addition, the calculation involves not only estimates of high-frequency fluctuation of the wave fluxes at different altitudes and latitudes but also differential operators that are applied to the slowly varying background temperature gradient. All these complications can potentially cause biases in the climatology, interannual variability, and/or long-term trends of E-P flux divergence. The nonlocal and nonlinear operators may also amplify small errors that are associated with the primary variables to much larger errors in the E-P flux divergence. It is therefore important to gauge the uncertainties in estimating the E-P flux divergence.
The most commonly used tools to derive the E-P flux divergence are reanalysis datasets, which are normally constructed by a variety of observations that are assimilated by using numerical weather prediction models to give a coherent representation of the global atmosphere with uniform spatial and temporal coverage (Uppala et al. 2005; Dee and Uppala 2009). A major concern with the use of reanalyses is their accuracy and homogeneity in representing both the underlying dynamics and long-term trends (e.g., Sterl 2004; Bengtsson et al. 2007; Thorne and Vose 2010). In particular, regions with relatively large analysis increments (defined as the reanalysis minus the model first guess that is based on the 6-hourly model forecast) can induce errors in estimating radiative balance and temperature (Uppala et al. 2005; Dee and Uppala 2008, 2009). In addition to model errors and drifts, studies have also shown that reanalysis datasets tend to differ from each other, especially in regard to long-term trends (e.g., Bengtsson et al. 2007; Kobayashi et al. 2009). This is because low-frequency and trend uncertainties may be induced by observational errors, including instrument biases and changes in geographical coverage. Sudden changes induced by incorporating newly available radiance measurements are of a particular concern in causing biases in low-frequency variation (Simmons et al. 2014).
ERA-40 and ERA-Interim, the two major consecutive reanalysis datasets produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), have been widely used for the study of atmospheric circulation and processes (Dee and Uppala 2009; Uppala et al. 2005). ERA-Interim, the newest reanalysis product of ECMWF, is known to have many improvements over ERA-40 (Dee and Uppala 2008, 2009; Dee et al. 2011a). It has much smaller analysis increments during winter at high latitudes, more realistic temperature trends and radiative budget, and more reliable low-frequency variability (Dee and Uppala 2009; Dee et al. 2011a; Screen and Simmonds 2011; Bracegirdle and Marshall 2012; Cornes and Jones 2013; Simmons et al. 2014). It also has better representations of the hydrological cycle in the tropics and subtropics and a more realistic B-D circulation in the stratosphere (Schoeberl et al. 2003; van Noije et al. 2004; Monge-Sanz et al. 2007, 2013; Dee et al. 2011b). Studies have yet to be undertaken to evaluate how the improvement may have affected the wave forcing estimates. Because it is extremely difficult to compare the wave forcing estimates directly against the observations, a comparative study may provide some insights into the uncertainties of estimating wave forcing based on reanalysis datasets.
This study undertakes a comparative study between ERA-40 and ERA-Interim to quantify the discrepancies in wave forcing, measured by the E-P flux divergence and the associated wave fluxes. We choose to compare these two ECMWF reanalyses mainly because of the well-documented improvements of ERA-Interim over ERA-40; these help in diagnosing the possible causes of the discrepancies. Our focus is on the height region from the upper troposphere to the upper stratosphere (500–1 hPa), where the zonal mean wave forcing is the main driver of the large-scale circulation, and the Northern Hemisphere (NH) winter mean of December–February (DJF), when both the wave amplitude and variability are largest. We first detect the regions with the largest E-P flux divergence discrepancies and identify the key wave fluxes that contribute the most to them. We then examine to what extent the E-P flux divergence discrepancies are linked to discrepancies in the residual circulation. Finally, we apply a changepoint detection method called the penalized maximal t test (PMT) to investigate the temporal consistency of the poleward eddy heat flux in these reanalyses.
2. Data and methods
The 40-yr ECMWF Re-Analysis (ERA-40) was generated by using the ECMWF Integrated Forecast System (IFS) model and its 6-hourly three-dimensional variational data assimilation (3D-Var) system (Uppala et al. 2005). It covered the period from September 1957 to August 2002 and incorporated observations from in situ measurements, including balloons, radiosondes, dropsondes, aircraft, and ships, along with satellite observations, which only provided global coverage of radiance measurements from 1979 onward. The data ingestion involved approximately 7–9 × 106 observations at each time step. The assimilation model used had a spectral T159 grid, corresponding to a 1.125° grid spacing in latitude and longitude and 60 levels in the vertical between the surface and 0.1 hPa (~65 km). Analysis products on the 23 standard pressure surfaces from 1000 to 1 hPa are available for general use.
Covering the data-rich satellite era of 1979–present the Interim ECMWF Re-Analysis (ERA-Interim) is the ECMWF’s current comprehensive atmospheric reanalysis (Dee and Uppala 2009; Dee et al. 2011a). It makes use of the same observations as ERA-40 before September 2002, supplemented with ECMWF operational data afterward (Berrisford et al. 2011; Simmons et al. 2014) but with major improvements over ERA-40. Especially, the ECMWF’s operational four-dimensional variational data assimilation (4D-Var) system couples the dynamic variables more cohesively with the humidity and radiation than its previous 3D-Var analysis system. This ensures a realistic interaction of temperature, vertical velocity, and humidity both temporally and spatially. Improved correction of biases in satellite radiance data is also achieved through the use of an automated variational bias correction system that optimizes the consistency of multiple measurements (Dee 2005; Dee and Uppala 2009; Dee et al. 2011a). In addition, the ERA-Interim assimilation model has a spectral T255 grid, corresponding to a ~0.70° grid spacing in latitude and longitude. It represents a higher spatial resolution than ERA-40; hence smaller-scale waves are resolved explicitly. The increase in spatial resolution is one of the key factors contributing to the reduction of analysis increments of temperatures as well as to a more realistic representation of the B-D circulation, in addition to many other improvements, including better physical parameterization schemes for radiative transfer, data quality control, subgrid-scale orographic drag, humidity analysis, clouds, and surface/soil processes (Dee and Uppala 2009; Dee et al. 2011a). The ERA-Interim assimilation model uses the same vertical levels as ERA-40 but the data are made available at 37 levels between 1000 and 1 hPa, including the standard 23 levels used by ERA-40.
Our analysis is based on the overlapping 22 winters (i.e., the winters of 1979/80–2001/02) that are shared by both ERA-40 and ERA-Interim. For clarity and simplicity, the definition of a winter is based on January across this paper; for example, the DJF mean of the 1979/80 winter is numbered and stated as 1980 hereafter.
b. TEM equation and the E-P flux divergence
The momentum balance in the TEM framework provides a theoretical account of large-scale dynamics by linking the mean flow acceleration to the residual circulation and large-scale wave forcing (Andrews et al. 1987). In spherical coordinates, it is expressed as
where u, υ, and w are Eulerian zonal, meridional, and vertical winds, a is the mean radius of Earth, ϕ is latitude, is the standard density in log-pressure coordinates, is the sea level reference density, z is the log-pressure height coordinate, H is the mean scale height (=7 km), f is the Coriolis parameter, the overbar denotes zonal average, and subscripts denote the derivatives with respect to the given variable. In Eq. (1), and represent the residual mean meridional and vertical winds; they can be expressed as
where θ is potential temperature and primes denote the departure from zonal mean. The term on the right-hand side of Eq. (1) is the E-P flux divergence and represents other nonconservative mechanical forcing, such as parameterized subgrid processes including gravity wave drag. Equation (1) states that the acceleration or deceleration of zonal mean zonal wind is affected by the residual mean meridional circulation (the sum of the first and second terms on the right-hand side, which is denoted as hereafter), the resolved or large-scale wave forcing that drives the circulation to depart from its radiative equilibrium (the third term on the right-hand side, which is denoted as hereafter), and the contribution from other nonconservative processes (the term). In this context, a significant difference in signifies inconsistency of wave forcing between these two datasets while a significant difference in suggests a different behavior in the B-D circulation. The E-P flux divergence that is the key to estimating can be further expanded into
where the meridional and vertical components of the wave forcing can be calculated as
Equation (1) is assembled in this form so that the net effect of the wave forcing on the mean flow can be quantified. Its individual terms, however, may show contrasting or opposite behaviors (Edmon et al. 1980; Palmer 1981). Here, to identify the key flux terms that contribute most to the total wave forcing discrepancies, we not only analyze the total E-P flux divergence term but also look into the individual contributing terms separately. In the latter case, we effectively employ an Eulerian approach by expanding the total wave forcing term into five additive terms according to Eqs. (4)–(6). The five terms are , , , , and . Theoretically, and should be the dominant terms that contribute to the total in the extratropics according to the quasigeostrophic approximation (Andrews et al. 1987). In the extratropical lower stratosphere where the wave forcing is primarily dominated by the vertical propagation of planetary waves from the troposphere, is central to the total wave forcing calculation (Newman and Nash 2000). Near the tropics or in the regions where plane-parallel gravity waves are present, the contribution from the vertical momentum flux term may also play an important role (Andrews et al. 1987).
All the wave forcing quantities are calculated using data archived at 2.5° × 2.5° grid spacing and at the 23 pressure levels that are common to both reanalyses. As a result, the wave forcing estimated from this coarse resolution should primarily be dominated by the effect of large-scale Rossby waves. The derivatives involved in the E-P flux divergence and other quantities in Eq. (1) are all calculated using centered differences except for those at the top and bottom boundaries (i.e., 1000 and 1 hPa), where one-sided differences were employed. As such, the results at the boundaries are less reliable. In addition, all the calculations are performed on daily mean winds and temperatures and then averaged over the DJF season. We chose to use daily averages rather than the 6-hourly instantaneous records because the very high-frequency waves such as diurnal tides should make a negligible contribution toward the wave driving B-D circulation.
c. Diagnostic tools
ERA-40 and ERA-Interim describe the same circulation of Earth’s atmosphere. Ideally, there should be no difference between them in all the wave forcing quantities and in the residual circulation term . In reality, the reanalysis datasets differ from each other due to the dissimilarity in bias correction, physical parameterization, model resolution, and/or the ways of assimilating observations. The wave forcing as well as the circulation parameters therefore differ as a consequence of these sources of dissimilarity. We use composite analysis with a two-sided Student’s t test to diagnose regions with significant differences in their climatological means across their common period (i.e., 1980–2002). The composite differences are all performed as ERA-40 minus ERA-Interim and denoted as ERA40 − ERAInt hereafter.
We apply the penalized maximal t test (Wang et al. 2007) to detect a significant sudden shift of mean in the wave forcing differences between the two reanalyses. A brief description of the method can be found in the appendix. To examine the principal contributor to the discontinuity, the PMT identification is separately applied to the total, stationary, and transient components of the wave forcing. This is because stationary waves are excited by the topography as well as land–sea heating contrast while transient waves are dominated by synoptic-scale weather patterns (Newman and Nash 2000). At each grid point, the total eddy heat flux is calculated by multiplying the daily zonal departures of meridional wind υ and temperature T (i.e., υ′ and T′) and averaging the multiplied quantity over DJF. To obtain the stationary component , we first average DJF meridional wind υ and temperature T at each grid point and then multiply the zonal departures of the seasonally averaged quantities. The transient component is estimated simply by . These three components are then zonally averaged in order to obtain their zonal mean fields. Also, when a winter is found to contain a significant sudden jump (i.e., a changepoint), composite analysis based on the detected changepoint winter is then used to investigate the spatial characteristics of the discontinuity. It is worth noting that the results reported here are case studies that demonstrate the usefulness of the detection technique rather than exhausting all possible discontinuities in both datasets.
a. Discrepancies in E-P flux divergence
Figures 1a and 1b show the climatology of DJF mean E-P fluxes (arrows) and E-P flux divergence term Ψ (contours) estimated from ERA-40 and ERA-Interim respectively. Both climatologies show that the wave forcing is marked by the upward and equatorward propagation of the E-P fluxes that are associated with the mainly negative E-P flux divergence term Ψ. There are two distinct peak regions of Ψ, one in the upper troposphere [~(200–300) hPa] and another in the upper stratosphere [~(1–3) hPa]. Another smaller peak can also be observed at high latitudes around 5–10 hPa.
Figure 1c shows the composite difference in the DJF mean E-P fluxes and E-P flux divergence term Ψ between the two reanalyses. The main feature of is the vertically alternating positive and negative anomalies in the extratropics, which intensify and expand more toward the equator with increasing altitude. As a result, the largest appears in the upper stratosphere, where the negative Ψ anomalies cover poleward of 20°N. It is worth noting that the magnitudes of are as large as 20%–40% of the climatological Ψ in this region. Anomalously upward E-P flux vectors are found in the lower and upper stratosphere, suggesting an overall stronger wave forcing in ERA-40 than ERA-Interim in the stratosphere.
Figure 2 shows the time series of DJF mean total E-P flux divergence term Ψ that are area-averaged over 45°–75°N at 3, 10, 50, and 100 hPa (top–bottom). At 3 hPa, noticeable discrepancies in both interannual variability and climatological mean can be observed with more negative Ψ values in ERA-40 than ERA-Interim. At 10 hPa, the difference is due mainly to the climatological mean with the ERA-Interim Ψ being more negative overall than that of ERA-40. At 50 hPa, a generally similar behavior to that at 3 hPa can be seen though the magnitude of the discrepancy is relatively smaller. At 100 hPa, the discrepancy is again dominated by a difference in the climatological mean with the ERA-40 Ψ being less negative than that of ERA-Interim. Over these four pressure levels, the climatological means of Ψ estimated from ERA-40 and ERA-Interim alternately exceed each other. The discrepancies are comparable to 15% of the interannual variability of Ψ at 10 hPa; this value increases to 45% at 100 hPa. There are also apparent trends in Ψ45–75N, especially at 100 hPa where upward trends are clearly noticeable in both ERA-40 and ERA-Interim estimates, and the trend of ERA-40 Ψ45–75N,100 hPa is distinctly steeper than that of ERA-Interim Ψ45–75N,100 hPa.
Figure 3 shows the composite differences of four of the individual terms that add up to the differences of the total wave forcing term Ψ. Because the climatology of is one order of magnitude smaller than the those of the other terms and no significant differences between the two reanalyses are detectable for , the difference plot of is not shown. It is immediately clear that is the main contributor to the vertically alternating positive and negative anomalies shown in Fig. 1c. In the upper stratosphere, and also play a role in addition to . At high latitudes, the discrepancies have an opposite sign to those of while the discrepancies have the same sign as those of . The combined effect of , , and forms the negative in the high-latitude upper stratosphere. At midlatitudes (i.e., 20°–45°N), plays a major role in causing the wave forcing discrepancies.
In the middle-to-low latitude upper troposphere (0°–50°N, 200–500 hPa), and are the main contributors to . At low latitudes, the discrepancies are dominated by the effects of , which are marked by the vertically alternating negative and positive anomalies that are very similar to those of in the extratropics. These tropical discrepancies are associated with the vertical momentum flux to which is negatively proportional. In the tropical and subtropical tropopause, the term also contributes to discrepancies mainly by enhancing the anomalies. In the midlatitude upper troposphere [~(25°–50°N), 300 hPa], significant discrepancies are found in both and , with positive differences partially counteracting the negative differences. Their combined effect is insignificant negative in this region.
The poleward eddy potential heat flux is the most important quantity that is used to estimate in the middle-to-high latitude stratosphere. To examine the extent to which contributes to the high-latitude , Fig. 4 shows the climatology of estimated from ERA-Interim as well as the composite difference between the two datasets, . The climatological increases with height, with larger values in the middle-to-high latitudes. Note that is also mostly positive and exhibits larger values in the extratropical upper stratosphere, where accounts for ~10% of its climatology. Vertically alternating positive and negative anomalies are found at high latitudes although they are not statistically significant. The term is statistically significant mainly below 200 hPa and away from the high latitudes. These results suggest that the alternating positive and negative shown in Figs. 1c and 3b cannot be explained by the differences in alone.
Figure 5 elaborates this point further by showing the temporal variation of poleward eddy heat flux and its long-term trends at 100 and 10 hPa respectively. It is noted that, considering each pressure level in isolation, is proportional to the poleward eddy potential heat flux , so similar behavior would also be seen in . In general, both datasets follow each other exceedingly well in terms of interannual variability; this holds true for the total, stationary, and transient components of both at 100 and 10 hPa. No significant trends are observed in the total either in ERA-40 or ERA-Interim estimates, although there is a noticeable difference in the climatological mean in the total ERA-40 estimates. However, an upward trend is shown in the stationary while a downward trend is associated with the transient component, with the ERA-40 trends being generally steeper than those of ERA-Interim. Similar positive and negative trends in stationary and transient are observable at 10 hPa, except that at this level the ERA-Interim trend is slightly steeper than that of ERA-40. Nevertheless, we find that the stationary and transient components of show consistent trends throughout the stratosphere, in contrast to the rather confusing trend behavior of Ψ45–75N in the stratosphere (see Fig. 2). These results suggest that the two datasets agree with each other better for than for Ψ in the extratropical stratosphere. They also imply that something other than the eddy heat flux is responsible for the discrepancies in Ψ45–75N.
The eddy heat fluxes and are not responsible for the vertically alternating feature of discrepancies, but Fig. 3b indicates that is the main contributor to . The other possible cause is the vertical gradient of the potential temperature . Figure 6 shows the latitude–height plane of the DJF mean of the vertical gradient of potential temperature , and . Note that is the same as except for being fixed as a constant that is equal to the ERA-Interim climatology. Vertically alternating anomalies are clearly noticeable in all three variables. The discrepancies in and are found not only at high latitudes but also at low latitudes while the discrepancies in are mostly confined to the middle to high latitudes. Wright and Fueglistaler (2013) recently found that net diabatic heating directly above the tropical convective regions is noticeably stronger in ERA-Interim than other reanalyses; this may be linked to the negative at 70–100 hPa and positive at 150–300 hPa. However, comparing the discrepancies in with those in (Fig. 3b), the magnitude of the discrepancies is at most half of those of in the lower-to-middle stratosphere and differences of opposite sign are found near 1 hPa. These results suggest that differences in the vertical temperature gradient between these two datasets play the most important role in explaining the vertically alternating positive and negative anomalies of and those of at high latitudes. The anomalies in the eddy heat fluxes are nevertheless not negligible in terms of their magnitudes; their contribution may be comparable to that from static stability in the upper stratosphere. Nonlinear interaction between these two may also play a role.
Figure 7 shows the vertical profile of DJF zonal mean temperature climatology (Fig. 7a) and differences (Fig. 7b) for the 0°–35°N, 35°–75°N, and 75°–90°N latitude bands. In general, decreases from the surface to the tropopause and then increases from the tropopause to the upper stratosphere. Also, the vertical temperature gradient decreases with latitude with temperature gradient being the steepest at 0°–35°N. For all three latitude bands, the magnitude of is relatively small (<1.5 K) below 10 hPa, but it increases sharply above 10 hPa (to ~5 K). Vertically alternating positive and negative anomalies are clearly visible at low and high latitudes; the effect is less clear for the midlatitude band below 10 hPa.
Figure 8 shows NH polar plots of DJF mean temperature differences at various pressure levels. Two common features that are found at all the pressure levels except for 850 hPa are that 1) at a given pressure level, tends to have the same sign hemispherically, and 2) significant are mostly confined to the low and high latitudes with little signal visible at midlatitudes. At 7 and 20 hPa, the low-latitude signal peaks over the Pacific and Atlantic Oceans where the signal extends more northward. At 70 hPa, the signal is relatively small and spatially patchy. At 100 hPa, the pattern of is broadly similar to that at 20 hPa though it is more zonal and more confined to the tropics. At 500 hPa, there is a lack of significant over most of the Pacific and relatively weaker over the North Atlantic. At 850 hPa, the significant differences are found mainly over the two ocean basins and near the tropics. These results suggest that the temperature differences are zonally symmetric at some levels (i.e., 7 and 100 hPa) and asymmetric at other pressure levels. More importantly, they show that over the Pacific and Atlantic Oceans contributes most toward the vertical zigzag behavior of zonal mean temperature gradient difference in the low latitudes.
As well as showing significant discrepancies in the extratropical stratosphere, Fig. 3 also shows significant discrepancies in the upper troposphere. To illustrate the temporal variation of these tropospheric discrepancies, Fig. 9 shows the time series of DJF mean ERA-40 and ERA-Interim and area-averaged over 25°–50°N at 300 hPa. The discrepancies in both and are due mainly to a difference of climatological mean and the magnitude of the discrepancy is larger than its interannual variability. Also, the interannual variability of is much larger than that of ; may play a dominant role in the total waving forcing for a particular winter such as 1989 in this region.
Figure 3 shows that is also responsible for the total wave forcing discrepancies in the low-latitude upper troposphere. To illustrate the temporal variation of these discrepancies, Fig. 10 shows the time series of ERA-40 and ERA-Interim that are area-averaged over 0°–10°N at 300 hPa (left) and 0°–10°N at 100 hPa (right). The discrepancies are again dominated by a difference in climatological mean. The climatological difference in at 300 hPa is again larger than its interannual variability, implying that there is large uncertainty associated with the momentum budget in this region. Apart from the dominant climatological mean difference, there are also noticeable disagreements in the interannual variability in at 300 hPa. It is noted that ERA-Interim departs farther away from the zero line than ERA-40 at both 100 and 300 hPa, implying a larger magnitude of the vertical eddy flux in ERA-Interim than ERA-40. Small-scale processes such as gravity waves play an important role in (Lindzen 1981), and differences in model resolution and parameterization are the likely sources for the discrepancies. It has been shown that the vertical velocity of ERA-Interim is less noisy than that of ERA-40 (Dee and Uppala 2008; Iwasaki et al. 2009). This may also contribute to the larger magnitude of (or ) in ERA-Interim than ERA-40.
b. Effect on the B-D circulation
This section investigates the extent to which the resolved wave forcing term Ψ is linked to the discrepancies in the B-D circulation by examining the momentum budget of the TEM equation. The first row of Fig. 11 shows the climatology of the DJF mean residual mean meridional circulation (arrows) and the residual mean meridional circulation term Θ (contours) estimated from ERA-40 and ERA-Interim (Figs. 11a,b), as well as the composite differences between these two datasets (Fig. 11c). The main climatological feature of the residual mean meridional circulation in both ERA-40 and ERA-Interim is the upward movement of streamlines of the flow at low latitudes that is followed by poleward movement at midlatitudes and finally downward movement at high latitudes. The residual meridional circulation term Θ is mainly positive, reflecting eastward (or westerly) acceleration and a predominantly northward apparent force on the fluid parcels. In the stratosphere, Θ peaks in the extratropical upper stratosphere and it is in approximate balance with the Ψ peak in the same region (see Figs. 1a,b). The tropospheric Θ peaks poleward of 40°N where Θ is also in rough balance with Ψ. However, Θ is not in balance with Ψ near the tropospheric subtropical jet, where gravity wave drag and upgradient eddy transport (McFarlane 1987; Birner et al. 2013) play an important role. In the TEM formulation [Eq. (1)], the effect of these processes is accounted for by the nonconservative term rather than the resolved wave forcing term Ψ, suggesting that the effects of parameterized processes such as the gravity wave drag and numerical approximation play an important role in this region.
The key feature of the discrepancies in the residual circulation is the broadly positive between 2 and 200 hPa together with the poleward arrows in the same region (Fig. 11c). This implies a stronger residual circulation in ERA-40 than ERA-Interim, which is consistent with other studies (e.g., Iwasaki et al. 2009; Monge-Sanz et al. 2013). However, the regions with significant positive do not generally coincide with the regions of significant negative or vice versa (see Fig. 1c). The only exceptions are the midlatitude upper stratosphere (20°–40°N, 2–7 hPa) and the high-latitude upper troposphere and lower stratosphere (poleward of 70°N, 500–30 hPa), where partially cancels . Therefore, the discrepancies in the E-P flux divergence can only partially explain the discrepancies in the residual circulation.
The second row of Fig. 11 shows the climatology of the nonconservative term (contours) calculated as from ERA-40 and ERA-Interim (Figs. 11d,e), as well as the composite differences of between these two datasets (Fig. 11f). Above 100 hPa, the climatology of is mainly negative for both datasets. This implies that other processes, such as small-scale wave forcing, are also involved in driving the residual meridional circulation (Seviour et al. 2012). Note that Seviour et al. (2012) found smaller magnitudes of the nonconservative term than those shown in Fig. 11e for ERA-Interim. This is likely because our results are based on daily averaged data at 2.5° × 2.5° resolution and for the period of 1979–2002 while Seviour et al. (2012) used 6-hourly instantaneous records at 3.75° × 2.5° resolution for the period 1989–2009.
The magnitude of stratospheric differs between these two datasets; it is nearly twice as large in ERA-40 than in ERA-Interim. This results in hemisphere-wide significant negative above 200 hPa, except for the high-latitude upper stratosphere where gravity waves may play an important role (Holton 1983). The stratospheric is broadly in balance with (see Fig. 11c), implying that the balance between terms other than is better achieved in ERA-Interim than ERA-40. Because the zonal wind tendency term for the DJF mean is at least one order of magnitude smaller than either Ψ or Θ in terms of both the climatology and the differences (not shown) and the differences in the zonal mean zonal wind between these two datasets are negligibly small (Dee et al. 2011a; Lu et al. 2014), results shown in Figs. 1c and 11c,f indicate that the discrepancies in none of Θ, Ψ, or have corresponding differences in zonal mean flow.
In the upper troposphere, is largely in balance with Ψ in terms of climatology (see Figs. 1a,b). Especially, both datasets show a good balance between and Ψ at 15°–55°N, 150–300 hPa. As such, the TEM budget based on the resolved wave forcing becomes inadequate for the assessment of the forced variability of zonal wind in this region. Figure 11f suggests that this nonlinear interaction appears to occur lower in altitude in ERA-40 than ERA-Interim, resulting in the positive centered at 20°–50°N, 300 hPa; the difference may be attributed to the stronger convective motion and therefore more effective vertical heat transport in ERA-Interim (Wright and Fueglistaler 2013).
In the tropical upper troposphere, is mostly in balance with , implying that ERA-40 and ERA-Interim account for large-scale wave forcing and the nonconservative processes differently in this region. Similar to those at 15°–55°N at 150–300 hPa, the discrepancies are closely associated with analysis increments of temperature in the region, where the interaction of temperature, vertical velocity, and humidity is better captured by ERA-Interim than ERA-40 (Dee and Uppala 2009; Dee et al. 2011a). Differences in gravity wave parameterization may also contribute to these tropical discrepancies (McFarlane 1987).
c. Sudden change of mean in the eddy heat fluxes
Up to this point, the diagnostics have been based on the composite differences between the two datasets for their common period; they therefore do not address the discrepancies in long-term trends. Inhomogeneity in either temperature gradient or wave fluxes can induce trend uncertainty in the wave forcing. Because the poleward eddy heat flux is the most important quantity for assessing the impact of tropospheric waves propagating into the stratosphere, it is chosen here to identify possible discontinuities that are induced by a change of instruments, or quantity and quality of observations over time. A similar analysis could also be performed for the temperature gradient , but only the results of are reported here as a demonstration.
In this section, we use the PMT technique to detect any significant sudden departure of difference between the two reanalyses [i.e., ]. The reason that we use rather than or for the detection is because the PMT technique requires that the time series under consideration is normally distributed and does not have a physically real trend. It is more likely that satisfies the “no-trend” assumption because any apparent trend in is more likely to be caused by a discontinuity of observations or an inhomogeneity in the treatment of observations by the data assimilation procedure in one or both of the datasets. Conversely, and are more likely to combine physically real trends with instrumental-induced sudden changes, violating the no-trend requirement of the PMT. For the same reason, is more likely to obey a normal distribution due to the random nature of the observational errors, except for the sudden changes. Most importantly, for each individual time series or , the magnitudes of the discontinuity and the trend are much smaller than that of the interannual variability, making it statistically harder to detect the changepoint. But because the two time series are very strongly covarying (see, e.g., Fig. 5), taking the difference allows us to effectively remove the interannual variability and thus to detect small discontinuities.
Figure 12 shows the time series of DJF mean total, stationary, and transient eddy heat flux averaged over 10°–30°N, 100 hPa, from ERA-40 (blue dashed), ERA-Interim (red dash-dotted), and their difference (black solid). It appears that both total and stationary in ERA-40 have long-term downward trends, which becomes noticeably steeper after the 1991 winter; those in ERA-Interim , however, have no obvious trends. An immediate sudden departure between ERA-40 and ERA-Interim in both total and stationary can be clearly seen after the 1991 winter with ERA-Interim estimates being consistently larger than those of ERA-40 after this time. A different behavior can be observed for the transient component, with ERA-40 estimates being consistently larger than those of ERA-Interim before the 1997 winter and the two estimates becoming more nearly identical to each other after 1997.
According to the three significance measures, a significant changepoint in the wave forcing difference is detected in the 1991 winter, after which the total and its stationary values dropped significantly. The drop is most noticeable in the stationary component, which has a zero mean for the period 1980–91 but a mean value of −1.5 K m s−1 in the period 1992–2002. The drop is about half of the amplitude of its interannual variability. There is another possible changepoint in the winter of 1997, after which the transient component of appeared to drop suddenly. For the 1997 changepoint, however, only two of the three p values are significant at the 0.05 level.
Figure 13a shows a NH polar plot of DJF mean eddy heat flux at 100 hPa estimated using ERA-Interim while Fig. 13b shows the composite difference of at 100 hPa between the periods 1992–2002 and 1980–91 (i.e., later minus earlier periods). The climatological peaks at 45°–75°N and attains its maximum value (~80 K m s−1) over the North Pacific Ocean. The overall pattern of resembles those of previous studies and it is known to be controlled by the stationary component that is related to near-surface topography and topographically induced perturbations (e.g., Plumb 1985; Newman and Nash 2000).
Figure 13b shows the difference plot of total after and before 1991 (later minus earlier periods). The sudden change of in 1991 is mainly marked by a longitudinal belt of negative anomalies centered on 20°N. The largest jump is located near the vicinity of Mt. Pinatubo and there are significant negative anomalies almost everywhere in radiosonde-data-sparse ocean regions in the latitude band of 10°–30°N. The stationary component accounts for almost all of these negative anomalies (not shown). In the mid to high latitudes, the values of are predominately positive. The magnitude of those positive anomalies is found to be noticeably enhanced in the stationary component while significant negative anomalies of transient are found at 70°–80°N (not shown). These high-latitude positive (negative) differences in the stationary (transient) component imply that the 1991 sudden jump induced an upward (downward) trend in the respective component of ERA-40 (see Fig. 5).
Figure 14 shows the time series of DJF mean total, stationary, and transient averaged over 45°–75°N at 10 hPa, estimated from ERA-40, ERA-Interim, and their difference. The total, stationary, and transient show similar behaviors as (see the right-hand panels of Fig. 5), with the two datasets closely resembling each other. However, based on the three significance measures, a significant changepoint is detected in 1998 winter for the difference between these two datasets . After the 1998 winter, the total and stationary dropped significantly. The transient also dropped after 1998 although the third measure does not attain a p value ≤ 0.05. However, the detection of a change after 1998 at 10 hPa involves a comparison between a 4-yr interval that exhibits large anomalies (i.e., 1999–2002) with a 19-yr period of relatively small anomalies (i.e., 1980–98). The relative size of the drop, at ~(5%–7%) of the climatological mean , is much smaller than that at 10–30 or 100 hPa. Thus, the sudden change detection at 10 hPa may not be reliable and the effect of this sudden change on the wave forcing estimates is less of a concern than that at 100 hPa.
Figure 15 shows the spatial characteristics of the 1998 sudden change of at 10 hPa, displayed in a similar way as that for the 1991 changepoint at 100 hPa (i.e., Fig. 13). The climatological flux peaks at 45°–80°N, 90°E−180°, with a maximum value of ~180 K m s−1. Weaker positive fluxes are also present in the region 40°–80°N, 150°W–90°E. The effect of the 1998 changepoint in is mainly confined to the region poleward of 40°N. They are marked by negative differences of over land surfaces at 0°–120°E and 30°–150°W. These high-latitude negative differences imply that the 1998 sudden jump induces an apparent upward trend in the stationary component of ERA-Interim , which might consequently contribute in part to the steeper upward trend of the E-P flux divergence in this region. This may partially explain why the stationary in ERA-Interim has a steeper upward trend than that in ERA-40 (see Fig. 5).
4. Conclusions and discussion
We have here reported that significant discrepancies exist in the wave forcing estimated from ERA-40 and ERA-Interim during NH winter. When measured by the E-P flux divergence, three key regions are identified as having significant discrepancies. They are the entire high latitudes, the upper troposphere, and the extratropical upper stratosphere. The discrepancies in the high latitudes are marked by vertically alternating positive and negative anomalies of the E-P flux divergence. They are manifested as differences in the climatological mean between the two datasets and can account for up to 15%–45% of the interannual variability in the affected regions. Such discrepancies are due mainly to differences in the vertical gradient of potential temperature .
Similar vertically alternating positive and negative anomalies were previously found in the analysis increments of temperature in many reanalysis datasets and are known to be caused by the presence of systematic bias between the data assimilation model and the satellite measurements (Uppala et al. 2005; Dee and Uppala 2009; Kobayashi et al. 2009). Such a bias has a larger magnitude and is more persistent in ERA-40 than ERA-Interim (Simmons et al. 2007; Dee and Uppala 2009). Recent studies indicate much closer agreement to observations by ERA-Interim compared to ERA-40, which is attributed to the advances in the ERA-Interim assimilation system, especially the various improvements of the ERA-Interim 4D-Var system over the previous 3D-Var system that was used by ERA-40 (e.g., Fueglistaler et al. 2009; Dee et al. 2011a; Simmons et al. 2010, 2014; Bracegirdle and Marshall 2012). For this reason, we suggest that the E-P flux divergence discrepancies at high latitudes are most likely due to the model drift induced by the data assimilation system, rather than observational errors.
In the middle-to-low latitude upper troposphere, the discrepancies in the E-P flux divergence are due largely to the bias in the vertical momentum flux . It has been suggested that imbalance of radiative heating induced by assimilation of the observational radiance data tends to introduce noise in the vertical velocity (Schoeberl et al. 2003; Fueglistaler et al. 2009). More importantly, because small-scale processes such as convection and gravity waves may contribute significantly to the momentum budget in addition to resolved wave forcing, differences in model resolution and parameterization of subgrid processes by ERA-40 and ERA-Interim can induce discrepancies in the E-P flux divergence in this region. This may explain why the discrepancies are marked by a cancellation between the E-P flux divergence term Ψ and the nonconservative term . These discrepancies may also be linked to the large bias of analysis increments in the tropical upper troposphere (Dee and Uppala 2009). Furthermore, we have noted that the discrepancy in the E-P flux divergence climatology can be larger than the amplitude of its interannual variation in this region; such large uncertainty strongly discourages merging these two reanalysis products to study wave–mean flow interaction.
In the upper stratosphere, the E-P flux divergence discrepancies involve all the relevant flux terms and are associated with substantial differences in temperature as well as static stability. These discrepancies may be attributed to the relatively larger model bias in the region, where observations are sparse and model errors are large (Dee and Uppala 2009). Nevertheless, we find that the discrepancies between these two datasets become much reduced both in terms of interannual variability, climatological mean, and long-term trend if the wave forcing is measured by the poleward eddy heat flux .
Based on the TEM momentum budget, we have shown that a stronger residual circulation is associated with ERA-40 than ERA-Interim, agreeing with previous studies (e.g., van Noije et al. 2004; Monge-Sanz et al. 2007; Dee and Uppala 2009; Monge-Sanz et al. 2013). However, the discrepancies in the residual circulation are only partially associated with the discrepancies in the resolved large-scale wave forcing. The majority of the discrepancies in the residual circulation are associated with the nonconservative term , suggesting that the bias in wave forcing is not the main cause for the excessively strong B-D circulation in ERA-40. The excessively strong B-D circulation in ERA-40 was in part attributed to systematic analysis increments in stratospheric temperature that resulted from the biases induced by 3D-Var (Uppala et al. 2005). Apart from radiative heating, improvements in the treatment of the effects of volcanic aerosols, gravity wave drag, and better radiation schemes may also have led to an improved representation of the B-D circulation in ERA-Interim (Dee et al. 2011a). Especially, it is known that gravity wave drag plays a considerable role in driving the B-D circulation (McLandress and Shepherd 2009; Butchart et al. 2010; Seviour et al. 2012) and the effect of smaller-scale wave drag is better resolved by ERA-Interim than ERA-40 (Dee and Uppala 2009; Dee et al. 2011a). A recent study shows that the largest differences in radiative heating in the tropical upper troposphere between several reanalysis products are due primarily to differences in cloud radiative heating as well as localized biases in heating and cooling associated with parameterized turbulent mixing (Wright and Fueglistaler 2013).
The thermodynamic balance in the stratosphere is largely a balance between the radiative heating and the dynamical heating from the advection of the residual circulation (Andrews et al. 1987). Because the dynamical heating term in the thermodynamic budget of the TEM equations is the product of the residual velocity and the temperature gradients, an enhanced residual circulation should be associated with cooling in the low-latitude stratosphere and warming at high latitudes if the radiative heating is constant. However, the temperature difference in the tropical lower stratosphere (70–100 hPa) is characterized by significant warm anomalies at 0°–35°N (see Fig. 8) and an enhanced B-D circulation (see Fig. 11). This is opposite to what we would expect from the augment of dynamical heating. Thus, the discrepancies in the dynamical behavior between these two datasets are more likely the result of a dynamical adjustment to a difference in radiative balance. Because the wave forcing discrepancies are mostly confined to the regions where analysis increments of temperatures are known to be largest, we suggest that they are likely to have originated from an imbalance in radiative heating that is introduced during the ingestion of observational data. However, an enhanced B-D circulation and a warmer tropical lower stratosphere previously reported for the ERA-40 cannot be explained only by differences in the static stability or radiative heating in the high-latitude stratosphere and/or in the low-latitude upper troposphere. The differences may also have originated from the bias in the forecast model. For instance, the forecast models may have different radiative transfer scheme and/or they generate different distributions of clouds, ozone, and water vapor, which then leads to different radiative heating. Further studies are required to investigate the implications for tracer transport and ozone tendencies extracted from the reanalysis datasets.
A sudden drop of the eddy heat flux difference is detected after the 1991 winter over the subtropical ocean at ~(10°–30°N) at 100 hPa. This drop could be due in part to the contamination effects of volcanic aerosols on the infrared radiances measured by the High Resolution Infrared Radiation Sounder (HIRS) on board the NOAA-12 satellite following the eruption of Mt. Pinatubo in June 1991 (Uppala et al. 2005). Because the radiative transfer model that was used by ERA-40 did not include the effect of volcanic aerosols, the aerosol contaminated radiance measurements were absorbed by the bias corrections of the 3D-Var system, which is known to result in excessive humidity/rainfall in the tropics and subtropics (Uppala et al. 2005; Dee and Uppala 2009). A revised thinning, channel selection, and quality control of HIRS radiances assimilation was used by ERA-Interim, in which the 4D-Var analysis system couples the humidity and the dynamic variables cohesively to help ensure a realistic interaction of temperature, vertical velocity, and humidity (Dee and Uppala 2009; Dee et al. 2011a).
A subtler sudden drop in eddy heat flux difference is also detected in the midlatitudes at 10 hPa. This drop may be linked to the discontinuity in upper-stratospheric temperatures associated with the radiance measurements from the Advanced Microwave Sounding Unit-A (AMSU-A) since August 1998 in ERA-Interim (Dee and Uppala 2008). While the ERA-40 reanalyzed temperatures in the upper stratosphere inherited a consistent warm bias from the assimilation model, ERA-Interim began using uncorrected radiance data from AMSU-A channel 14 from August 1998 (Dee and Uppala 2008). This change is known to have induced a jump of the global mean temperature above 10 hPa in ERA-Interim (Dee and Uppala 2008). It remains unknown whether or not this change caused a jump of in ERA-Interim or if instead the detected sudden drop of after the 1998 winter is due mainly to a change of correlation coefficient between temperature T and meridional velocity υ.
Several studies have found significant trends in stratospheric wave forcing (Newman and Nash 2000; Randel et al. 2002; Hu and Tung 2002) while others have found that the trends reverse in early and later winter with no significant trend in midwinter (Karpetchko and Nikulin 2004; Hu et al. 2005). Here, we have found that trends in the E-P flux divergences differ substantially between these two datasets. Sudden changes in either temperature gradient or eddy fluxes that are induced by inhomogeneity of observations are able to alter the respective trends and low-frequency variability in the wave forcing. Because of the highly derived nature of the E-P flux divergence, the trends estimated from such a quantity should be treated with extreme caution.
Nevertheless, we have found that the trends in the eddy heat flux are more consistent than those in the E-P flux divergence. In both ERA-40 and ERA-Interim in the midlatitude stratosphere there is a positive trend in the stationary and a negative trend in the transient , generally in agreement with previous studies (Newman and Nash 2000; Randel et al. 2002). It must be noted that the general circulation model (GCM) used in both ERA-40 and ERA-Interim does not include the effect of solar variability; any decadal- to multidecadal-scale variation comes solely from the observations (Dee et al. 2011a). During solar maxima, the background state that is predicted by the GCM of the data assimilation system is generally biased compared to the observations, so systematic analysis increments may emerge as a result. This can affect the low-frequency variation and the trends for both datasets. Thus, further confirmation based on longer records and other reanalysis datasets is needed before we can go further into the physical explanations of the opposite trends in the stratosphere in terms of the stationary and transient .
This comparative study of wave forcing estimated from ERA-40 and ERA-Interim provides an additional perspective for evaluating dynamic processes in the stratosphere and upper troposphere. It is noted that a comparative study like this cannot make a quantitative attribution in terms of which dataset is better and/or by how much. Our results nevertheless show that bias in static stability induced by temperature differences and/or radiative heating imbalance can potentially cause large uncertainty in the E-P flux divergence, endorsing the importance of reducing the analysis increments, especially the model drift, in assimilating temperatures. Our results also demonstrate the importance of the recently established Stratospheric Processes and their Role in Climate (SPARC) Reanalysis/Analysis Intercomparison Project (S-RIP) (Fujiwara et al. 2012; Fujiwara and Jackson 2013).
This study is part of the British Antarctic Survey Polar Science for Planet Earth Programme funded by the Natural Environment Research Council. We acknowledge use of ECMWF reanalysis datasets and documentation at http://www.ecmwf.int. We would also like to thank the three reviewers for their detailed and constructive comments.
Penalized Maximal t Test
Let denote a time series with zero true trend (but potentially containing a sudden change that may give rise to an apparent linear trend in ) and identically and independently distributed (IID) Gaussian errors. To detect a changepoint in is to test the null hypothesis
against the alternative
where and stands for follows an IID Gaussian distribution of mean μ and variance . When is true, the entire time series can be viewed as two independent samples from two normal distributions of the same unknown variance , one for all t ≤ k and another for all t > k, where the point–time t = k is called a changepoint, and is called the magnitude of the mean shift.
To detect the most probable value of k and to test whether the means of these two samples are statistically significantly different from each other, the test statistic for detecting a mean shift by penalized maximal t test is
and is an empirical penalty function that is constructed via Monte Carlo simulation according to N and k (Wang et al. 2007). The functional form of is constructed to give the same level of confidence on the detected changepoints regardless of their position in the time series and to have the same false-alarm rate for all points if happens to be a homogeneous series. The empirical weight function is used to diminish the effect of unequal sample sizes on the power of detection only based on so that the false-alarm rate of PMT is evenly distributed across all candidate changepoints. The detailed formulation of is given in Wang et al. (2007).
Here, we use three measures to evaluate the significance of any detected changepoint. The first measure is the chance of a changepoint occurring at the detected position, the second measure is the significance of the mean-shift magnitude Δμ, and the third measure is the significance of the maximum value of the penalized t statistics . To calculate these three measures, 10 000 synthetic random time series that share the same distribution of are constructed based on Monte Carlo simulations. To calculate the first measure, the PMT detection is performed to find the position k in the time series where the maximum value of exists for each synthetic series. A distribution of the random trial–based k values is then constructed accordingly. The k value estimated from the original time series is compared to this distribution and the rank of the actual k among these randomized trials determines its significance level. To calculate the second measure, we rank the true Δμ values against the distribution of Δμ calculated from the 10 000 synthetic series. Similarly, the third measure is established by ranking the maximum value of of the actual series against those from the synthetic series. When the ranks for all the three measures fall in the 5% tail ends of their associated distributions, the changepoint is regarded as statistically significant. For simplicity, we call the proportional ranks as p values in order to align with the traditional terminology of significance tests.