Poleward migration of the latitudinal edge of the tropics of 0.25°–3.0° decade−1 has been reported in several recent studies based on satellite and radiosonde data and reanalysis output covering the past ~30 yr. The goal of this paper is to identify the extent to which this large range of trends can be explained by the use of different data sources, time periods, and edge definitions, as well as how the widening varies as a function of hemisphere and season. Toward this end, a suite of tropical edge latitude diagnostics based on tropopause height, winds, precipitation–evaporation, and outgoing longwave radiation (OLR) are analyzed using several reanalyses and satellite datasets. These diagnostics include both previously used definitions and new definitions designed for more robust detection. The wide range of widening trends is shown to be primarily due to the use of different datasets and edge definitions and only secondarily due to varying start–end dates. This study also shows that the large trends (>~1° decade−1) previously reported in tropopause and OLR diagnostics are due to the use of subjective definitions based on absolute thresholds. Statistically significant Hadley cell expansion based on the mean meridional streamfunction of 1.0°–1.5° decade−1 is found in three of four reanalyses that cover the full time period (1979–2009), whereas other diagnostics yield trends of −0.5°–0.8° decade−1 that are mostly insignificant. There are indications of hemispheric and seasonal differences in the trends, but the differences are not statistically significant.
In contrast to the subtropics, the tropics are a region characterized by less outgoing longwave radiation (OLR), large-scale uplift, low column ozone amounts, a higher and colder tropopause, and a greater frequency of convective clouds and precipitation. A geographic definition of the “tropical belt” is the region between the tropic of Cancer (23°26′N) and the tropic of Capricorn (23°26′S); with the latitudinal limits defined by the axial tilt of the earth, which varies with a periodicity of approximately 41 000 years. However, from an atmospheric perspective, there is no similarly simple definition of the latitudinal extent of the tropical belt. Instead, the tropical edge latitudes have been quantitatively diagnosed from observations and global models by identifying various thresholds or local extrema in certain properties of the atmosphere as they change from their tropical to extratropical values (see Fig. 1).
Multiple independent analyses using chemical constituent measurements, meteorological observations, and reanalysis fields have identified changes in the latitudinal extent and character of the tropical belt during the past ~30 years. These studies have noted poleward migration using a variety of definitions, including the Hadley cell edge as defined by OLR and the meridional mass streamfunction (Hu and Fu 2007; Johanson and Fu 2009; Mitas and Clement 2005), the region of high-altitude tropical tropopause (Lu et al. 2009; Seidel and Randel 2007, hereafter SR07; Seidel et al. 2008), the region of “tropical”-like low column ozone amount (Hudson et al. 2006), and subtropical jet location based on tropospheric and lower-stratospheric temperature retrievals from satellite microwave sounders (Fu and Lin 2011; Fu et al. 2006).
Other studies based on reanalyses have suggested changes in both the strength and position of the subtropical and polar jet streams (Archer and Caldeira 2008, hereafter AC08; Strong and Davis 2007), and a poleward shift in storm tracks (Fyfe 2003; McCabe et al. 2001; Yin 2005). Although the eddy-driven jets do not exist at tropical latitudes and are not strictly related to the other tropical edge diagnostics listed above, we include them here as part of our loosely defined suite of “tropical edge” diagnostics because our overarching concern is with potential atmospheric circulation changes and also because there is some evidence that the jet latitudes and Hadley cell edge are indeed correlated (Kang and Polvani 2011).
Tropical widening and poleward migration of the jets has been detected in climate model simulations with anthropogenic forcings, and model control runs indicate that the magnitude of the late twentieth-century widening cannot be explained by natural variability alone (Johanson and Fu 2009; Lu et al. 2009; Yin 2005). Furthermore, although the inclusion of greenhouse gas increases and ozone changes (primarily polar ozone depletion) in the models produces tropical widening, the rate of widening is greater in observations than in models for the few diagnostics that have been tested (Johanson and Fu 2009). For example, the late twentieth-century poleward expansion rates from several Hadley cell diagnostics span a range of 0.6° ~ 1.8° decade−1, whereas comparable model estimates are 0.1°–0.2° decade−1 (Hu and Fu 2007; Johanson and Fu 2009).
A better understanding of the dynamical mechanisms controlling the tropical width is very important for assessing the relative importance of ozone depletion and anthropogenic greenhouse gas forcing of both past and future tropical width changes. Several mechanisms have been proposed for explaining the poleward movement of the tropics and jets, and in general these mechanisms involve interactions between the atmospheric thermal structure–gradients, winds, and wave breaking (e.g., Chen and Held 2007; Frierson et al. 2007; Lau et al. 2008; Lorenz and Deweaver 2007; Lu et al. 2007; Polvani and Kushner 2002; Polvani et al. 2011; Simpson et al. 2009). As previously published observation and reanalysis-based widening trends cover a large range from ~0.25° decade−1 (AC08) to ~3° decade−1 (SR07), it is not clear to what extent this range reflects inherent differences in differing aspects of the circulation and its drivers, versus the use of different reanalyses, datasets, time periods, and details of tropical edge definitions.
In addition, it is not known to what extent reanalyses accurately capture trends in the width of the tropical belt. Reanalysis trends can be biased to reflect changes in both the quality as well as the quantity of the underlying data being assimilated (e.g., Kistler et al. 2001; Sturaro 2003), and assessing their accuracy is a difficult and expansive task that is beyond the scope of this paper. As pointed out by Kistler et al. (2001), “agreement between two reanalyses in the climate trend is an important necessary but not sufficient condition for confidence in climate trends.” As a necessary first step toward this end, it is the goal of this paper to examine the sensitivity of tropical widening estimates to the use of different reanalyses, time periods, and definitions.
In the next section, we define the tropical edge latitude diagnostics. These diagnostics are a combination of previously published definitions and also include new and more objective definitions. Then, we present time series and trends of tropical belt widths from the different diagnostics and datasets, partitioned by hemisphere and season. In the results section, we compare the different classes of metrics to one another, identifying some of the advantages and disadvantages of different types of metrics for tropical widening detection. We then compare the trend estimates to previous work, offering some explanations for the differences between our estimates and the previous work.
In this section, the methodology for identifying tropical belt edge latitudes is described. Except for the OLR metric, tropical belt widths are calculated from reanalyses output. To facilitate future comparisons with climate model–derived tropical widths, tropical edge latitudes are identified using monthly-mean fields. Except for the tropopause metrics, which use the model-level data, all tropical widths are calculated using reanalysis output on specified pressure levels. A schematic of the tropical edge diagnostics applied to monthly-mean, zonal-mean fields is shown in Fig. 1. Note that some metrics, such as the mean meridional streamfunction, can only be defined from zonal-mean quantities, whereas others can also be defined as a function of longitude. For the diagnostics that can be defined either way, we first identify the edge latitude as a function of longitude, and then compute an area-weighted mean latitude. This method produces almost identical results to identifying the edge latitude from the zonal-mean quantity but allows for the investigation of the longitudinal dependence of the widening.
The reanalyses used here are the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis (1948–present; Kalnay et al. 1996), the NCEP Climate Forecast System Reanalysis (CFSR, 1979–present; Saha et al. 2010), the Japanese Climate Data Assimilation System [an extension of the Japanese 25-year reanalysis (JRA), 1979–present] (Onogi et al. 2007), the Modern Era Retrospective Analysis for Research and Applications (MERRA, 1979–present; Rienecker et al. 2008), 40-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40, 1957–2002; Uppala et al. 2005), and the ECMWF Interim Reanalysis (ERA-Interim, 1989–present; Simmons et al. 2007).
For reference, Table 1 gives a summary of the metrics described below, including a brief description of how each metric is defined and what quantities are used to calculate it. Table 2 lists the horizontal and vertical resolution of the reanalyses for both the pressure-gridded and the model-level data, including the vertical number of model levels in the vicinity of the tropopause that are used in the tropopause calculation.
a. Tropopause-based tropical belt
In this section, four tropopause height–based metrics for identifying the tropical belt edges are described. One of these metrics is similar to one that has been used in several previous studies, whereas three new metrics offer an alternative, and more objective, means by which to quantify the meridional change in tropopause height from tropical to extratropical values. The tropopause is defined here according to the World Meteorological Organization (WMO) lapse rate definition (WMO 1957), which is the lowest point at which the lapse rate decreases to 2 K km−1 or less, and the average lapse rate within the next higher 2 km does not exceed 2 K km−1. For all reanalyses, the tropopause (geopotential) height (hereafter zTP) was calculated with 6-hourly model-level output, which typically has double the vertical resolution of the pressure-level output. Monthly-means of the 6-hourly tropopause fields are then used for diagnosing the tropical belt edges.
The first method of defining the tropical belt using tropopause height is to specify a threshold height (e.g., zTP = 15 km) and for each longitude/hemisphere, and then determine the latitude at which the tropopause drops to the threshold height using linear interpolation. After computing the tropical edge latitude at each longitude (φi) for both the Northern and Southern Hemispheres, an area-equivalent latitude (φeq) is computed for each hemisphere such that the area equatorward of φeq is equal to the area equatorward of the region bounded by φi,
where n is the number of longitude points in the grid.
This tropical edge definition is a modified version of the one introduced by SR07, who defined the tropical edge as the latitude at which the tropopause height is greater than 15 km for x days per year (where x = 100, 200, or 300). Because SR07 used a log–pressure height with a 7-km scale height (W. Randel 2010, personal communication), their 15-km tropopause threshold is equivalent to a pressure threshold of ~120 hPa, which was also adopted by Lu et al. (2009) (with x = 200 days yr−1).
As discussed in Birner (2010), trends from the SR07 definition are sensitive to the choices of the two arbitrary thresholds employed (i.e., the height and the number of days per year). As applied in SR07, the tropical width is calculated as a yearly and zonal mean, although it could be applied on shorter time/spatial scales. Our modified SR07 definition used here involves only one arbitrary threshold (e.g., zTP = 15 km) and is conceptually simpler to apply on shorter time (e.g., monthly, seasonally) and spatial scales.
In addition to the simple tropopause height threshold, we also consider a definition of the tropical belt based on the latitude at which the tropopause height falls to 1.5 km below the tropical average (15°S–15°N) tropopause height (hereafter ΔzTP = 1.5 km). This definition gives an absolute width of the tropical belt that is similar to the zTP = 15 km definition, but identifying the tropical belt edge relative to a tropical-average tropopause height removes the effect of globally uniform variations in tropopause height (e.g., a long-term global trend in tropopause height). In contrast, choosing a fixed height threshold for defining the tropical edge, as in the zTP method described above, would result in an increase (decrease) in the latitude of the tropical edge for a globally uniform rise (fall) of the tropopause.
The ΔzTP definition is a similar, but simplified, version of the objective tropopause definition used by Birner (2010). The Birner definition involves a height threshold that varies by month for each hemisphere based on the tropopause height that occurs least frequently (i.e., the height threshold for which the edge latitude changes least with threshold choice). The ΔzTP definition, like the Birner definition, gives latitudes that are insensitive to globally uniform changes in tropopause height. The Birner objective threshold varies seasonally between about 13 and 14.5 km, and comparable thresholds from the ΔzTP method are 14.5–15.5 km.
Finally, because both the zTP and ΔzTP methods involve assigning arbitrary thresholds, we also compute an objectively-based tropical edge metric based on the latitude of the maximum value of the tropopause meridional gradient (∂zTP/∂φ). As can be seen in Fig. 1, the meridional drop in tropopause height from its tropical to extratropical value undergoes a maximum rate of change in the vicinity of the subtropical jet; this region is often referred to as the tropopause break.
Unlike the tropopause-change metrics above, where the latitude can be found by interpolation to a threshold value, it is the maximum of ∂zTP/∂φ that is of interest. While conceptually simple, identifying the latitude of the maximum in ∂zTP/∂φ restricts the values of φi to the horizontal resolution of the given reanalysis and is not numerically robust for longitudes where there is a relatively weak maximum in ∂zTP/∂φ. For these reasons, tropical edge diagnostics based on identification of extrema (max or min) may suffer from poor noise characteristics, leading to larger uncertainties in trend estimates and longer times for climatic signal detection. Identifying a simple maximum also neglects potential movement at latitudes other than the location of the peak. Because of these issues, we calculate the mean area-weighted latitude of the tropopause gradient,
This “mean” method, which is the first moment of the area-weighted latitudinal distribution of tropopause gradient, is very similar to the method employed by AC08 for jet streams, although they did not use an area weighting in their definition. Finally, we also compute the latitude of maximum ∂zTP/∂φ for comparison with the mean ∂zTP/∂φ diagnostic.
b. Hadley cell edge diagnostics
The Hadley cell edges are diagnosed from the mean-meridional streamfunction (ψ), precipitation − evaporation (P − E) fields, and OLR, using definitions described in previous work (Hu and Fu 2007; Johanson and Fu 2009; Lu et al. 2007; Previdi and Liepert 2007). Briefly, the mean-meridional streamfunction is defined as
where a is the radius of the earth, g is the acceleration due to gravity, and is the zonal-mean meridional wind. The tropical edge using ψ is found by interpolating to the latitude at which ψ changes from positive to negative at 500 hPa (i.e., ψ500 = 0 kg s−1). The tropical edges in ψ are defined in the zonal-mean sense only and are calculated using monthly-mean zonal-mean wind fields.
Similarly, for P − E, the tropical edge is found by interpolating to the latitude where P − E = 0. The rationale for this is that in the subtropics, evaporation exceeds precipitation, and thus the latitude where P − E = 0 represents the poleward edge of the subtropical dry zone in the zonal profile of P − E.
For OLR, the tropical belt has been defined as the location at which OLR drops to a threshold of 250 W m−2 on the poleward side of the subtropical maximum in each hemisphere (Hu and Fu 2007). We consider this definition, but also add in a definition similar to our ΔzTP definition to guard against any potential global OLR trends present in the datasets. This definition involves finding the first latitude poleward of the subtropical maximum at which OLR drops to 20 W m−2 of its peak (ΔOLR = 20 W m−2).
In steady state radiative equilibrium and absent changes in incoming solar irradiance, the global OLR trend will be zero. A global OLR trend, if present in the datasets discussed below, could be caused by these conditions not being met or by artifacts related to changes in satellite instrumentation or sampling. The ΔOLR metric is used to focus on changes in the latitudinal pattern of OLR and guard against potential OLR trends, regardless of their cause.
The OLR datasets used here include the three used by Hu and Fu: International Satellite Cloud Climatology Project (ISCCP; Zhang et al. 2004), High Resolution Infrared Radiation Sounder (HIRS; Lee et al. 2007), and version 2.5 of the Global Energy and Water Experiment surface radiation budget dataset (GEWEX; Stackhouse et al. 2004). As these data do not extend past 2005, we also use the National Oceanic and Atmospheric Administration (NOAA) interpolated OLR dataset from the NOAA polar-orbiting satellites, which is provided continuously from 1979 to present as a monthly mean at 2.5° × 2.5° horizontal resolution (Liebmann and Smith 1996).
c. Wind-based jet metrics
Several different jet metrics based on wind fields are considered here. First, we consider the mean latitude of the mass-weighted wind between 100 and 400 hPa at each longitude (mean u400–100), similar to the definition described by AC08, but with the area weighting as in (2). The mean u400–100 latitude time series presented here are calculated over the same latitudes as the subtropical jet definitions in AC08 (i.e., 15°–40°S and 15°–70°N).
We also consider the mean latitude of the zonal-mean zonal-wind at 850-hPa (mean u850) using (2). This metric (or a similar one using surface winds) has been used to diagnose the midlatitude eddy-driven jet location (Lorenz and Deweaver 2007; Lu et al. 2008; Son et al. 2009; Son et al. 2010), in contrast to the u400–100 latitude, which is affected by a combination of the eddy-driven jet and angular momentum–conserving branch of the Hadley circulation.
In addition to the use of the mean metrics for the u400–100 and u850 winds, we also consider the latitude at which the maximum occurs (i.e., max u400–100 and max u850). The max u850 metric is identical to that used by Polvani et al. (2011). In general, the latitudes at which the mean and maximum metrics occur are not necessarily the same because of the asymmetry in the latitudinal profile and the area-weighting used in (2).
d. Tropical width time series analysis
In the next section, an analysis of trends in tropical width time series is presented. We also wish to compare trends calculated from different reanalyses, metrics, hemispheres, and seasons. The methodology for this analysis is described below.
The trends presented below are linear least squares fits to the annual mean tropical width time series, and the 95% confidence intervals are calculated using an adjusted standard error and adjusted degrees-of-freedom based on the lag-1 autocorrelation of the residuals (Santer et al. 2000; Wilks 2006). In general, the yearly time series do not contain significant autocorrelation, so these corrections are relatively small. We refer to trends as being statistically significant when their 95% confidence interval does not include zero.
We also wish to evaluate whether pairs of trends are statistically different from one another to assess trend differences between different reanalyses, hemispheres, etc. To assess trend differences, we use both the confidence interval method and the difference series method discussed in Santer et al. (2000). In the confidence interval method, a one-tailed, two-sample t test (assuming unequal variance) is performed on the trend difference, using the null hypothesis that the trend difference is zero. Trends are considered different from one another when the p values from this test are <0.05, which is equivalent to rejecting the null hypothesis that the trends are the same (95% confidence level).
It is worth pointing out that, in general, the outcome of the confidence interval test is not the same as a simple visual inspection of whether error bars about the trend overlap one another (Lanzante 2005). If the error bars do not overlap, the difference of the trends is statistically significant, but the inverse is not necessarily true. If the error bars overlap, it is still possible that the difference between the trends is statistically significant.
For comparing trend differences among time series that are nominally of the same quantity, such as the same metric across different reanalyses, we use the difference series method. In this method, the trend (and its statistical significance) in the time series of the difference between the two tropical widths is calculated. By differencing the time series, interannual variability common to both time series is removed, allowing for more sensitive detection of trend differences than in the confidence interval method.
a. Tropical belt time series and trends
In this section, we present annual time series and trends of tropical belt widths using the various metrics and statistical techniques discussed above. Figure 2 shows a plot of the annual mean tropical edge latitude time series in each hemisphere, spanning 1979–2009. Figure 3 is a very similar plot showing the combined global tropical width time series [i.e., NH and SH, combined using (1)]. In each of these plots, the different colors represent different reanalyses or OLR datasets, and each column of plots contains a different category of metric as described in the following section. The top two rows in Figs. 2 and 3 contain time series for the different tropopause height– and OLR-based metrics, and the bottom two rows contain the Hadley cell, P − E, and wind-based metrics.
Similar to the schematic presented in Fig. 1, tropical edges determined from the different metrics span a range of about 20° in latitude. Annual mean edge latitudes determined from tropopause height, OLR, and ψ500 metrics are around 30°–35° in each hemisphere, whereas the peak tropopause gradient and jets occur poleward of 40°. For most diagnostics, the reanalyses have similar annual mean values and exhibit similar interannual variability, suggesting that such variability is real and is not merely noise associated with the tropical edge identification algorithms.
Some of the time series in Figs. 2 and 3 indicate long-term changes in the width of the tropical belt, as identified in previous studies. It is evident that the tropical belt width trends depend on the reanalysis, hemisphere, diagnostic, and time period considered. The tropical belt growth visually apparent in Figs. 2 and 3 is borne out by the trend summary presented in Fig. 4 and Table 3, which shows the total (NH + SH) decadal trends and their error bars, along with estimates from previous studies.
The data in Fig. 4 includes the four reanalyses that span 1979–2009, starting when satellite data became available. Although discontinuities at the beginning of 1979 in the time series are not obvious (not shown), we restrict our trend analysis to the period from 1979 onward for more direct comparison with previous studies, and also out of caution, as temperature discontinuities due to the introduction of satellite data have been noted (Kistler et al. 2001; Sturaro 2003; Tennant 2004; Trenberth et al. 2001) that could affect the circulation in subtle ways. Absent from Fig. 4 are the ERA-40 (1979–2001) and ERA-interim (1989–present) reanalyses, which do not cover the full period from 1979–2009. Trends over the limited time periods are discussed below in the inter-reanalysis comparison.
b. Comparison of metrics by category
There are often multiple ways of defining a tropical edge or jet latitude from a given quantity of interest, such as winds or tropopause height. In this section, we categorize the methodology of the metrics used here and in previous studies, with the hopes of illuminating some of the advantages and disadvantages of the different ways of defining edge latitudes. To this end, we classify each of the metrics presented here into one of five categories: absolute threshold (zTP = 15 km, OLR = 250 W m−2), relative threshold (ΔzTP = 1.5 km, ΔOLR = 20 W m−2), zero-crossing threshold (ψ500 = 0, P − E = 0), mean (or first moment, mean ∂zTP/∂φ, mean u400–100, mean u850), and extrema (max ∂zTP/∂φ, max u400–100, max u850). The first two of these categories can be considered subjective, in that they involve an arbitrarily chosen threshold, whereas the latter three categories are objective, as they involve identifying the latitude at which a physically meaningful quantity occurs.
Several previous studies have used absolute, subjective thresholds in tropopause height and OLR for defining the tropical edges, and these studies have also been the ones that reported the largest tropical belt trends (e.g., Fig. 4). One potential drawback to the use of these types of definitions is that a (hypothetical) uniform trend in the quantity under consideration will show up as a trend in the tropical edge latitude that is independent of any actual poleward shift in the latitudinal pattern. For this reason, we calculated relative threshold metrics (i.e., ΔzTP = 1.5 km, ΔOLR = 20 W m−2) in addition to the previously used absolute threshold metrics (i.e., zTP = 15 km, OLR = 250 W m−2).
As can be seen in Fig. 4, the trends from these relative threshold metrics are all smaller than from their absolute counterparts. To highlight the relationship between trends in zTP–OLR and widening in their absolute threshold metrics, Table 4 lists the global and “tropical” (45°S–45°N) trends in zTP–OLR, and differences between the absolute and relative threshold trends.
For the tropopause height metrics, the difference in tropical widening between the absolute and relative threshold definitions are significant only for NCEP and CFSR, and for these reanalyses the trend differences are 0.5 ~ 0.7° decade−1. NCEP and CFSR also contain statistically significant global trends in tropopause height, as well as significant trends in tropopause height in the 45°S–45°N region. Thus, in part these tropopause height trends help explain the difference in trends between the absolute and relative threshold metrics.
Similarly, for OLR, trend differences between absolute and relative threshold metrics are significant in the HIRS and ISCCP datasets, which have 1°–2° decade−1 larger trends in the absolute metrics compared to the relative ones. HIRS and ISCCP are also the two datasets that contain the largest trends in OLR (1 ~ 2 W m−2 decade−1), although the other datasets also contain rather uniform, significant increases in OLR in the region around their subtropical maxima. These changes are sufficient to cause a trend in the latitude at which OLR = 250 W m−2, even though the overall pattern is not apparently shifting poleward. The use of a relative threshold for the OLR metrics causes a dramatic change to the trends from Hu and Fu, changing the range of trends from 0.9°~1.8° decade−1 (all significant) to −0.3°–0.8° decade−1 (none significant). These results suggest that the use of subjective, absolute thresholds can be misleading, and should be abandoned in favor of relative threshold or objective metrics where possible.
Another issue explored here is how mean (first moment) metrics compare to extrema metrics. Identifying simple maxima in quantities such as wind speed offers a conceptually simple manner in which to diagnose jet latitudes, but there are several drawbacks. First, under noisy or weak peak conditions, extrema identification can be misleading. Unless some form of interpolation is used to fit points around the peak, extrema metrics are also constrained to occur on a model grid box. For lower-resolution models such as NCEP–NCAR (2.5° × 2.5°), identifying changes 0.1° ~ 1° decade−1 may be difficult or impossible. One possible way to get around this limitation is to apply the metric at higher temporal resolution (e.g., on 6-hourly output), but this may not be possible in some cases (i.e., when dealing with large climate model output datasets), and at higher temporal resolution increased noise in the signal may partially negate the effects of averaging over a larger sample.
Also, from an information content perspective, movements in the peak only reveal information about a limited part of a distribution. By using first-moment metrics, which detect a latitudinal shift in the “center of mass” of a distribution over a wide range of latitudes, one is inherently considering information from more latitudes and is less constrained by model horizontal resolution. Overall, it is expected that time series of mean metrics should contain less noise than extrema metrics, and thus be more appropriate for detecting potentially small climate trends on a noisy background. This improvement in detection can be seen clearly by comparing the mean and maximum time series in Figs. 2 and 3. The interannual variability in the mean time series is a factor of 2 ~ 4 less than that of the maximum time series, leading to a better ability to detect small trends. As the trends were virtually identical for the mean and maximum metrics, with the only difference being larger error bars on the maximum metrics, we excluded their trend estimates from Fig. 4.
c. Inter-reanalysis comparison
In this section, we assess how well the trends computed from the different observations and reanalyses agree with one another. We first focus on NCEP/NCAR, NCEP CFSR, MERRA, and JRA, which span the period 1979–2009, and then briefly discuss the shorter time periods covered by the two ECMWF reanalyses. For each metric, the trend differences between different datasets can be seen visually in Fig. 4; however, as discussed above, the overlap cannot be relied upon for statistical significance testing. For this reason, the trends in the difference time series for global tropical width metrics are listed in Table 5a. The same information for the NH and SH data is listed in Tables 5b,c.
In many cases, the trends from the different reanalyses are statistically different from one another, most notably in the tropopause and P − E = 0 metrics. For the global tropopause metrics, the NCEP CFSR shows significantly larger trends in the ΔzTP = 1.5 km and mean ∂zTP/∂φ metric than all other reanalyses except ERA-40. Considering the hemispheric trends separately, the NCEP CFSR shows larger trends in each hemisphere than the other reanalyses, with the differences between CFSR and the other reanalyses greatest in the SH. It is worth noting that we calculated the tropopause heights on model levels for all reanalyses except CFSR. For CSFR we used the tropopause fields provided by NCEP. At the time of writing, the CFSR model-level data were not publicly available, so a discrepancy due to a subtle difference in the CFSR tropopause calculation cannot be ruled out.
In addition to the tropopause metrics, the P − E = 0 metric also shows significant differences in the trend estimates. Globally, the JRA contains a much larger trend in P − E = 0 than the other reanalyses that cover the full time period. This is due to a poleward expansion in the SH that is not in agreement with the other three reanalyses. As can be seen in Fig. 2, the poleward trend is primarily due to a very large step change of ~5° in 1987 in the JRA. This step change coincides with the introduction of precipitable water vapor data from the Special Sensor Microwave Imager (SSM/I) midway through 1987. Introduction of SSM/I data led to a significant improvement in precipitation forecast skill in the model (Onogi et al. 2007), and as such, this discontinuity, and the associated poleward trend in SH P − E = 0 latitude, is very likely spurious.
To evaluate the agreement between the two ECMWF reanalyses with the others, we also consider the time periods 1979–2001 (to include ERA-40), and 1989–2009 (to include ERA-Interim). Over the 1989–2009 time period, the ERA-Interim trends are statistically different from the other reanalyses for some metrics. One notable difference is with the CFSR tropopause metrics in the SH. As with the other reanalyses, ERA-Interim shows less poleward movement in these metrics than CFSR. Also, the ERA-Interim NH P − E = 0 trends are also significantly positive and larger than all other reanalyses, although the differences are only significant with ERA-40 and JRA. Interestingly, the ERA-Interim shows significantly more widening than ERA-40 in the NH for the P − E = 0, ΔzTP = 1.5 km, and mean u850 metrics. For ERA-40, the NH tropopause widening is significantly less than in CFSR, but there are not any large systematic differences relative to the other reanalyses.
d. Differences in hemispheric and seasonal trends
It is possible that the signature of different drivers on tropical widening could lead to different hemispheric and/or seasonal trends among the various metrics. Indeed, several studies have suggested that tropopause widening in the SH is larger than in the NH (Birner 2010; Lu et al. 2009; SR07), leading to the interpretation that Antarctic ozone depletion is key (Son et al. 2009, 2010), although hemispheric differences are not present in the Hadley cell metrics (Hu and Fu 2007). To address this issue, Figs. 5 and 6 show trends for each metric by hemisphere and season, respectively.
Consistent with previous studies, the tropopause-based metrics unanimously show more poleward migration in the SH than in the NH for 1979–2009, although (using the difference time series method) the differences are only significant in NCEP–NCAR and NCEP CFSR. In contrast, the OLR metrics all show greater widening trends in the NH than the SH, with only the NCEP and GEWEX data significant at the 5% level (for ΔOLR = 20 W m−2). Other metrics do not exhibit statistically significant hemispheric differences in the trends.
We also wish to assess whether trends vary by season for a given metric, reanalysis system, or hemisphere. The trends in Fig. 6 show a weak seasonality of the trends in some metrics. For example, as pointed out by Hu and Fu (2007), the growth in ψ500 = 0 seems to be predominantly in the respective summer–fall seasons of each hemisphere. Also, as pointed out by Polvani et al. (2011), widening in the SH wind metrics seems to occur predominantly in DJF.
To assess the seasonality quantitatively, for each trend in Fig. 6 we determined whether the other three seasonal trends from the same hemisphere, reanalysis, and metric were the same. In no cases were the seasonal trends different from one another at the 5% level, although the qualitative observations regarding the ψ500 and wind metrics are borne out at the 15% level.
e. Comparison to previous studies
Here, we briefly compare our results to previous studies of tropical belt growth, whose decadal rates of change are shown in the lower panel of Fig. 4. Our NCEP OLR = 250 W m−2 trends for 1979–2009 are within the range of the three different OLR datasets considered by Hu and Fu (2007), which end in 2003/04. The trends for the Hu and Fu datasets cover the same range as in their paper and are essentially identical to their results. For ψ500, our results for NCEP–NCAR are very close to those from Hu and Fu, even though the time periods are slightly different (NCEP–NCAR data in their analysis ended in 2005). As illustrated in Fig. 4 and previously pointed out by Johanson and Fu (2009), the satellite OLR = 250 W m−2 and reanalysis ψ500 = 0 latitude trends are significantly larger than the same metrics diagnosed from climate model simulations run over a similar time period including natural and anthropogenic forcings.
SR07 reported large trends (2°–3° decade−1) in NCEP–NCAR using their zTP = 15 km metric that are much greater than the values presented here (~0.3° decade−1). To test the reason for the differences, we calculated trends using the same time period (1979–2005) and methodology as in SR07, with a threshold of 200 day−1. From this, we get a trend of 0.99° ±0.45° decade−1, which is statistically different than their trend of 1.8° ±0.6° decade−1. However, SR07 used a constant scale height of 7 km for calculating z (W. Randel 2010, personal communication) rather than the geopotential height used here. This explains the difference between our trend and theirs, as our implementation of their method using log–pressure height reproduced their results identically.
By using a 7-km scale height, their definition of a 15-km tropopause height is equivalent to a pressure threshold of 117 hPa, and is close to the value of 120 hPa used by Lu et al. (2009). It turns out that the combination of the choices of height (pressure) and days-per-year threshold in SR07 are quite sensitive, as evidenced by the large difference between the Lu et al. (2009) trends (~0.85° decade−1 using 120 hPa and 200 d a−1 thresholds) and the SR07 trends (1.8° decade−1 using 117 hPa and 200 day−1). The extreme sensitivity of the trends in the SR07 method to threshold choices is addressed comprehensively by Birner (2010), so it is not explored in detail here.
In addition to methodology differences, we also note that end date differences explain a relatively small part of the difference between the SR07 trends and those from our zTP = 15 km definition. For example, using the NCEP–NCAR reanalysis, the zTP = 15 km trend is 0.31° ±0.34° decade−1 for 1979–2009 as opposed to 0.53° ±0.38° decade−1 for the 1979–2005 period used in SR07.
In this paper, we presented updated time series and trends of tropical belt widths calculated from multiple reanalyses and satellite observations and using multiple definitions for the tropical edge latitudes. With the exception of the ψ500 = 0 and absolute threshold metrics (zTP = 15 km, OLR = 250 W m−2), the various datasets yield tropical widening rates that are mostly positive (i.e., poleward), less than 1° decade−1, and not statistically significant. For many of the metrics considered, statistically significant differences were found in the trends from different datasets.
The trends in ψ500 = 0 updated through 2009 indicate a continuing, statistically significant poleward movement of the Hadley cell edge at a rate of ~1° decade−1 or more in all reanalyses except the CFSR, which does not show widening in this metric. Also, except for the CFSR, which has a trend of 0.65° (0.78°) decade−1 in ΔzTP = 1.5 km (mean ∂zTP/∂φ), the reanalyses do not contain statistically significant tropical widening in the tropopause metrics. This is perhaps partly due to a slowdown in growth around 2000, which may be related to previously documented changes in stratospheric circulation and composition (Randel et al. 2006). However, as there are many factors that affect tropopause heights, this is an area where further investigation is warranted.
The most systematic discrepancies between the trends from different reanalyses were found for tropopause diagnostics (with CFSR the outlier) and P − E metrics (with JRA the outlier). Because we calculated the tropopause heights for all reanalyses except CFSR, it is possible that the differences in tropopause widening may be due to a different implementation of the tropopause calculation in CFSR. For the JRA, the large trend in P − E = 0 latitude is almost certainly due to the introduction of SSM/I data in 1987, although the negative trend in MERRA is not obviously due to a discontinuity in the time series that can be associated with a particular event.
Our results are qualitatively consistent with previous studies that have noted hemispheric and seasonal differences in tropical belt growth. The most notable hemispheric differences are in the tropopause diagnostics, which show stronger expansion in the SH, and OLR, which shows stronger expansion in the NH. For the P − E metric, the trends are much more consistent in the NH (0.1°–0.5° decade−1) than in the SH (−0.7°–1.9° decade−1). This is possibly a reflection of the relative dearth of assimilated measurements in the SH that can affect the hydrological cycle, as well as changes in time in the number and quality of such measurements. Finally, tropical belt growth is largest in summer–fall in each hemisphere for the ψ500 and u850 metrics, as recognized by previous studies (Hu and Fu 2007; Polvani et al. 2011), although the seasonal differences are not significant at the 95% level.
The data presented here show that the use of relative threshold definitions for OLR and tropopause-based tropical widths yields trends that are smaller than previous estimates based on absolute thresholds. With the exception of the ψ500 estimates, this leads to a lowering of the range of tropical expansion to be less than 1° decade−1. These results also highlight the importance of abandoning the use of subjective tropical edge definitions based on absolute thresholds. Trend estimates using these definitions are potentially sensitive to arbitrary threshold choices and by definition uniform global-mean changes in the quantities under consideration result in apparent widening.
Ultimately, the differences in growth rates among the tropical belt diagnostics are a manifestation of the differing physics represented by the various diagnostics. As such, future work comparing tropical belt growth rates from multiple diagnostics between models and observations–reanalyses may help provide important clues as to how well represented different physical processes are in the models, and what processes are of primary importance for driving observed and future changes in the tropical width.
We wish to thank Dian Seidel, Thomas Birner, Qiang Fu, Lorenzo Polvani, Eric Ray, Paul Young, Susan Solomon, Celeste Johanson, and Bill Randel for helpful discussions and reviews. NCEP–NCAR reanalysis data were provided by NOAA/OAR/ESRL PSD, Boulder, Colorado, from their Web site at http://www.cdc.noaa.gov/. ERA-40 and ERA-Interim data were provided by the Data Support Section of the Computational and Information Systems Laboratory at the National Center for Atmospheric Research in Boulder, Colorado (http://dss.ucar.edu). The JRA data used in this study are from the JRA-25 long-term reanalysis cooperative research project carried out by the Japan Meteorological Agency (JMA) and the Central Research Institute of Electric Power Industry (CRIEPI). The CFSR data were developed by NOAA’s National Centers for Environmental Prediction (NCEP). The data for this study are from NOAA’s National Operational Model Archive and Distribution System (NOMADS), which is maintained at NOAA’s National Climatic Data Center (NCDC). We also acknowledge the Global Modeling and Assimilation Office (GMAO) and the GES DISC for the dissemination of MERRA data.