The historical global “best track” records of tropical cyclones extend back to the mid-nineteenth century in some regions, but formal analysis of these records is encumbered by temporal heterogeneities in the data. This is particularly problematic when attempting to detect trends in tropical cyclone metrics that may be attributable to climate change. Here the authors apply a state-of-the-art automated algorithm to a globally homogenized satellite data record to create a more temporally consistent record of tropical cyclone intensity within the period 1982–2009, and utilize this record to investigate the robustness of trends found in the best-track data. In particular, the lifetime maximum intensity (LMI) achieved by each reported storm is calculated and the frequency distribution of LMI is tested for changes over this period.
To address the unique issues in regions around the Indian Ocean, which result from a discontinuity introduced into the satellite data in 1998, a direct homogenization procedure is applied in which post-1998 data are degraded to pre-1998 standards. This additional homogenization step is found to measurably reduce LMI trends, but the global trends in the LMI of the strongest storms remain positive, with amplitudes of around +1 m s−1 decade−1 and p value = 0.1. Regional trends, in m s−1 decade−1, vary from −2 (p = 0.03) in the western North Pacific, +1.7 (p = 0.06) in the south Indian Ocean, +2.5 (p = 0.09) in the South Pacific, to +8 (p < 0.001) in the North Atlantic.
Data describing past tropical cyclone frequency, track, and intensity have been collected using methods of paleotempestology (e.g., Frappier et al. 2007) and through investigation of historical written records (e.g., Chenoweth and Divine 2008), but the majority of data collected globally over the past 150 years constitute what is known as the “best track” (Knapp et al. 2010). These data were, and are, generally collected in an operational forecasting setting and represent the best technology and analysis protocols of the time (e.g., Hagen et al. 2012). Because the technology and analysis protocols have progressively changed over time, the data naturally contain temporal heterogeneities. For example, prior to the meteorological satellite era, which began in the 1960s, tropical cyclones that never approached land or encountered a ship had a greater chance of nondetection, and consequently the record of storm frequency may contain unphysical upward trends because more storms have been detected during the satellite era (e.g., Vecchi and Knutson 2011).
As with frequency, estimates of tropical cyclone intensity in the best track are also heterogeneous (Kossin et al. 2007; Hagen and Landsea 2012). For example, regular in situ intensity measurements from aircraft reconnaissance into tropical cyclones began around 1948 but were terminated in the western North Pacific in 1987, which introduced a discontinuity into intensity data in the regional best track (Martin and Gray 1993). Similarly, methods of intensity estimation using satellite data evolved considerably during the 1970s and 1980s owing to the introduction of, and subsequent improvements to, the Dvorak technique (Dvorak 1973, 1984). This technique will be discussed in greater depth in section 2.
In addition to heterogeneity in the best-track intensities from each tropical cyclone region, there are interregion heterogeneities that can become problematic when combining regional best tracks for global analyses.1 For example, some regions report intensity as sustained 10-min winds while others report these as 1-min sustained winds, and converting between the two retrospectively is not straightforward (Harper et al. 2008). Additionally, some regions benefit from regular aircraft reconnaissance, while others do not. As noted above, reconnaissance data were utilized in constructing the western North Pacific best track from 1948 to 1987, while the North Atlantic best track has benefited from regular aircraft data from 1948 to present. Aircraft measurements of near-surface winds are often indirectly derived from measurements of minimum sea level pressure, but the methods for performing the pressure-to-wind conversion are not consistently applied in the best track (e.g., Knaff and Zehr 2007). Sporadic increases in the amount and quality of available data, including aircraft data, can occur during field experiments in any ocean basin. Different regions are also viewed from different geostationary satellites, and these satellites can differ in the spatial and temporal resolution of their instruments and in their position relative to the regions of high tropical cyclone activity. Satellite instrument resolution changes have also occurred at differing times in each region as technology has improved. All of these changes can affect the quality of tropical cyclone intensity estimates produced from application of the Dvorak technique using geostationary satellite data (e.g., Velden et al. 2006). These estimates make up the primary input into best-track data in regions without active aircraft reconnaissance.
The data issues briefly described here (which by no means comprise a complete list) introduce uncertainty and reduce confidence in analyses of the data and thus complicate the detection of tropical-cyclone-related trends attributed to climate change (Knutson et al. 2010; Seneviratne et al. 2012; Lee et al. 2012). There is an expectation that global warming forced by CO2 will increase the mean state of tropical cyclone potential intensity in the tropics (Emanuel 1987; Henderson-Sellers et al. 1998; Bister and Emanuel 2002), and consequently the relative frequency distribution of tropical cyclone intensity is expected to shift toward greater intensities (Emanuel 2000; Wing et al. 2007; Elsner et al. 2008).2 But the uncertainty in the best-track data has led to a situation where any trends identified in the best-track record of intensity, whether global or regional, are suspect and debated (e.g., Webster et al. 2005; Landsea et al. 2006). When comparing CO2 warming scenarios with nonwarming control simulations, numerical and statistical models generally project increases of mean tropical cyclone intensity (Knutson et al. 2010). But the associated emergence time scales, which describe the time needed for the signal of a trend to become identifiable above the noise of natural variability at some prespecified level of confidence (usually 90%–95%), can be long [multidecadal or more, e.g., Knutson and Tuleya (2004)]. If this is indeed the case, then the best-track data should, at a bare minimum, be temporally homogeneous on a similar time scale in order for greenhouse-gas-induced trends to be detectable with an acceptable level of confidence. On the other hand, there is mounting evidence that regional aerosol forcing, both natural and anthropogenic, can introduce detectable changes on shorter time scales than the more globally uniform forcing from well-mixed greenhouse gas (e.g., Baines and Folland 2007; Mann and Emanuel 2006; Evan et al. 2009; Booth et al. 2012; Evan et al. 2012; Villarini and Vecchi 2013; Dunstone et al. 2013). In this case, there is the potential for detectable trends that are attributable to anthropogenic forcing on shorter (multidecadal or less) time scales but are more region specific.
Here we attempt to mitigate a number of the data issues summarized above by applying a state-of-the-art objective intensity estimation model to a homogenized global record of satellite data and then perform trend analyses with these new data, globally and within individual regions. In section 2, we will describe the data and the algorithm used to estimate intensity, and our method for removing a known discontinuity in the satellite data in the region of the Indian Ocean. In section 3, we will show and discuss the results of the global and regional trend analyses and perform a heuristic exercise in which observed global trends in tropical cyclone potential intensity are used to create synthetic time series of tropical cyclone lifetime maximum intensity (LMI) that are then subjected to trend analysis. The purpose of this exercise is to better establish what global trends in tropical cyclone intensity might be theoretically expected given the observed changes in tropical climate. We make no attempt to formally establish attribution for any observed changes found in the new data, but in section 4 we will discuss potential connections to both the expected long-term trends caused by increasing CO2 emissions and the shorter-term variability expected to be associated with regional aerosol forcing.
2. Data and method
The global best-track data comprise 6-hourly estimates of the location (latitude and longitude) and intensity (usually measured as a wind speed) of every recorded tropical cyclone (e.g., Knapp et al. 2010) during varying periods of record in each region. As mentioned in the previous section, a substantial proportion of intensity estimates in the global best-track data are strongly influenced by estimates provided operationally by the Dvorak technique (Dvorak 1973, 1984). A thorough review of this technique and its strengths and weaknesses is found in Velden et al. (2006) and Knaff et al. (2010). In the hands of a trained specialist, the Dvorak enhanced infrared (EIR) technique (Dvorak 1984) is widely considered the “gold standard” for estimating tropical cyclone intensity from infrared satellite imagery. However, the technique relies in part on subjective decision making and is not always consistently applied from one specialist to the next, or one forecast office to the next, or one tropical cyclone season to the next. To remove the subjective decision making from the technique, the advanced Dvorak technique (ADT) was introduced (Velden et al. 1998), and this is the algorithm that we will apply here. A thorough description of the ADT and its error characteristics is found in Olander and Velden (2007). Briefly, the ADT operates in a similar way to the original Dvorak EIR technique, using infrared satellite imagery to identify a “scene type” (such as “curved band” or “eye” scenes) and then applying different statistical/empirical-based models according to scene type. For example, if an eye scene is identified, then a model is applied that relates storm intensity to the infrared brightness temperature of the eye and the cold cloud tops of the surrounding eye wall. If a curved-band scene is identified, then a model is applied that fits a log spiral to the bands and relates intensity to this fit. All scene identification and intensity estimation is fully automated in the ADT.
The Dvorak EIR technique and the ADT can be applied to infrared data from polar orbiting or geostationary satellites. However, to achieve regular temporal consistency, geostationary data are almost exclusively used. As noted in the previous section, there have been a number of changes in geostationary satellite instrumentation and their orbital positions since their introduction as weather satellites in the 1970s. To create a more homogeneous record of tropical cyclone intensity using the ADT, a homogenized satellite data record is needed. Here we use the Hurricane Satellite (HURSAT)-B1 data (Knapp and Kossin 2007; Kossin et al. 2007; Knapp et al. 2011), which is a global conglomeration of available geostationary satellite imagery since the late 1970s. The data are parsed from International Satellite Cloud Climatology Project (ISCCP)-B1 data (Knapp and Kossin 2007; Knapp 2008a,b) and are centered on the storms recorded in the global best track. Figure 1 shows the evolution of the available satellite data in the ISCCP-B1 record.
The HURSAT-B1 data (hereafter referred to simply as HURSAT data) have been reprocessed and recalibrated, and later data have been subsampled both spatially and temporally to be homogeneous with earlier data (~8 km spatial and 3 h temporal resolution). Although the HURSAT record begins in 1978, there are missing storms in the first few years of the record owing to limited geostationary data availability (Fig. 2), and our period of analysis is constrained to the 28-yr period 1982–2009 when imagery from all regions/basins became routinely available. Within this period, there can still be occasional data gaps within individual storms that exceed the normal 3-hourly resolution of the HURSAT data (e.g., during satellite eclipse periods), but the frequency and duration of these gaps exhibit no trends in time (not shown) and should not substantially affect the trend analyses in the next section.
Previous studies by Kossin et al. (2007) and Elsner et al. (2008) formed intensity estimates from the HURSAT record using standard linear regression models. These models were relatively simple to implement and allowed for temporal consistency, but with the understanding that there would often be large errors in individual estimates. In other words, accuracy was sacrificed for consistency and model simplicity. Another issue with these previous estimates is introduced by a known discontinuity in the quality of satellite data in the region of the north and south Indian Oceans in 1998. Prior to 1998, satellite views of Indian Ocean tropical cyclones were highly oblique. In 1998 a satellite was repositioned over the Indian Ocean, and the satellite view angle, hence the quality of the imagery, was greatly improved (Fig. 1, see also Knapp et al. 2011). This discontinuity was introduced as a caveat but not addressed in Kossin et al. (2007). In Elsner et al. (2008), the discontinuity was addressed retroactively by removing a constant mean bias in the pre-1998 satellite-based predictors of the regression model (see their supplementary information). Here we mitigate this discontinuity proactively by disallowing all imagery from the post-1998 satellites that are positioned over the Indian Ocean. Specifically, we removed all Meteosat-5 data after 1 July 1998, all Meteosat-7 data after 17 July 2006, and all Feng Yun (FY)-2 data from the HURSAT record. This essentially degrades the post-1998 data to pre-1998 standards, removing the discontinuity in a direct proactive manner rather an indirect retroactive manner. To quantify the effect that this additional homogenization step has on the trend analyses in the next section, we will repeat many of our analyses without this step so that the trends can be compared.
As mentioned earlier, the satellite imagery in the HURSAT record relies on prior knowledge of the tracks of the storms, and this information is taken from the global best track. Thus, there is no new information to be provided here regarding potential heterogeneity in the record of storm frequency, and we will concentrate solely on providing a more homogeneous record of storm intensity, which will be treat as an intensive (bulk) property independent of frequency. For each 3-hourly image for each storm in the HURSAT record, we apply the fully automated ADT to estimate intensity in terms of wind speed. This record will be referred to as the ADT-HURSAT record. Then for each storm in the ADT-HURSAT record, we identify the lifetime maximum intensity achieved. We also calculate the LMI from the best track. Some storms have multiple best-track entries because there is overlap among the agencies that contribute to the global best track. When this occurs, we simply choose the greatest of the reported LMI values.
The global frequency distributions of LMI calculated from the best track and the new ADT-HURSAT estimates are shown in Fig. 3. One feature that is immediately apparent is a high degree of kurtosis in the ADT-HURSAT LMI distribution, with a peak at 55–65 kt (1 kt = 0.514 ms−1) that has no obvious counterpart in the best-track LMI distribution. This is attributed to a known issue with the ADT (Olander and Velden 2012): as an incipient tropical cyclone develops and intensifies, a “cirrus shield” is typically maintained over the central convective region of the system. This shield is opaque in infrared satellite imagery, which precludes the ADT scene-typing algorithm from accurately identifying increases in convective organization that are associated with an intensifying storm. As a storm intensifies toward Category-1 hurricane strength (i.e., 65 kt on the Saffir–Simpson scale3), an eye typically begins to develop, but this often takes place gradually while obscured under the cirrus shield. In this case, the ADT can “get stuck” at a steady intensity just below hurricane strength. If the eye that has been forming below the cirrus shield becomes apparent in the infrared imagery, it often occurs suddenly as the cirrus above the eye quickly dissipates via local subsidence. The ADT will interpret this as a rapid change in the scene type and will rapidly intensify the storm. However, if an eye never appears in the infrared imagery, then the LMI is recorded below hurricane strength, which is often erroneously weak. This process is likely a major contributor to the artificially high kurtosis seen in Fig. 3, although there may be other contributing factors that have not been identified yet.
A new experimental version of the ADT has addressed this issue by incorporating passive microwave satellite imagery into the algorithm (Olander and Velden 2012). Passive microwave sensors can “see through” the cirrus shield and provide information related to convective organization, allowing the ADT to provide a more uniform evolution of intensity prior to eye formation in hurricanes. However, for our purposes, reliance on microwave imagery introduces severe limitations on the homogenization process as well as substantially reducing the time span of our available data. To mitigate the spurious kurtosis in our ADT-HURSAT LMI sample, here we will only consider storms that reached or exceeded 65-kt intensity in their lifetimes. This choice provides additional benefit in that the Dvorak-based satellite methods, including the ADT, are substantially more accurate in developed hurricanes than developing tropical storms (Olander and Velden 2007). For our period 1982–2009, 1105 of the 2513 recorded storms reached hurricane strength in the ADT-HURSAT and, since we are primarily interested in identifying changes in the most intense storms, constraining our analyses to this subsample should not introduce any significant interpretive limitations.
Before proceeding with formal trend analyses of the new ADT-HURSAT record, an additional homogeneity test is provided here. As discussed above, the Dvorak EIR type of intensity estimation methods, such as the ADT, rely on statistical/empirical models that relate intensity to features in infrared satellite imagery. For example, when an “eye scene” is identified, warmer infrared brightness temperatures in the eye region and colder temperatures above the convective eye wall region both correlate statistically to greater wind speeds. But there is no guarantee that these relationships are stationary under a changing climate. If these relationships are not stationary, then there is no guarantee that an identified trend in the ADT-HURSAT estimates is capturing a physical trend in intensity. To test this, we use best-track data that are within 3 h of a low-level aircraft reconnaissance “fix” as ground truth and calculate the error in the ADT-HURSAT data for each year (Fig. 4). We find no trends in the central tendency of the errors. A reduction in the interquartile range is evident in Fig. 4 (blue lines), but these trends are not significant, nor are the trends at the higher and lower quantiles (red lines). We also analyzed errors in operational intensity estimates provided by both the NOAA Tropical Analysis and Forecast Branch (TAFB) and the Satellite Analysis Branch (SAB) and found no significant trends (not shown). Thus, within the limitations of the available data, there is no evidence to suggest nonstationarity in the statistical relationships that either the ADT or the Dvorak EIR technique is based on.
3. Trend analysis
Following Elsner et al. (2008), we will apply the method of quantile regression (Koenker and Bassett 1978; Koenker and Hallock 2001; Koenker 2005; Jagger and Elsner 2009; Koenker 2011) to the LMI samples from the best-track and ADT-HURSAT records to estimate the median and other quantiles of the response variable LMI conditional on the covariate time (given by year). In our application, quantile regression provides information on how the frequency distribution of hurricane lifetime maximum intensity is changing over our time period of analysis. The analysis will be applied to the global data and then to the data from each region. In our 28-yr time period of analysis, there are only 39 storms that reached hurricane intensity (LMI ≥ 65 kt) in the north Indian Ocean best track, and 38 in the ADT-HURSAT data in that region. We found that this was insufficient for regional trend analysis, and here we will necessarily limit our discussion to trends in the remaining regions.
Before introducing the more rigorous statistical analysis provided by quantile regression, it is useful to consider simplified time series that can provide a visual sense of the 28-yr trend signals in the ADT-HURSAT LMI values relative to their interannual variability. The remainder of this section will then be devoted to a more formal quantification of the amplitudes and levels of confidence of the trends. For each region, we calculated the median and higher quantiles of LMI from each year in our period of data. The LMI values are from the ADT-HURSAT with the additional homogenization correction to account for the discontinuity in the HURSAT data over the Indian Ocean region. The results are shown in Fig. 5. Increasing, but weak, trends are found in the global ADT-HURSAT data, indicating a subtle shift of LMI toward stronger storms. The different trend amplitudes seen at the different quantiles indicate that the shift is not uniform across the frequency distribution, and the trend is greatest at around the 0.6 quantile of the distribution (~60 m s−1) and vanishes at the 0.9 quantile (~73 m s−1). In the North Atlantic, very strong positive trends are found, while negative trends are found in the LMI from the eastern Pacific region. No clear positive trends are seen in the western Pacific, although there is a weak negative trend at the 0.9 quantile. Contrarily, the LMI from both the South Pacific and south Indian Ocean exhibit positive trends at most quantiles. As noted, these simple time series are based on a reduction of each year’s data to single values representing the various quantiles for that year, and the trend line is based on an ordinary least squares fit to these 28 single values. This is useful for providing a visual sense of the trends and variability, but the analyses in the following sections using quantile regression represents a more rigorous treatment of the trends and their levels of confidence.
a. Global trends
Global trends deduced using quantile regression are shown in Fig. 6. In the best track, the trend in the mean LMI is about +2 m s−1 decade−1 (solid red line in Fig. 6) and is statistically significant (dashed red lines). The mean LMI trends in the ADT-HURSAT, both with and without the additional homogenization correction, are substantially smaller and are not significant. At higher quantiles, however, the trends increase in both the best-track and ADT-HURSAT data, in agreement with Elsner et al. (2008). Maximum trends in the best track are ~+3.3 [±0.7 standard error (se)] m s−1 decade−1. Trends in the ADT-HURSAT with the additional homogenization correction are almost exclusively positive above the 0.4 quantile of LMI (52 m s−1), and achieve a maximum of ~+1.1 (±0.6 se) m s−1 decade−1 near the median LMI (57 m s−1). The trends in ADT-HURSAT LMI shown in Fig. 6 are not significant at the 95% (two sided) confidence level, except near the median LMI, where they are marginally significant. In comparison, the analysis of the regression-based intensity estimates of Elsner et al. (2008) identified significant global trends as large as 3.0 (±0.9 se) m s−1 decade−1, but those trends were based on the changes in quantiles calculated from a different LMI frequency distribution that included tropical storms (with a lower bound of 17 m s−1), whereas here we are only considering hurricanes (with a lower bound of 33 m s−1). The behavior of the part of the sample with LMI between 17 and 33 m s−1 affects how the higher quantiles change, making direct comparisons difficult. As a side note for comparison, the median LMI in our global sample of hurricanes is roughly equal to the 0.9 quantile of LMI in the Elsner et al. (2008) study that included storms weaker than hurricane strength.
The intensity estimates provided by the Dvorak-based models are not continuous (they provide values from a discretized set of “current intensity numbers” that are then converted to a wind speed), and the best-track intensity data are generally provided in 5-kt increments. A common procedure before analyzing such discretized data is the addition of random noise to the data points. In particular, when considering quantiles, the random noise serves as a “tie breaker” when multiple discrete points have the same value. In Elsner et al. (2008), random noise sampled from a uniform distribution on the interval ±5 kt (±2.6 m s−1) was added to the intensity estimates prior to applying quantile regression. Here we document how this noise can affect the trend and confidence bounds in the quantile regression. In our case, where the significance of the global trends in the homogenized ADT-HURSAT data is marginal, this becomes especially important for accurate interpretation.
We considered quantiles above the median LMI from the global ADT-HURSAT record, both with and without the additional homogenization step, and added random noise sampled from a uniform distribution on the interval ±1.0 m s−1. Figure 7a shows the trend coefficients from quantile regression applied to the ADT-HURSAT LMI data with the additional homogenization step, repeated 1000 times with different random noise added each time. The sensitivity of the trend and its confidence level to the addition of random noise is evident in Fig. 7a. The trend ranges from +1.6 to +0.6 m s−1 decade−1, with a mean and standard deviation of +1 and 0.15 m s−1 decade−1, respectively. The associated p values range from 0.014 to 0.219, with a mean of 0.096.
To better identify the effect of the final homogenization step that we applied to remove the discontinuity in the satellite data in the region of the Indian Ocean, Fig. 7b repeats the exercise above with the uncorrected ADT-HURSAT LMI data. The discontinuity in the satellite data is seen to measurably inflate the trends and their level of statistical significance, which suggests that the ADT applied at more oblique satellite view angles is more likely to underestimate intensity in hurricanes. This can be caused by the resulting differences in the measured infrared brightness temperatures that the ADT uses as input or by a reduced ability of the ADT scene-typing algorithm to accurately recognize the presence of a warm eye. This will be explored further in section 3c.
b. Comparison of observed LMI changes to theoretical expectations based on observed climate trends and variability: A heuristic exercise
When interpreting the global trend analysis above, it is instructive to consider the changes in the LMI frequency distribution that might be theoretically expected given an increasing trend in potential intensity (PI) in the tropics. We explore this here as an idealized heuristic exercise: we assume that the mean tropical cyclone–season PI in the tropics is described by an autoregressive AR(1) process with autocorrelation coefficient a, annual mean μ, standard deviation σ, and an increasing trend δ calculated over N years. Mean annual global hurricane frequency in the best track since 1982 is about 49 per year with a standard deviation of about 7 storms, consistent with the expectations of a Poisson process, and we assume that the global number of hurricanes per year is described by a Poisson process with rate parameter λ. Then we assume that any storm that has reached hurricane intensity has a uniform probability of achieving any LMI between hurricane intensity and its potential intensity (Emanuel 2000). This allows us to form synthetic time series of annual mean PI and annual samples of LMI (described more formally in the appendix).
The parameters used to create the synthetic time series are based on observed annual mean PI in the Modern-Era Retrospective Analysis for Research and Applications (MERRA) (Rienecker et al. 2011), spanning the 32-yr period 1979–2010 (Fig. 8). The Northern Hemisphere values are based on August–October (ASO) means, and the Southern Hemisphere values are based on January–March (JFM) means. As estimated from the global tropical values shown in Fig. 8, we use a = 0.4, μ = 75 m s−1, σ = 3 m s−1, and δ = 1.5 m s−1 decade−1. As noted above, the annual rate of global hurricanes in the global best track record is λ = 49. With these values fixed, we can form time series of any length N (see the appendix) and test them for the presence of significant trends.
We formed 1000 time series of PI based on Eqs. (A1) and (A2) and then used these to calculate 1000 time series of LMI based on Eq. (A3). We then tested each time series for linear trends. The PI time series were tested for trends in the mean using ordinary least squares regression and the LMI time series were tested using quantile regression applied to the 0.5 and 0.9 quantiles. We choose N = 28 yr, which is the length of the time series in the ADT-HURSAT record analyzed above.
Figure 9 shows the results of the experiment. For the mean PI time series, the trend coefficients are distributed normally about 1.5 m s−1 decade−1, as expected, and range from about −1.8 to +4.8 m s−1 decade−1. About half of the PI time series do not show positive trends that are significant at the 95% confidence level. For the median of the LMI time series, the trend coefficients are distributed normally about 0.7 m s−1 decade−1 and range from about −2.4 to +3.3 m s−1 decade−1. Roughly 80% of the median LMI time series do not show significant positive trends at the 95% confidence level. For the 0.9 quantile of LMI time series, the trend coefficients are distributed normally about ~1.3 m s−1 decade−1 and range from about −2.1 to +5.3 m s−1 decade−1. About 40% of the time series do not show significant positive trends at the 95% confidence level. That is, based on the observed global seasonal variability and trend in PI over the past 32 years, and assuming that PI and LMI behave approximately according to Eqs. (A1)–(A3), there is about 60% probability of the emergence of a detectable trend at the 0.9 quantile of LMI in a 28-yr time series. In particular, the observed positive trends of ~1 m s−1 decade−1 and uncertainty bounds calculated from the ADT-HURSAT record with the additional homogenization step (Fig. 7a) could be quite plausibly argued to have come from any of a large number of these synthetic time series in which the trend signal has not yet emerged above the observed natural variability. There are a number of highly simplifying assumptions being made in this exercise, but it illustrates the present limitations of homogenized tropical cyclone intensity data constrained to the modern satellite era. As the HURSAT record is continually appended in the future, it is expected to be of increasing utility, but at present the time series analyses presented here should be considered in this context.
c. Regional trends
When narrowing the focus of our analysis from global to regional scales, there is generally an expectation of greater uncertainty in the quantification of natural variability, which makes formal detection of trends more difficult (e.g., Knutson et al. 2010). A further cautionary note was provided by Callaghan and Power (2010), who showed that a collection of individual 30-yr trends in severe tropical cyclone landfall frequency in eastern Australia parsed from the period 1872–2010 exhibited no consistency in either amplitude or sign (their), although they found a decreasing trend over the entire period (their). With these tempering points in mind, here we will document the regional trends in the hurricane LMI frequency distributions from the North Atlantic, eastern North Pacific, western North Pacific, South Pacific, and south Indian Oceans as calculated from the ADT-HURSAT data. The results are summarized in Table 1. Trends in the North Atlantic, eastern and western North Pacific, and South Pacific Oceans are shown in Fig. 10 and described below. Trends in the south Indian Ocean are then discussed separately.
1) North Atlantic Ocean
As expected from previous studies (e.g., Kossin et al. 2007; Elsner et al. 2008), the trends in the homogenized North Atlantic data are positive and significant, exhibiting large amplitudes (Fig. 10). The mean LMI (shown by the constant-value red line) is increasing at a statistically significant rate of about +4 m s−1 decade−1, while trends as large as +8 m s−1 decade−1 are found at LMI quantiles above 50 m s−1. In comparison, the North Atlantic best-track data show similar, but somewhat weaker, trends on average. The smaller (but still positive and significant) trends at the uppermost quantiles of the ADT-HURSAT LMI gives an indication of how the shape of the North Atlantic LMI frequency distribution has been changing. In this case there is not a uniform shift toward stronger hurricanes, and the quantiles near the median hurricane LMI (~50 m s−1) have been shifting most rapidly.
2) Eastern and western North Pacific and South Pacific Oceans
In the eastern North Pacific best-track data, there is no trend found in the mean hurricane LMI, but significant positive trends are found at the very highest quantiles (Fig. 10). However, this is not supported by the ADT-HURSAT data, in which there are no significant trends found at any of the quantiles. In the western North Pacific best-track data, significant positive trends are found in the mean LMI and in a range of quantiles, but again this is not well supported by the ADT-HURSAT data and, in fact, the highest quantiles of the ADT-HURSAT LMI exhibit marginally significant negative trends. In the South Pacific, the trends in the best track are similar to those found in the North Atlantic best track. The mean trend is positive and significant, and trends at the higher quantiles as large as +7 m s−1 decade−1 are found. In the ADT-HURSAT data, the trends are substantially smaller in amplitude but remain positive and marginally significant at the highest quantiles. As with the marginally significant global trends in the ADT-HURSAT shown in Fig. 6, we tested the trends in the western North Pacific and South Pacific ADT-HURSAT data for robustness when subjected to the repeated addition of random noise to the LMI values (as described in section 2a above). When we do this, we find that the mean trend at the highest quantiles in the western North Pacific is −2 m s−1 decade−1 with a mean p value of 0.03, and in the South Pacific +2.5 m s−1 decade−1 with a mean p value of 0.09 (Table 1).
3) Indian Ocean
As discussed in section 2, the Indian Ocean region poses a unique challenge in our analysis owing to the introduction of the Meteosat-5 geostationary satellite into the region in 1998 (Fig. 1). Prior to this, geostationary satellite views of tropical cyclones in the north and south Indian Oceans were often highly oblique, affecting the estimates provided by satellite-based intensity models such as the Dvorak technique and ADT, or the regression-based models of Kossin et al. (2007) and Elsner et al. (2008). It was shown in section 3a and Fig. 7 that the discontinuity introduced into the satellite data can act to inflate trends and associated confidence levels in the uncorrected ADT-HURSAT LMI estimates. Here we will look more carefully at the Indian Ocean and analyze trends both with and without the additional homogenization correction. As noted earlier, there is insufficient data in the north Indian Ocean for trend analysis, and here we limit our discussion to trends in the south Indian Ocean.
The trends in the south Indian Ocean are shown for ADT-HURDAT data, both with and without the additional homogenization correction, and for the best track in the region (Fig. 11). In the best track, the trend in the mean LMI is about +2.5 m s−1 decade−1 (solid red line in Fig. 11) and is statistically significant (dashed red lines). The mean LMI trend in the ADT-HURSAT without the additional homogenization correction is roughly the same as found in the best track. In the ADT-HURSAT with the additional homogenization correction, the amplitude of the mean trend remains positive but smaller, and it is not significant. At higher quantiles, the trends are generally higher in the best track but are consistently positive in the ADT-HURSAT (with and without the correction). The trends at the uppermost quantiles in the ADT-HURSAT with the additional homogenization correction are marginally significant, so here again we repeat the analyses performed for the global ADT-HURSAT to test the trends for robustness. The analysis shown earlier in Fig. 7 is repeated and shown in Fig. 12. Similar to what we found in the global data, the trends in the south Indian Ocean are reduced in amplitude and associated confidence levels when the final correction is made to account for the discontinuity in the regional satellite data. The mean trend for the uppermost quantiles in the ADT-HURSAT LMI with the additional homogenization correction is +1.7 m s−1 decade−1 with a mean associated p value of 0.06.
Some specific examples of how the additional homogenization correction can affect the ADT-HURSAT intensity estimates in Indian Ocean cyclones are shown in Fig. 13. Cyclone Gonu (2007) is an example of how the more oblique satellite view angles associated with the HURSAT data with the additional homogenization correction can impede the ability of the ADT to accurately identify the development of an eye scene. In the uncorrected data, which is based on satellite imagery with a view angle of 20°–30° from nadir, an eye scene was identified earlier than the homogenized data, which was based on much more oblique angles of 70°–80° from nadir. This resulted in an earlier rapid intensification stage, which ultimately resulted in a greater LMI. This is, arguably, what would be expected when comparing the behavior of a Dvorak-type algorithm, as the accuracy of the method would be expected to be compromised at more oblique satellite view angles. However, this is not always the case, as seen in Cyclone Sidr in the same year. In Sidr, the ADT applied to the HURSAT data with the additional homogenization correction actually provided a higher LMI than the uncorrected data. The identification of an eye scene occurred roughly concurrently, but the infrared brightness temperatures differed at the different view angles in such a way that the intensity was greater at the more oblique angles. Note that, in addition to the view angles as an absolute measure of obliqueness, different satellites view the storms from very different directions, which also affects the measured brightness temperatures.
South Indian Ocean Cyclones Bindu (2001) and Pancho (2008) are also shown in Fig. 13 to illustrate more extreme examples of how the different view angles of the corrected HURDAT data versus the uncorrected data can affect the intensity estimates provided by the ADT. Again, we find that the more oblique satellite view can provide either greater or lesser LMI, and these differences can be large. As noted above, this is a somewhat unexpected result and emphasizes the variability in the relationship between satellite view and the intensity estimates provided by Dvorak-type algorithms. This is also seen in Fig. 14, which shows the differences in LMI estimated by the ADT applied to the corrected versus the uncorrected HURSAT data. When all named storms (LMI ≥ 35 kt) in the Indian Ocean during the period 1998–2009 are analyzed, the mean difference is about zero but the range of differences spans −65 to +65 kt, which underscores some of the complexity in the relationship between the ADT estimates and the various satellite images that the ADT is applied to. The more simple retroactive bias correction applied by Elsner et al. (2008), while very easy to implement and reasonably supported by the data, would be less likely to represent this variability well.
Another point that Figs. 13 and 14 bring out is that the ADT is remarkably robust even when presented with highly oblique view angle situations. It is clear from our analyses that the correction for the 1998 discontinuity in the HURSAT data is important for trend analyses, but this robustness provides additional confidence that ADT applied to the HURSAT data with the additional homogenization correction is still doing a reasonable job of capturing LMI, and hence LMI trends, in the Indian Ocean.
4. Concluding remarks
Our analyses using a new homogenized record of tropical cyclone intensity suggest that the stronger tropical cyclones, globally, have become more intense at a rate of about +1 m s−1 decade−1 during the 28-yr period 1982–2009, but the statistical significance of this trend is marginal. Dramatic changes in the frequency distribution of lifetime maximum intensity (LMI) have occurred in the North Atlantic, while smaller changes are evident in the South Pacific and South Indian Oceans, and the stronger hurricanes in all of these regions have become more intense. There are no significant changes noted in the eastern North Pacific, and negative changes are found in the western North Pacific, that is, the strongest hurricanes have become weaker. There are insufficient data to determine trends in the distribution of LMI in north Indian Ocean hurricanes.
The 28-yr length of the new homogenized record places strong constraints on the interpretation of the observed trends, and a heuristic exercise suggests that trends in the LMI of the strongest storms caused by observed trends in tropical cyclone potential intensity could easily be obscured by random variability within this time period. Interpretation is further challenged by the fact that observed regional climate variability comprises a number of factors, both natural and anthropogenic, and the response of tropical cyclones to each factor is not yet well understood. Long-term trends in tropical climate due to increasing greenhouse gas can be regionally dominated by shorter-term decadal variability forced by both internal and external factors such as changes in natural and anthropogenic aerosol concentrations. For example, pollution aerosols can affect regional sea surface temperature and SST gradients (Chung and Ramanathan 2006; Mann and Emanuel 2006; Baines and Folland 2007; Evan et al. 2009; Ting et al. 2009; Zhang and Delworth 2009; Chang et al. 2011; Solomon et al. 2011; Booth et al. 2012; Villarini and Vecchi 2013; Zhang et al. 2013) and can affect regional circulation patterns (e.g., Meehl et al. 2008; Evan et al. 2011b). Similarly, mineral aerosols (dust) and volcanic aerosols can affect regional SST in the tropics, as well as upper-level conditions, on interannual to decadal time scales (Thompson and Solomon 2009; Evan et al. 2011a; Evan 2012; Evan et al. 2012; Emanuel et al. 2013). In concert with these natural and anthropogenic external forcings, internal variability can play a substantial, and possibly dominant, role in regional decadal variability (e.g., Ting et al. 2009; Zhang et al. 2013). Thus, when interpreting the global and regional changes in tropical cyclone intensity shown in the present work, it is clear that framing the changes only in terms of linear trends forced by increasing well-mixed greenhouse gasses is most likely not adequate to provide a complete picture of the potential anthropogenic contributions to the observed changes.
At present, detection and attribution of changes and trends in tropical cyclone activity remains a significant challenge, but increases in our physical understanding of the causes of regional climate variability and its effect on tropical cyclones, particularly on decadal time scales, together with attempts to homogenize the historical tropical cyclone records provide a way forward.
The continued development of the Advanced Dvorak Technique has been supported by the Naval Research Lab in Monterey, California, and the Office of Naval Research, with additional support from NOAA/NESDIS. Quantile regressions were performed using the software environment R (http://www.r-project.org) and the quantile regression package quantreg: Quantile regression (Koenker 2011). We thank Kerry Emanuel for providing us the potential intensity data used in Fig. 8 and Chris Velden for his input and support of the application of the Advanced Dvorak Technique to the HURSAT-B1 data.
Synthetic Time Series
We describe an AR(1), or red noise, process as
where x is a standardized variable with zero mean and standard deviation of unity, a is the lag-1 (Δt) autocorrelation, and ɛ is a random number drawn from a normal distribution at each time step (white noise). We then form synthetic time series of potential intensity as
where μ and σ are the specified constant annual mean and standard deviation, respectively, of PI, and δ(t) = m(t − t0) is a specified linear trend with slope m. Assuming that any storm that has reached hurricane intensity has a uniform probability of achieving any lifetime maximum intensity (LMI) between hurricane intensity and its potential intensity (Emanuel 2000), we can form synthetic time series of annual LMI samples as
where LMImin = 33 m s−1 (minimal hurricane intensity) and γ(t) is a collection of random numbers drawn from a uniform distribution on the interval (0, 1). For each year t, the number of values in γ(t) is a random number drawn from a Poisson distribution with specified rate parameter λ.
These regions are typically separated into North Atlantic, eastern North Pacific, western North Pacific, South Pacific, north Indian, and south Indian Oceans.
Potential intensity describes the maximum intensity that a tropical cyclone can attain, as determined by the thermodynamic state of the local environment (Bister and Emanuel 1998).
For simplicity, we use the 65-kt threshold to denote hurricane strength winds, although these winds and their related thresholds are measured differently in the best-track data from different regions. We will also refer to these storms collectively as “hurricanes” regardless of geographic region.