Hurricanes cause drastic social problems as well as generate huge economic losses. A reliable forecast of the level of hurricane activity covering the next several seasons has the potential to mitigate against such losses through improvements in preparedness and insurance mechanisms. Here a statistical algorithm is developed to predict North Atlantic hurricane activity out to 5 yr. The algorithm has two components: a time series model to forecast average hurricane-season Atlantic sea surface temperature (SST), and a regression model to forecast the hurricane rate given the predicted SST value. The algorithm uses Monte Carlo sampling to generate distributions for the predicted SST and model coefficients. For a given forecast year, a predicted hurricane count is conditional on a sampled predicted value of Atlantic SST. Thus forecasts are samples of hurricane counts for each future year. Model skill is evaluated over the period 1997–2005 and compared against climatology, persistence, and other multiseasonal forecasts issued during this time period. Results indicate that the algorithm will likely improve on earlier efforts and perhaps carry enough skill to be useful in the long-term management of hurricane risk.
Hurricanes are a serious social and economic threat to the United States. Strong winds, heavy rainfall, and storm surge kill people and destroy property. Hurricane activity over the Atlantic basin varies widely on intraseasonal to multidecadal time scales. Several factors are known to influence this activity including the El Niño–Southern Oscillation (ENSO), the North Atlantic Oscillation (NAO), and Atlantic sea surface temperature (SST).
Seasonal forecasts of hurricane activity are routinely issued by government agencies and academics. There appears to be some general skill against climatology with these forecasts, especially at lead times of a few months or less. However, the usefulness of forecasts is limited by the lack of regional specificity and by the rather short lead times (forecast horizons). Approaches are available to forecast activity near the coast (Elsner et al. 2006; Saunders and Lea 2005; Elsner and Jagger 2004, 2006; Lehmiller et al. 1997). However, little has been done to address the limited forecast horizon. One exception is the multiseason model developed in Elsner et al. (1998). The model (with slight modifications) has been used to forecast the basin-wide hurricane count since 1997. However, as anticipated by the authors, the forecast skill has been only marginal against climatology.
The objective of this paper is to show how improvements can be made in multiseason forecasts of Atlantic hurricane activity. The idea is to make multiyear forecasts of Atlantic SST and then predictions of hurricane activity conditional on a forecast of the SST. A skillful model that spans multiple years will help with long-term emergency management planning as well as with financial planning including capital allocation and investment strategies.
We begin in section 2 by examining the track record of multiseason predictions. The skill level is put into the context of the routinely issued seasonal forecasts. In section 3 we examine time series of both Atlantic SST and North Atlantic hurricane counts. In section 4 we perform diagnostics on the SST and hurricane records. Time series diagnostics are run to help develop the time series model and regression diagnostics are run to help develop the regression model. In section 5 we tie the two models together into a hierarchical prediction algorithm. In section 6, the algorithm is used to retrodict hurricane activity over the period 1997–2005 and the hindcasts are compared with persistence and with actual forecasts made over this period. In section 7, we provide a summary of the method and results.
2. Past research
The present work is motivated by Elsner et al. (1998; hereafter ENT98) who use a time series model for the annual North Atlantic hurricane counts. The ENT98 model is based on a singular spectrum analysis (SSA) performed in Elsner et al. (1999). The results show important variations in North Atlantic hurricanes on subdecadal time scales that to some degree are predictable. Here we choose a different approach that uses a time series model for the SST variation and a regression of hurricane activity onto the predicted SST values. The main difference between the present approach and the approach of ENT98 is the parametric time series model for SST that replaces the empirical model for annual hurricane counts.
With the exception of the El Niño years of 1997 and 2002, hurricane activity over the Atlantic basin has been above normal every year since 1995. As expected, predictive skill of the ENT98 model has been marginally better than climatology. Interestingly, the overall track record of the multiseasonal ENT98 model is better than the track record of forecasts issued by Gray et al. at Colorado State University (CSU) over this time period. (Fig. 1). The correlation between predicted and actual hurricane counts using the CSU June, April, and December forecasts are +0.30, −0.12, and −0.47, respectively. The correlation between predicted and actual hurricane counts using the ENT98 one-, two-, and 3-yr forecasts are +0.40, +0.31, and +0.47, respectively.
Hurricane counts are used directly in building the ENT98 model. It is our hypothesis that improvements on this strategy can be made by forecasting rates rather than counts. Thus our perspective is more statistical in that we assume the counts to be random conditional on the underlying rate. Another limitation to the ENT98 model is the lack of covariate information. Here we make use of the fact that, in general, hurricane activity responds to changes in SST. We thus develop a time series model of the SST, rather than a time series model of the raw counts as in ENT98, and then we use the predicted values as covariate information to forecast the hurricane rates. It is anticipated that, with both of these improvements, the skill level of multiseason projections of Atlantic hurricane activity can be increased.
The change from a period of below-normal activity to a period of above-normal activity is considered the result of low-frequency climate variations driven in part by SST over the North Atlantic (Goldenberg et al. 2001). There is also a perspective that since SSTs are influenced by climate change, the next lull in hurricane activity will not be as low as in the past (Emanuel 2005; Webster et al. 2005).
3. Atlantic SST and hurricanes
Hurricanes derive their power from the ocean through moist enthalpy and instability. Other things being equal, the amount of energy available depends on the SST and the depth of the ocean mixed layer. Cold water below a surface layer of warm SST (shallow mixed layer) will limit the energy available for hurricane maintenance. In the absence of information about the mixed layer depth and on interannual and longer time scales, we use SST as a proxy for surface forcing of hurricane activity.
Statistical analysis confirms that over the Atlantic basin warmer SST tends to result in more and stronger hurricanes. Saunders and Harris (1997) using linear regression showed that SST over the tropical Atlantic is the dominating influence behind the interannual variance in Atlantic hurricane numbers. Kimberlain and Elsner (1998) showed that the increase in hurricane activity since 1995 could be related to warmer SSTs to the east of the Lesser Antilles. Goldenberg et al. (2001) show that the number and strength of Atlantic hurricanes have a multidecadal variation which they suggest is related to SST changes resulting from North Atlantic Ocean currents.
Modeled SST and National Oceanic and Atmospheric Administration (NOAA) optimal interpolated SST datasets are used to compute Atlantic SST anomalies north of the equator in Enfield et al. (2001). Anomalies (in °C) are computed by month using the base period 1951–2000. Data are obtained from the NOAA/Cooperative Institute for Research in Environmental Sciences (CIRES) Climate Diagnostics Center back to 1871. For this study we average the SST anomalies over the hurricane-season months of August through October.
Figure 2 displays the August–October averaged value of the Atlantic SST by year over the period 1871–2005. The data show large year-to-year fluctuations, multidecadal variability, and a positive trend. The exact cause of the low-frequency variation is open to debate, but it has been suggested that is part of a multidecadal mode [Atlantic Multidecadal Oscillation (AMO)] that influences hurricane formation through changes in local atmospheric boundary layer processes and vertical wind shear, particularly in the main development region of the Atlantic (Bell and Chelliah 2006; Enfield and Mestas-Nuñez 1999).
The distribution of the individual values is also displayed with a kernel density plot. The distribution appears symmetric and normal. Formally we check for normality using the Shapiro–Wilks statistic and find that the test statistic has a value of 0.985 corresponding to a P value of 0.719 and providing no evidence against the null hypothesis of normally distributed SST data.
A series of the annual hurricane counts over the same time period is also displayed in Fig. 2. Similar to the record of SST, the annual hurricane counts show large interannual fluctuations superimposed on low-frequency variations. The case for a trend in the counts is less compelling than for the SST, especially when considering the fact that data prior to 1946, when reconnaissance flights began, is generally accepted as somewhat incomplete. The distribution of annual counts is discrete and follows a Poisson distribution. A χ2 goodness-of-fit test with 3 degrees of freedom gives a P value of 0.662 providing no evidence against the null hypothesis of Poisson distributed counts.
4. Data analysis and modeling
In this section we perform analysis and modeling of the data to help us decide on the overall structure of the prediction algorithm, which is outlined in the next section. We begin with time series analysis of the Atlantic SST record and continue with a regression analysis of the hurricane counts.
a. Time series model of Atlantic SST
We employ standard tools from time series analysis to help us model Atlantic SST. Our goal is a prediction model for late summer/early fall Atlantic SST values. Regardless of how well we understand the nature of the phenomenon behind the identified time series patterns, if we properly model them, we can extrapolate the patterns to predict future values.
The first step in developing a time series model is to check for stationarity. Stationarity refers to the property that the mean, variance, and autocorrelation of the time series are roughly constant. Stationarity can be checked with various statistics. The behavior of the autocorrelation function can also be used. For example, a slow decay of the autocorrelation values with increasing lag is often evidence of nonstationarity. We begin with the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test.
The KPSS test is a commonly used direct method for checking stationarity in time series (Kwiatkowski et al. 1992). It is a one-sided right-tailed test so that one rejects the null hypothesis of stationarity at the 100α% level if the KPSS statistic is greater than the corresponding quantile value. For the late summer/early fall SST data the KPSS statistic is 1.249 under the null hypothesis of no trend. The corresponding 99th percentile value is 0.762, so we reject the null hypothesis of stationarity about a constant value (no trend). However, if we test for stationarity under the null hypothesis of a linear trend, the KPSS statistic drops to 0.125 (the corresponding 99th percentile value is 0.229), so we fail to reject the null hypothesis and we proceed as if the time series is stationary after a linear trend removal.
Trend removal is typically performed by fitting a curve and subtracting the fitted values from the original data. However, Box and Jenkins (1976) recommend differencing the series to achieve stationarity and that is what is done here. The time series of the first difference (SSTt − SSTt−1) is shown in Fig. 3. Note that the long-term trend is removed with this procedure. A stationary time series has time invariant first and second statistical moments. Thus, given a sample time series yt (for t = 1, . . . , T), the lag j sample autocorrelation is defined as
where y = (1/T)ΣTt=1yt is the sample mean. The sample autocorrelation function is a plot of ρ̂j against lag j.
The autocorrelation function provides information that can be used to model a time series. For comparison we plot the values of the autocorrelation function for the original series and for the differenced series in Fig. 3. The dashed lines indicate the 95% confidence levels. Large autocorrelation values (above the 95% level) extend to lag 11 for the raw time series. In contrast, the difference series shows only significant correlation at lags 1, 11, and 12. The nonstationarity in the raw series as evidenced by the slow decay in the autocorrelation function makes it difficult to choose the order of the time series model as the trend contributes to the short lag autocorrelations.
A common way to specify a univariate time series is to use an autoregressive (AR) model. Let yt be a zero-mean time series, then
Thus an autoregressive model is a linear regression of the current value of the series against one or more prior values of the series. The value of p is called the order of the AR model, and is denoted AR(p). The AR integrated moving average [ARIMA(p, d, q)] models extend the scope of the AR models to include moving average (MA) terms (to order q) and differencing (to order d), where the “I” in ARIMA stands for “integrated” since the predictions must be integrated (antidifferenced) to forecast the original time series. Once the orders are determined, the model coefficients are estimated using the method of maximum likelihood.
For comparisons we entertain two time series models. The first is an ARIMA(3, 0, 0) model [or equivalently, a AR(3) model on the raw time series] and the second is a ARIMA(12, 1, 1) model with coefficients for all lags except 1, 11, and 12 set to zero. The models can be expressed as
where ɛt = zt + m1zt−1.
The ARIMA model is somewhat more complicated than the AR model. Note that the ARIMA model contains a component corresponding to an 11–12-yr cycle in the differenced series. We offer no physical explanation for this, only noting its frequency correspondence with the 11-yr solar cycle. Elsner and Kara (1999) provide a historical summary of studies related to the sun–hurricane connection and describe a few mechanisms for a possible extraterrestrial influences on Atlantic hurricane activity, including cosmic rays (Elsner and Kavlakov 2001). We note that August–October SST values are generally lower (higher) during years of low (high) sunspot numbers, which correspond to a “cooler” (warmer) sun. For example, the periods 1963–66, 1971–77, 1983–87, and 1992–97 featured reduced sunspot activity and an average hurricane-season North Atlantic SST anomaly of −0.066 which compares with an anomaly of +0.146 for the years not included in the above list during the period 1957–2003. Under the assumption of independence (not entirely valid), the p value for the difference in mean anomalies is 0.0013 (N = 46) indicating a statistical relationship.
A qualitative comparison of the two competing models is made by examining the partial autocorrelation functions of the model residuals (Fig. 3). The partial autocorrelation at lag j is the autocorrelation between SSTt and SSTt−j that is not accounted for by lags 1 through j − 1. In general, partial correlation values are larger for the AR model. Specifically, a significant partial correlation value (above the 95% line) at lag 10 is noted for the AR model residuals indicating that the model might be inadequate in describing all the important serial correlation in the time series. In contrast, no significant lag correlations are noted in the partial autocorrelation function of the residuals from the ARIMA model. Another comparison is made by examining the Akaike information criteria (AIC) given by
The information criterion captures the trade-off between model fit and parsimony. Models with too many parameters are penalized by lack of parsimony. A model with a lower AIC value is the better model accounting for fit and number of parameters. For the AR model the AIC is −86 which compares with −96 for the ARIMA model. Based on the autocorrelation analysis and the AIC values we choose the ARIMA(12, 1, 1) model for predicting the SST out to 5 yr.
More explanation is needed on the chosen model. The SSTt−23 term arises from the product expansion of an ARIMA(12, 1, 1) and an ARIMA(11, 1, 1) model resulting in autoregressive terms with 1, 11, 12, and 23 lags. The coefficient on the SST t−12 term is based on examining another model where the lag 11 and lag 12 terms are added simultaneously (separate coefficients). The chosen model, where the coefficient on the lag 12 term contains the lag 1, lag 11, and lag 12 coefficients, had a smaller AIC while still removing the significant correlations and partial coefficients at lags 11 and 12. The model also contains a moving average term ɛt containing a first-order coefficient m1 where zt is a random uncorrelated value (innovation) at year t.
In summary, our time series model for Atlantic late summer/early fall SST contains an autoregressive component to capture the interannual persistence in SST values, a moving average component to capture the possibility that a random fluctuation may propagate to future values through feedbacks, a cyclic component possibly related to changes in solar insolation, and a trend component to capture the low-frequency variability possibly related to climate change. The model is from a family of models referred to as Box–Jenkins models after Box and Jenkins (1976).
b. Regression model of hurricanes on SST
Since hurricanes derive energy from the heat and moisture of the underlying ocean, theoretically the maximum potential intensity of a hurricane increases with increasing SST, all else being equal (Emanuel 1987; Holland 1997). Thus, as a first approximation, a season with a uniform distribution of maximum intensities over an ocean of uniform temperature will result in more hurricanes if the ocean warms even if the number of tropical cyclones remains constant. In fact, at least over the Atlantic, the year-to-year variation in hurricane activity is statistically linked (positively correlated) to Atlantic SST (Saunders and Harris 1997; Kimberlain and Elsner 1998; Goldenberg et al. 2001). More realistically the thermal structure of the ocean’s mixed layer (top several hundred meters) is a better representation of the available energy for hurricanes against mixing and overturning generated by the strong winds. Thus, SST serves as a proxy for oceanic heat content. Figure 4 shows time series’ plots of the August– October Atlantic SST and annual Atlantic hurricane counts.
The interannual variation in the pattern of hurricane activity appears to coincide with the interannual variation in the pattern of SST. This association is confirmed quantitatively in the scatterplot. The regression line indicates that SST explain a statistically significant (P value <0.001) 17% of the interannual variability in hurricane counts. Considering the complexity of hurricane genesis and the fact that SST is only a proxy for oceanic heat content, the result is encouraging. Indeed it is well understood that, regardless of surface warmth, atmospheric winds creating a shearing environment can inhibit hurricane activity. This is why seasonal forecast models can be improved by including a variable characterizing ENSO, which tends to modulate Atlantic hurricane activity through its effect on environmental wind shear and subsidence.
On longer time scales, the effect of ENSO can be considered a “random” component to hurricane activity, whereas the autocorrelative structure of the SST data as described in the previous section suggests that the relationship with hurricane activity likely will be stronger. The scatterplot of the smoothed hurricane counts against the smoothed SST verifies this. Here we find that SST explains 32% of the variability in hurricane activity. The series’ are smoothed using a local regression with a span of 5 yr. These diagnostics provide statistical validation for our approach to modeling hurricane counts from SST. However, since interest is in future hurricane counts we use a regression model appropriate for discrete data.
The canonical model for hurricane count data is the Poisson regression (Elsner and Schmertmann 1993; Solow and Moore 2000; Elsner et al. 2001; Jagger et al. 2002; Elsner 2003; McDonnell and Holbrook 2004). It is based on the Poisson distribution, which is a discrete distribution defined on the nonnegative integers. It is derived from the distribution of wait times between successive events. For our purpose the Poisson regression model is used to model a set of basin-wide hurricane counts ht ∈ 0, 1, 2, . . . , ∞ = Z+ on the nonnegative integers for a set of observed years t = 1, . . . , T. Additionally we have a set of SST values one for each year. Thus the Poisson regression is
where λt is the hurricane rate for year t, β0 is the intercept, and β1 is SST coefficient. The symbol ∼ refers to a stochastic relationship and indicates that the variable on the left-hand side is a random draw (sample) from a distribution specified on the right-hand side. The equal sign indicates a logical relationship with the variable on the left-hand side algebraically related to variables on the right-hand side.
5. Prediction algorithm
Diagnostic results from the previous section provide us with the background to build a multiyear prediction algorithm for annual hurricane counts. In short, the algorithm generates a predictive distribution of annual hurricane counts out to 5 yr by generating predictive samples of SST values and predictive samples of regression model coefficients and combining them mathematically through Eq. (6). Specifically the algorithm uses Monte Carlo (MC) sampling within a hierarchy. A schematic of the hierarchy is shown in Fig. 5. At the top are the SST and hurricane data records. The SST data are used alone to build a univariate time series model and are used together with the hurricane data to build a regression model. Details of the time series and regression models were given in the previous section.
Given values for the coefficients of the ARIMA model and random uncorrelated innovations for each year, the time series model together with the historical series of SST values produces a single sample of predicted SST values with a single value for each year out to 5 yr. Using 100 different innovation series provides a sample consisting of 100 5-yr SST predictions. Given the historical series of SST values and corresponding hurricane counts, the regression model produces a single set of model coefficients (offset and slope), representing expected values from a Student’s t distribution. The regression model specifies a Poisson rate for a single year conditional on a value of SST. Choosing a random set of model coefficients from their respective distributions together with a single SST value produces a single rate for a single year. The procedure is repeated for each forecast year and for 100 sampled SST predictions and 1000 sampled regression coefficients for a total of 500 000 hurricane counts. To get a single predicted rate we average over all samples for a given forecast year. This MC approach combines the uncertainty inherent in the time series predictions with the uncertainty associated with the regression of hurricanes on SST to give an assessment of overall predictive uncertainty.
Model skill is examined by estimating the time series and regression parameters on data over the period 1871–1996 (training period), then applying the prediction algorithm to forecast hurricane counts from 1997–2005 (forecast period). The length of the training period (initially 116 yr) is dictated by a trade-off between having enough data to produce an accurate model and leaving enough data to get reliable validation statistics. The choice of years for the training period is made partly based on the fact that comparisons can be made with actual forecasts from the ENT98 model. The first forecast in the validation period is issued for 5 consecutive seasons 1997–2001 and is initialized in December of 1996. The training period is then updated to include the observed number of hurricanes for 1997 and the model parameters reestimated to make a forecast for the period 1998–2002. This is repeated 7 times so that the last forecast period is 2003–05.
Figure 6 shows various results from the validation exercise. First, we compare the observed relative hurricane rate with forecast relative rates. Relative rates are computed by taking the ratio of the short-term (5 yr) mean hurricane count to the long-term mean hurricane count. A relative rate of 1 indicates a long-term climatological mean number of hurricanes. The long-term mean is the average number of hurricanes starting with 1871 and ending with the year before the first forecast year. For example, in comparing the 5-yr forecast issued beginning with the 1997 season, we use a long-term mean of 5.14 hurricanes based on the period 1871–1996. The observed short-term mean is 7.60 hurricanes so the observed relative rate is 1.48. Forecasts made with the present model are converted to relative rates by taking the ratio of the 5-yr forecast mean to the observed long-term mean. The same procedure is used to convert the forecasts from the ENT98 model and a persistence model that uses the previous 10 yr as a forecast for the next 5 yr. Note that with forecasts made starting with the 2002 and 2003 seasons, the number of forecast years is 4 and 3, respectively.
The observed short-term hurricane rates have exceeded the long-term mean rate in each of the 7 forecast periods. The three forecast models do a good job in predicting the above-normal activity with the SST model performing the best, although all three forecast models fall short in predicting the magnitude of the recent high level of hurricane activity. The correlation between the observed and predicted number of hurricanes over each of the 5-yr prediction intervals for the SST model and the ENT98 model is also shown. The correlation are generally above 0.5 for the SST model and better than the ENT98 model, although the ENT98 model proved superior with its 2001 and 2002 forecasts. Note that the persistence model provides a 5-yr mean rate forecast rather than forecasts of annual counts, so correlation as a measure of skill is meaningless. The root-mean-squared error statistics show similar results, with the SST model providing more skill than the ENT98 model with the exception of the 2001 forecast when the ENT98 model predicted 11 hurricanes (observed = 15) for the 2005 season 5 yr in advance.
The amount by which the model underestimates the observed activity over the verification period might be partly due to a low bias in hurricane counts in the years prior to about 1944. Recent work in trying to quantify this bias (Landsea 2007) could lead to a more accurate forecast model. This can be done by using the bias-corrected set of hurricane counts to recalibrate the Poisson regression.
In general the validation results indicate skill for the SST model. Thus we use the model to predict the relative rates out to the year 2010. The rates are integrated over the forecast period. The forecast calls for a continuation of an active hurricane period with expected relative rates at levels about 50% above the long-term mean. The 95% confidence interval ranges between 10% and 100% of the long-term rate.
The 2006 Atlantic hurricane season, which featured 5 hurricanes, came and went during the course of the research and review process. This is slightly below the long-term (1871–2005) mean number of 5.34 and well below the forecast mean relative rate of 1.46. Model failure is attributed to El Niño. The El Niño produced greater wind shear and subsidence across the tropical Atlantic that is not accounted for in the model. However, to put this forecast into context, a mean relative rate of 1.46 translates into 7.8 hurricanes. For comparison, Klotzbach and Gray (2006) forecast 9 hurricanes for 2006 as late as 1 June 2006, while Saunders and Lea (2006) forecast 8.2 hurricanes for 2006 as late as 4 April 2006.
Hurricanes are capable of generating large financial losses to the insurance industry. Predictions of the level of hurricane activity have been around since the mid-1980s, and currently there are at least a half dozen groups regularly issuing such forecasts, but with the exception of ENT98 outlooks covering a longer time horizon of several years have not been routinely attempted.
Here we develop a statistical algorithm to predict basin-wide North Atlantic hurricane counts out to 5 yr that improves on the strategy of ENT98. The algorithm has two components: an ARIMA time series model to forecast average hurricane-season Atlantic SST, and a regression model to forecast the annual hurricane counts conditional on the predicted SST. The algorithm is developed based on results from diagnostic tests for goodness of fit, stationarity, component order, and significance of the regression coefficients. The algorithm uses Monte Carlo sampling to generate predictive samples of SST and samples of the regression coefficients. In this way forecasts are samples of hurricane counts that combine uncertainty in the predictive SST values with uncertainty in the regression model of hurricanes on SST.
Forecast skill is evaluated over the period 1997–2005 and compared with skill from a persistence forecast and the skill from the original multiseasonal forecast model. Results indicate the algorithm has sufficient skill to be useful in the long-term guidance of hurricane risk. The model contains no term to account for higher-frequency fluctuations associated with, for example, the inhibiting or enhancing conditions of ENSO, so forecasts will fail in years when such conditions dominate. A forecast to the year 2010 indicates a continuation of the recent above-normal activity. The connection between regional North Atlantic SSTs and hurricanes could lead to the development of space–time multiseason forecast models.
We thank Stilian Kavlakov for discussions on the possible role of extraterrestrial influences on Atlantic SST. Support for this study was provided by the National Science Foundation (ATM-0086958, ATM-0435628, and BCS-0213980) and by the Risk Prediction Initiative (RPI-05001). The views expressed within are those of the authors and do not reflect those of the funding agencies.
Corresponding author address: James B. Elsner, Dept. of Geography, The Florida State University, Tallahassee, FL 32306. Email: email@example.com