The Tropical Meteorology Project at Colorado State University currently issues seasonal forecasts for Atlantic basin hurricane activity in early April, June, and August. This paper examines the potential for issuing an additional seasonal forecast on 1 July, using a two-predictor forecast model. The two predictors are selected from the ECMWF Interim Re-Analysis (ERA-Interim) and explain over 60% of the cross-validated variance in post–30 June accumulated cyclone energy over the hindcast period from 1979 to 2012. The two predictors selected are May–June-averaged 2-m temperatures in the eastern tropical and subtropical Atlantic along with May–June 200-mb zonal winds in the tropical Indian Ocean. The May–June-averaged 2-m temperatures are shown to strongly correlate with August–October 2-m temperatures in the main development region, while the 200-mb zonal wind flow over the tropical Indian Ocean is shown to strongly correlate with El Niño–Southern Oscillation. In addition, each predictor is shown to correlate significantly with accumulated cyclone energy, both during the hindcast period of 1979–2012 and with an independent period from 1948 to 1978.
The Tropical Meteorology Project (TMP) at Colorado State University has issued seasonal hurricane forecasts in early June and early August since 1984 (Gray 1984; Gray et al. 1993, 1994), and an early season forecast has been issued in early April since 1995 (Klotzbach and Gray 2013). The early June and early August forecasts have shown skill when compared with several no-skill metrics (Klotzbach and Gray 2009), with the early April forecast showing less skill (Klotzbach and Gray 2013). The statistical modeling behind the TMP’s forecasts has undergone revisions in recent years (e.g., Klotzbach 2011), with the newly developed early August seasonal forecast explaining approximately 80% of the cross-validated hindcast variance for integrated seasonal metrics such as net tropical cyclone (NTC) activity (Gray et al. 1994) and accumulated cyclone energy (ACE; Bell et al. 2000) during the most recent thirty years. Given the improvement in real-time prediction skill from early June to early August (Klotzbach and Gray 2009), development of an intermediate forecast model issued in early July seems like a logical extension of previous work.
From the mid-1980s to the late 1990s, the TMP utilized primarily weather station and radiosonde data (e.g., Gray et al. 1994). During the past 10–15 yr, the TMP has utilized the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis dataset as its predominant source for statistical model development (Kistler et al. 2001). In recent years, the Climate Forecast System Reanalysis (Saha et al. 2010) and the European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis (ERA-Interim; Dee et al. 2011) have been developed and continue to be updated in near–real time. These reanalysis products likely provide a more realistic estimate of the actual conditions observed for a particular part of the globe than earlier reanalysis products because of the improved resolution and data assimilation techniques. The ERA-Interim product will be utilized in the development of this early July statistically based seasonal forecast scheme. Section 2 discusses the data utilized to develop the forecast, while section 3 describes the predictor selection process and evaluates the hindcast skill of the predictors over the developmental period from 1979 to 2012, along with an evaluation of skill during an earlier period from 1948 to 1978. Section 4 discusses the physical links between the two predictors and Atlantic basin tropical cyclone (TC) activity. Section 5 summarizes the manuscript and provides some ideas for future work.
All tropical cyclone (TC) statistics are calculated from the National Hurricane Center’s (NHC) second generation best-track data, which is available online (http://www.nhc.noaa.gov/data/#hurdat; see Landsea and Franklin 2013). This file provides estimates of 1-min maximum sustained wind and central pressure for every 6-h period where the NHC deems that a TC is present. ACE was calculated by utilizing the approach outlined in Bell et al. (2000) and is defined as the sum of the maximum 1-min sustained wind speed squared for each 6-hourly interval when the NHC declares a tropical or subtropical cyclone exists in the Atlantic basin. Seasonal ACE is an integrated measure that approximates the kinetic energy generated by all TCs in the Atlantic basin for a particular season.
While ACE is likely reliable since 1970, there are significant questions about data quality in the National Oceanic and Atmospheric Administration’s (NOAA) revised Atlantic hurricane database (HURDAT2) prior to that time period. There was likely an overestimation of major hurricane intensity from the 1940s through the 1960s (Landsea 1993). These overestimations are currently being addressed by the Atlantic Hurricane Database Reanalysis Project (Hagen et al. 2012); however, at this point, only years up to 1945 have been reanalyzed. While these overestimations likely led to inflated ACE values during the 1940s through the 1960s, TC intensity in the eastern part of the Atlantic basin may have been underestimated prior to the advent of continuous satellite monitoring in the mid-1960s. Several approaches have been outlined to account for these underestimates for both named storms (Vecchi and Knutson 2008) and hurricanes (Vecchi and Knutson 2011), but no approach has yet been outlined to account for underestimates in ACE. The overestimation of major hurricane ACE and underestimation of eastern Atlantic ACE may result in somewhat of an overall cancellation effect.
The ERA-Interim product (Dee et al. 2011) is utilized as the dataset for predictor selection. This reanalysis is performed on a finer-scale grid (1.5° latitude–longitude) than the original NCEP–NCAR reanalysis (2.5° latitude–longitude) and utilizes an advanced four-dimensional variational data assimilation scheme compared with a three-dimensional data assimilation scheme for the NCEP–NCAR reanalysis. The ERA-Interim product is available from 1979 to the near present.
ERA-Interim data are updated with an approximately 4-month lag to real time, so estimates of reanalysis values must be utilized for real-time prediction. Real-time ECMWF operational data are available for research purposes from the The Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) website (http://apps.ecmwf.int/datasets/data/tigge/?levtype=pl&type=fc). These operational data are available beginning in October 2006. To estimate reanalysis predictor values in real time, operational values are calculated and are then converted to “reanalysis” values. This conversion involves calculating standardized anomalies of the operational values from the 2007–12 average and then generating reanalysis values by multiplying the operational standardized anomalies by the 2007–12 reanalysis standard deviation and adding in the 2007–12 reanalysis average.
Four fields are analyzed for predictors: 2-m temperature [2mT, which is closely correlated with sea surface temperatures (SST) over water], sea level pressure (SLP), 850-mb zonal wind (U850), and 200-mb zonal wind (U200). All four of these fields have been shown in previous research (Klotzbach 2011) to significantly impact Atlantic basin TC activity. The predictors are selected based on ERA-Interim data from 1979 to 2012 and are then tested using the NCEP–NCAR reanalysis from 1948 to 1978.
All predictor calculations were made utilizing the Climate Explorer website (http://climexp.knmi.nl/).
3. July seasonal forecast model development
Correlation maps were constructed between the post–30 June ACE and the May–June large-scale fields discussed in the previous section (e.g., 2mT, SLP, U850 and U200). Figure 1 displays a correlation map between May–June 2mT and post–30 June ACE. The positive correlation outlined in the box in the eastern Atlantic was selected as the first predictor in the forecast scheme (e.g., May–June-averaged 2mT over 10°–50°N, 30°–10°W). Following the selection of this predictor, a preliminary prediction was run and a residual time series (e.g., observed minus hindcast) was created. The residual map was compared with the original correlation map to find areas that significantly correlated both with Atlantic basin ACE as well as with the residual ACE remaining after the eastern Atlantic 2mT predictor was considered (Fig. 2). Tropical Indian Ocean U200 (10°S–5°N, 60°–90°E) during the months of May and June was selected as a secondary predictor. After this predictor was selected, no other large-scale areas showed significant correlations with both basin-wide and residual ACE and, consequently, the forecast model was completed with two predictors.
The linear correlation between each predictor and ACE is significant at the 5% level when using a two-tailed Student’s t test and assuming that each year represents an individual degree of freedom over both the dependent period from 1979 to 2012 and the earlier period from 1948 to 1978. Predictor 1 correlates with ACE at 0.71 from 1979 to 2012, while the correlation from 1948 to 1978 is 0.49. Predictor 2 correlates with ACE at −0.49 from 1979 to 2012, while the correlation from 1948 to 1978 is −0.39.
Over the 2007–12 period, the operational and reanalysis values of predictor 1 correlate at greater than 0.99, while the operational and reanalysis values of predictor 2 correlate at 0.92, indicating the utility of the TIGGE ECMWF operational dataset for estimating predictor values in real time.
The two predictors are combined using a linear regression approach. When this is done, and the year being hindcast is left out of the equation development (typically referred to as jackknifing or cross validation) (Efron and Tibshirani 1993), the model correlates with observed post–30 June ACE at 0.80. Figure 3 displays the year-by-year cross-validated hindcast from 1979 to 2012. For the period from 2007 to 2012, both hindcasts from ERA-Interim as well as what hindcasts would have been from the ECMWF operational dataset are displayed. The correlation between the hindcasts based on the ERA-Interim data and the ECMWF operational data from TIGGE is 0.95.
When the two predictors are combined for 1948–78 using the equations developed over the 1979–2012 period, the correlation is 0.50. This correlation degradation is likely a combination of observational issues with the earlier-period NCEP–NCAR reanalysis data as well as the observational issues with ACE discussed previously. Upper-level zonal winds prior to the International Geophysical Year in 1957 were infrequently observed (Kistler et al. 2001) and consequently are subject to larger errors than in more recent years. In addition, there is the potential that the relationship between the two predictors and ACE was somewhat different during the earlier period from 1948 to 1978 than during the more recent period from 1979 to 2012.
As was done in Klotzbach (2011), the model was tested against several no-skill metrics over the developmental period from 1979 to 2012 as well as the independent period from 1948 to 1978. Equations were rederived over the independent period, since the model based on equations developed over 1979–2012 significantly overestimated ACE during the earlier period, potentially resulting from underestimates in observed ACE in the earlier period. The various no-skill metrics examined for this analysis were the 1979–2012 climatology, the previous 3-, 5-, and 10-yr means, the latter of which is currently the recommended World Meteorological Organization (WMO) no-skill metric (WMO 2002). Table 1 displays a variety of skill metrics and compares the hindcasts over both periods with these metrics. The same no-skill metrics were examined for 1948–78, except using the 1948–1978 climatology. The mean absolute error (MAE; defined as the absolute value of the difference between observation and hindcast) as well as the mean squared absolute error (MSE; the square of the difference between the observation and the hindcast) were evaluated. The model shows improved skill over all of the no-skill metrics for both time periods, although the skill is significantly improved from 1979 to 2012.
Tropical Storm Risk (TSR; http://www.tropicalstormrisk.com) also issues real-time seasonal hurricane forecasts of Atlantic basin ACE from early July (Saunders and Lea 2013). Their forecasts improve upon the MSE of the previous 10-yr mean by 46% over the period from 1980 to 2012, using a replicated real-time forecast. Effectively, this uses equations developed on data from 1948 to 1979 to forecast 1980, from 1948 to 1980 to forecast 1981, etc. The forecast outlined here results in a 49% improvement in MSE over 1980–2012, indicating a slight improvement in skill over that documented by TSR.
4. Physical relationships between predictors and Atlantic basin TC activity
Both of the predictors that are listed in Table 1 displayed significant correlations with ACE, both over the periods 1948–78 and 1979–2012, indicating that they likely modulate physical features during the peak of the TC season that drive fluctuations in Atlantic activity. This section examines each predictor’s correlations with large-scale fields during the peak of the Atlantic hurricane season from August to October.
a. Predictor 1: May–June 2mT (10°–50°N, 30°–10°W)
SSTs have been analyzed in a similar region for previous August seasonal forecast models (Klotzbach 2007; Klotzbach 2011). Figure 4 displays the correlation between predictor 1 and 2mT, U200, U850, and SLP during August–October. The strongest correlation skill between predictor 1 and 2mT is located directly over the Atlantic main development region (MDR; 7.5°–22.5°N, 75°–20°W; see Fig. 4a). The correlation between predictor 1 and the Atlantic MDR 2-m temperature averaged over August–October is 0.79, explaining over 60% of the variance. Smirnov and Vimont (2012) have shown that this should be expected, as SST anomalies tend to propagate equatorward and westward with time, due primarily to wind–evaporation–SST feedback mechanisms. Anomalously warm 2mT in the eastern tropical and subtropical Atlantic are also strongly correlated with anomalous upper-level easterlies (Fig. 4b), anomalous lower-level westerlies (Fig. 4c), and reduced SLP (Fig. 4d). These wind anomalies counteract the prevailing upper-level westerlies and lower-level easterlies that predominate in the tropical Atlantic, consequently resulting in weaker vertical wind shear, which has been shown in many studies to be favorable for an active Atlantic basin hurricane season (e.g., Gray 1984). Anomalously low SLPs throughout the tropical Atlantic have also been documented in previous research to be associated with active Atlantic hurricane seasons (Knaff 1997; Klotzbach 2007).
b. Predictor 2: May–June 200-mb zonal wind (10°S–5°N, 60°–90°E)
Upper-level easterly anomalies in the equatorial Indian Ocean during May–June are associated with an active onset of the Indian monsoon (Webster and Yang 1992). Anomalous easterly flow over the Indian Ocean tends to be associated with an active overall Indian–Asian monsoon system as well, which is typically experienced during La Niña events. Figure 5a displays the strong negative correlation between predictor 2 and eastern tropical Pacific 2mT, indicating that upper-level easterly anomalies in the equatorial Indian Ocean are associated with colder eastern tropical Pacific temperatures. Note that predictor 2’s values have been inverted for easy comparison with predictor 1. In addition, as would be expected given the correlation with ENSO, anomalous upper-level easterly flow is generally experienced across the Caribbean (e.g., Gray 1984) and the tropical Atlantic when predictor 2 is anomalously out of the east (Fig. 5b). Correlations between predictor 2 and the low-level flow along with SLP are mostly insignificant across the tropical Atlantic (Figs. 5c and 5d). The positive correlation with SLP across the tropical eastern Pacific and negative correlation with SLP across the tropical western Pacific is to be expected, given the positive Southern Oscillation index that typically exists when La Niña conditions are present (Walker 1923).
In general, predictor 1 correlates very strongly with Atlantic MDR SSTs, which are critical for Atlantic TC development (Saunders and Lea 2008). Predictor 2 correlates very strongly with ENSO conditions, which through their teleconnected impacts on upper-level winds (Gray 1984) and upper-tropospheric temperature and column stability (Tang and Neelin 2004) also significantly impact Atlantic TC activity.
5. Conclusions and future work
This manuscript documents a first attempt by the TMP at issuing a 1 July seasonal forecast. By using a simple two-predictor model evaluating 2-m temperature in the eastern Atlantic along with 200-mb zonal wind over the tropical Indian Ocean, over 60% of the cross-validated variability in post–30 June Atlantic ACE can be explained over the period 1979–2012. The predictors also show robust correlations with large-scale physical features known to impact Atlantic TCs during the peak of the Atlantic hurricane season from August to October. Positive values of predictor 1 are closely coupled with local anomalous warming in the tropical Atlantic. Negative values of predictor 2 are shown to correlate strongly with La Niña conditions, which then impact the Atlantic through reductions in upper-level westerlies, thereby reducing the vertical wind shear.
DelSole and Shukla (2009) provided a criticism of the methodology utilized by earlier forecasts issued by the TMP. Their criticism was primarily focused upon the screening methodology that was utilized, and the overfitting of the forecast model. When predictors are selected based upon the full time series, cross validation does not provide an accurate view of what kind of skill can be expected in real-time prediction.
Since the analysis by DelSole and Shukla (2009) (which utilized data through the 2007 Atlantic hurricane seasonal forecast), the TMP has extensively redone its entire forecast models, taking into account many of the criticisms outlined in their manuscript. Earlier seasonal forecast models used 6–10 predictors, while new models use 2–4 predictors. In addition, predictors must continue to show significant correlations with the predictand (in this case ACE) over an independent period, to prevent the screening issues discussed in DelSole and Shukla (2009). In the prediction model outlined here, both predictors showed significant correlations over the 1948–78 time period. The strong physical linkages between each predictor and TC activity are also stressed, to make sure that the relevance between each predictor is strongly tied to hurricane activity individually. These more stringent predictor selection criteria have likely led to the improvements in real-time forecast skill that have been documented in the TMP’s real-time forecasts from 2008 to 2012 (real-time verifications available online at http://tropical.atmos.colostate.edu/Includes/Documents/Publications/forecast_verifications.xls).
In the future, the author plans to revise the early June and early April seasonal forecast schemes using ERA-Interim products. An interim forecast model issued in early May will also be constructed. It is hoped that using more recent period data and a more reliable reanalysis product will increase the hindcast, as well as real-time forecast, skill of all of the TMP’s seasonal predictions.
The author would like to acknowledge helpful discussions with William Gray and Eric Blake that significantly improved the manuscript. Comments provided by Chris Landsea and an additional anonymous reviewer also improved the paper significantly.