A simple linear discriminant analysis scheme using climatological predictors is derived for the Atlantic basin as a no-skill baseline for operational phase forecasts from the National Hurricane Center (NHC). The model with independent data correctly classifies 80% of the cases at 12 h, and this value decreases to about 45% by 60 h, remaining steady thereafter. Using the same cases, NHC-issued phase predictions were more frequently accurate than the baseline, so their forecasts are said to have skill.
The National Hurricane Center (NHC) provides tropical cyclone (TC) track, intensity, structure, and phase forecasts in the Atlantic and eastern Pacific basins once a TC has developed; they later provide postprocessed best estimates (the best track) of these parameters for verification. Specifically, the track forecast is the TC center latitude and longitude; the intensity forecast is for the maximum sustained surface (10 m) wind and gust speeds; the structure forecasts contain maximum gale-, storm-, and hurricane-force surface (10 m) wind speed radii in four quadrants. The phase forecasts now contain information as to whether the TC is expected to be tropical, extratropical,1 subtropical, a remnant low or wave, or dissipated (including being absorbed into a larger, extratropical system or frontal zone). Though no-skill baselines to evaluate the track [Climatology and Persistence (CLIPER); Aberson (1998)], intensity [Statistical Hurricane Intensity Forecast model (SHIFOR); Knaff et al. (2003)], and structure (McAdie 2004; Knaff et al. 2007) forecasts have been derived, no equivalent for the phase is available. These forecasts are not designed to be accurate, nor are they meant as guidance for operational forecasts though they are sometimes used as such in cases when forecasters believe their other guidance is unlikely to have skill; their purpose is to provide a baseline for assessing the skill of other forecast techniques and official forecasts. Since skill is defined relative to a simple statistical model based on climatology and persistence, a linear discriminant analysis scheme using the same climatological predictors as CLIPER and SHIFOR (to be consistent with those models) is derived for phase forecasts for the Atlantic basin.
Since track, intensity, and structure are numerical quantities, multiple linear regression techniques provide a simple framework for these baselines except in Knaff et al. (2007), which uses a statistical-parametric scheme. Since the phase is a nonnumeric classifier, and the no-skill baseline is defined as a simple statistical technique, linear discriminant analysis (Wilks 2006; Aberson 1997; Miller 1962) is used for the climatological (nonnumeric) phase scheme. The next section describes the training dataset (climatology) used in developing the model. The model itself, along with examples and an operational forecast skill assessment, is shown in section 3, followed by the conclusions.
2. Tropical cyclone phase change climatology
Best-track data from a recent 31-yr period (1980–2010) were chosen to train the discriminant analysis; this period is seen as a compromise between dataset quality and quantity due to the difficulty in assessing TC track, intensity, and phase before regular satellite monitoring began. The predictors are the same as those in CLIPER and SHIFOR (i.e., the current TC-center latitude and longitude, the current maximum sustained surface wind speed, changes in these three quantities during the previous 12 h, and the current Julian day). All initial times in which the seven predictors are available (starting 12 h after the first best-track data for each system) and in which the phase is tropical at the initial time are used, a total of 7043 cases.
Figure 1 shows the proportion of TC cases in each of the five possible phases in time. The proportion of cases that remains tropical decreases to about half by 84 h from the initial time, and the proportion that dissipates increases to more than one-third by that time. The proportions of cases that transition to extratropical cyclones or weaken to remnant lows or waves peak at 84 and 60 h, respectively, though they remain low.2 Less than 1% of the cases transition to subtropical cyclones.
Figure 2 shows the mean group values of the seven predictors in time; values for subtropical cyclones are not shown because the number of cases is small. TCs that remain tropical or degenerate into remnant lows are farther southeast than those that dissipate or undergo extratropical transition (ET), since these systems tend to remain in the tropics during the 5-day period. TCs that undergo ET are moving through midlatitudes and recurving during that time. Those TCs that dissipate are also in midlatitudes, though not as far north as the transitioning systems, and farther west than the other groups, showing the possible influence of the American continent in causing dissipation. TCs that remain tropical or undergo ET are stronger than those that degenerate or dissipate, though those that remain tropical are, on average, intensifying whereas those that undergo ET are weakening. The dissipating and degenerating classes are generally weak or weakening systems. No large differences in the average time of year are seen in these groups.
3. Predictive discriminant analysis
Predictive discriminant analysis is a statistical technique used to find an optimal linear combination of predictors (discriminators) to separate a set of objects into multiple classes. The procedure provides posterior probabilities that the event resides within each possible classification, with the actual classification being the one with the highest posterior probability (Morrison 1969; Perrone and Lowe 1986; Mason and Mimmack 2002; Hennon and Hobgood 2003; Kerns and Zipser 2009; Kerns and Chen 2013). Best-track data from the most recent 31-yr period (1981–2010) train the discriminant analysis, and the resulting discriminant functions are tested on an independent dataset from 2011. For 2011, operational values of the seven predictors are used instead of the postprocessed best-track values in order to mimic what would be provided in real time. The result is a set of no-skill phase predictions of the 19 Atlantic TCs identified operationally, initiated every 6 h and verifying every 12 h through 120 h, a total of 405 cases.
a. Dependent data (1980–2010)
The descriptive discriminant analysis classifies the 70433 dependent cases at each forecast time, and the classifications are compared to the best track. The scheme is able to correctly classify almost all the cases from data 12 h earlier (Fig. 3), but the numbers decline to about ⅔ of the cases by 72 h before leveling off. Therefore, climatology alone is able to classify at least ⅔ of the cases through 120 h. Since this is a baseline analogous to those of CLIPER and SHIFOR, the skill is, by definition, zero.
Example classification tables at 72 and 120 h (Table 1) show that remnant lows and waves, in particular, rarely occur and are even more rarely predicted. The likelihood that a particular case will be extratropical decreases in time, likely because the extratropical phase itself does not exist for long periods. The numbers of cases that are classified as tropical (dissipated) decreases (increases) in time, as expected.
b. Independent data (2011)
The 2011 Atlantic season was unusual in that 4 of the 19 TCs identified operationally underwent ET, and 9 of the 19 degenerated into a remnant low before dissipation. Tropical Storm Lee was one of only a few TCs that ever transitioned to a subtropical cyclone before dissipation. Despite the unusual independent data, the predictive discriminant analysis is able to accurately forecast about 80% of the cases at 12 h, and this accuracy decreases in time to about 50% by 60 h before leveling off. Any forecasting technique must be able to perform better than this to be said to have skill.
Classifications from four of the 2011 TCs are examined in depth to show how the scheme works. Tropical Storm Arlene was a short-lived June TC in the Gulf of Mexico that moved inland soon after developing. Because of its location surrounded by land, it would climatologically be expected to dissipate after landfall, possibly with a period as an extratropical cyclone before dissipation. The climatological forecasts from the predictive discriminant analysis suggested that Arlene would remain a TC for 1–3 days longer than in reality, but it correctly forecast the lack of ET (Fig. 4). The probabilities of the case with the longest delayed dissipation (initial time 0600 UTC 30 June as highlighted in Fig. 4) show that the probability of either dissipation or ET rose through the forecast period, but remained below that for the tropical phase until 120 h. Though ET is climatologically possible for Gulf of Mexico systems, its probability leveled off below that for the tropical phase by 4 days into the forecast.
Tropical Storm Lee also developed in the Gulf of Mexico and soon moved inland, so it had similar climatological aspects to Arlene. Lee behaved differently by transforming to a subtropical cyclone before becoming extratropical and then dissipating. The predictive discriminant analysis was unable to correctly predict the transformation to a subtropical cyclone because this is a rare event, and it also did not predict ET (Fig. 5). The probabilities for the 0600 UTC 3 September case (not shown) are similar to those shown for the Arlene case. The probability that Lee would become subtropical was minuscule during this time, while the probability that Lee would undergo ET was relatively large, suggesting this as a strong possibility.
Hurricane Irene was a strong storm that originated from a tropical wave but did not develop until reaching the western Atlantic Ocean. The predictive discriminant analysis incorrectly forecast a quick transition to the extratropical phase and dissipation during the first day of the lifetime of Irene, though the probabilities that Irene would remain tropical suggested this as a possibility (Fig. 6). After Irene turned more to the west and intensified, the scheme correctly predicted that Irene would remain tropical and then transition to an extratropical cyclone. The predicted ET timing was within 2 days, but the scheme incorrectly predicted that Irene would remain an extratropical cyclone for a long time, when in fact Irene was absorbed by a large, powerful extratropical cyclone. The dissipation probability increased during this period, suggesting dissipation as a strong climatological possibility.
The predictive discriminant analysis also predicts an extratropical phase during the early lifetime of Katia, a long-lived, strong, Cape Verde system, as it intensified over the far eastern Atlantic Ocean (Fig. 7). These predictions were similar to those for Irene, though the probability of Katia remaining tropical was higher than in Irene. The scheme correctly predicted a long period of Katia being a TC, then ET within about 2 days of its actual transition. The scheme at first correctly predicted dissipation as Katia was weakening after recurvature, but then predicted Katia to remain an extratropical cyclone for the forecast periods. The final two predictions incorrectly suggested that Katia would become subtropical over the North Atlantic, and the probabilities were overwhelmingly toward this solution with a slight chance of dissipation predicted, likely due to the high intensity of Katia as it reached high latitudes.
c. Comparison to operational (OFCL) forecasts
A forecast technique must perform better than that from a simple scheme based upon climatology in order to be considered skillful. All OFCL forecasts issued during the 2011 Atlantic season for systems that were identified as tropical in the best track at the start of the forecast are compared to those from this scheme, a total of 373 cases. These forecasts have a considerably higher percentage of correct classifications than do those in the climatological scheme (Fig. 3) and, thus, can be said to have skill. The Panofsky and Brier skill scores4 (Aberson 1997) are about 0.3 at all forecast times, and the classification tables are significantly different from chance at the 99% level using a chi-square test.
A simple statistical scheme based on climatology has been developed as a baseline to assess TC phase forecast skill in the Atlantic basin. The model, based on linear discriminant analysis, uses best-track data from 1980 to 2010 and correctly classifies 45%–80% of the cases depending on forecast time using independent data from the 2011 Atlantic hurricane season. During the same season, NHC issued correct phase predictions more frequently than the baseline at all forecast times, so their forecasts may be said to have skill.
Starting in 2011, the statistical-dynamical Statistical Hurricane Intensity Prediction Scheme (SHIPS) and the Logistic Growth Equation Model (LGEM) started predicting TC phase (though they do not allow for the “remnant low” phase). The current procedure allows for an assessment of the forecast skill. Though dynamical numerical models do not directly forecast TC phase (except dissipation), postprocessing of model fields using techniques such as the cyclone phase (Hart 2003) can be used for this purpose. Comparison with these climatological forecasts can be used to assess their skill. Also, SHIFOR does not allow for dissipation. Since standard TC intensity forecast errors are quantified only when both the forecast and best-track intensities are available, the current technique can be used in addition to alternative techniques (such as described in Aberson 2008) to assess intensity forecast skill.
Thanks go to John Kaplan and Jason Dunion for comments on an earlier manuscript draft, and to Mike Jankulak for his expert editing. The routines used to perform the discriminant analysis are from Roguewave’s IMSL Libraries, and special thanks to their support staff, especially Trudi Schweizer, for help in getting the routines to work properly.
Both extratropical systems and remnant lows are now operationally classified by NHC as “posttropical.”
Hart and Evans (2001) showed that ~47% of Atlantic TCs undergo extratropical transition. These statistics show that ~10% of individual cases undergo ET. The current sample includes cases in which the TC did not undergo ET within 120 h of the initial time but may have later. Thus, these two values are not contradictory.
This skill score only accounts for yes–no forecasts. No skill score that can evaluate multiple category forecasts has come to the attention of the author.