Catastrophic impacts associated with tropical cyclone (TC) activity mean that the accurate and timely provision of TC outlooks are important to people, places, and numerous sectors in Australia and beyond. In this study, we apply a Poisson regression statistical framework to predict TC counts in the Australian region (AR; 5°–40°S, 90°–160°E) and its four subregions. We test 10 unique covariate models, each using different representations of the influence of El Niño–Southern Oscillation (ENSO), Indian Ocean dipole (IOD), and southern annular mode (SAM) and use an automated covariate selection algorithm to select the optimum combination of predictors. The performance of preseason TC count outlooks generated between April and October for the AR TC season (November–April) and in-season TC count outlooks generated between November and January for the remaining AR TC season are tested. Results demonstrate that skillful TC count outlooks can be generated in April (i.e., 7 months prior to the start of the AR TC season), with Pearson correlation coefficient values between r = 0.59 and 0.78 and covariates explaining between 35% and 60% of the variance in TC counts. The dependence of models on indices representing Indian Ocean sea surface temperature highlights the importance of the Indian Ocean for TC occurrence in this region. Importantly, generating rolling monthly preseason and in-season outlooks for the AR TC season enables the continuous refinement of expected TC counts in a given season.
Globally, 80–90 tropical cyclones (TCs) on average occur across the tropical oceans each year (Emanuel 2003). However, there is a high degree of year-to-year variability in the number of TCs that occur, with Henderson-Sellers et al. (1998) estimating this variability to be ~10% globally and up to 40% for some regions. The Australian region (AR; 5°–40°S, 90°–160°E) is no exception with an average of 10.9 TCs per season, and a range of between 5 and 20 TCs occurring during the AR TC season (November–April) between 1980 and 2009 (Chand et al. 2019).
The temporal variability of TCs, and uncertainties associated with TC outlooks, hinders preparedness, and can amplify TC-related impacts and associated costs and damage. For example, in Australia, TCs account for the majority of normalized insurance losses (29%) between 1966 and 2017, compared to other natural perils including hail (27%), flood (15%), bushfire (12%), storm (10%), and earthquake (5%) (McAneney et al. 2019). As such, the accurate and timely provision of tropical cyclone outlooks for Australia are important for a range of people, industries, and services across the nation.
Early efforts to produce seasonal TC forecasts for the Australian region include those of Nicholls (1979), who generated a statistical forecast scheme to predict TCs using pressure anomalies over Darwin, Australia. This technique was subsequently updated to include sea surface temperature around the tropical Pacific Ocean and northern Australia as a predictor of TC activity (Nicholls 1985). Subsequent prediction schemes have since followed (Liu and Chan 2012; McDonnell and Holbrook 2004a,b; Nicholls 1992; Nicholls et al. 1998; Solow and Nicholls 1990; Werner and Holbrook 2011; Wijnands et al. 2015), all of which use an index or indices representing El Niño–Southern Oscillation (ENSO) as model predictors. The well-established ENSO–TC relationship has been comprehensively investigated for the AR (Basher and Zheng 1995; Chand et al. 2013; Dowdy and Kuleshov 2012; Evans and Allan 1992; Goebbert and Leslie 2010; Kuleshov et al. 2008a; Ramsay et al. 2008; Dare and Davidson 2004). During El Niño conditions, the eastward migration of favorable conditions required for TC genesis (Gray 1975), results in a systematic shift of TCs toward the northeast, typically resulting in fewer TCs occurring in the AR. During La Niña, enhanced TC activity is observed around eastern Australia and off the northwest coast of Australia (Kuleshov and de Hoedt 2003). The northeast/southwest shifts in TC activity during El Niña/La Niña phases (e.g., Magee et al. 2017) make ENSO an important predictor for TC activity in the AR region.
Since approximately 1998, a decline in the predictive skill of ENSO-derived AR TC forecasts has been reported (Dowdy 2014). Ramsay et al. (2017) found that including Indian Ocean SST variability alongside predictors of ENSO can assist in mitigating any loss in the predictive skill of ENSO-driven TC forecasts since 1998. Other studies have also investigated the role of Indian Ocean SST variability on TCs in the AR (Liu and Chan 2012; Saha and Wasimi 2013) and areas of the overlapping southwestern Pacific (SWP) basin (135°E–120°W; Magee and Verdon-Kidd 2018). For the AR and subregions, Wijnands et al. (2015) found that the dipole mode index (DMI) was the most frequently used index used in support vector regression models, when considered among a host of other indices, including eight ENSO indices. This is also consistent with Magee et al. (2020) in which indices representing Indian Ocean SST variability were most frequently selected for predicting TC counts for island and regional-scale TC outlooks across the SWP region.
Other climate influences are also known to influence TC activity in the Australian region, including the Madden–Julian oscillation (MJO; Hall et al. 2001; Lavender and Dowdy 2016) and interdecadal Pacific oscillation (IPO; Grant and Walsh 2001). One climate influence that has not been explored for the AR, in terms of its influence of TC behavior, but has been for the neighboring (and partly overlapping) SWP, is the southern annular mode (SAM). Diamond and Renwick (2015) found that a synergy between SAM and ENSO shows an increased number of SWP TCs undergo extratropical transition reaching farther south during positive SAM and La Niña conditions, which may have implications for the east coast of the AR, where exposure is high. Magee et al. (2020) demonstrated that SAM was an important contributor to improving the skill of TC outlooks in the SWP. SAM has previously been shown to be associated with seasonal hydroclimatic variability across Australia (Hendon et al. 2007; Gillett et al. 2006; Risbey et al. 2009; Kiem and Verdon-Kidd 2009; Gallant et al. 2012), which suggests SAM should also be considered for inclusion in TC prediction schemes for the AR.
The Australian Bureau of Meteorology (BoM) produces a statistically driven operational TC outlook for the AR and its subregions, including the eastern region (AR-E; 5°–40°S, 142.5°–160°E), the northern region (AR-N; 5°–40°S, 125°–142.5°E), the northwest subregion (AR-NW; 5°–25°S, 105°–130°E), and the western region (AR-W; 5°–40°S, 90°–125°E). Outlooks are derived in October for the coming November–April TC season using two linear discriminant models that are based on the July–September average of the Southern Oscillation index (SOI) and Niño-3.4 index. Operational outlooks produced by the BoM provide a lead time of less than 1 month before the official start of the AR TC season but provide essential information for decision-makers, stakeholders, and a host of other industries in TC impacted areas.
While previous studies have investigated methods to improve the statistical skill of TC forecasts in the AR (Werner and Holbrook 2011; Liu and Chan 2012; Wijnands et al. 2014, 2015, 2016; Ramsay et al. 2017), few investigate how the dynamics associated with co-occurring Pacific, Indian, and Southern Ocean climate variability [i.e., ENSO, Indian Ocean dipole (IOD), and SAM] can be used in a predictive modeling framework, or how increasing outlook lead time can impact the predictive skill of TC outlooks. In this study, we train and validate multivariate Poisson regression models to predict TCs in the AR and its four subregions. We evaluate different indicators of ENSO, IOD, and SAM and utilize an automated model selection algorithm to determine the most statistically significant combination of predictors. We also explore whether it is possible to generate skillful TC outlooks up to 6 months ahead (from April) of when current TC outlooks are generated (October) for the November–April TC season and investigate the usefulness of producing rolling monthly, in-season TC outlooks to refine predictions for the latter and most active half (e.g., Chand et al. 2019) of the Australian TC season.
2. Data and methods
a. TC data and study area
TC data are accessed from the Australian Best-Track Tropical Cyclone Database that is managed and maintained by the BoM (BoM 2020a). This dataset contains historical best tracks for TC events south of the equator between 90° and 160°E from 1906 to present. Only TCs for which minimum central pressure is equal to or less than 995 hPa that occur within the official AR TC season (November–April) between 1970 and 2019 are considered in this analysis (Nicholls et al. 1998; Kuleshov et al. 2008b, 2012; Wijnands et al. 2014). Given that the Australian TC season straddles the Gregorian calendar year, events that occur during November and December are considered to be part of the following year’s TC season (see Dowdy 2014). For example, TCs that occur in November or December 2018 are considered to be part of the 2019 TC season. A total of 50 TC seasons (1970–2019) are analyzed in this study.
TC outlooks are derived for each of the five Australian TC outlook regions (see Fig. 1; BoM 2020b) including AR (5°–40°S, 90°–160°E), AR-E (5°–40°S, 142.5°–160°E), AR-N (5°–40°S, 125°–142.5°E), AR-NW (5°–25°S, 105°–130°E), and AR-W (5°–40°S, 90°–125°E).
b. Model covariates used for TC outlooks
Ten predictor models are considered in this analysis. Oceanic (e.g., Niño-3.4), atmospheric (e.g., SOI), and both oceanic–atmospheric [e.g., coupled ENSO index (CEI)] derived ENSO indices are included in this analysis (see Table 1 for a detailed list of model covariates and Fig. 2 for a diagrammatic summary). Although numerous ENSO indices exist, there is no consensus on which one best captures the ENSO phenomenon (Hanley et al. 2003) or which method should be used to classify ENSO events (Kiem and Franks 2001). To avoid multicollinearity between selected indices that would be detrimental to model performance (O’Brien 2007; Villarini et al. 2012), each predictor model contains one unique ENSO index, coupled with the following indices: SAM, IOD east (E), IOD west (W), and DMI (see Table 2).
Each predictor model contains monthly values that vary depending on the model initialization month. Six monthly lags of each covariate are included per model. The combination of monthly lags is dependent on which month the model is initialized. Ten model initialization months are considered, including seven “preseason” models, initialized on a monthly basis between April and October (for the coming November–April), and three “in season” models that are initialized in November, December, and January, providing predictions for the remaining December–April, January–April, and February–April TC seasons, respectively.
c. Model development and variable selection
For rare, discrete events, such as TC counts (Wilks 2011), Poisson regression has been shown to be an effective methodology to model expected TC counts using one or more geophysical variables (predictors) (Elsner and Schmertmann 1993; McDonnell and Holbrook 2004a,b; Sabbatelli and Mann 2007; Magee and Verdon-Kidd 2018; Magee et al. 2020). Consistent with these studies, we use a model based on Poisson regression:
Here, μi is the expected number of TC counts with covariate values xij for the j predictors on the ith observation; βj refers to the regression coefficient for each covariate, and β0 refers to the intercept.
Using the entire predictor model dataset (each containing six variables with six monthly lags—36 covariates in total), the stepAIC (Akaike Information Criterion) R function (MASS package; Ripley et al. 2020) was used to perform a backward and forward stepwise search in order to select the most appropriate combination of covariates for a given scenario (outlook region and model initialization period). StepAIC uses the AIC (Burnham and Anderson 2004) as a selection criterion for determining when the variable elimination procedure should stop. Using the selected predictors from this stage, a predicted TC time series is derived.
The performance of each prediction is evaluated using the following metrics:
Pearson correlation coefficient (r; predicted TCs vs observed TCs),
coefficient of determination R2,
root-mean-square error (RMSE),
strike rate: exact (SR-E), the percentage of seasons for which the prediction matches the observation,
strike rate ± 1 (SR ± 1), the percentage of seasons for which the prediction is ±1 from the observation,
finite-sample corrected Akaike information criterion (AICc; Burnham and Anderson 2004), a measure to assess model overfitting and used to estimate the quality of a model relative to another, and
skill score (SS; as per Roebber and Bosart 1996).
Skill score is defined as
where fi and Oi are the ith prediction and observation, respectively. Similarly, MSEc is defined by the substitution of climatological values for the predictions fi in Eq. (4), such that
where the climatological prediction is defined by
An SS of 1.0 (100%) indicates a perfect prediction, 0.0 (0%) indicates a prediction as good as climatology, and negative values indicate predictions that are less accurate than climatology (Roebber and Bosart 1996). In this analysis, a total of 500 model runs were performed (10 model initialization months, 10 predictor models, and 5 forecast regions, including the AR, AR-E, AR-N, AR-NW, and AR-W).
d. Cross validation: Evaluating outlook skill
To cross validate each model, we evaluate outlook skill by validating model performance over the most recent 10 TC seasons (2010–19) and training models on the first 40 TC seasons (1970–2009). This method of cross validation is preferable in this case, over other methods of cross validation, such as leave-one-out cross validation (LOOCV), as it enables validation over longer continuous time periods, which is particularly important for time series with a significant linear trend (Magee et al. 2020). For the 1970–2009 training period, the seven performance metrics (as outlined in section 2c) are calculated to enable comparison with models that have been trained over the entire 50 TC season (1970–2009) time period. For the validation period (2010–19), only SR-E and SR ± 1 are calculated, because n = 10 is insufficient to calculate other performance metrics outlined in section 2c.
a. Predictor model performance for October initiated outlooks
Comparing predictor model skill shows substantial variability between each of the 10 models tested (see Table 3) and outlook regions (Fig. 3). In Fig. 3, model statistics are summarized in the left panels for models trained on the entire 1970–2019 period and in the right panels for models trained/calibrated on the first 40 TC seasons (1970–2009) and validated on the most recent 10 TC seasons (2010–19).
For models trained on the entire 1970–2019 period (left panels in Fig. 3), models demonstrate good performance in predicting TCs counts for the AR and various subregions. For AR (μ = 10.6; σ = 3.6), TC outlooks are correlated with observations between r = 0.6 and 0.81, have an SS between 49% and 64%, and an SR-E between 10% and 26%. An SR-E of 26%, means that predictions match the observations for 13 of 50 seasons, and an SR ± 1 of 58% means that predictions match the observations (±1) for 29 of 50 seasons.
Models are also found to perform well for other subregional outlook regions. For example, in the AR-E region (μ = 3.5; σ = 1.9) an SR-E of up to 40% suggests that covariate model 2 (see Table 2) correctly predicts TC counts for 20 of the 50 TC seasons. Further, an SR ± 1 of 70% for the same model suggests that the prediction was correct (±1 TC) for 35 of the 50 TC seasons. For the AR-N (μ = 3.8; σ = 1.6), models were able to successfully predict TC counts for 17 of 50 TC seasons (SR-E = 34%) and ±1 TCs for 40 of 50 TC seasons (SR ± 1 = 80%). For other regions, SR-E (SR ± 1) values of 24% (58%) for AR-NW (μ = 5.1; σ = 2.2), and 30% (66%) for AR-W (μ = 6.5, σ = 2.4) also indicate that models perform well when validated on the entire time series.
Evaluating cross-validation statistics (Fig. 3, right panels) indicates comparable model performance between the full calibration period (1970–2019) (Fig. 3, left panels) and the shorter 1970–2009 calibration period. Validating model performance on the most recent TC seasons (2010–19), AR-E and AR-N region models have an SR-E of 50%, meaning models are able to successfully predict TCs for 5 of 10 seasons, increasing to up to 90% for SR ± 1 (AR-N). For other regions, validation SR-E (SR ± 1) statistics of 30% (70%) for the AR and 40% (80%) for AR-NW and AR-W also highlight superior model skill when models are trained on the first 40 seasons and validated on the remaining 10 seasons.
Selecting a superior model
Given 10 predictor models are assessed in this study, a thorough comparison is required to define the superior model. To reduce the number of models and to compare models for further analysis, we use the following two methods:
In method A, models with the highest SS when trained on the entire 1970–2019 period (Fig. 3, left panels) are selected.
In method B, models with the highest SR-E for the 2010–19 validation period (Fig. 3, right panels) are selected. Where more than one model has the highest SR-E, the following conditions are considered: (i) models with the highest SR ± 1 (2010–19 validation) and (ii) model with the highest SS (1970–2009 calibration/training period). The second condition for method B is only applied if two models share the same SR ± 1 from the first condition.
By applying this selection criterion, different models are selected for AR, AR-NW, and AR-W. For AR-E and AR-N, the same models are selected as they happen to both fulfill this criterion.
For AR, the model identified using method A (model 1), has an SR-E (SR ± 1) of 22.5% (57.5%) for the 1970–2009 calibration period (40 TC seasons), successfully predicting 9 (23) TC seasons (Fig. 4a, red line). This performs better than the model identified using method B (model 3; blue line), which is based on superior SR-E performance on the 2010–19 validation period, which has an SR-E (SR ± 1) of 20% (45%) for the 1970–2009 calibration period, successfully predicting 8 (18) TC seasons. Comparing SR-E and SR ± 1 for other regions during the 1970–2009 calibration period suggests that, for AR-NE (Fig. 4j) and AR-W (Fig. 4i), SR-E for method A (red lines) is higher (25% for both regions) than models for method B (blue lines; 20% and 12.5% respectively). For AR-E (Fig. 4c) and AR-N (Fig. 4e), models identified using method A and B are the same.
Not surprisingly, for the 2010–19 validation period, models selected using method B are found to perform better than models identified using method A, in terms of strike-rate statistics. For Australia, method A has an SR-E (SR ± 1) of 10% (50%) versus 30% (70%) for method B. For the NE subregion, method A has an SR-E (SR ± 1) of 30% (50%) versus 40% (80%) for method B. For the AR-W, differences between the two models are most apparent. Using method A, an SR-E of 0% means that for the 10 TC seasons considered in the validation period, not one prediction matched the observation, as compared with SR-E = 30% for method B. However, method A has a higher SR ± 1 (70%) relative to method B (60%).
Models identified using either method are useful in predicting TC counts. However, models using method A (highest SS for entire 1970–2019 period) were selected for the following reasons:
It is important to have a model that performs well across the entire time series, not just the most recent period. For all regions, a downward trend in TC counts is observed, up to −1.04 TCs decade−1 for the AR (Fig. 4). Evaluating model skill/performance on the entire time series assesses a model’s ability to reproduce this downward trend in TC counts.
Use of the first condition associated with method B does not consider overall model performance, and if, by chance, SR-E was high for a given model, there is a possibility that the model would not perform well over the entire time series.
In the case of the 1970–2009 calibration period, method A better captures extremes. For the AR, this is particularly evident for relatively inactive TC seasons, where method B tends to overestimate TC counts. A comparison of standard deviation for the 1970–2009 indicates that method A (σ = 3.1) better captures the variability of observed TCs (σ = 3.7) versus method B (σ = 2.9). For AR-NW/AR-W, method A (σ = 1.4/1.8) also better captures the variability of observed TCs (σ = 2.3/2.4) than method B (σ = 1.1/1.8).
Using method A, a comparison of predicted TCs when trained on the entire 1970–2019 period is compared with observations for the five regions considered in this analysis (Fig. 4; right panels). Confidence intervals (5%–95%) indicate that for nearly all seasons, observations fall within these bounds. Summary statistics for these outlook models are summarized in Table 3. For the AR, selected covariates from model 1 (Table 2) explains 65% of the variance in seasonal TC frequency, successfully predicted TC counts for 9 of 50 seasons (SR-E = 18%) and ±1 TCs for 27 of 50 seasons (SR ± 1 = 54%). For all other regions, selected covariates explain between 34% and 52% of the variance in seasonal TC frequency. Investigation of model undercount and overcount does not suggest a bias toward consistently underestimating or overestimating TC counts in a given region (Fig. 5). Also, it does not suggest any bias toward underestimating/overestimating TC counts according to phase of ENSO (when categorized according to Niño-3.4).
b. Evaluating predictor model performance for rolling monthly preseason TC outlooks
Producing TC outlooks in October provides limited lead time for those who depend on TC outlooks for planning and decision-making. To establish how model skill varies according to changing lead time, Fig. 6 compares model performance metrics (for models trained between 1970 and 2019) between April and October. For model initialization month, the same model fitting and evaluation process is conducted [as outlined in Fig. 3 and section 3a]. As such, the predictor model used to generate the outlook may differ between months. Summary statistics for each region and month of model initialization, as well as model cross-validation statistics, are summarized in the online supplemental material.
Using SS as a metric to determine which month provides the most skillful outlook, Fig. 6 shows that the most skillful outlooks for the AR (70.2%), AR-E (57.4%), and AR-W (61%) are initiated in July. For the AR-NW, the most skillful outlook is produced in June (54.9%), while for the AR-N, optimum performance is August or September (57.9%). Relative to outlooks initiated in October, improvements in SS of between 5.33% (AR) and up to 20.57% (AR-NW) are achieved by initializing outlooks up to five months before the start of the AR TC season. Also, strike-rate performance (SR-E and SR ± 1) suggests that generating outlooks before October can still provide useful information for the preceding TC season. In fact, for four of five regions (AR, AR-N, AR-NW, and AR-W), higher SR-E statistics are observed for outlooks initiated between April and September than for those outlooks initiated in October. For example, for AR outlooks, the highest SR-E is achieved for outlooks initiated in June (26%; 13 in 50 seasons) as compared with October (18%; 9 in 50 seasons). Other examples include AR-N, where SR-E in June is 38% (19 in 50 seasons) compared to 32% (16 in 50 seasons) for October initiated outlooks. AR-E is the only region where the SR-E for October initiated outlooks (40%; 20 in 50 seasons) is higher than the preceding six months (maximum SR-E up to 36% for April initiated outlooks; 18 in 50 seasons). Analysis of cross-validation statistics, specifically the 2010–19 validation period, shows that generating outlooks also provides useful and skillful forecast information (see Fig. S1 in the online supplemental material).
The relationship between observed TCs conditional on the obtained predictions is summarized for each region and lead time for November–April TC predictions in Fig. 7. If the linear regression line (black line) is not close to the 1:1 (y = x) line, this is indicative of forecast bias. For all outlook regions and initialization months, both lines lie in close proximity, indicating no obvious forecast bias. However, in exploring scenarios where models underpredict TC counts, particularly for active TC seasons (points above the 1:1 line toward the upper right), models are more likely to underpredict particularly active TC seasons than overpredict (also observed in Fig. 4). For AR, this is particularly obvious, where models consistently underestimated up to four TC seasons, where observed TCs counts are ≥16, over a number of outlook lead times. For AR-N (TCs ≥ 6) and AR-NW (TCs ≥ 8), underestimation for a number of outlook lead times is also observed. Given each monthly outlook is retrained with a different combination of monthly predictor models, the persistent occurrence of this underestimation suggests a forecast bias for particularly active TC seasons may be present.
To cross validate the impact of outlook lead times on model performance, we compare observed TC counts with predicted TCs (Fig. 8). Two predictions for the 2010–19 period are compared: (i) those using the full 1970–2019 period as a training set, with 5%–95% confidence intervals applied (red line); and (ii) those using the 1970–2009 training set, where the remaining 2010–19 period is the validation set (blue dots). For all cases (350 unique predictions), TC counts from the 2010–19 validation set fall within the 5%–95% confidence intervals of the predicted TCs from the full training set. Also, the differences between both predicted TC time series are not impacted by increasing outlook lead time. Figure 8 confirms that the methods to derive a statistical TC outlook for Australia and subregions for a number of lead times, up to six months before current operational outlooks, are not influenced by training period, and relationships between predicted TCs are similar regardless of training period.
In summary, models perform well and demonstrate considerable skill when predicting November–April TC counts with increasing lead times. The implications of increased lead times and the potential of this outlook are discussed in section 4.
c. Evaluating the performance of in-season TC outlooks
While it is useful to predict seasonal TC counts before the start of the TC season, in-season outlooks can provide further insight into how many TCs may occur before the end of the TC season. Also, given that the second half of the Australian TC season is more active than the first half (Chand et al. 2019), this method provides an opportunity to refine TC outlooks for the remaining TC season.
In this analysis, outlooks are initiated in November, December, and January for the remaining December–April, January–April, and February–April TC season, respectively. Figure 9 compares model skill for outlooks initiated between October and January. Note the statistics for October are the same as in Fig. 6 but provide an important reference to how skill varies according to model initiation month for in-season TC outlooks. With time, in-season model SS statistics generally decrease because sample sizes and associated variance decreases. For AR-N and AR-NW, small increases in SS are observed in December (for January–April outlooks) but swiftly decline by January, likely a result of reduced sample size when dealing with a shortened TC season with which outlooks are generated. Also, for every region, strike-rate statistics, SR-E and SR ± 1 are higher in January than in October. For the AR region, modest increases in SR-E between October (18%; 9 in 50 seasons) and January (22%; 11 in 50 seasons) are observed, while more substantial increases are observed in SR-E for the smaller SR-E region; 18% (9 in 50 seasons) versus 38% (19 in 50 seasons). Large improvements in SR ± 1 are also observed for all basins between October and January. Cross-validation statistics for in-season outlooks are summarized in Fig. S2 and Tables S1–S5 of the online supplemental material.
d. Analysis of covariate selection for preseason and in-season TC outlooks
The stepAIC function selected, on average, seven covariates for models initialized between April and January. Per model run, 30 covariates (six monthly lags of five model covariates; see Table 1) were available for selection. For all models, AICc are calculated for fitted models and intercept-only models and demonstrate a substantial improvement in model quality, (i.e., the risk of model overfitting is minimized in this analysis; see Table 3 and Tables S1–S5 in the online supplemental material). While some models may be more complex and less parsimonious than others, AICc values ≥10 suggest models are not overfitted.
Analysis of the proportionality of model covariates for each outlook region is summarized in Table 4. Indices representing Indian Ocean SST variability (IOD E, IOD W and DMI) account for between 38.3% (AR-W) and 53.9% (AR-E) of predictor model covariates. IOD E and IOD W that make up the individual poles of the DMI were more likely to have been selected than the DMI itself. The prevalence of indices representing Indian Ocean SST variability is consistent with Wijnands et al. (2015) who found that the DMI was the most commonly occurring predictor for modeling TCs in the AR and its subregions.
For ENSO indices, the trans-Niño index (TNI) was preferred for AR-N and AR-W. For other regions, the Niño-1+2, Niño-4, and Niño-3 oceanic predictors were most preferred (but do not present an overwhelming majority) for the AR, AR-E, and AR-NW regions, respectively. The selection of multiple ENSO indices highlights the diversity that including multiple ENSO indices can add to predictive modeling (Kiem and Franks 2001; Tozer and Kiem 2017; Tozer et al. 2017). The influence of SAM (Marshall 2003) on TC counts is also significant [between 15% (AR) and 22.5% (AR-NW) of covariates], demonstrating that SAM is also a useful addition to the covariate set.
4. Discussion and conclusions
This study derives statistically driven TC outlooks for Australia and its subregions. An automated covariate selection algorithm is used to determine the best combination of predictors to predict TCs. Compared to other studies that explore methods to derive TC outlooks/predictive models for Australia, including simple linear regression approaches (Nicholls et al. 1998), Bayesian regression (Werner and Holbrook 2011), Poisson regression (McDonnell and Holbrook 2004a,b) and machine learning algorithms (Wijnands et al. 2015), this study is unique for the following reasons:
An automated covariate selection algorithm is used that allows the model to consider multiple ENSO, IOD, and SAM indicators, with six monthly lags of each, enabling the model to consider as many relevant parameters as possible.
Given ENSO is the dominant mode of variability in the Pacific, and the well-established ENSO-TC frequency relationship (Nicholls 1979; Chand et al. 2013; Ramsay et al. 2008), 10 unique combinations of covariates are tested, each of which contained a different ENSO index. The subjective choice of one, or a few, ENSO indices can potentially limit a model’s prediction potential and given there is no consensus on defining or quantifying ENSO variability in one index (Hanley et al. 2003), considering as many ENSO indices as possible enables selection of the most superior model per region and outlook month.
The skill of rolling monthly outlooks are tested and show that skillful preseason TC outlooks can be generated up to seven months before the start of the TC season. In-season outlooks are also tested and are found to provide useful insights for the remaining TC season, particularly given the latter half of the Australian TC season is most active (Chand et al. 2019).
Producing skillful TC outlooks that resolve the variability in both Australian TC counts and ocean–atmosphere interactions has the potential to reduce vulnerability and impacts associated with TC activity. The models presented in this analysis are able to replicate this variability and the decreasing trend in TC counts (Fig. 4). For the AR, outlooks generated in October and trained for the entire 1970–2019 period are significantly correlated with observations up to r = 0.80 and have an SS of 64.8% and an SR-E (SR ± 1) of 18% (54%). Models for other locations also demonstrate impressive performance, particularly strike-rate statistics.
TC outlooks are typically derived in October, for the coming November–April TC season. However, producing seasonal TC outlooks as early as possible is desirable for decision-makers, natural resource companies, agribusiness, and the insurance industry (among other industries) to plan accordingly for the TC season ahead. We show that generating monthly TC outlooks from April enables the continuous refinement of expected TC counts as the TC season approaches (and even after the TC season has begun). This additional six-month lead time, compared to current operational outlooks, enables improved planning and preparation, particularly for seasons where above/below average TC counts are expected. In-season TC outlooks also revealed impressive performance, which allows for continuous in-season refinement of TC counts. While a trade-off between outlook lead time and predictive skill is a particular issue in TC forecasting (Leroy and Wheeler 2008), it does not appear to be a significant issue in this analysis.
While the modeling framework used in this study enables inclusion of any predictor, such as multidecadal, or extratropical climate variability, model covariates considered in this analysis were included for a number of practical reasons. First, care was taken to choose appropriate covariates that are easily accessible and regularly updated. Values one month prior to the month of model initialization must be available for the model to generate a prediction. As suggested in the introduction, inclusion of an IPO index may have improved model skill, however, it was decided to exclude the IPO because of (i) issues associated with multicollinearity between Niño-3/Niño-3.4 and unfiltered IPO indices (Westra et al. 2015) and (ii) the absence of the most recent monthly IPO values for filtered IPO indices that are needed to run each TC outlook model. Also, the MJO was not considered as a covariate in this analysis as calculating a monthly average of the MJO would average its location and intensity across multiple phase spaces. This would be necessary for the MJO to conform with the other monthly-averaged covariates that are input to each model. Future work could investigate incorporating MJO as a model covariate, which may add greater predictive skill to the model. The methods and results presented in this paper offer a step forward for TC outlooks in the AR and demonstrate considerable skill at predicting TC counts, especially for outlooks generated up to seven months before the official start of the AR TC season. Further work could investigate the relationship between covariates and TC forecasts with time and establish if the addition of other covariates, such as the MJO, improve the robustness of the models.
For Australia, the new TC outlook presented here, the Long-Range Tropical Cyclone Outlook for Australia (TCO-AU), provides a complementary perspective to regional outlooks produced by the BoM and other agencies. For both preseason and in-season outlooks, rolling monthly updates of TC counts will be offered to allow consideration of the most recent changes in ocean–atmosphere variability. At the start of each monthly update, TCO-AU is retrained to incorporate TC counts from the most recent TC season, increasing sample size, and potentially reducing uncertainties associated with TC outlooks with time. TCO-AU represents an additional source of information that a range of end users can use to enhance decision-making in the months leading up to (and within) a TC season. TCO-AU guidance will be updated and made freely available online (https://www.tcoutlook.com).
The authors are grateful to all referenced in Table 4 who continue to update the climate indices used in this study. These include NOAA_ERSST_V5 data provided by the NOAA/OAR/ESRL/Physical Sciences Laboratory (https://psl.noaa.gov/), the NOAA Climate Prediction Center, Gareth Marshall (British Antarctic Survey) for providing the SAM index, and Christina Patricola and Ian Williams (Iowa State University) for updating the ELI. Monthly updates of each index allow TCO-AU to generate rolling-monthly outlooks. The authors are also grateful to the Australian Bureau of Meteorology for maintaining the Australian Best-Track Tropical Cyclone Database and to Mr. Kim Colyvas at the University of Newcastle, Australia, who advised on some aspects of the method.
Denotes content that is immediately available upon publication as open access.