## Abstract

A new and potentially skillful seasonal forecast model of tropical cyclone formation [tropical cyclogenesis (TCG)] is developed for the Australian region. The model is based on Poisson regression using the Bayesian approach. Predictor combinations are chosen using a step-by-step predictor selection. The three-predictor model based on derived indices of June–August average convective available potential energy, May–July average meridional winds at 850 hPa (*V*_{850}), and July–September geopotential height at 500 hPa produces the smallest standard error (se = 0.36) and root-mean-squared error (RMSE = 5.20) for the leave-one-out cross-validated TCG hindcasts over the 40-yr record between 1968/69–2007/08. The corresponding correlation coefficient between observed annual TCG totals and cross-validated model hindcasts is *r* = 0.73. Using *four*fold cross validation, model hindcast skill is robust, with 85% of the observed seasonal TCG totals hindcast within the model standard deviations. Seasonal TCG totals during ENSO events are typically well captured with RMSE = 5.14 during El Niño, and RMSE = 6.04 during La Niña years. The model is shown to be valuable in hindcasting seasonal TCG totals in the eastern Australian subregion (*r* = 0.73) and also provides some skill for the western Australian region (*r* = 0.42), while it not useful for the northern region. In summary, the authors find that the three-predictor Bayesian model provides substantial improvement over existing statistical TCG forecast models, with remarkably skillful hindcasts (forecasts) of Australian region and subregional seasonal TCG totals provided one month ahead of the TC season.

## 1. Introduction

Australia’s tropical climate is dominated by the El Niño–Southern Oscillation (ENSO), which is driven largely from the Pacific basin (e.g., Allan et al. 1996). The relationship between ENSO and Australian region tropical cyclone (TC) formation [tropical cyclogenesis (TCG)] has been reported extensively (e.g., Nicholls 1984; Basher and Zheng 1995; Kuleshov and de Hoedt 2003; Ramsay et al. 2008; Kuleshov et al. 2009). Ramsay et al. (2008) argue that, next to sea surface temperature, vertical zonal wind shear from 850 to 200 hPa and low-level relative vorticity are the main ENSO-related factors affecting the Australian region TCG. Kuleshov et al. (2009) confirmed these results but added relative humidity in the midtroposphere as a major contributor. Seasonally, the Australian monsoon trough (the intertropical convergence zone) plays an important role in TCG in this region, McBride and Keenan (1982). It has also been shown that the effect of ENSO-linked dynamics on TCG occurs through a strong relationship between the monsoon trough and ENSO via atmospheric bridge processes (Evans and Allan 1992).

Since the late 1970s/early 1980s, a number of statistical seasonal forecast schemes have been developed and improved to predict TC activity in various basins and subbasins (Klotzbach et al. 2011). In particular, seasonal forecast modeling of TC activity was first undertaken by Nicholls (1979) for the Australian region and Gray (1984) for the North Atlantic. In later studies by Gray et al. (1992, 1994), climatic relationships with hurricane activity in the North Atlantic are based on metrics, such as the quasi-biennial oscillation and African rainfall. A link between intense hurricanes and the Sahel monsoon rainfall was also established (Landsea and Gray 1992). The skill of Gray’s operational Atlantic seasonal TC forecasts for the analyzed period from 1984–2001 relative to climatology and persistence was confirmed and improved (e.g., Owens and Landsea 2003; Saunders and Lea 2005; Klotzbach 2007). Other relevant North Atlantic statistical forecasts include model predictions of hurricane counts using Poisson regression models (e.g., Elsner and Schmertmann 1993; Lehmiller et al. 1997). The Poisson method was later extended using a Bayesian approach to investigate seasonal TC counts and landfall over the United States (e.g., Elsner and Jagger 2004, 2006). This approach has also been used most recently to improve multiseason forecasting of Atlantic hurricane activity (Elsner et al. 2008) and seasonal forecasting of TCs affecting the Fiji, Samoa, and Tonga regions (Chand et al. 2010) and the central North Pacific (Chu and Zhao 2007). In the northwest Pacific, projection pursuit regression has been used to forecast seasonal TC totals and associated TC predictands (e.g., Chan et al. 1998; Chan and Shi 1999; Chan et al. 2001). A statistical scheme based on ENSO-related indices was later developed for predicting the annual number of TCs making landfall along the south China coast (Liu and Chan 2003). Most recently, modes from an empirical orthogonal analysis of climate factors have been used as predictors of TC behavior in a statistical model also for the South China region (Goh and Chan 2010).

For the Australian region, Nicholls (1979) showed that the austral winter to spring anomalies of sea level pressure at Darwin are highly correlated with early season Australian region tropical cyclone activity and, to a lesser extent, with total seasonal TC activity. Subsequent research, and operational testing, confirmed the strong link with the ENSO metric, the Southern Oscillation index (SOI) (Nicholls 1984, 1985, 1992; Drosdowsky and Woodcock 1991; Ready and Woodcock, 1992). Solow and Nicholls (1990) presented the first Poisson regression-based statistical forecast model for the Australian region. They used the SOI as the predictor of Australian region total TC counts. More recently, a Poisson regression model using SOI and the September lead saturated equivalent potential temperature gradient between 1000 and 500 hPa was developed to forecast upcoming season TCG totals across the Australian region (McDonnell and Holbrook 2004a,b). This model has also been applied to forecast subregional TCG totals in the eastern Indian Ocean, northern Australia, and southwest Pacific regions (e.g., McDonnell et al. 2006). On intraseasonal time scales, Leroy and Wheeler (2008) developed a logistic regression model for TC development in the Australian region. As predictors, they used the two dominant varimax-rotated modes of SST anomalies for the Indo-Pacific region, as well as an index describing variations in the Madden–Julian oscillation.

This paper presents a new Australian region statistical seasonal TCG forecasting scheme that shows considerable promise based on a comprehensive assessment of its cross-validated hindcast skill, with high correlations identified between hindcast and observed seasonal TCG counts (*r* = 0.73) and a low standard error (se = 0.36). This is a substantial improvement in cross-validated hindcast skill over previous studies with correlations between cross-validated hindcasts and observations of TC counts ranging from *r* = 0.44 to *r* = 0.60 (e.g., Solow and Nicholls 1990; Nicholls 1992; McDonnell and Holbrook 2004b). Following previous successful studies across different basins (e.g., Elsner and Jagger 2006; Chu and Zhao 2007; Chand et al. 2010), the model developed here is based on the Poisson regression using a Bayesian approach. The Bayesian inference enables us to characterize the uncertainties of the model parameters by a posterior distribution after taking observed data into account. Predictors for the model are carefully selected indices of atmospheric parameters known to affect TC formation. A step-by-step predictor selection based on the RMSE calculated from the cross-validated hindcasts ensures the best combination of predictors. We show that this model makes significant advances on previous statistical schemes used in the Australian region.

The paper is structured as follows. Section 2 describes the data and variables used. Section 3 introduces the prediction schemes applied as the basis for predictor selection. The model setup and the techniques used are outlined in section 4. Section 5 presents the model results, and finally section 6 discusses and summarizes the quality and improvements of the models presented over existing models.

## 2. Data

### a. Tropical cyclone observations

This study takes advantage of the global TC best-track dataset, the International Best Track Archive for Climate Stewardship version 2 (IBTrACS.v02) (Knapp et al. 2010), provided by the U.S. National Oceanic and Atmospheric Administration. TCG is defined to occur when and where a tropical storm system with winds exceeding 34 kt (17.5 m s^{−1}) is first recorded.

The Australian (tropical cyclone) region is defined here as spanning between 0° and 30°S, 90° and 170°E. Following Dare and Davidson (2004), the Australian region is also divided into three subregions: a western region from 90° to 125°E, a northern region from 125° to 142.5°E, and an eastern region from 142.5° to 170°E (Fig. 1). TCG occurrences identified poleward of 30°S, or over land, have been removed in the quality assessment process. Only storms during the Australian TC season from November to April are taken into account. Overall, a total of 570 TCs during the 40-yr period from 1968/69 to 2007/08 are analyzed following the quality control. Figure 1 shows the spatial distribution of quality-assured TCG points and corresponding time series of seasonal TCG totals in the 40-yr record.

### b. Oceanic and atmospheric data

In this study, SST data were taken from uniformly gridded temperature observations provided in the Hadley Centre Global Sea Ice and Sea Surface Temperature (HadISST1) dataset (Rayner et al. 2003), compiled by the Met Office Hadley Centre. HadISST1 is a combination of global monthly SST fields and sea ice concentrations on a 1° × 1° grid from 1870 to the present.

The atmospheric data analyzed in this study are from the National Centers for Environmental Prediction (NCEP)–National Center for Atmospheric Research (NCAR) monthly mean upper-air reanalyses, with 2.5° horizontal resolution on 17 pressure levels (Kalnay et al. 1996). In total, eight variables were analyzed as potential TCG predictors describing the thermodynamic and dynamic condition of the ocean and atmosphere across a large portion of the Indo-Pacific region from 30°N to 50°S, 30°E to 70°W. Monthly anomalies of all variables were determined against a 30-yr base period of 1970–99. Statistical significances of the correlation coefficients are based on the reduced effective number of degrees of freedom method outlined by Davis (1976).

### c. Thermodynamic and dynamic parameters

#### 1) Thermodynamic parameters

Four thermodynamic parameters were selected. These are SST, geopotential height (GPH) at 500 hPa, convective available potential energy (CAPE) calculated between 850 and 300 hPa, and the (nonsaturated) equivalent potential temperature (EPT) gradient between 1000 and 500 hPa. The ocean temperature, and therefore the potentially available moist convection due to evaporation, is described by the SST (°C or K). To identify the low to midtroposphere temperature, we examined the GPH (m). The stability of the troposphere, which characterizes the likelihood of deep convection, is measured by CAPE (m^{2} s^{−2}) defined as

where *z*(850 hPa) is assumed the approximate level of free convection and *z*(300 hPa) the level of neutral buoyancy, *T* is a function of pressure level height (*z*) and is defined as the environmental temperature, *T _{m}* is the temperature of an idealized rising air parcel that is assumed to be saturated at the 850-hPa level, and

*g*= 9.81 m s

^{−2}is the standard gravity constant.

EPT (K) describes the enthalpy between two levels and, in the present context, the likelihood to form cloud clusters. It is defined as

with the vapor pressure

where *T* (K) is the temperature, Θ(K) the potential temperature, and RH (%) the relative humidity—all variables dependent on the pressure level. The constants are the latent heat of vaporization *L _{v}* = 2.5 × 10

^{6}J kg

^{−1}, the specific heat at constant pressure

*c*= 1004 J kg

_{p}^{−1}K

^{−1}, the saturation vapor pressure at triple point

*e*

_{str}= 6.11 hPa, the gas constant for vapor

*R*= 461 J kg

_{v}^{−1}K

^{−1}, and the triple point temperature

*T*

_{tr}= 273.16 K. Relatively moist layers in the lower and midtroposphere (RH between 700 and 500 hPa) are essential, as dry midlevels suppress the continuing development of widespread deep convection; that is, EPT is highly dependent on RH.

#### 2) Dynamic parameters

Four dynamic parameters were also analyzed. These are the zonal and meridional winds at 850 hPa (*u*_{850}, *υ*_{850}), the environmental vertical wind shear (EVWS) between 850 and 200 hPa, and the relative vorticity (RV) at 850 hPa. The 850-hPa pressure level was chosen as the lower dynamic level to avoid effects of the boundary layer and thus focus on atmospheric interior geostrophic flows. It is imperative that EVWS (m s^{−1}), defined as

is weak, otherwise convection within the TC eyewall cannot develop or persist. Also an existing negative (Southern Hemisphere) RV such as a small cyclonic atmospheric disturbance, a tropical wave, or a monsoonal trough with convergence—is a necessary initial factor to develop a TC. RV (s^{−1}) is defined as

where *u* (*υ*) (m s^{−1}) are the east (north) components of velocity, and *x* (*y*) (m) are the east (north) Cartesian displacement directions. Attendant strong upper-level divergence supports the development of deep convection, which in turn intensifies the disturbance into a low pressure system.

### d. ENSO definitions and effects on TCG

Ramsay et al. (2008) found the Niño-4 SST anomaly (SSTA) index to be the strongest ENSO predictor of interannual TC frequency in the Australian region and is therefore also included in the present study. Niño-4 is defined as the SSTA time series averaged spatially between 5°S and 5°N, 160°E and 150°W.

ENSO events are classified according to the definition used by the former U.S. National Weather Service (available online at http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ensoyears.shtml) using the three-month running mean in the Niño-3.4 region. El Niño (La Niña) events are defined by the Niño-3.4 SSTA exceeding thresholds of ±0.5°C for a minimum of five consecutive three-month average overlapping periods (see appendix A).

## 3. Prediction schemes

Figure 2 shows spatial correlation maps between Australian region annual TCG totals and the individual predictor variables shown for the June–August (JJA) period prior to the upcoming TC season. The correlation patterns with the thermal variables EPT, CAPE, and SST (see corresponding panels in Fig. 2) are characterized by the developing ENSO (e.g., Drosdowsky and Chambers 2001) “boomerang” pattern across the tropical/subtropical Pacific. This pattern defines regions of enhanced convection, contrasted by suppressed convection over the Indian Ocean. GPH shows a strong negative correlation pattern throughout the tropics that is maximized in the central Indian Ocean. The pattern is consistent with variations in the ITCZ, whereby colder lower- and midtroposphere air masses support convection and the development of TCs. The positive correlations in the central subtropical South Pacific are located at the southern tip of the South Pacific convergence zone and the warm advection region of the semipermanent South Pacific subtropical high. However, interannual variations of the air column in that region are mainly driven by changes of the trough in the midlatitude westerlies between 15° and 45°S in the central South Pacific (Van Loon and Shea 1985). The dynamic variables tend to show similar patterns as CAPE, EPT, and SST with a change of sign at the eastern boundary of the Australian TC region (bottom four panels of Fig. 2). These patterns describe changes in the Walker and Hadley circulations due to changes in ENSO phases. The correlation pattern in the tropical Pacific and Indian Ocean between annual TCG totals and *u*_{850} imply enhanced TCG in the Australian region with a strengthening of the Walker circulation and trade winds. The *υ*_{850} correlation patterns north of the equator in the far eastern Pacific and western Indian Ocean suggest enhanced TCG with increased meridional surface inflow into the equatorial regions of the eastern and central Pacific but also into the tropical western Indian Ocean. In contrast, the correlation patterns of *υ*_{850} also describe enhanced TCG with weakening of the Hadley circulation in the Pacific warm pool and west Pacific regions. The correlation pattern of EVWS and RV with Australian region TCG totals mostly reflects the correlation pattern of *u*_{850} over the tropical central Pacific. Our results imply a stronger (weaker) Walker circulation leading to increased (decreased) convection in the western Pacific warm pool area, which is more (less) favorable to Australian region TCG.

To achieve climatologically relevant predictors, we derived indices for each climate predictor variable on the basis of persistence and strength of preseasonal spatial correlation coefficients. Spatial correlation maps for October–December (OND) (Fig. 3) help to identify patterns that continue to be significant after the austral winter till the start of the Australian TC season. The most persistent correlations observed for CAPE, EPT, and GPH are all located in the subtropical central South Pacific, while the used SSTA index (Niño-4), *u*_{850} and EVWS describe variations in the equatorial central Pacific region. The *υ*_{850} and RV indices correspond to the northern inflow region of the Walker cell in the tropical East Pacific.

Figure 4 presents the correlation coefficients from all three-month-overlapping periods from January–March (JFM), ahead of the following TC season, through to OND at the start of the TC season. We found that all thermal predictors generate a strong increase of correlations from March–May (MAM) to JJA and stay highly correlated with around *r* = ±0.5 to the start of the TC season. Correlation coefficients between annual TCG totals and CAPE, EPT, and Niño-4 are all statistically significant (at the 95% confidence level) from early spring on, while correlation coefficients with GPH only reach 95% significance by May–July (MJJ). The two wind component indices generate increased correlations with annual TCG totals from April–June (AMJ, *υ*_{850}) and MJJ (*u*_{850}), and peak during MJJ and June–September (JAS), respectively. Correlations between TCG totals and *u*_{850} are statistically significant from AMJ while the ones with *υ*_{850} only reach correlation coefficients significant on 90% confidence level from AMJ on. EVWS shows a strong increase in the correlation coefficients from AMJ to JJA (significant at the 95% confidence level from MJJ on) and peaks during JAS, slowly decreasing thereafter. RV reaches its maximum correlations during JAS (significant at the 95% confidence level from AMJ on) and reduces subsequently. The definition of each predictor index, as well as the three-month mean of each index best relating to variations in TCG totals during the upcoming season, are provided in Table 1. In total, eight predictor indices were further investigated. Possible combinations of predictors should be complementary and contribute information that adds value. For that reason, predictor combinations with correlations of more than ±0.8 with each other were not included in any further analysis due to their collinearity. Here collinearity was observed between RV and *υ*_{850}, as well as between CAPE and EPT.

## 4. Bayesian regression model

### a. Poisson regression

The Poisson distribution is often used to model the occurrence of rare, discrete events, such as tornado counts and the occurrences of droughts or cold spells (e.g., Wilks 1995). The Poisson distribution also restricts the possible outcomes to nonnegative integers, making it ideal for modeling tropical cyclone occurrences (Elsner and Schmertmann 1993).

Following previous studies (e.g., Solow and Nicholls 1990; Elsner and Schmertmann 1993; McDonnell and Holbrook 2004a,b; Elsner and Jagger 2006; Chu and Zhao 2007; Chand et al. 2010), we applied a Poisson regression approach to model TCG totals satisfying

where

Here, if *Y* has a Poisson distribution, the logarithm of the expected number of TCG occurrences, *μ*, can be modeled as a linear combination of the predictors *x _{ij}*, with

*j*being the specified predictor during the season

*i*;

*β*is the corresponding Poisson regression coefficient,

_{j}*β*the intercept, and

_{0}*y*the observed TCG count. In a Poisson model, the variance is equal to the mean (

*μ*), with the standard deviation being the square root of

*μ*.

### b. Bayesian analysis

The Bayes theorem represents a quantification of uncertainty provided by probabilities. By comparison, in the Frequentist approach, probabilities are seen in terms of frequencies of random repeatable events (Wilks 1995). In this study, the Bayesian approach is used to predict the seasonal number of TCG occurrences. The Bayes theorem is applied to find the best possible model coefficient representation and this information is then used to predict the seasonal TCG totals. The observed predictor set is denoted as *x*_{1:T} = {*x*_{1}, … , *x _{T}*} and the corresponding seasonal number of TCG occurrences, or TCG probability, as

*y*

_{1:T}= {

*y*

_{1}, … ,

*y*} during the observations 1:

_{T}*T*. We are interested to find the conditional probability of

*y*

_{T}_{+1}given

*x*

_{T}_{+1}and the model coefficients

*β*,

*p*(

*y*|

*x*

_{T}_{+1},

*β*). The model coefficients

*β*will be estimated using the posterior distribution

*p*(

*β*|

*x*

_{1:T},

*y*

_{1:T}).

The assumptions about the prior knowledge of the model coefficients are stated before observing the data *x _{i}* and

*y*in the form of a prior probability

_{i}*p*(

*β*). As we have no, or only little, prior information on the climatic effects of our chosen predictors on TCG occurrences, we chose the conservative way of defining the prior probability of the model coefficients as almost flat priors. The priors are defined as a Gaussian distribution

with the mean selected as *μ* = 0, and the standard deviation *σ* = 100, with *β _{j}* representing the

*j*model coefficients. For the TCG occurrences model, we consider the Poisson distribution

Following the Bayes rule we get the posterior distribution

The posterior probability [Eq. (10)] allows us to take uncertainties into account and predictions are then obtained from

with *N* being the total number of obtained samples from the posterior distribution.

To usefully apply the Bayesian approach, and obtain the appropriate values for the model coefficients, the use of a sampling method like the Markov Chain Monte Carlo (MCMC) (Hastings 1970) method is indispensable (Larget and Simon 1999). The MCMC simulates direct draws from a probability distribution. There, the previous sample values generate randomly the next sample value, which means generating a Markov chain. Unlike previous studies (e.g., Elsner and Jagger 2004, 2006; Chu and Zhao 2007; Chand et al. 2010), which applied the Gibbs sampler via the open source software WinBUGS, we instead used the multivariate slice sampler (Neal 2003), which is a form of auxiliary variable technique (Roberts and Rosenthal 1999; see appendix B). The model standard deviation was then calculated from the expected number of events as estimated using the average of the mean *μ* obtained from the MCMC samples. The model standard deviation *σ* is then defined as

where the first term is the average process standard deviation and the second term is the coefficient uncertainty (Peters et al. 2009). The hindcasted TCG totals are taken as the number with the maximum probability from the hindcast distribution.

To discard the effects of the chosen initial conditions we applied a model burn-in of 500 iterations. This relatively short iterative burn-in achieves a quick convergence (see Fig. 5). Also, to avoid high autocorrelations and gain statistically independent samples out of the iteration process, the samples were thinned so that only every fifth sample was taken into account. Finally, 5000 samples were used to estimate the model coefficients and obtain the predictions.

### c. Model skill

To compare the skill of the possible model predictors, a leave-one-out cross validation (e.g., Stone 1974; Elsner and Schmertmann 1993) was performed. In this method, the model gets trained using *n* − 1 seasons to hindcast the number of storms expected for the one season that has been withheld from the training dataset. The train-and-test approach is successively repeated to hindcast every season across the 40-yr dataset. This enables us to perform an independent hindcast of every season. For a better understanding of the robustness of the model results over time, we also performed a *k*fold cross-validation technique with *k* = 4. The method splits the data into *k* equal-sized subsets (e.g., Stone 1974; Efron and Gong 1983). For the *k*th subset, the model is developed using the other *k* − 1 data subsets, and then the fitted model is used to predict the *k*th data subset. To validate the skill of the models, the rms error (RMSE) of each model hindcast was calculated (e.g., Elsner and Jagger 2006; Chand et al. 2010). The MSE is defined as

where *n* is the total number of seasons, *i* is the particular season being hindcast and used for cross validation, *i* = 1, … , *n*. *p _{i}* is the predicted probability that

*k*TCs develop in the hindcast season

*i*, and

*o*is the number of TCs that were actually observed to form in that season. The RMSE is a commonly used metric for the potential utility of a predictor or predictor combination in a probabilistic model, with small values indicating a good model. It is calculated using the probabilities of the independent hindcasts in the leave-one-out cross-validation method.

_{i}The standard error (se) and the cross correlation (*r*) between the predicted and observed number of TCs formed in each region provide a measure of the overall hindcast skill at the first-order level of the seasonal time series. The final model coefficients were estimated based on data from the training period between 1968/69 and 2007/08.

## 5. Results

### a. Model predictor selection

The kernel distributions of the estimated posterior densities of the model coefficients aid verification of the likely merit of the predictors. They give an indication of the quality of each of the tested predictors. In the ideal case, all sampled coefficients lie on either side of the zero line. This shows that the chosen predictor is playing a significant role in predicting the events given the sign of the chosen predictor.

The posterior densities of the climatology and the tested single predictors over the 40-yr record are shown in Fig. 6. We find that all indices in the present study are suitable as TCG predictors. Using the leave-one-out cross-validation technique, we calculated the RMSE to help evaluate the model skill. The climatology-only model was determined as the uncertainties of the intercept and hindcasts of TCG numbers are close to the observational long-term average with a high RMSE = 6.20 and se = 0.38. The climatology-only model does not contain any predictive power beyond the background state. The best single predictor is CAPE (RMSE = 5.39, se = 0.33), while the predictor with the least utility as a single predictor was found to be GPH (RMSE = 5.70, se = 0.41; Table 2a).

In an attempt to further improve the model, we investigated predictor combination models using a step-by-step predictor selection based on the calculated RMSEs. CAPE was used as the key single predictor based on its lowest RMSE and strongest correlation with TCG. With CAPE as the base, we found that the two-predictor combinations, CAPE + *υ*_{850}, CAPE + EVWS, and CAPE + RV provided further reductions to the RMSE to 5.21, 5.33, and 5.21, respectively. The three-predictor model CAPE + *υ*_{850} + GPH provided the lowest errors, with RMSE = 5.20 and se = 0.36 (Tables 2b,3a). Figure 7 shows the 40-yr leave-one-out cross-validated (hereafter CV40) hindcasts for that model plotted against the observed total number of TCs formed in each season. The hindcasted annual TCG count is the number of TCG occurrences with the maximum probability in the hindcast distribution as obtained by Eq. (11). We find that the CAPE + *υ*_{850} + GPH model captures the variability in number of cyclones formed within its boundaries of standard deviation with an 80% success rate of the CV40 hindcasts and very favorable performance against the observed total of TCG occurrences, the correlation coefficient being *r* = 0.73 (Table 2b). This is a substantial improvement in cross-validated hindcast skill of at least 21.5% over previous models with correlations between cross-validated hindcasts and observations of TC counts ranging from *r* = 0.44 to *r* = 0.60 (e.g., Solow and Nicholls 1990; Nicholls 1992; McDonnell and Holbrook 2004b).

### b. Fourfold cross validation

For a better evaluation of the robustness of the predictor set, we applied a *four*fold cross-validation technique. In this method, the data are split into *four* consecutive 10-yr subsets. The model is then trained on three of the four subsets to hindcast the left-out 10 years. This procedure was used to hindcast the 10-yr periods 1968/69–1977/78, 1978/79–1987/88, 1988/89–1997/98, and 1998/99–2007/08. The RMSEs for each independent 10-yr hindcast were calculated for the CAPE + *υ*_{850} + GPH model and compared to the average RMSE of the CV40 hindcast results (Fig. 8a). We found that the RMSEs calculated on the hindcasts did not deviate strongly from the CV40 hindcasts. We also found that the lowest RMSE corresponded to the 1988/89–1997/98 10-yr hindcast. The highest RMSE was for the 1968/69–1977/78 decade. Figure 8b shows the annual RMSEs for the CAPE + *υ*_{850} + GPH model as obtained from the *four*fold cross-validated hindcast distributions. We note that there is a slight trend toward reduced errors of the model hindcasts over time. The model captures ENSO seasons quite well, except for the 1985–86 La Niña event year (RMSE = 7.67), when the TCG totals of 19 storms were substantially underestimated and the 1987–88 El Niño event year (RMSE = 7.44) in which the low TCG occurrences of 6 was not reproduced. The highest RMSEs can be observed during the La Niña of 1971–72 (RMSE = 9.47) in which a rather low number of events (15) was observed and 1974/75 (RMSE = 8.26), for which the RMSE is particularly high due to the high number of TCG and therefore broad hindcast probability. The other periods showing very high RMSEs (>7) are the underestimated ENSO-neutral years of 1979–80 and 1993–94 and the 1973–74 La Niña year, which had a high number of TCG occurrences and broad hindcast probability.

Figures 9 and 10 present the hindcast distributions obtained from the fourfold cross validation. We find that the hindcast distributions of Australian region TC seasons with an expected high number of TCG events are broader than the ones for seasons with a lower number of occurrences hindcasted. This is particuarly the case in the first hindcast decade and explains the high RMSE for the 1968/69–1977/78 decade in which 6 out of 10 seasons had 19 TCG occurrences or more. Also during this decade, we observe large values of standard deviation, especially in the process uncertainty term (not shown), indicating lower skill of the model hindcasts. Figure 11 shows the four independent fourfold 10-yr hindcasts, as well as the fitted 30-yr hindcasts on which the model was trained. The results are compelling with the model displaying considerable skill. The independent hindcasts of the left-out 10 years are mostly very accurate, with only 6 out of 40 (15%) of the seasonal hindcasts outside of the model standard deviation. The model lacks skill when an unusually high number of TCGs occurred during ENSO-neutral or El Niño conditions (1979/80, 1993/94; see Figs. 9 and 10). Another issue is correctly capturing the very low number of cyclones during the identified unusual weak seasons (1982/83, 1987/88, and 2003/04)—a recognized issue in trying to capture extremes using regression modeling. Only the 1985–86 La Niña event year were not well presented by the model.

In the following section, we demonstrate that despite ENSO years being by their nature somewhat “extreme,” the model performs very well in capturing the TCG occurrences across most of these events.

### c. ENSO years

To assess the skill of hindcasting TCG totals during strong El Niño and La Niña years between 1968/69–2007/08, we evaluate the model performance based on the CV40 hindcasts (refer Fig. 7) for the CAPE + *υ*_{850} + GPH model. We find that the model performs very well during ENSO event years (Fig. 8a) with RMSEs only little higher than the 40-yr average. The model skillfully hindcasts the low TCG numbers during El Niño years (RMSE = 5.14), as well as the high TCG numbers during La Niña years (RMSE = 6.04; Fig. 8). Figure 12 shows the hindcast distribution for TCG totals using the CAPE + *υ*_{850} + GPH model for all El Niño and La Niña event years. The model accurately hindcasts the number of TCG occurrences during El Niño years (79% success) and is particularly skilful in hindcasting TCG totals during La Niña event years, with 84% of the La Niña year TCG hindcasts falling within hindcast standard deviation statistics. We note that the higher RMSE for La Niña compared to El Niño years results from a broader hindcast distribution of the annual hindcasts due to the higher numbers of TCG occurrences, rather than a poor performance of the model.

### d. Subregional hindcasts

We also examined the skill of the CAPE + *υ*_{850} + GPH model for the three Australian subregions: in the west, north, and east. To achieve that aim, the model was trained on the subregional seasonal TCG totals with the model skills verified using the CV40 approach (Fig. 13). In Table 3 the subregional model coefficients, as well as their standard deviations, are listed. We find that the model performs well at the subregional scale for the western and eastern Australian region with 87.5% of the observed seasonal TCG counts being hindcast within the model standard deviation. For the northern region, TCG hindcast skill was found to be little better than climatology-only. Due to the different averages of annual TCG counts in each region, the RMSEs and standard errors cannot be used as a comparative measure, as both are directly linked to the average or the standard deviation of the TCG totals and hence have different amplitudes for each subregion. Correlating the CV40 hindcasts with the observed seasonal TCG totals for each subregion, however, provides a measure of the model’s skill in capturing the correct phase of the hindcast variability. Correlation coefficients in the eastern region are as high as *r* = 0.73 (se = 0.22). For the western region, the model is also strong with correlations between the hindcasts and observations of *r* = 0.42 (se = 0.27). The northern region TCG totals, however, are hindcast poorly with model outputs varying marginally from the average TCG occurences (Fig. 13b).

## 6. Summary and discussion

This study represents a substantial improvement in the potential for more accurate statistical seasonal forecasting of tropical cyclone formation (genesis, TCG) for the Australian region. It is well understood that TCs in the Australian region are strongly affected by the phase of ENSO. Nevertheless, models based purely on ENSO metric predictors typically fail to forecast seasonal TCG totals in both the eastern Indian and southwest Pacific Ocean regions.

Our approach comprises an extended analysis using well-known climate indices together with the identification of new and potentially skilful indices that represent metrics important for predicting TC formation. These include 1) three subtropical central South Pacific indices of the convective available potential energy between 850 hPa and 300 hPa, the (unsaturated) equivalent potential temperature gradient between 1000 hPa and 500 hPa, and geopotential height at 500 hPa; 2) two tropical central Pacific indices of the zonal winds at 850 hPa (*u*_{850}) and environmental vertical wins shear between 850 and 200 hPa; and 3) two tropical northeast Pacific indices including the meridional wind at 850 hPa (*υ*_{850}) and low-level relative vorticity at 850 hPa (see Fig. 3). Additionally, the ENSO SST anomaly index Niño-4 was used. As a result of correlating preseasonal three-month climate index means with TCG totals in the upcoming season, we found that three-month austral winter mean indices showed the greatest overall potential for forecasting Australian region TCG totals in the upcoming TC season. The best eight predictor variables (Table 1) were incorporated into a Poisson regression model developed on TCG totals. Following recent studies (e.g., Elsner and Jagger 2004, 2006; Chu and Zhao 2007; Elsner et al. 2008; Chand et al. 2010), we applied a Bayesian approach using the Markov Chain Monte Carlo method. The Bayesian approach is beneficial as it allows for incorporation of prior beliefs and is convenient to account for the uncertainties in model parameters. The final model runs generated a total of 5000 independent samples using a slice sampler with a burn-in of 500 and a thinning of five.

Using a three-predictor Bayesian model on key indices of CAPE, *υ*_{850}, and GPH, we have been able to create a substantial improvement in cross-validated hindcast skill over previous studies. Previous published studies produced correlations between cross-validated hindcasts and observations of TCG counts ranging from *r* = 0.44 to *r* = 0.60 (e.g., Solow and Nicholls 1990; Nicholls 1992; McDonnell and Holbrook 2004b). Nicholls (1992) also found correlations of *r* = 0.72 between September–November (SON) values of the Southern Oscillation index and the “first differences” of consecutive seasons instead of the total number of TC counts between 1959/60 and 1990/91. An earlier study by Nicholls (1984) showed correlations of *r* = 0.78 between SON SST in a region north of Australia (5°–15°S, 120°–160°E) and Australian region TC counts from 1964–82. However, this relationship has been shown to degrade over time and is not robust for the more recent years (Ramsay et al. 2008). With the leave-one-out cross-validated hindcast approach we achieve correlations of *r* = 0.73 and a standard error of se = 0.36 using a three-predictor model of CAPE + *υ*_{850} + GPH for the period 1968/69–2007/08. In comparison to the skill of the climatology-only model, the CAPE + *υ*_{850} + GPH model shows a 19% improvement in RMSE. Using a fourfold cross-validation approach, we achieved highly skilled hindcasts for entire decades that were left out, highlighting the robustness and potential skill of the model.

The CAPE, *υ*_{850}, and GPH indices derived in this study are ENSO linked. Combining convective available potential energy and geopotential height from the subtropical central South Pacific with the lower-troposphere meridional tropical inflow from the Northern Hemisphere in the eastern Pacific was found to produce a valuable and complementary predictor set. Figure 14 shows the anomalies of all model-relevant predictor variables (CAPE, wind flow at 850 hPa, GPH) between active TC seasons (TCGs ≥ 17) and rather inactive TC seasons (TCGs ≤ 11) and indicates the index locations used in the three-predictor model. Active and inactive seasons were here chosen to represent around 25% of the investigated seasons, respectively. The CAPE index is located in the subtropical central South Pacific and embedded in the South Pacific convergence zone of ENSO-related convection. CAPE, as a measure of the instability of the atmosphere, provides a likelihood metric of deep convection within the zone of TC formation. The relationship between TCG occurrence and the ITCZ-linked North Pacific index, *υ*_{850}, is negative, which means a strengthening of the low-level equatorward trade wind inflow (Walker circulation) leads to enhanced TCG. Finally, the positive sign of the GPH model coefficient, as an indication for heating and atmospheric layer expansion, relates warmer air masses in the lower and midtroposphere of the subtropical central Pacific during austral winter to enhanced TCG in the following season. The GPH index is located in the warm advection zone of the South Pacific subtropical high. Van Loon and Shea (1985) identified this region to be linked to the Southern Oscillation. In autumn (fall) and winter prior to a warm event the trough in the westerlies, located west of New Zealand, fails to develop to its usual amplitude, allowing colder air from the midlatitudes intrude the Australian Pacific region. These conditions are suppressed in the year after a warm event took place.

Based on the assessment of various quality controls in the leave-one-out and fourfold cross-validated hindcasts, we found that the three-predictor CAPE + *υ*_{850} + GPH Poisson model is the most skillful in our model suite. The lowest RMSE corresponded to the 1988/89–1997/98 decade (RMSE = 4.41), while the largest corresponded to the 1968/69–77/78 decade (RMSE = 6.34). During ENSO event years (El Niño or La Niña), the three-predictor model demonstrated high skill, performing generally very well during El Niño, with the RMSE slightly below the average [RMSE(CV40) = 5.20; RMSE(El Niño) = 5.14]. The higher than average RMSE during La Niña [RMSE(La Niña) = 6.04] is due to a broader hindcast distribution of high numbers of TCG totals rather than a deficiency in the model. In fact, all except two of the hindcasts during La Niña event years captured the observed seasonal TCG totals within its standard deviation. Further, the independent hindcasts of the four separate decades using fourfold cross validation is impressive, with only six seasons (15%) out of a total of 40 being outside of the model standard deviation. We believe these results are quite compelling.

Finally, we tested the three-predictor model on three smaller subregions: west, north, and east. We found that the model is adaptable to forecast seasonal TCG totals in the eastern region (*r* = 0.73), while there is also some skill for TCG in the western region (*r* = 0.42). For the northern region, however, the hindcasts are not very different from the average number of seasonal TCG occurrences adding up to correlations between hindcasted and observed TCG totals of *r* = −0.23. The regional-scale complexities associated with this shallow sea region are problematic for our large-scale model. It may also be that a component of the poor performance in this northern region is due to northern Australian TCG being inherently unpredictable for statistical schemes.

In summary, we have developed a new and potentially skillful seasonal forecast model of tropical cyclogenesis for the Australian region. We find that a three-predictor CAPE + *υ*_{850} + GPH Poisson model produces remarkably skillful hindcasts of Australian region seasonal TCG totals by September of each year, one month prior to the onset of the Australian region TC season (November–April). The predictor variables identified in this study are physically meaningful and appropriate to condition the model forecasts of TCG. By combining information from useful dynamic and thermal variables as predictors in a Bayesian approach Poisson model system, we are able to demonstrate skillful cross-validated hindcasts of Australian region seasonal TCG totals with high correlation (*r* = 0.73) and low RMSE (5.20) against a 40-yr record of observations.

## Acknowledgments

A. Werner was supported by a MQRES Scholarship from Macquarie University, Australia. AW also thanks the School of Geography and Environmental Studies at the University of Tasmania for providing computing resources and office space during her visits to UTAS. Finally, we wish to thank Dr. John McBride and Dr. Pavel Shevchenko for their constructive comments and helpful discussions on this work.

### APPENDIX A

#### ENSO Event Years

This study defined ENSO events as defined by the Climate Prediction Center of the National Weather Service from three-month SSTA means in the Niño-3.4 region:

El Niño—1968/69, 1969/70, 1972/73, 1976/77, 1977/78, 1982/83, 1986/87, 1987/88, 1991/92, 1994/95, 1997/98, 2002/03, 2004/05, 2006/07; and

La Niña—1970/71, 1971/72, 1973/74, 1974/75, 1975/76, 1983/84, 1984/85, 1985/86, 1988/89, 1995/96, 1998/99, 1999/2000, 2000/01, 2007/08.

### APPENDIX B

#### Slice Sampling

The slice sampler (Neal 2003) avoids specifying the proposal densities as in Metropolis–Hastings algorithms (e.g., Hastings 1970; Gelman 1992) or the introduction of methods for sampling from the conditional distributions as in the Gibbs sampler (e.g., Gelfand and Smith 1990). Instead the slice sampler defines a joint distribution over the sampling variable *β* and an auxiliary real variable *u*, which is defined by

where with is proportional to *β*. The resulting marginal distribution over *β* is given by

so we can sample from *p*(*β*) by sampling *β* and u alternately from each other from and then ignoring the *u* values. That means, given the value of *β* we evaluate and then sample *u* uniformly in the range , after *u* is uniformly fixed from the “slice” through the distribution defined by . Slice sampling is applied to multivariate distributions by repeatedly sampling each of the *n* variables in turn, in the manner of the Gibbs sampling (Bishop 2006), where one needs *n* iterations to get from *β _{j}*

^{(i)}to

*β*

_{j}^{(i+1)}