1. Introduction
The use of dynamical models for the prediction of tropical cyclones (TCs) at the intraseasonal time range (from 10 to 60 days) has not been documented so far, although dynamical models are used routinely to produce short- and medium-range forecasts of TC tracks and seasonal outlooks of TC activity. It is only very recently that a few statistical models have been developed to predict the genesis or occurrence of TCs in the intraseasonal time range (Leroy et al. 2004; Frank and Roundy 2006; Leroy and Wheeler 2008). This is mostly due to the fact that the source of predictability at the intraseasonal time range has not been as well established as it has for medium-range or seasonal forecasting. For medium-range forecasting, predictability comes mostly from the atmospheric initial conditions. On the other hand, seasonal forecasting of TCs is mostly based on the impact of sea surface temperature (SST) anomalies on TC activity (e.g., Gray 1984; Shapiro 1987; Goldenberg and Shapiro 1996; Saunders and Harris 1997). SST anomalies are also a source of predictability for TCs at the intraseasonal time scale. For instance, the probability of higher-than-normal TC activity in the central Pacific during an El Niño event extends down to the multiweek time scale as well (Leroy and Wheeler 2008). However, it is the impact of the Madden–Julian oscillation (MJO; Madden and Julian 1971) on TCs that has triggered the recent interest in subseasonal TC prediction.
The impact of the MJO on TC activity has been documented in observational studies for the western North Pacific (Nakazawa 1988; Liebmann et al. 1994), the eastern North Pacific (Molinari et al. 1997; Maloney and Hartmann 2000a), the Gulf of Mexico (Maloney and Hartmann 2000b; Mo 2000), the South Indian Ocean (Bessafi and Wheeler 2006; Ho et al. 2006), and the Australian region (Hall et al. 2001). The modulation of the number of TCs by the MJO can be as high as 4:1 in some locations (Hall et al. 2001; Maloney and Hartmann 2000b) and largely exceeds the modulation due to SST variability. The MJO also has a significant impact on the probability of TC landfall over the United States and Australia (Vitart 2009). According to Camargo et al. (2009), the impact of the MJO on observed TCs comes mainly from midlevel relative humidity and low-level absolute vorticity. Maloney and Hartmann (2000a) attribute the impact of the MJO on TCs to its impact on cyclonic low-level relative vorticity and vertical wind shear.
The MJO is therefore a key predictor of the statistical model developed by Leroy and Wheeler (2008) that is used to issue weekly probabilities of TC activity over large areas of the Southern Hemisphere. The skill of this statistical model is higher when there is an active MJO. Other statistical methods for predicting TC probabilities at the intraseasonal time scale include an empirical method developed by P. Roundy (State University of New York; see online at http://www.atmos.albany.edu/facstaff/roundy/tcforecast/tcforecast.html), which includes the effects of a wide variety of wave modes and climate signals to forecast local daily probabilities of TCs (Frank and Roundy 2006), and the prediction of individual months of TC activity by the Colorado State University team (Blake and Gray 2004; Klotzbach and Gray 2003). Camargo et al. (2006) provide a review of some of these methods.
Modeling studies (e.g., Vitart and Anderson 2001) have simulated the impact of local SSTs and El Niño–Southern Oscillation (ENSO) on model TCs, which explains the success of dynamical seasonal forecasts of TCs (Vitart and Stockdale 2001; Camargo et al. 2005; Vitart et al. 2007). However, the success of the intraseasonal prediction of TCs is likely to be largely dependent on the success of the model to predict MJO events and their impact on model TCs. Slingo et al. (1996) and Lin et al. (2006) have shown that general circulation models (GCMs) often have difficulty to adequately represent MJO events. The representation of TCs in models also varies widely from one numerical model to another (Vitart 2006). However, Vitart (2009) showed that a set of 46-day hindcasts using a recent version of the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecast System (IFS) simulates adequately the impact of the MJO on model TCs, which suggests that this “monthly” forecast system may have some skill to predict the intraseasonal variability of TCs. Furthermore, Elsberry et al. (2010, manuscript submitted to Asia-Pac. J. Atmos. Sci.) showed that the operational ECMWF monthly forecasts provided useful guidance for the genesis of typhoons a few weeks in advance during the combined Tropical Cyclone Structure (TCS08)/The Observing System Research and Predictability Experiment (THORPEX) Pacific Asian Regional Campaign (T-PARC).
The main goal of the present paper is to evaluate the skill of the ECMWF monthly forecast system to predict TC occurrences over the Southern Hemisphere, and compare it to the skill of the statistical model developed by Leroy and Wheeler (2008) using the same verification framework. The same set of hindcasts as in Vitart (2009) will be used. The verification will focus on the Southern Hemisphere where most TCs develop during November–April, which coincides with the season of strongest MJO activity.
Section 2 will describe the series of hindcasts produced with the ECMWF model and the model’s ability to simulate TCs and forecast MJO events. Section 3 will present the main characteristics of the statistical model from Leroy and Wheeler (2008), which will be used as a benchmark for the skill of the dynamical model. The verification methodology will be discussed in section 4 and the reliability and skill of the forecasts will be evaluated in section 5. Sections 6 and 7 will present results obtained respectively after applying a calibration to the ECMWF forecasts and after combining the statistical and calibrated dynamical forecasts. ROC score maps will be presented in section 8. Finally, section 9 will summarize the main results of this paper.
2. Dynamical weekly TC prediction in the ECMWF forecast system
A series of 46-day hindcasts starting on the 15th of each month was performed for the 20-yr period 1989 to 2008 with the ECMWF model version Cy32r3 (Bechtold et al. 2008) that was operational from November 2007 until June 2008. Each hindcast consists of an ensemble of 15 members (a control and 14 perturbed forecasts) integrated for 10 days with a T399 resolution (about 50-km resolution) and 62 vertical levels that is forced by persisted SST anomalies. At day 10, the horizontal resolution is lowered to T255 (about 80-km resolution) and the model is coupled to the Hamburg Ocean Primitive Equation (HOPE) ocean GCM (Wolff et al. 1997) every 3 h. During the first 10 days, the ocean model is forced by the fluxes provided by each atmosphere-only integration, but the atmosphere is not sensitive to the ocean model state. The persisted SST anomaly product used to force the atmosphere is also used to constrain the SST of the ocean model to avoid inconsistencies between the atmospheric state at day 10 and the underlying SSTs. The fact that the atmospheric model is not coupled to an ocean model during the first 10 days impacts the prediction of the MJO (see Fig. 10 in Vitart et al. 2008) and presumably other coupled ocean–atmosphere disturbances. Future plans include coupling the atmosphere to the ocean model from day 0.
The atmospheric initial conditions are taken from the 40-yr ECMWF Re-Analysis (ERA-40; Uppala et al. 2005) until 2001 and from the ECMWF operational analysis after 2001. The atmospheric perturbations are produced using the singular vector method (Buizza and Palmer 1995) and by randomly perturbing the tendencies in the atmospheric physics during the model integrations (Buizza et al. 1999; Palmer 2001). Different ocean initial conditions are produced by applying a set of wind stress perturbations during the ocean data assimilation (Vialard et al. 2005). More details about the ECMWF monthly forecasts can be found in Vitart et al. (2008).
The TCs are tracked in the model hindcasts using the methodology described in Vitart et al. (1997) and revised in Vitart et al. (2003). As in observations, model TCs are defined as systems with a maximum 10-m wind velocity exceeding 17 m s−1. The climatology of model TCs is generally consistent with observations, although the TC activity is too high in the model compared to observations (Fig. 1). Vitart (2009) showed that the TC track density simulated by this set of hindcasts is modulated by the MJO as in observations.
The ability of the model to predict the MJO is assessed using the method outlined by Gottschalck et al. (2010). This involves the calculation of the Wheeler and Hendon (2004) MJO index for all the model hindcasts and comparison with the index computed from the ERA Interim Reanalysis (Simmons et al. 2007) over the same period. This index is calculated by projecting the forecasts or analyses onto the two leading combined empirical orthogonal functions of observed outgoing longwave radiation (OLR), zonal wind at 200 and 850 hPa averaged between 15°N and 15°S. The index has been applied to daily anomalies relative to the 1989–2008 hindcast climatology to remove the impact of the seasonal cycle. In addition, the 120-day running mean, preceding the forecast day, has been subtracted to remove the variance associated with ENSO.
Previous work has indicated that the MJO simulated by the model tends to be too strong by about 25% after forecast day 10, and its propagation is often slower than observed (Vitart and Molteni 2009). Here we further examine its skill through computation of the bivariate correlation and root-mean-square error as in Lin et al. (2008) and Rashid et al. (2010). The bivariate correlation between the observations and the ensemble mean forecasts falls to 0.6 at about day 19 (Fig. 2a), and the bivariate root-mean-square error of the ensemble mean forecast reaches climatology around day 20 (Fig. 2b). Thus, the dynamical model has skill for up to about 20 days to predict the evolution of the Madden–Julian oscillation, and should therefore have skill to predict the TC activity over the Southern Hemisphere during the first weeks of the forecast.
3. Statistical weekly TC prediction
For comparison purposes, we also test the skill of hindcasts generated with a purely statistical scheme. The scheme uses the known statistical relationships between weekly TC activity and various large-scale modes of climate variability, including the MJO, ENSO, and the climatological seasonal cycle. It is the same as the scheme described by Leroy and Wheeler (2008), except for two important differences: (i) its generalization to a grid of multiple overlapping regions in the Southern Hemisphere; and (ii) the use of different predictors of interannual TC variability. A maximum of six possible predictors are used—two predictors are for the MJO: the pair of real-time multivariate MJO indices of Wheeler and Hendon (2004); three predictors are for interannual variability related to tropical SSTs: the Niño-3.4 index, the Trans-Niño index (Trenberth and Stepaniak 2001), and the Indian dipole mode index (Saji et al. 1999); and the final predictor is the climatological seasonal cycle of TC activity.
As in Leroy and Wheeler (2008), the statistical scheme is based on logistic regression and predicts the probability of TC formation or occurrence in a specified region. However, the regions have been reduced in size and we concentrate on the occurrence of TCs (as opposed to TC formation) in each of the regions.
Hindcasts with the statistical scheme are generated for the same weeks as are available from the dynamical model. These are generated using a cross-validation method whereby the season being predicted is left out of the set of data used for the development of the logistic regression equation (as in Leroy and Wheeler 2008). This includes the recalculation of the climatological seasonal cycle of TC activity for each year being forecast by excluding the season predicted. A different logistic regression model is thus developed for each year, as well as a different model for each region and forecast lead time. As in Leroy and Wheeler (2008), the predictors are lagged by an appropriate amount to reflect their availability in real time.
This statistical scheme represents the state-of-the-art for statistical intraseasonal TC prediction. The original version has been run operationally at Météo-France (Nouméa) since the 2006/07 TC season, and the improved version is ready for use in 2009/10, with forecasts out to week 3 provided (see online at http://www.meteo.nc/espro/previcycl/cyclA.php). Tests of the skill and reliability for the independent seasons of 2006/07 and 2007/08, provided at the Web page listed above, show that it has achieved overall positive skill for these seasons, with very good reliability, when averaged over all four original regions and forecast leads. However, the statistical model relies on the presence of moderately strong climate signals (e.g., the MJO and ENSO) for success, as were present in 2006/07 and 2007/08 (Fawcett 2007; Wheeler 2008).
4. Verification methodology
For verification purposes, the probability of occurrence of TCs is predicted by both models over the same 60 regions in the Southern Hemisphere. Each region covers an area of 15° latitude by 20° of longitude, and is overlapped by 7.5° in latitude and 10° in longitude. The entire domain stretches from 0° to 30°S and 30°E to 120°W, which makes a grid of 20 boxes in longitude by 3 boxes in latitude. Smaller regions are preferable from the perspective of users of forecast information, but a compromise must be reached between the smallness and the number of TC events within each box that can be used for the development of the statistical scheme and verification of the models. The regions are large enough that the occurrence of a TC is not an extremely rare event, which would be difficult to verify with just 20 yr of observations. For boxes of this size, the average observed probability of a TC occurring in a week is 14% when averaged across all boxes and across all weeks for November–April 1982–2008 (see below).
The weeks are defined as days 1–7 (week 1), days 8–14 (week 2), days 15–21 (week 3), days 22–28 (week 4), and days 29–35 (week 5). A set of hindcasts has been produced with the statistical model for weeks 1–3, starting each day from 1 November to the 30 April during the years 1969–2006. The hindcasts stop at week 3, since the statistical model shows only very modest skill after week 3. To compare the skill of the dynamical and statistical models, the forecast starting dates that are common to both the statistical and dynamical hindcasts have been selected: hindcasts starting on 15 November and December 1989–2005 and 15 January, February, March, and April 1989–2006, representing a total of 106 starting dates. For the dynamical model, the probabilistic forecasts of TC occurrence are computed from the fraction of the 15 ensemble members that have a forecast TC track going through each of the 60 domains. Overall this represents 6360 forecasts issued by both the statistical and the dynamical forecasting systems for each weekly period. The weekly forecasts produced by the dynamical and statistical models are then verified against TC observations over the Southern Hemisphere issued by the Joint Typhoon Warning Center (JTWC; see online at http://www.usno.navy.mil/JTWC).
5. Forecast reliability and skill
The dynamical model will be referred to hereafter as ECMWF, and the statistical forecast as STAT. The probabilistic skill scores of ECMWF and STAT will also be compared to the scores of climatology (CLIM; 14% chance of a TC occurrence for all starting dates and for each domain) and to the probabilistic skill scores of a spatially and temporally varying climatology. This variable climatology, which will be referred to hereafter as “Variable CLIM,” has been computed in a cross-validated way as the observed probability of TC occurrence during a specific week (of the year) and over a specific domain during the period 1982 to 2008 with the JTWC dataset. Unlike CLIM, Variable CLIM varies from one domain to another and from one weekly period to another.
a. Statistical model
The reliability of probabilistic forecasts is well demonstrated with the use of a reliability diagram (Wilks 1995), which is a display of the observed frequency as a function of the forecast probability. For a perfectly reliable forecast, the graph should lie along the 45° diagonal, whereas values along a horizontal line indicate a no-skill forecast. Forecasts produced by STAT for week 1 are reliable since their reliability diagram is close to the diagonal (Fig. 3a). However, this forecast does not produce probabilities larger than 70%. STAT also rarely predicts 0% probabilities. In fact, the most populated probability bin is the 10%–20% bin that contains CLIM. This indicates that STAT produces very reliable forecasts, but with a relatively low resolution (ability of the forecast to sort or resolve the set of events into subsets with different frequency values). This is typical of statistical models, which in general do not predict very high or very low probabilities. In the two following weeks (Figs. 3a,b), the STAT forecasts remain very reliable, but at week 3, the sharpness (ability of the forecast to deviate from the climatological mean) and resolution are reduced, with few cases of forecast probabilities of TC occurrence exceeding 30%.
The skill of the probabilistic forecasts can be measured using the Brier skill score (BSS; Wilks 2005), which in this paper will use CLIM as the reference forecast. According to Table 1, STAT displays positive BSS during the three weeks. At week 3, the BSS is quite low, but the difference relative to CLIM is still statistically significant within the 5% level of confidence using a 10 000 bootstrap resampling procedure. For the subset of forecasts analyzed here, the skill of STAT is not statistically significantly different from the skill of Variable Clim in week 3 (cf. BSS of 0.04 to BSS of 0.046). In this section, the BSS is computed between 30°E and 120°W. Leroy and Wheeler (2008) found that there are areas in the Southern Hemisphere where STAT displays more skill than Variable CLIM in week 3.
The relative operating characteristic (ROC) score (Stanski et al. 1989; Mason and Graham 1999) has also been used to assess the skill of the probabilistic forecasts. Hit rates and false alarm rates are computed for different probability thresholds giving multiple points on a graph of hit rate (vertical axis) against false alarm rate (horizontal axis). In the present study, the interval between each probability threshold is 10%. The ROC score is the area below the ROC curve. Whereas a ROC score equal to 1.0 would be perfect, a no-skill forecast, such as CLIM, has a ROC score of 0.5. For week 1, STAT has a ROC diagram well above the diagonal (Fig. 4), with a ROC score of 0.73 (Table 2). As for the BSS, the ROC score of STAT is significantly higher than the ROC scores of CLIM and Variable CLIM during the first two weeks.
These results are overall consistent with Leroy and Wheeler (2008), who provide a more complete verification of this statistical model based on a larger number of hindcasts. However, Leroy and Wheeler (2008) found a larger decrease in the skill of STAT between weeks 1 and 2. The smaller decrease in this study may be because the verification periods do not overlap. Since the forecasts start on the 15th of each month, week 1 covers the period from the 15th to the 21st, whereas week 2 covers the period from the 22nd to the 28th. Therefore, the scores from one week to another are less able to be directly compared. This difference in forecast verification periods also explains why the Brier skill score of Variable CLIM varies from one forecast week to another.
b. Uncalibrated ECMWF dynamical model
The ECMWF hindcasts have been verified in exactly the same way as STAT. The ECMWF forecasts (red lines in Fig. 3) for weeks 1, 2, and 3, respectively, seem somewhat reliable with the observed frequency increasing with higher forecast probabilities. However, the reliability graphs of ECMWF are flatter than the 45° diagonal, which indicates that the dynamical model is overconfident because it too often predicts both very high and very low probabilities, which is often the case with dynamical models. The strong overestimation of the high risks is a consequence of the model bias in that the TC activity is overpredicted in the ECMWF model (Fig. 1 and discussion in section 2). The number of TC occurrences is about 30% larger in the model than is observed over the Southern Hemisphere for the period November–April, so that the probabilities of TC occurrences are likely to be inflated in the model with a significant number of false alarms. In this sense the ECMWF model is less reliable than STAT during the first three forecast weeks, but the ECMWF forecasts have much more resolution and sharpness than STAT (Fig. 3). For instance, the dynamical model forecasts include a very large number of 0% and 100% probabilities of a TC occurrence in week 1 (Fig. 3a) and still produce high probabilities in week 3 (Fig. 3c).
The ECMWF forecasts have higher ROC scores than STAT for the first three forecast weeks (Table 2). These differences in ROC scores between ECMWF and STAT for the three forecast weeks are statistically significant within the 5% level of confidence using a 10 000 bootstrap resampling procedure. ECMWF also has significantly higher ROC scores than Variable CLIM, even for week 3. The difference of ROC scores is particularly large for week 1 (0.86 for ECMWF compared to 0.73 for STAT), which may be partially explained by the fact that the dynamical forecasts of week 1 include the knowledge of the presence of TCs via the initial conditions, which is not the case for STAT.
These results suggest that the ECMWF forecast system can produce useful probabilistic forecasts up to at least 3 weeks of TC occurrences over the Southern Hemisphere. It may not be necessary to integrate the model for 3 weeks if the skill of the model for week 2 or 3 originates from the persistence of the previous week. To check if this is the case, the forecast probabilities for week 1 have been used to predict the occurrence of TCs in week 2 and the forecast probabilities of week 2 have been used to predict the occurrence of TCs in week 3. The ROC score of persisting week 1 probabilities to predict week 2 TC occurrences is only 0.68, which is statistically significantly lower than the ROC score of 0.8 for the ECWMF forecast of week 2 (Table 2). Similarly, assuming a persistence of week 2 to forecast week 3 leads to a ROC score of 0.66, which is also significantly lower than the ROC score for the ECMWF forecast of week 3 (0.74). These results indicate that integrating the dynamical forecast for at least 21 days is useful for the prediction of TCs in the Southern Hemisphere.
The BSS of the ECMWF forecasts for week 1 is also higher than the BSS of STAT (Table 1). The difference is statistically significant within the 5% level of confidence using a 10 000 bootstrap resampling procedure. However, the BSS of ECMWF for weeks 2 and 3 is lower than the BSS for STAT (Table 1), which is due to the ECMWF forecast being much less reliable than STAT during those weeks, despite having more resolution. As discussed earlier, this lack of reliability is likely due to the ECMWF model being overactive in the tropics and generating too many TCs.
6. Calibrating the ECMWF model outputs (CECMWF)
The reliability of the ECMWF forecast system for the prediction of the probability of TC occurrences may be improved by calibrating the forecasts. As a first step, a crude calibration is to set the ECMWF forecast probability to zero over areas and periods in which TCs have not been observed during the climatological period 1982–2008 (excluding the actual year of the forecast). This calibration eliminates the problem of the model predicting TCs too far to the east in the South Pacific where no TCs are observed (Fig. 1). In addition, the forecast probabilities are multiplied by 0.77, so that the climatological number of occurrences in the model is equal to the observed climatology (the model generates 30% more TC occurrences than in observations). These calibrated forecasts will be referred to as CECMWF. A more sophisticated calibration of the ECMWF forecasts has been tested where the calibration is performed over each domain and each time of the year independently. The results obtained were not better than those obtained with CECMWF, probably because of sampling issues when performing a calibration that is time and domain dependent with just 19 yr of data.
In the reliability diagrams (Fig. 3), the calibrated forecasts (black curves) lie much closer to the 45° diagonal than the uncalibrated forecasts, which indicates that the calibrated forecasts are much more reliable than the uncalibrated forecasts. Indeed, the calibrated forecasts have higher BSSs for each of the three forecast weeks (Table 1) than the uncalibrated ECMWF forecasts. This suggests that the overprediction of tropical cyclones in the Southern Hemisphere was the likely cause for the low reliability of the ECMWF forecasts. In addition, the CECMWF also has higher BSSs than STAT and Variable CLIM for weeks 2 and 3, although CECMWF still has less reliability than STAT and Variable CLIM. The BSS difference is statistically significant within the 5% level of confidence. However, the better reliability and BSS of CECMWF compared to ECMWF is at the expense of the sharpness. Note in Fig. 3 that CECMWF does not produce probabilities higher than 80%, which may be an issue for the forecast of week 1, in which very high probabilities are often due to the presence of TCs in the initial conditions. For weeks 2 and 3 this is less of a problem, since the very high probabilities of TC occurrences in the uncalibrated ECMWF forecasts at those time ranges are unrealistic. The ROC scores of CECMWF are not significantly different from the ROC scores of ECMWF.
7. A multimodel combination
Previous studies have shown that combining different models can lead to better forecasts (e.g., Palmer et al. 2004). Vitart (2006) has shown that combining several models leads to better seasonal forecasts of TCs over most ocean basins. Therefore, the STAT and CECMWF forecasts are combined by simply averaging the forecast probabilities produced by the two models, with each model being given the same weight. The multimodel (referred to as MULTI) has higher BSSs than CECMWF (Table 3). Although the difference for week 1 is not statistically significant, the difference of BSSs for weeks 2 and 3 is statistically significant within the 5% level of confidence. The multimodel combination improves the reliability of CECMWF (Fig. 5), but at the expense of less sharpness. The ROC scores are slightly lower in MULTI than in CECMWF (not shown), but the difference is not statistically significant.
Other multimodel forecasts have been tested, giving different weights to CECMWF and STAT that are a function of the BSSs of the individual models computed in a cross-validated way. The model with the higher BSS is given the larger weight. However, the variable weight results were not significantly different from the results in which the models were equally weighted.
A cost/loss model of economical decision making (Murphy 1977; Richardson 2000) can be used to address the potential benefit of the probabilistic forecasts of TC occurrences for different users. In this model, a decision maker has two alternatives of taking action or doing nothing depending on his or her belief that a given weather event will occur or not. Taking action incurs a cost C irrespective of the outcome. If the event does occur and no action has been taken, then the decision maker incurs a loss L. It is convenient to consider the expenses of the various courses of action in term of “cost/loss” ratio (C/L). In this model, the value V of the forecast is defined as the savings made by using the TC probabilistic forecasts as a fraction of the potential saving that would be achieved with perfect knowledge of the future. A value V = 0 indicates that the forecast has no more value than climatology. The potential economic value diagrams of the multimodel combination of CECMWF and STAT for the 3 forecast weeks in Fig. 6 confirm that the multimodel forecasts of TCs in weeks 1 to 3 have some value for a large range of cost/loss ratios.
8. ROC score maps for the ECMWF model
In the previous sections, the probabilistic skill scores have been evaluated over the whole Southern Hemisphere. The ROC scores can also be computed for a 1° × 1° grid over the Southern Hemisphere to get more detailed information about the regions in which the ECMWF model has skill. For each grid point, the probability of a TC occurrence in a 20° × 15° domain centered on the grid point is calculated for all the November–April forecasts available from the dynamical model. As expected, these ROC scores decrease from weeks 1 to 5 (Fig. 7). Even at week 5, the ROC score is larger than 0.5 over most grid points, which indicates a better performance than CLIM. During the first 3 weeks of the forecast, the ECMWF model has higher ROC scores than the Variable CLIM (not shown) over both the Indian Ocean and the South Pacific. For week 4, a clear difference exists between the South Indian Ocean where the ROC scores exceed 0.6 (higher than the ROC score obtained with Variable CLIM) and the South Pacific where the ROC scores are generally lower than 0.6 (Fig. 7).
9. Conclusions and discussion
In this paper, the skill of the ECMWF model in forecasting the weekly occurrence of TCs in a 20° × 15° domain has been compared to the skill of a statistical model (STAT) during the first three forecast weeks. The ECMWF forecast system has higher ROC scores than STAT during the first three forecast weeks, but has lower Brier skill scores than STAT after week 1 because of its poor reliability resulting from an overactive production of TCs. A simple calibration applied to the dynamical model leads to higher Brier skill scores than both the uncalibrated forecasts and STAT, with more resolution and sharpness, but less reliability than STAT. Therefore, a state-of-the-art NWP model can be useful for the prediction of TCs over the Southern Hemisphere in the intraseasonal time scale with skill that can be competitive with, if not better than, the skill of a state-of-the-art statistical model. A second conclusion of this study is that the statistical model of Leroy and Wheeler (2008) can serve as a useful benchmark to validate the skill of a dynamical model. Combining the dynamical and statistical forecasts in a multimodel forecast results in additional skill.
The skill of the dynamical model in predicting the evolution of the MJO up to about 20 days is likely contributing to its skill to predict Southern Hemisphere TC occurrences. The result that the skill of the dynamical model in week 4 seems to be limited to only over the Indian Ocean may be because this dynamical model has difficulty in propagating an MJO across the Maritime Continent (Vitart and Molteni 2009), which means that the skill of the model to predict an MJO event in the western South Pacific is lower than its skill to predict an MJO over the South Indian Ocean, at this time range. In this version of the ECMWF model, the MJO is about 20% too strong after about 10 days of model integrations. This has been partially solved in the more recent versions of IFS (Vitart and Molteni 2009) that are not as overactive as Cy32r3. Another important problem is the too-slow eastward propagation of the MJO in the model simulations (Figs. 12 and 13 in Vitart and Molteni 2009), which can have a negative impact on the skill of the forecast in predicting TC occurrences in week 4. Solving those issues should extend the MJO skill beyond day 20, which should then translate into an extended skill of the ECMWF model to predict TC occurrences in week 4 and beyond. Another area of improvement is the use of a high vertical resolution ocean mixed-layer model (Woolnough et al. 2007), which may further extend the skill of the dynamical model to predict the evolution of the MJO by a few days to a week depending on how realistic the intensity and propagation speed of the MJO simulated by the atmospheric model is in the first place.
Monthly forecasts are currently produced operationally once a week at ECMWF (Vitart et al. 2008). Those forecasts consist of a 51-member ensemble of 32-day integrations. The present study suggests that this forecasting system can produce skillful weekly predictions of TC occurrences. As discussed in section 6, these dynamical forecasts of TCs need to be calibrated to become more reliable. The calibration depends on the version of the ECMWF model. However, a set of hindcasts that is produced operationally every week at ECMWF (Vitart et al. 2008) could be used to calibrate the weekly forecast of TC occurrences as discussed in section 6. Experimental weekly forecasts of TCs are currently produced routinely in real-time at ECMWF.
The results presented in this paper also suggest the potential for skillful intraseasonal prediction of TC landfall. Indeed, some initial results with the ECMWF model indicate skill for TC landfall over the major TC-prone land regions in the Southern Hemisphere up to at least week 2, and up to week 4 over western Australia, but we leave the presentation of these results to a future study.
Acknowledgments
We are grateful to Rob Hine, who has helped to improve the quality of the figures and to Paul Roundy and two anonymous reviewers whose comments proved invaluable in improving the presentation of the material.
REFERENCES
Bechtold, P. , M. Koehler , T. Jung , P. Doblas-Reyes , M. Leutbecher , M. Rodwell , and F. Vitart , 2008: Advances in simulating atmospheric variability with the ECMWF model: From synoptic to decadal time-scales. Quart. J. Roy. Meteor. Soc., 134 , 1337–1351.
Bessafi, M. , and M. C. Wheeler , 2006: Modulation of South Indian Ocean tropical cyclones by the Madden–Julian oscillation and convectively coupled equatorial waves. Mon. Wea. Rev., 134 , 638–656.
Blake, E. S. , and W. M. Gray , 2004: Prediction of August Atlantic basin hurricane activity. Wea. Forecasting, 19 , 1044–1060.
Buizza, R. , and T. N. Palmer , 1995: The singular-vector structure of the atmospheric general circulation. J. Atmos. Sci., 52 , 1434–1456.
Buizza, R. , M. Miller , and T. N. Palmer , 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 125 , 1887–1908.
Camargo, S. J. , A. G. Barnston , and E. Zebiak , 2005: A statistical assessment of tropical cyclone activity in atmospheric general circulation models. Tellus, 57A , 589–604.
Camargo, S. J. , M. Ballester , A. G. Barnston , P. Klotzbach , P. Roundy , M. A. Saunders , F. Vitart , and M. C. Wheeler , 2006: Short-term climate (seasonal and intra-seasonal) prediction of tropical activity and intensity. Workshop Topic Reports, Sixth Int. Workshop on Tropical Cyclones (IWTC-VI), Topic 4.3, TMRP 72, San José, Costa Rica, WMO, 493–499.
Camargo, S. J. , M. C. Wheeler , and A. H. Sobel , 2009: Diagnosis of the MJO modulation of tropical cyclogenesis using an empirical index. J. Atmos. Sci., 66 , 3061–3074.
Elsberry, R. L. , M. S. Jordan , and F. Vitart , 2010: Predictability of tropical cyclone events on intraseasonal timescale with the ECMWF monthly forecast model. Asia-Pac. J. Atmos. Sci., 46 , 135–153.
Fawcett, R. J. B. , 2007: Seasonal climate summary Southern Hemisphere (summer 2006–07): The end of the 2006–07 El Niño. Aust. Meteor. Mag., 56 , 309–319.
Frank, W. M. , and P. E. Roundy , 2006: The role of tropical waves in tropical cyclogenesis. Mon. Wea. Rev., 134 , 2397–2417.
Goldenberg, S. G. , and L. J. Shapiro , 1996: Physical mechanism for the association of El Niño and West African rainfall with Atlantic major hurricane activity. J. Climate, 9 , 1169–1187.
Gottschalck, J. , and Coauthors , 2010: A framework for assessing operational model MJO forecasts: A project of the CLIVAR Madden–Julian oscillation working group. Bull. Amer. Meteor. Soc., 91 .in press.
Gray, W. M. , 1984: Atlantic seasonal hurricane frequency. Part I: El Niño and 30 mb quasi-biennial oscillation influences. Mon. Wea. Rev., 112 , 1649–1668.
Hall, J. D. , A. J. Matthews , and D. J. Karoly , 2001: The modulation of tropical cyclone activity in the Australian region by the Madden–Julian oscillation. Mon. Wea. Rev., 129 , 2970–2982.
Ho, C-H. , J-H. Kim , J-H. Jeong , H-S. Kim , and D. Chen , 2006: Variation of tropical cyclone activity in the South Indian Ocean: El Niño–Southern Oscillation and Madden–Julian Oscillation effects. J. Geophys. Res., 111 , D22101. doi:10.1029/2006JD007289.
Klotzbach, P. J. , and W. M. Gray , 2003: Forecasting September Atlantic basin tropical cyclone activity. Wea. Forecasting, 18 , 1109–1128.
Leroy, A. , and M. C. Wheeler , 2008: Statistical prediction of weekly tropical cyclone activity in the Southern Hemisphere. Mon. Wea. Rev., 136 , 3637–3654.
Leroy, A. , M. C. Wheeler , and B. Timbal , 2004: Statistical prediction of the weekly tropical cyclone activity in the Southern Hemisphere. Bureau of Meteorology and Meteo France Internal Rep., 66 pp. [Available online at http://cawcr.gov.au/bmrc/clfor/cfstaff/matw/abstracts/Leroyetal04.html].
Liebmann, B. , H. H. Hendon , and J. D. Glick , 1994: The relationship between tropical cyclones of the western Pacific and Indian Oceans and the Madden–Julian oscillation. J. Meteor. Soc. Japan, 72 , 401–411.
Lin, H. , G. Brunet , and J. Derome , 2008: Forecast skill of the Madden–Julian oscillation in two Canadian atmospheric models. Mon. Wea. Rev., 136 , 4130–4149.
Lin, J-L. , and Coauthors , 2006: Tropical intraseasonal variability in 14 IPCC AR4 climate models. Part I: Convective signals. J. Climate, 19 , 2665–2690.
Madden, R. A. , and P. R. Julian , 1971: Detection of a 40–50-day oscillation in the zonal wind in the tropical Pacific. J. Atmos. Sci., 28 , 702–708.
Maloney, E. D. , and D. L. Hartmann , 2000a: Modulation of eastern North Pacific hurricanes by the Madden–Julian oscillation. J. Climate, 13 , 1451–1460.
Maloney, E. D. , and D. L. Hartmann , 2000b: Modulation of hurricane activity in the Gulf of Mexico by the Madden–Julian oscillation. Science, 287 , 2002–2004.
Mason, S. J. , and N. E. Graham , 1999: Conditional probabilities relative operating characteristics, and relative operating levels. Wea. Forecasting, 14 , 713–725.
Mo, K. C. , 2000: The association between intraseasonal oscillations and tropical storms in the Atlantic basin. Mon. Wea. Rev., 128 , 4097–4107.
Molinari, J. , S. Knight , M. Dickinson , D. Vollaro , and S. Skubis , 1997: Potential vorticity, easterly waves, and eastern Pacific tropical cyclogenesis. Mon. Wea. Rev., 125 , 2699–2708.
Murphy, A. H. , 1977: The value of climatological, categorical, and probabilistic forecasts in the cost-loss ratio situation. Mon. Wea. Rev., 105 , 803–816.
Nakazawa, T. , 1988: Tropical super clusters within intraseasonal variations over the western Pacific. J. Meteor. Soc. Japan, 66 , 823–839.
Palmer, T. N. , 2001: A nonlinear dynamical perspective on model error: A proposal for nonlocal stochastic dynamic parameterization in weather and climate prediction models. Quart. J. Roy. Meteor. Soc., 127 , 279–304.
Palmer, T. N. , and Coauthors , 2004: Development of a European Multimodel Ensemble System for Seasonal-to-Interannual Prediction (DEMETER). Bull. Amer. Meteor. Soc., 85 , 853–872.
Rashid, H. , H. H. Hendon , M. C. Wheeler , and O. Alves , 2010: Prediction of the Madden-Julian Oscillation with the POAMA dynamical prediction system. Climate Dyn., in press.
Richardson, D. , 2000: Skill and relative value of the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 126 , 647–667.
Saji, N. H. , B. N. Goswami , P. N. Vinayachandran , and T. Yamagata , 1999: A dipole mode in the tropical Indian Ocean. Nature, 401 , 360–363.
Saunders, M. A. , and A. R. Harris , 1997: Sea warming as a dominant factor behind near-record number of Atlantic hurricanes. Geophys. Res. Lett., 24 , 1255–1258.
Shapiro, L. J. , 1987: Month-to-month variability of the Atlantic tropical circulation and its relationship to tropical storm formation. Mon. Wea. Rev., 115 , 2598–2614.
Simmons, A. J. , S. Uppala , D. P. Dee , and S. Kobayashi , 2007: ERA-Interim: New ECMWF reanalysis products from 1989 onwards. ECMWF Newsletter, No. 110, ECMWF, Reading, United Kingdom, 25–35.
Slingo, J. M. , and Coauthors , 1996: Intraseasonal oscillations in 15 atmospheric general circulation models: Results from an AMIP diagnostic subproject. Climate Dyn., 12 , 325–357.
Stanski, H. R. , L. J. Wilson , and W. R. Burrowa , 1989: Survey of common verification methods in meteorology. World Weather Watch Tech Rep. 8, WMO Tech. Doc. 358, 114 pp.
Trenberth, K. E. , and D. P. Stepaniak , 2001: Indices of El Niño evolution. J. Climate, 14 , 1697–1701.
Uppala, S. M. , and Coauthors , 2005: The ERA-40 Re-Analysis. Quart. J. Roy. Meteor. Soc., 131 , 2961–3012.
Vialard, J. , F. Vitart , M. A. Balmaseda , T. Stockdale , and D. L. T. Anderson , 2005: An ensemble generation method for seasonal forecasting with an ocean–atmosphere coupled model. Mon. Wea. Rev., 133 , 441–453.
Vitart, F. , 2006: Seasonal forecasting of tropical storm frequency using a multi-model ensemble. Quart. J. Roy. Meteor. Soc., 132 , 647–666.
Vitart, F. , 2009: Impact of the Madden–Julian Oscillation on tropical storms and risk of landfall in the ECMWF forecast system. Geophys. Res. Lett., 36 , L15802. doi:10.1029/2009GL039089.
Vitart, F. , and J. L. Anderson , 2001: Sensitivity of Atlantic tropical storm frequency to ENSO and interdecadal variability of SSTs in an ensemble of AGCM integrations. J. Climate, 14 , 533–545.
Vitart, F. , and T. N. Stockdale , 2001: Seasonal forecasting of tropical storms using coupled GCM integrations. Mon. Wea. Rev., 129 , 2521–2527.
Vitart, F. , and F. Molteni , 2009: Simulation of the MJO and its teleconnections in an ensemble of 46-day EPS hindcasts. ECMWF Tech. Memo. 597, ECMWF, 60 pp. [Available online at http://www.ecmwf.int/publications/library/do/references/list/14].
Vitart, F. , J. L. Anderson , and W. F. Stern , 1997: Simulation of interannual variability of tropical storm frequency in an ensemble of GCM integrations. J. Climate, 10 , 745–760.
Vitart, F. , D. Anderson , and T. Stockdale , 2003: Seasonal forecasting of tropical cyclone landfall over Mozambique. J. Climate, 16 , 3932–3945.
Vitart, F. , and Coauthors , 2007: Dynamically-based seasonal forecast of Atlantic tropical storm activity issued in June by EUROSIP. Geophys. Res. Lett., 34 , L16815. doi:10.1029/2007GL030740.
Vitart, F. , and Coauthors , 2008: The new VAREPS-monthly forecasting system: A first step towards seamless prediction. Quart. J. Roy. Meteor. Soc., 134 , 1789–1799.
Wheeler, M. C. , 2008: Seasonal climate summary Southern Hemisphere (summer 2007–08): mature La Niña, an active MJO, strongly positive SAM, and highly anomalous sea-ice. Aust. Meteor. Mag., 57 , 379–393.
Wheeler, M. C. , and H. H. Hendon , 2004: An all-season real-time multivariate MJO Index: Development of an index for monitoring and prediction. Mon. Wea. Rev., 132 , 1917–1932.
Wilks, D. S. , 1995: Statistical Methods in the Atmospheric Sciences: An Introduction. Academic Press, 464 pp.
Wilks, D. S. , 2005: Statistical Methods in the Atmospheric Sciences. 2nd ed. Elsevier, 627 pp.
Wolff, J. O. , E. Maier-Raimer , and S. Legutke , 1997: The Hamburg ocean primitive equation model. Deutsches Klimarechenzentrum Tech. Rep. 13, Hamburg, Germany, 98 pp.
Woolnough, S. J. , F. Vitart , and M. A. Balmaseda , 2007: The role of the ocean in the Madden-Julian Oscillation: Implications for MJO prediction. Quart. J. Roy. Meteor. Soc., 133 , 117–128.
Density of TCs (×1000) in (top) observations and (bottom) model for the period November–April 1989–2008. The density of TCs is defined as the number of TCs per day passing within 2° latitude (about 220 km).
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
(a) Bivariate correlation skill and (b) root-mean-square error of the MJO index for the ECMWF forecast ensemble mean as a function of lead time for the period November–April 1989–2008 for all levels of MJO activity (black lines). The thin lines represent the 0.5 and 0.6 correlations in (a) and the dashed line in (b) represents the RMSE obtained with climatology.
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
Reliability diagrams of the probability of TC occurrence in 20° × 15° domains in the Southern Hemisphere for the forecast range days (a) 1–7, (b) 8–14, and (c) 15–21 (week 3). The blue line corresponds to STAT, the red line corresponds to the ECMWF model, and the black line corresponds to CECMWF. In this graphic the area of the symbols (octagons) for each probability bin is proportional to the number of cases populating that bin. The error bars (95% level of confidence) were computed from a 10 000 bootstrap resampling procedure.
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
ROC diagrams of the probability of TC occurrence in 20° × 15° domains in the Southern Hemisphere for the forecast range days 1–7 (week 1). The red line corresponds to ECMWF, the blue line corresponds to STAT, and the black line corresponds to CECMWF. The numbers in the figure correspond to the probability thresholds.
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
Reliability diagrams of the probability of TC occurrence in 20° × 15° domains in the Southern Hemisphere for the forecast range week (a) 1, (b) 2, and (c) 3. The black line corresponds to CECMWF and the gray line corresponds to the multimodel combination of CECMWF and STAT. In this graphic, the area of the symbols (octagons) for each probability bin is proportional to the number of cases populating that bin. The error bars (95% level of confidence) were computed from a 10 000 bootstrap resampling procedure.
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
Potential economic value diagram of the multimodel probability of TC occurrence in 20° × 15° domains in the Southern Hemisphere for the forecast range week 1 (solid black curve), 2 (dotted black curve), and 3 (solid gray curve).
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
ROC score gridpoint map of the probability of TC occurrence in 20° × 15° domains in the Southern Hemisphere for the forecast range weeks 1–5 of the ECMWF model. Shaded areas indicate a ROC score larger than 0.5 (higher scores than climatology). White areas indicate a ROC score smaller than 0.5 (less skill than climatology).
Citation: Monthly Weather Review 138, 9; 10.1175/2010MWR3343.1
BSSs of the probability of the occurrence of a TC in 20° × 15° domains in the Southern Hemisphere for the forecast range days 1–7 (week 1), 8–14 (week 2), and 15–21 (week 3) for CLIM, Variable CLIM, STAT, ECMWF, and CECMWF (calibrated ECMWF). The numbers in italics represent the 95% confidence interval that is calculated using a 10 000 bootstrap resampling procedure.
BSS of the probability of the occurrence of a TC in 20° × 15° domains in the Southern Hemisphere for the forecast weeks 1, 2, and 3 for CECMWF and the multimodel combination of CECMWF with STAT (MULTI).