The Forecast Skill of Tropical Cyclone Genesis in Two Global Ensembles

Xiping Zhang aShanghai Typhoon Institute, and Key Laboratory of Numerical Modeling for Tropical Cyclone, China Meteorological Administration, Shanghai, China
bKey Laboratory for Mesoscale Severe Weather, School of Atmospheric Sciences, Nanjing University, Nanjing, China

Search for other papers by Xiping Zhang in
Current site
Google Scholar
PubMed
Close
,
Juan Fang bKey Laboratory for Mesoscale Severe Weather, School of Atmospheric Sciences, Nanjing University, Nanjing, China

Search for other papers by Juan Fang in
Current site
Google Scholar
PubMed
Close
, and
Zifeng Yu aShanghai Typhoon Institute, and Key Laboratory of Numerical Modeling for Tropical Cyclone, China Meteorological Administration, Shanghai, China

Search for other papers by Zifeng Yu in
Current site
Google Scholar
PubMed
Close
Free access

Abstract

Tropical cyclone (TC) genesis forecasts during 2018–20 from two operational global ensemble prediction systems (EPSs) are evaluated over three basins in this study. The two ensembles are from the European Centre for Medium-Range Weather Forecasts (ECMWF-EPS) and the MetOffice in the United Kingdom (UKMO-EPS). The three basins include the northwest Pacific, northeast Pacific, and the North Atlantic. It is found that the ensemble members in each EPS show a good level of agreement in forecast skill, but their forecasts are complementary. Probability of detection (POD) can be doubled by taking all the member forecasts in the EPS into account. Even if an ensemble member does not make a hit forecast, it may predict the presence of cyclonic vortices. Statistically, a hit forecast has more nearby disturbance forecasts in the ensemble than a false alarm. Based on the above analysis, we grouped the nearby forecasts at each model initialization time to define ensemble genesis forecasts, and verified these forecasts to represent the performance of the ensemble system. The PODs are found to be more than twice that of the individual ensemble members at most lead times, which is about 59% and 38% at the 5-day lead time in UKMO-EPS and ECMWF-EPS, respectively; while the success ratios are smaller compared with that of the ensemble members. In addition, predictability differs in different basins, and genesis events in the North Atlantic basin are the most difficult to forecast in EPS, and its POD at the 5-day lead time is only 46% and 23% in UKMO-EPS and ECMWF-EPS, respectively.

Significance Statement

Operational forecasting of tropical cyclone (TC) genesis relies greatly on numerical models. Compared with deterministic forecasts, ensemble prediction systems (EPSs) can provide uncertainty information for forecasters. This study examined the predictability of TC genesis in two operational EPSs. We found that the forecasts of ensemble members complement each other, and the detection ratio of observed genesis will be doubled by considering the forecasts of all members, as multiple simulations conducted by the EPS partially reflect the inherent uncertainties of the genesis process. Successful forecasts are surrounded by more cyclonic vortices in the ensemble than false alarms, so the vortex information is used to group the nearby forecasts at each model initialization to define ensemble genesis forecasts when evaluating the ensemble performance. The results demonstrate that the global ensemble models can serve as a valuable reference for TC genesis forecasting.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Juan Fang, fangjuan@nju.edu.cn

Abstract

Tropical cyclone (TC) genesis forecasts during 2018–20 from two operational global ensemble prediction systems (EPSs) are evaluated over three basins in this study. The two ensembles are from the European Centre for Medium-Range Weather Forecasts (ECMWF-EPS) and the MetOffice in the United Kingdom (UKMO-EPS). The three basins include the northwest Pacific, northeast Pacific, and the North Atlantic. It is found that the ensemble members in each EPS show a good level of agreement in forecast skill, but their forecasts are complementary. Probability of detection (POD) can be doubled by taking all the member forecasts in the EPS into account. Even if an ensemble member does not make a hit forecast, it may predict the presence of cyclonic vortices. Statistically, a hit forecast has more nearby disturbance forecasts in the ensemble than a false alarm. Based on the above analysis, we grouped the nearby forecasts at each model initialization time to define ensemble genesis forecasts, and verified these forecasts to represent the performance of the ensemble system. The PODs are found to be more than twice that of the individual ensemble members at most lead times, which is about 59% and 38% at the 5-day lead time in UKMO-EPS and ECMWF-EPS, respectively; while the success ratios are smaller compared with that of the ensemble members. In addition, predictability differs in different basins, and genesis events in the North Atlantic basin are the most difficult to forecast in EPS, and its POD at the 5-day lead time is only 46% and 23% in UKMO-EPS and ECMWF-EPS, respectively.

Significance Statement

Operational forecasting of tropical cyclone (TC) genesis relies greatly on numerical models. Compared with deterministic forecasts, ensemble prediction systems (EPSs) can provide uncertainty information for forecasters. This study examined the predictability of TC genesis in two operational EPSs. We found that the forecasts of ensemble members complement each other, and the detection ratio of observed genesis will be doubled by considering the forecasts of all members, as multiple simulations conducted by the EPS partially reflect the inherent uncertainties of the genesis process. Successful forecasts are surrounded by more cyclonic vortices in the ensemble than false alarms, so the vortex information is used to group the nearby forecasts at each model initialization to define ensemble genesis forecasts when evaluating the ensemble performance. The results demonstrate that the global ensemble models can serve as a valuable reference for TC genesis forecasting.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Juan Fang, fangjuan@nju.edu.cn

1. Introduction

Tropical cyclones (TCs) are among the most catastrophic weather systems affecting humans. Accurate genesis forecasting is essential for mitigating risk. With the remarkable progress in computational power, numerical weather prediction (NWP) models are beginning to play an important role in TC genesis forecasting. Accordingly, a lot of studies have been conducted to assess the reliability of model-indicated genesis (e.g., Liang et al. 2021; Halperin et al. 2013, 2016, 2020; Yamaguchi and Koide 2017; Li et al. 2016; Wang et al. 2018; Jaiswal et al. 2016; Komaromi and Majumdar 2014; Elsberry et al. 2011; Chan and Kwok 1999).

The assessment of genesis forecasting in deterministic models is a hot topic of research (e.g., Liang et al. 2021; Halperin et al. 2013; Nakano et al. 2015; Cheung and Elsberry 2002; Chan and Kwok 1999). Halperin et al. (2013, 2016, 2017, 2020) carried out a series of studies to evaluate the performance of major global models out to 5 days in the North Atlantic and northeast Pacific (EP). Their results revealed that global models were increasingly able to predict TC genesis, and the probability of detection was greater over the EP basin. Liang et al. (2021) evaluated the performance of the European Centre for Medium-Range Weather Forecasts (ECMWF) deterministic forecast in predicting TC genesis out to 10 days prior to genesis in the northwest Pacific (WP), and found that about 70% (46%) of TCs could be predicted at the forecast lead time of 48 h (120 h). Nakano et al. (2015) investigated the predictability of TC genesis in the WP basin using a global nonhydrostatic atmospheric model, and their results suggested that the successful simulation of large-scale fields was the key for extended-range genesis forecasts. However, the thresholds to identify model TCs and the definition of a successful forecast differed among these studies. For example, Liang et al. (2021) adopted 9 m s−1 at 925 hPa as the wind threshold to identify model genesis, Halperin et al. (2016) adopted 15.9 m s−1 at 925 hPa for the ECMWF model in the EP basin, while Chan and Kwok (1999) only checked the sea level pressure and vorticity; for the allowable position error when judging a successful forecast, Liang et al. (2021) adopted within 5° of the observed genesis position, while Nakano et al. (2015) adopted within 10°. The prediction ability of a model may vary under different thresholds and definitions of metrics (Magnusson et al. 2021).

In ensemble prediction systems (EPSs), multiple simulations with slightly different initial conditions or with different parameterizations are conducted to partially overcome the inherent uncertainties in weather systems (Pedlosky 1987) and model imperfections. The disagreement among ensemble members at a specific forecast can provide information about genesis uncertainty, like it does in TC track forecasting (Zhang and Yu 2017; Zhang et al. 2015). If a larger number of ensemble members predict the development of a tropical disturbance, then the probability of it actually developing into a tropical storm is greater (Yamaguchi and Koide 2017; Tsai et al. 2020; Jaiswal et al. 2016). Compared with deterministic forecasts, the forecasts by EPSs provide uncertainty information for operational forecasters. Therefore, it is of both scientific and forecasting interest to explore the prediction skill of TC genesis in EPSs.

However, studies on genesis forecasting in EPSs are not as much, partly due to the huge data volume involved (Yamaguchi et al. 2015; Nakano et al. 2015). Li et al. (2016) and Wang et al. (2018) investigated the genesis forecasting in the National Centers for Environmental Prediction’s (NCEP) Global Ensemble Forecasting System (GEFS) using the GEFS Reforecast version 2 dataset. This ensemble well captured the climatological seasonality of TC genesis over different ocean basins (Li et al. 2016), and the predictability of genesis associated with tropical transition was lower in the Atlantic (Wang et al. 2018). However, their results were based on the analysis of the ensemble mean, and the uncertainty contained in ensemble forecasts was not explored. Majumdar and Torn (2014) used the ratio of ensemble members that predict genesis at each forecast time to represent the uncertainty information, and Tsai et al. (2011) used the number of ensemble tracks surrounding the formation location to measure the uncertainty. Case-to-case variability was noticed. However, limited by data, the evaluation by Majumdar and Torn (2014) was performed only on model genesis with initial disturbances present out to 6 days, and the skill test by Tsai et al. (2011) was conducted on TC occurrences (the whole life cycle). Komaromi and Majumdar (2014, 2015) examined the predictability of environmental conditions favorable to genesis in the Atlantic using the ECMWF ensemble forecasts, which was about a week, while they did not give statistics about successful forecasts and false alarms. The above-mentioned studies have enriched our understanding of genesis forecasting in EPS. Nevertheless, a detailed assessment of genesis forecasts in global ensemble prediction systems, like the work Halperin et al. (2016) has done for deterministic models, is still in need.

The objective of this paper is to give a comprehensive assessment of genesis forecasting in the operational ensembles from ECMWF and the U.K. MetOffice (UKMO) over the Pacific and Atlantic basins. The assessment in this study will take into account the vortex information contained in ensemble members, i.e., the divergence among the forecasts of ensemble members. However, we have not done vortex tracking in the study, but use the output of vortex tracking on the model data archived by the THORPEX Interactive Grand Global Ensemble (TIGGE) project, so this study does not involve determining optimal thresholds for TC trackers. The rest of this paper is organized as follows. In section 2, the dataset and methodology used in this study will be introduced. In section 3, examples of genesis predictability in ensemble forecasts will be shown and an overall evaluation of ensemble members will be given. The difference between successful forecasts and false alarms in ensemble systems is also analyzed. Based on the results, the evaluation of EPS performance as a whole will be given in section 4. The thresholds used to define a successful forecast will be discussed in section 5. Major results are summarized in section 6.

2. Dataset and methodology

a. Dataset

Global ensembles of ECMWF and UKMO from 2018 to 2020 are used in this study (hereafter ECMWF-EPS and UKMO-EPS). Details about the two EPSs can be found in Table 1. UKMO-EPS runs four times each day, but only the 0000 and 1200 UTC initial times for forecasts are used to compare with the forecasts of ECMWF-EPS. Besides, the 36-member ensemble of UKMO-EPS is generated by time lagging over 12 h from one control member and 17 perturbed members (MetOffice 2022). The forecast data of the two EPSs are decoded from the archive of the TIGGE project (Swinbank et al. 2016). In this study, genesis forecasts over three basins are investigated (Fig. 1), including the northwest Pacific (WP), northeast Pacific (EP), and North Atlantic (AL). Geographically, the EP basin is a combination of the eastern North Pacific and central North Pacific basins.

Fig. 1.
Fig. 1.

TC genesis locations during the period of 2018–20. Borders of the three TC basins (WP, EP, and AL) are indicated with green lines. Geographically, the EP basin includes the eastern North Pacific basin and central North Pacific basin.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

Table 1

Specifications for ECMWF-EPS and UKMO-EPS during 2018–20.

Table 1

The best track data are obtained from the Japan Meteorological Agency (JMA) for WP, from the National Hurricane Center (NHC) for AL and the eastern North Pacific basins, and from the Central Pacific Hurricane Center (CPHC) for the central North Pacific basin. They are all decoded from the IBTrACS dataset (Knapp et al. 2010). The time of genesis is defined as the first declaration of a tropical storm [35 kt (1 kt ≈ 0.51 m s−1)] in the best track data. A total of 205 storms formed during the 3-yr period of 2018–20 (Fig. 1), of which 80 (39.6%) were in WP, 63 (31.2%) were in AL, and 59 (29.2%) were in EP.

b. Methodology

1) The identification of TC genesis forecasts in EPS datasets

The track forecast data are archived in CXML format (Swinbank et al. 2016), which include the 6-hourly center location and intensity estimation (wind speed and central pressure) of the cyclonic vortex during the forecast window of 10 days. These vortices include those that may develop into TCs and the named TCs at the model initialization time. The vortex tracking algorithms for UKMO-EPS and ECMWF-EPS can be found in the work of Heming (2017) and Magnusson et al. (2021), respectively. Since warm-core information is not provided in the dataset, we only rely on wind speed to identify genesis forecasts. Our evaluation mainly focuses on tropical and subtropical oceans (Fig. 1), and the extratropical cyclones at the later stage of TCs can easily be excluded by location and name, so the interference from extratropical cyclones is limited. Maximum wind speed of 18 m s−1 in the forecast is selected to exclude preexisting TCs, and 16.5 m s−1 (about 32 kt, instead of 35 kt) is used to judge model genesis due to the adjustment based on data resolution (Walsh et al. 2007; Wang et al. 2018). To be more specific, the predicted vortex is considered as a TC genesis forecast if its wind speed is smaller than 18 m s−1 at the model initialization time and it has wind speed larger than 16.5 m s−1 for at least 24 h. The latter condition can be satisfied at any time during a forecast cycle, and the start time when wind exceeds this threshold for at least 24 h is considered to be the time the TC forms.

2) Genesis verification

A model genesis forecast is defined as a hit if the model TC forms within 72 h of the observed TC genesis time and within a 5° radius of the observed position (Table 2), otherwise it is categorized as a false alarm. A miss is defined as the case when the model or an ensemble member fails to produce a hit for an observed TC. At a model initialization time, if multiple model genesis forecasts meet the hit criteria for the same observed genesis event, which happens about 15 times for each ensemble member during 2018–20, we will choose between the forecast with the smallest position error and the forecast with the smallest time error. Specifically, if the time error of the latter is more than 48 h smaller than that of the former and its position error is less than 1° larger than that of the former, then the latter will be regarded as a hit, and the rest as false alarms; otherwise, the former will be considered a hit, and the rest are false alarms. The selection of 5° mainly refers to previous studies (e.g., Liang et al. 2021; Halperin et al. 2020; Wang et al. 2018). More discussion on the selection of time and radius thresholds can be found in section 5.

Table 2

A summary of the definitions in this study.

Table 2

Two metrics are employed to evaluate the predictive skill of the two EPSs, including the probability of detection (POD) and success ratio (SR). Their definitions are as follows:
POD=NhitNhit+Nmiss,
SR=NhitNhit+Nfalsealarm,
where Nhit, Nfalse_alarm, and Nmiss are the numbers of hits, false alarms, and misses, respectively. This study deals with more than one type of POD and SR. For clarity, here is a brief introduction. First, each ensemble member is evaluated as a deterministic model in section 3b, where the genesis forecasts in a specific member are classified as hit, miss, or false alarm, and the average of the PODs (SRs) of different members in an EPS is also given. Then, in the calculation of the ensemble POD (section 4a), for each observed genesis event, a hit of the EPS at a lead time is defined as that there are at least 5 hit forecasts among all the members initialized at this time point, otherwise it will be considered as a miss, and the POD is calculated as Eq. (1); since a false alarm cannot be defined in this situation, only POD is discussed in section 4a. In section 4c, we pick out genesis forecasts by grouping the nearby genesis forecasts from all members at each model initialization time to represent the forecasts of an EPS, then calculate the SRs and PODs of these selected forecasts.

Greater values of POD and SR indicate better model performances (Halperin et al. 2016). For a perfect model, the metrics in Eqs. (1) and (2) would equal unity. In the subsequent analysis, the PODs are given as a function of forecast lead time, and the SRs are given as a function of forecast time. Here and below, “forecast lead time” refers to the model initialization time relative to the observed genesis time, while “forecast time” refers to the time in each model cycle (Fig. 2). Therefore, Nhit and Nmiss, related to POD in Eq. (1), refer to the number of model initialization, while Nhit and Nfalse_alarm, related to SR in Eq. (2), refer to the number of genesis forecasts in model. As a distinction, the unit of “forecast lead time” will be expressed in days, and the unit of “forecast time” will be expressed in hours. For initialization times that are missing in the TIGGE dataset, we will exclude them in the calculation of PODs. In the 10 days preceding genesis, the data for UKMO-EPS is missing in 5.9% of all cycles and the data for ECMWF-EPS is missing in 10.5% of all cycles. Therefore, the comparison between UKMO-EPS and ECMWF-EPS in this study is nonhomogeneous due to missing data. Besides, the vortex tracking algorithms are not exactly the same in the two centers. The differences in PODs and SRs between UKMO-EPS and ECMWF-EPS may also be influenced by these factors. For descriptive convenience, some results may be described in a comparative manner in the following sections, but this study is not intended to compare the performance of the two EPSs, but to evaluate the two EPSs separately.

Fig. 2.
Fig. 2.

The schematic diagram of forecast lead time and forecast time.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

3. Genesis forecasting in ensemble members

In this section, an overall evaluation of member forecasting will be given. First, we will give some examples of ensemble genesis forecasts by demonstrating the evolution of the number of ensemble members which make a hit forecast as genesis approaches (section 3a). Then, the forecasts of individual ensemble members will be evaluated like deterministic models (section 3b).

a. Examples of genesis predictability in ensemble forecasts

We will take the forecasting for TCs in 2018 as an example to show the predictability of genesis in EPS. Due to the loss of data at some initialization times, only TCs without missing data in the 10 days preceding genesis will be shown here. Figures 3a and 4a demonstrate the daily number of ensemble members with successful forecasts (hits) for these TCs in the concerned forecast lead times of UKMO-EPS and ECMWF-EPS. There are two initialization times each day. Therefore, for a perfect prediction, the number should be twice the ensemble size, namely, 72 for UKMO-EPS and 102 for ECMWF-EPS. However, the number of ensemble members with hits rarely reaches 72 or 102 (Figs. 3a, 4a). In addition, although the ensemble size is smaller, the number in each lead day is relatively greater in the forecasts of UKMO-EPS. For ECMWF-EPS, the number rarely reaches half of the ensemble size in the AL and EP basins.

Fig. 3.
Fig. 3.

(a) The daily number of ensemble members with correct forecasts (hit) for TCs in 2018 in the three basins in UKMO-EPS. (b) As in (a), but for the number of ensemble members with disturbance forecasts. The numbers on the x axis denote the TC number in each basin. The numbers above the bars in (a) give the genesis location of the corresponding TC. The numbers above the bars in (b) give the maximum intensity of the corresponding TC within 2 days of genesis. The daily number of ensemble members is labeled in numerical form. For a perfect prediction, the number should be twice the ensemble size. The percentage of members with correct forecasts (the daily number divided by twice the ensemble size) is given in color shading.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

Fig. 4.
Fig. 4.

As in Fig. 3, but for forecasts in ECMWF-EPS.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

To check whether there are corresponding cyclonic vortices existing in the ensemble members with no successful forecasts, we calculate the number of ensemble members with disturbance forecasts within ±72 h of the observed genesis time and within a 5° radius of the genesis location (Table 2). The wind threshold for a disturbance should be smaller than that of genesis to include more cyclone information, given the slow progress in TC intensity forecasting (DeMaria et al. 2014). Here, the disturbance is defined as a cyclonic vortex with a life span of more than two days and a maximum intensity greater than 14 m s−1 within the abovementioned range of time and distance. The selection of 14 m s−1 mainly refers to the minimum threshold given by Halperin et al. (2016). The number of ensemble members with disturbance forecasts at each lead day for UKMO-EPS and ECMWF-EPS is given in Figs. 3b and 4b. Comparing Fig. 3a (Fig. 4a) with Fig. 3b (Fig. 4b), we find that although some ensemble members do not correctly predict the hit, they do predict that there was a disturbance around when genesis occurs, especially for forecasts by ECMWF-EPS in the AL and EP basins. When comparing the numbers between the two EPSs, it is generally larger in ECMWF-EPS, which is consistent with its larger ensemble size. The number of ensemble members with disturbance forecasts in ECMWF-EPS is much greater than the number of ensemble members with hits (Figs. 4a,b). To some extent, it reflects the weaker cyclone intensity predicted by ECMWF-EPS.

The performance of the two ensemble forecasts differs in different basins and for different TCs, consistent with the case-to-case variability observed by Tsai et al. (2011), Majumdar and Torn (2014), and Komaromi and Majumdar (2015). For most cases, ensemble forecasts can give signs of genesis five days before a TC forms. Nakano et al. (2015) simulated eight TCs that formed in the WP basin in August 2004, and only two TCs were not successfully reproduced. They found that the two TCs were generated in relatively high latitude (north of 26°N) and were weak during their lifetime (minimum pressure > 990 hPa). Here, we also show the genesis latitude (Figs. 3a and 4a) and the maximum intensity within 48 h after genesis (Figs. 3b and 4b). It is found that what Nakano et al. (2015) observed cannot be generalized to genesis in the three basins throughout the year. According to the analysis of Liang et al. (2021) in WP, Wang et al. (2018) in AL and the case studies by Xiang et al. (2015), the predictability of TC genesis is largely affected by the large scale flow regimes. Therefore, the relationship between predictability and the genesis latitude or maximum intensity is not as clear-cut as what Nakano et al. (2015) observed based on limited and monsoon-related genesis events. Since our sample size is relatively small and our main concern is the overall performance of EPS, genesis forecasting in different flow regimes will not be investigated in this study.

The above analysis is based on the forecasts for 2018 TCs, and the results are similar for the 2019 and 2020 forecasts. Next, the forecasts of individual ensemble members during 2018–20 will be evaluated.

b. An overall evaluation of ensemble members

Figures 5 and 6 show the evaluation results of several ensemble members in the two EPSs, including the control run (EPS00) and a few randomly selected members. The ensemble means in the figures denote the mean of all members.

Fig. 5.
Fig. 5.

(a) The probability of detection (POD) as a function of the forecast lead time for several ensemble members in UKMO-EPS during 2018–20. (b) As in (a), but for the success ratio (SR). The black dots and their nearby numbers denote the mean of all ensemble members; the bottom and top of the black bars are the 10% and 90% percentiles of all the ensemble members, respectively. The colored numbers next to the bars in (b) indicate the sample sizes in the corresponding bars.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

Fig. 6.
Fig. 6.

As in Fig. 5, but for ECMWF-EPS during 2018–20.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

As for POD, the ensemble members in UKMO-EPS show a good level of agreement; those in EMCWF-EPS also show a good level of agreement except for its control member, which is a bit worse than the other members. Overall, UKMO-EPS performs better than ECMWF-EPS through the 7-day lead time. The POD at 3(5)-day lead time before genesis is about 0.37 (0.25) for UKMO-EPS, but only 0.25 (0.16) for ECMWF-EPS. The result is consistent with the analysis in section 3a. As for SR, the performance of the control run in both EPSs is better than most members. For forecast time after 96 h, the SRs of UKMO-EPS and EMCWF-EPS are very close; the SRs are around 0.2, and the SRs of ECMWF-EPS are a slightly larger. Also, we noticed that the numbers of genesis forecasts in UKMO-EPS are larger than that in ECMWF-EPS for forecast times before 174 h, as indicated by the numbers beside the SR bars in Figs. 5b and 6b. Therefore, although the POD of ECMWF-EPS is lower partly due to adopted threshold settings, the SR does not increase much. For the control run, the sample size of genesis forecasts is smaller than that of the other members. We speculate that the perturbations added to the ensemble members have resulted in the generation of more disturbances than that of the control run.

4. Genesis forecasting in the ensemble

In section 3b, we found that the ensemble members exhibited similar SRs and PODs. Wang et al. (2018) and Li et al. (2016) took advantage of this agreement among ensemble members, and only discussed the ensemble mean when evaluating the genesis forecasting of an ensemble system. However, they did not check whether the TCs and lead time behind the similar PODs are also similar. In Figs. 3a and 4a, the successful forecasts of ensemble members are found to be scattered at different lead times of different TCs, i.e., the perturbations added to ensemble members might have caused them to behave differently in different genesis events, although statistically they exhibit similar SRs and PODs. In this section, we will compare the detection of observed genesis in the ensemble with the mean POD of all ensemble members (section 4a), analyze the difference between hits and false alarms (section 4b), then evaluate EPS as a whole (section 4c). The performance of EPS in different basins will be given in section 4d.

a. The ensemble PODs

To check the forecasts of observed genesis by EPS, we will take the hit forecasts of all ensemble members into account, and use “ensemble PODs” to refer to it. To be more specific, when calculating ensemble PODs, a hit of the EPS at a lead time is defined as that there are at least 5 hits among all the members initialized at this time point (Table 2). Based on the statistics given in Figs. 3a and 4a, the percentage of ensemble size is not an effective indicator of the probability of genesis, so we do not take it as the threshold. If we choose other numbers as thresholds, such as 2 or 10, the ensemble PODs will decrease or increase accordingly, and the PODs will be the highest when adopting 1. The selection of 5 mainly refers to the numbers at longer forecast lead times in Figs. 3a and 4a to illustrate the advantage of ensemble system over individual ensemble members.

Figures 7a and 7b compare the PODs of ensemble forecasts (blue bars) with the mean POD of all ensemble members (green bars), where “ens_F” (orange bars) will be defined in a later subsection. The ensemble PODs are much greater than the mean PODs. For example, for the 5-day lead time in UKMO-EPS (ECMWF-EPS), the ensemble POD is about 0.65 (0.57), while the mean is only about 0.25 (0.16). The larger ensemble PODs indicate that hit forecasts of the individual ensemble members are complementary, and the performance of a specific ensemble member cannot reflect that of the ensemble. The complementarity among ensemble members reflects that the multiple simulations performed by EPS have partially captured the inherent uncertainty of the genesis process. Therefore, although the performance of individual ensemble members is similar, a better decision can be made by taking their forecasts as a whole.

Fig. 7.
Fig. 7.

(a) PODs of the ensemble, ens_F, and the mean of all ensemble members as a function of the forecast lead time in UKMO-EPS. (b) As in (a), but for ECMWF-EPS. The definitions of ensemble PODs and ens_F PODs can be found in the text. The data include forecasts from all three basins.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

Figure 8 shows the ensemble PODs of the super ensemble of ECMWF-EPS and UKMO-EPS, along with the ensemble PODs of ECMWF-EPS and UKMO-EPS for comparison. The comparisons are performed with the same samples. Here, a hit of the super ensemble requires at least 5 hits from the super ensemble of two EPSs, which is the same as that for a single EPS. It is shown that the PODs of the super ensemble are the largest at all lead times. To some extent, the forecasts of the two EPSs also complement each other. Therefore, it is a better choice to use the super ensemble of multiple EPSs when developing the genesis forecast scheme based on EPS, which is beyond the scope of this study and will be carried out in the near future.

Fig. 8.
Fig. 8.

Ensemble PODs as a function of the forecast lead time for ECMWF-EPS, UKMO-EPS, and the superensemble of ECMWF-EPS and UKMO-EPS. The comparisons are performed with the same samples. The data include forecasts from all three basins.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

b. The difference between hits and false alarms

One of the advantages of ensemble forecasts is to provide information on genesis uncertainty. However, based on section 3 and previous studies (e.g., Majumdar and Torn 2014; Tsai et al. 2011), the ratio of ensemble members with genesis forecasts is not an effective indicator of the probability of genesis. In this section, we will check whether a hit in an ensemble member is accompanied by more genesis forecasts in the EPS and a false alarm in an ensemble member is accompanied by fewer forecasts.

Figures 9a and 9b illustrate the distribution of the number of genesis forecasts near the hits and false alarms. The nearby genesis forecasts refer to those within ±72 h of the forecasts time and within 3° of the forecast location (Table 2). The reference point for nearby genesis forecasts is each genesis forecast in each ensemble member at every model initialization time. The selection of 3° instead of 5° is to pick out genesis forecasts that are closer to the reference point. Besides, convection within 3°, which is often associated with vorticity generation, is a common concern in the study of TC genesis and intensification (Lee et al. 2008; Zawislak 2020; Ruan and Wu 2018). Surprisingly, in Figs. 9a and 9b, the mean numbers of hits and false alarms are close to each other in both EPSs, and as forecast time increases, the mean and median numbers do not change much. Except for three forecast intervals (two for UKMO-EPS and one for ECMWF-EPS), the differences in nearby numbers between hit and false alarms are not significant with a 95% confidence level in the Levene’s test. Moreover, although the ensemble size of ECMWF-EPS is larger, the mean number of nearby forecasts at all forecast times is not larger than that of UKMO-EPS, all of which are around 10.

Fig. 9.
Fig. 9.

(a) Boxplot of the number of genesis forecasts near each genesis forecast at different forecast times in UKMO-EPS. (b) As in (a), but for ECMWF-EPS. The bottom and top of the box are the first and third quartiles, respectively; the percentiles for the upper and lower whiskers are 2.5% and 97.5%, respectively. The diamond mark inside the box denotes the mean value, the line inside the box denotes the median, and the plus marks not included between the whiskers are outliers. The numbers in color denote the sample size. The solid semicircle on the x axis indicates that the difference in number between hits and false alarms at that forecast interval is significant under Levene’s test with a 95% confidence level. The data include forecasts from all three basins.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

Section 3a shows that there may be active cyclonic vortices around even if the ensemble member does not make a hit forecast. Therefore, we also compares the number of nearby disturbance vortices between hits and false alarms (Figs. 10a,b). Here, the definition of nearby disturbance is the same that in section 3a, except that it is relative to each genesis forecast and the radius is 3° (Table 2). It is worth mentioning that one ensemble member may have multiple disturbance forecasts at each model initialization time. The differences in the number of nearby disturbances between hit and false alarms are greater than that of genesis forecasts. Except for one forecast interval, the differences are significant with a 95% confidence level in the Levene’s test. For the only exception (102–120 h for ECMWF-EPS), the median or mean values are also much greater in hit category. As forecast time increases, the mean or median will decrease for both hits and false alarms. Furthermore, the number of nearby disturbances in ECMWF-EPS is larger than that of UKMO-EPS. As shown in Figs. 9a, 9b, 10a, and 10b, it is better to distinguish between hits and false alarms based on the number of disturbance vortices nearby. In addition to the time window of ±72 h, we also tested ±36 and ±48 h. The change in time window mainly affects the mean value, but does not qualitatively affect the conclusions.

Fig. 10.
Fig. 10.

As in Fig. 9, but for the number of disturbance forecasts accompanying each genesis forecast. The solid semicircle on the x axis indicates that the difference in number between hits and false alarms at that forecast interval is significant under Levene’s test with a 95% confidence level. The data include forecasts from all three basins.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

c. The performance of EPS as a whole

Genesis forecasting in ensemble members is evaluated in section 3. However, due to the complementarity of forecasts among ensemble members, the results do not fully reflect the ensemble performance. In last subsection, we found that a hit forecast has more nearby disturbance forecasts in the ensemble than a false alarm. In this subsection, we will calculate the PODs and SRs of EPS by considering the forecasts of all members, namely, the disturbances nearby.

Since a genesis forecast in an ensemble member is not equivalent to that of the ensemble system, we first need to determine what a genesis forecast is in ensemble systems. Therefore, before the evaluation, we will group the nearby forecasts at each model initialization time to define an ensemble genesis forecast. There are several thresholds involved in this process. Different thresholds may lead to different PODs and SRs. Since this study does not aim to develop an optimal genesis forecast scheme, we will not discuss the optimal thresholds, which also vary from sample to sample.

As preparation, we count the number of disturbance forecasts near each genesis forecast of the ensemble, and discard the forecasts with the number below some specific values. The selection of these thresholds mainly refers to the average number of nearby disturbances around the false alarms, namely, the blue diamonds in Figs. 10a and 10b. When the average number is less than 20 (12) in ECMWF-EPS (UKMO-EPS), it will be adopted as the threshold for that forecast interval, otherwise the threshold will be set to 20 (12). Then the remaining genesis forecasts at each model initialization time will be grouped further to pick out forecasts that represent the ensemble. Take the remaining forecasts at a model initialization time as an example. The detailed steps are as follows:

  1. Sort the forecasts at this model initialization time in descending order according to the number of nearby disturbance forecasts. Here, the sorted forecasts are referred to as FORCASTS_SORT.

  2. Start with the first genesis forecast (FORECAST_A with N nearby disturbance forecasts) in FORCASTS_SORT. Find the forecasts within 72 h and 5° of FORECAST_A in FORCASTS_SORT, and refer to them as GROUP.

  3. Find the forecasts in GROUP with the number of nearby disturbances greater than or equal to (N − 2), and take their position average and time average as the forecast for this group, that is, FORECAST_G. FORECAST_G is regarded as a genesis forecast of the EPS.

  4. Delete the genesis forecasts within 72 h and 5° of FORECAST_G in FORCASTS_SORT.

  5. Repeat steps 2–4 until there are no forecasts left in FORCASTS_SORT.

All the FORECAST_Gs we get in this process are the ensemble genesis forecasts at this model initialization time, which are either more than 72 h apart in time or more than 5° apart in space. The thresholds of 72 h and 5° are the two values to define a hit forecast. Compared with the ensemble PODs in section 4a, the evaluation is carried out from an operational forecasting point of view, as some criteria are adopted to define an ensemble genesis forecast. The results are shown in Figs. 7 and 11, as bars labeled “ens_F.” Here, ens_F is short for “ensemble forecast.”

Fig. 11.
Fig. 11.

(a) SRs of ens_F (blue bars) and SRs of the mean of all ensemble members (orange bars). (b) As in (a), but for ECMWF-EPS. The data include forecasts from all three basins.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

The PODs of ens_F are much better than that of the mean of all ensemble members for both UKMO-EPS and ECMWF-EPS, and the values are more than twice that of the mean at most lead times. They are comparable to the ensemble PODs for UKMO-EPS, and slightly larger than the ensemble PODs within the lead time of 3-day for ECMWF-EPS, but smaller than the ensemble PODs afterward. We also compared SRs between ens_F and the mean of all ensemble members (Fig. 11). In the evaluation of ensemble members (Figs. 5b and 6b), it is shown that SR decreases with increasing forecast time, while the SR of ens_F does not change much with increasing forecast time. The SRs of ens_F are much smaller the mean SRs for both UKMO-EPS and ECMWF-EPS within the forecast time of 96 h, and are close to the mean SRs afterward. That is, as the PODs of ens_F are improved, the false alarm ratio also increases; the increase in false alarm ratio is obvious for forecast times within 96 h. However, for this forecast interval, we found it is not possible to effectively improve SRs while keeping the PODs close to the ensemble PODs only by modifying the thresholds involved in this subsection. Further studies are still needed to develop an optimal genesis forecast scheme.

d. The performance in different basins

Figure 12a gives the PODs (ens_F) in different basins. In both EPSs, it shows that PODs are the lowest in AL at all the forecast lead times. Taking the PODs at the 5-day lead time as an example, for UKMO-EPS, the values in the WP, EP and AL basins are 0.68, 0.63, and 0.46, respectively; for ECMWF-EPS, the values are 0.43, 0.46, and 0.23, respectively. For UKMO-EPS, the PODs are much higher in WP with the 3-day lead time, the values are much higher in EP at the 7-and 8-day lead times, and the values are close to each other in WP and EP at the other lead times. For ECMWF-EPS, the PODs are highest in WP within the 4-day lead time, and the values are the highest in EP afterward.

Fig. 12.
Fig. 12.

(a) The PODs of ens_F in different basins. (b) The SRs of ens_F in different basins.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

Figure 12b gives the SRs (ens_F) in different basins. For UKMO-EPS, at the forecast intervals of 30–48 and 174–192 h, the SRs of EP are more than 10% smaller than that of WP and AL; at the forecast interval of 126–144 h, the SR of AL is more than 5% smaller than that of the other two basins; at the other forecast intervals, the SRs of the three basins are relatively close to each other. For ECMWF-EPS, the SRs in EP are the highest except for the forecast interval of 30–48 h, and they are much better than that of WP and AL at the last two forecast intervals; the SR in WP at the forecast interval of 102–120 h is more than 5% smaller than that in the other two basins; at the last two forecast intervals, the SRs in AL are more than 10% smaller than the best SRs.

In short, the performance of EPSs differs in different basins. And genesis in AL are the most difficult to forecast by the EPS, which is consistent with the assessment by Halperin et al. (2016, 2020) on deterministic models and by Wang et al. (2018) on ensemble mean.

5. Discussion on the thresholds to define a successful forecast

In the above discussions, a hit was defined for a model TC that forms within 72 h of the observed genesis time and within a 5° radius of the observed track. In previous research on model genesis, 5° is often adopted as the distance threshold (e.g., Liang et al. 2021; Halperin et al. 2020; Wang et al. 2018). There are exceptions, e.g., 10° by Nakano et al. (2015) and 5° × 5° surrounding the genesis location by Tsai et al. (2011). As for time range, there is also no consensus. Within 120 h of observed genesis time is adopted by Wang et al. (2018) and Li et al. (2016), and within 120 h of model initialization time is adopted by Halperin et al. (2016, 2020). More specifically, if a TC forms at 126 h after the initialization, and the model predicts a formation at 114 h, it is a hit under the first criterion and a false alarm under the latter criterion. It is worth noting that Halperin et al. (2016, 2020) only verified genesis forecast out to 5 days. In contrast, Liang et al. (2021) adopted 72 h before genesis time and 48 h after genesis time. To evaluate the impact of the two thresholds, we will investigate the time error and position error of the genesis forecasts. Here, if the forecast genesis time is later than the observed genesis time, the time error is considered to be positive (+), otherwise it is negative (−).

Figures 13a and 13b illustrate the distribution of the two errors for the control runs of UKMO-EPS and ECMWF-EPS, respectively. The blue dots denote model TCs that form within 120 h of the observed genesis times and within a 10° radius of the observed tracks. In the calculation, for a genesis forecast, we first check whether there is an observed genesis event with ±72 h and 3°; if not, we will try with ±72 h and 5° of the model genesis forecast; if not again, we try with ±120 h and 5°; if still not, we finally try with ±120 h and 10°. In the distribution, 57.11% of the cases are in the range of ±72 h and 5° in the control run of UKMO-EPS, and 59.5% in that of ECMWF-EPS. And, other error points are mainly concentrated in the direction of position error, and they are far away from the genesis location. The distribution for other ensemble members is similar. As shown in Figs. 13a and 13b, the adoption of ±72 h (within 72 h) can reflect the genesis forecast well. If a larger distance threshold is adopted, the values of POD and SR will increase. The false alarms may become hits under less strict criteria.

Fig. 13.
Fig. 13.

(a) Time and position errors for the control forecasts of UKMO-EPS. (b) As in (a), but for that of ECMWF-EPS. Only model TCs that form within 120 h of the observed genesis times and within a 10° radius of the observed tracks are shown here.

Citation: Weather and Forecasting 38, 1; 10.1175/WAF-D-22-0145.1

In other words, a hit forecast does not mean that the location and time of an observed genesis event are precisely predicted. After all, the current mean TC track error at 120 h is about 350 km. As a result, even if the POD values remain unchanged compared with previous years, if they are obtained with stricter criteria, it still indicates an improvement in genesis forecasting. Compared with the verification on TC track error or intensity error, verification on genesis is more complicated and less convincing. An undeveloped disturbance in a model may be classified as a miss forecast, but it can also provide helpful information to forecasters, especially when considering the forecasts of other models. Therefore, in our study, we include the disturbance information in all ensemble members to evaluate the EPS performance.

6. Conclusions

To examine the predictability of TC genesis in EPS, genesis forecasts during 2018–20 from two operational global ensemble models (UKMO-EPS and ECMWF-EPS) are evaluated over the WP, EP and AL basins in this study. It is found that the performance of ensemble members shows a good level of agreement, and statistically UKMO-EPS outperforms ECMWF-EPS. As expected, POD (probability of detection) decreases with increasing forecast lead time, and SR (success ratio) decreases with increasing forecast time. Besides, the SRs of the control run are greater than the other members. In addition, even if an ensemble member does not make a hit forecast, it may predict the presence of cyclonic vortices. To some extent, the genesis forecasts of the ensemble members are complementary, as multiple simulations are conducted each time to deal with model imperfections and the inherent uncertainties in weather systems, which is the advantage of EPS over a deterministic model. As a result, the PODs can be doubled by taking all the member forecasts into account. Accordingly, the performance of a specific ensemble member cannot reflect that of the ensemble. Furthermore, the ensemble PODs of the super ensemble of UKMO-EPS and ECMWF-EPS are higher than that of the two EPSs, so it is a better choice to use the super ensemble of multiple EPSs when developing the genesis forecast scheme based on EPS.

The ratio of ensemble members with genesis forecasts at each forecast time is found not to be an effective indicator of the probability of genesis, so this study did not take this ratio to represent forecast uncertainty. Results show that statistically, a hit case has more nearby disturbance forecasts in the ensemble than a false alarm, while the differences in the number of nearby genesis forecasts between hits and false alarms are not significant. Therefore, we grouped the nearby forecasts at each model initialization time to define ensemble genesis forecasts according to the number of disturbance vortices near each genesis forecast. Then we verified these ensemble genesis forecasts to represent the performance of the ensemble system. The results demonstrate that the PODs are almost more than twice the mean of all ensemble members; the SRs are smaller than the mean SRs within the forecast time of 96 h, and are close to the mean SRs afterward. The POD at the 5-day lead time is about 59% and 38% in UKMO-EPS and ECMWF-EPS, respectively, while the mean of ensemble members is about 25% and 16%, respectively. In addition, the performance of EPSs differs in different basins, and genesis in AL is the most difficult to forecast. The POD at the 5-day lead time in AL is only 46% and 23% in UKMO-EPS and ECMWF-EPS. It should be pointed out that the comparison between UKMO-EPS and ECMWF-EPS is not performed under the same samples due to missing data. Overall, the global ensemble models can provide valuable reference for TC genesis forecasting.

It is worth mentioning that time and position errors are allowed when determining a hit forecast. In this study, the time threshold is ±72 h and distance threshold is within 5°. Different thresholds may be adopted by different studies, which makes it difficult to compare their results with each other. Besides, the vortex tracking algorithms may also be different. Therefore, when interpreting the results of genesis verification, this extra information should also be considered. For genesis forecasts in EPS, an undeveloped disturbance in a member can also provide helpful information to operational forecasters.

Our assessment of EPS has considered the vortex information contained in each forecast. However, there are also some limitations in this study. One of them is that the vortex tracking algorithms are not exactly the same in the two centers (Magnusson et al. 2021; Heming 2017), which may lead to the different performances of UKMO-EPS and ECMWF-EPS in the evaluation. Another limitation is that the skill test is only performed on the forecasts of 3 years. This is because we take into account that UKMO upgraded their ensemble system in 2017, enlarging the ensemble size from 24 to 36. Further studies are still required to analyze what factors lead to the differences between successful and unsuccessful members and what factors contribute to the differences in predictability among TCs.

Acknowledgments.

The authors are very grateful to Dr. Lina Bai and Dr. Guomin Chen from Shanghai Typhoon Institute and the five anonymous reviewers for their helpful comments and feedback on this study. This work was supported in part by the National Key Research and Development Program of China under Grants 2021YFC3000804 and 2017YFC1501601; National Natural Science Foundation of China under Grants 41875067, 41875080, and 41975067; the Research Program from Science and Technology Committee of Shanghai (20ZR1469700, 19dz1200101); the Program of Shanghai Academic/Technology Research Leader (21XD1404500); and the Typhoon Scientific and Technological Innovation Group of Shanghai Meteorological Service.

Data availability statement.

The TIGGE track forecast data are downloaded from the Research Data Archive of UCAR (https://rda.ucar.edu/datasets/ds330.3/), and the best track data are obtained from IBTrACS dataset (https://www.ncdc.noaa.gov/ibtracs/index.php?name=ib-v4-access).

REFERENCES

  • Chan, J. C. L., and R. H. F. Kwok, 1999: Tropical cyclone genesis in a global numerical weather prediction model. Mon. Wea. Rev., 127, 611624, https://doi.org/10.1175/1520-0493(1999)127<0611:TCGIAG>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Cheung, K. K. W., and R. L. Elsberry, 2002: Tropical cyclone formations over the western North Pacific in the Navy Operational Global Atmospheric Prediction System forecasts. Wea. Forecasting, 17, 800820, https://doi.org/10.1175/1520-0434(2002)017<0800:TCFOTW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • DeMaria, M., C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is tropical cyclone intensity guidance improving? Bull. Amer. Meteor. Soc., 95, 387398, https://doi.org/10.1175/BAMS-D-12-00240.1.

    • Search Google Scholar
    • Export Citation
  • Elsberry, R. L., M. S. Jordan, and F. Vitart, 2011: Evaluation of the ECMWF 32-day ensemble predictions during 2009 season of western North Pacific tropical cyclone events on intraseasonal timescales. Asia-Pac. J. Atmos. Sci., 47, 305, https://doi.org/10.1007/s13143-011-0017-8.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., H. E. Fuelberg, R. E. Hart, J. H. Cossuth, P. Sura, and R. J. Pasch, 2013: An evaluation of tropical cyclone genesis forecasts from global numerical models. Wea. Forecasting, 28, 14231445, https://doi.org/10.1175/WAF-D-13-00008.1.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., H. E. Fuelberg, R. E. Hart, and J. H. Cossuth, 2016: Verification of tropical cyclone genesis forecasts from global numerical models: Comparisons between the North Atlantic and eastern North Pacific basins. Wea. Forecasting, 31, 947955, https://doi.org/10.1175/WAF-D-15-0157.1.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., R. E. Hart, H. E. Fuelberg, and J. H. Cossuth, 2017: The development and evaluation of a statistical–dynamical tropical cyclone genesis guidance tool. Wea. Forecasting, 32, 2746, https://doi.org/10.1175/WAF-D-16-0072.1.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., A. B. Penny, and R. E. Hart, 2020: A comparison of tropical cyclone genesis forecast verification from three Global Forecast System (GFS) operational configurations. Wea. Forecasting, 35, 18011815, https://doi.org/10.1175/WAF-D-20-0043.1.

    • Search Google Scholar
    • Export Citation
  • Heming, J. T., 2017: Tropical cyclone tracking and verification techniques for Met Office numerical weather prediction models. Meteor. Appl., 24, 18, https://doi.org/10.1002/met.1599.

    • Search Google Scholar
    • Export Citation
  • Jaiswal, N., C. M. Kishtawal, S. Bhomia, and P. K. Pal, 2016: Multi-model ensemble-based probabilistic prediction of tropical cyclogenesis using TIGGE model forecasts. Meteor. Atmos. Phys., 128, 601611, https://doi.org/10.1007/s00703-016-0436-2.

    • Search Google Scholar
    • Export Citation
  • Knapp, K. R., M. C. Kruk, D. H. Levinson, H. J. Diamond, and C. J. Neumann, 2010: The International Best Track Archive for Climate Stewardship (IBTrACS) unifying tropical cyclone data. Bull. Amer. Meteor. Soc., 91, 363376, https://doi.org/10.1175/2009BAMS2755.1.

    • Search Google Scholar
    • Export Citation
  • Komaromi, W. A., and S. J. Majumdar, 2014: Ensemble-based error and predictability metrics associated with tropical cyclogenesis. Part I: Basinwide perspective. Mon. Wea. Rev., 142, 28792898, https://doi.org/10.1175/MWR-D-13-00370.1.

    • Search Google Scholar
    • Export Citation
  • Komaromi, W. A., and S. J. Majumdar, 2015: Ensemble-based error and predictability metrics associated with tropical cyclogenesis. Part II: Wave-relative framework. Mon. Wea. Rev., 143, 16651686, https://doi.org/10.1175/MWR-D-14-00286.1.

    • Search Google Scholar
    • Export Citation
  • Lee, C.-S., K. K. W. Cheung, J. S. N. Hui, and R. L. Elsberry, 2008: Mesoscale features associated with tropical cyclone formations in the western North Pacific. Mon. Wea. Rev., 136, 20062022, https://doi.org/10.1175/2007MWR2267.1.

    • Search Google Scholar
    • Export Citation
  • Li, W., Z. Wang, and M. S. Peng, 2016: Evaluating tropical cyclone forecasts from the NCEP Global Ensemble Forecasting System (GEFS) reforecast version 2. Wea. Forecasting, 31, 895916, https://doi.org/10.1175/WAF-D-15-0176.1.

    • Search Google Scholar
    • Export Citation
  • Liang, M., J. C. L. Chan, J. Xu, and M. Yamaguchi, 2021: Numerical prediction of tropical cyclogenesis Part I: Evaluation of model performance. Quart. J. Roy. Meteor. Soc., 147, 16261641, https://doi.org/10.1002/qj.3987.

    • Search Google Scholar
    • Export Citation
  • Magnusson, L., and Coauthors, 2021: Tropical cyclone activities at ECMWF. ECMWF Tech. Memo. 888, 140 pp., https://doi.org/10.21957/zzxzzygwv.

  • Majumdar, S. J., and R. D. Torn, 2014: Probabilistic verification of global and mesoscale ensemble forecasts of tropical cyclogenesis. Wea. Forecasting, 29, 11811198, https://doi.org/10.1175/WAF-D-14-00028.1.

    • Search Google Scholar
    • Export Citation
  • MetOffice, 2022: Numerical weather prediction models. Met Office, accessed 14 April 2022, https://www.metoffice.gov.uk/research/approach/modelling-systems/unified-model/weather-forecasting.

  • Nakano, M., M. Sawada, T. Nasuno, and M. Satoh, 2015: Intraseasonal variability and tropical cyclogenesis in the western North Pacific simulated by a global nonhydrostatic atmospheric model. Geophys. Res. Lett., 42, 565571, https://doi.org/10.1002/2014GL062479.

    • Search Google Scholar
    • Export Citation
  • Pedlosky, J., 1987: Geophysical Fluid Dynamics. Springer, 710 pp.

  • Ruan, Z., and Q. Wu, 2018: Precipitation, convective clouds, and their connections with tropical cyclone intensity and intensity change. Geophys. Res. Lett., 45, 10981105, https://doi.org/10.1002/2017GL076611.

    • Search Google Scholar
    • Export Citation
  • Swinbank, R., and Coauthors, 2016: The TIGGE project and its achievements. Bull. Amer. Meteor. Soc., 97, 4967, https://doi.org/10.1175/BAMS-D-13-00191.1.

    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., K.-C. Lu, R. L. Elsberry, M.-M. Lu, and C.-H. Sui, 2011: Tropical cyclone–like vortices detection in the NCEP 16-day ensemble system over the western North Pacific in 2008: Application and forecast evaluation. Wea. Forecasting, 26, 7793, https://doi.org/10.1175/2010WAF2222415.1.

    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., R. L. Elsberry, W.-C. Chin, and T. P. Marchok, 2020: Opportunity for early warnings of Typhoon Lekima from two global ensemble model forecasts of formation with 7-day intensities along medium-range tracks. Atmosphere, 11, 1162, https://doi.org/10.3390/atmos11111162.

    • Search Google Scholar
    • Export Citation
  • Walsh, K. J. E., M. Fiorino, C. W. Landsea, and K. L. McInnes, 2007: Objectively determined resolution-dependent threshold criteria for the detection of tropical cyclones in climate models and reanalyses. J. Climate, 20, 23072314, https://doi.org/10.1175/JCLI4074.1.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., W. Li, M. S. Peng, X. Jiang, R. McTaggart-Cowan, and C. A. Davis, 2018: Predictive skill and predictability of North Atlantic tropical cyclogenesis in different synoptic flow regimes. J. Atmos. Sci., 75, 361378, https://doi.org/10.1175/JAS-D-17-0094.1.

    • Search Google Scholar
    • Export Citation
  • Xiang, B., and Coauthors, 2015: Beyond weather time-scale prediction for Hurricane Sandy and Super Typhoon Haiyan in a global climate model. Mon. Wea. Rev., 143, 524535, https://doi.org/10.1175/MWR-D-14-00227.1.

    • Search Google Scholar
    • Export Citation
  • Yamaguchi, M., and N. Koide, 2017: Tropical cyclone genesis guidance using the early stage Dvorak analysis and global ensembles. Wea. Forecasting, 32, 21332141, https://doi.org/10.1175/WAF-D-17-0056.1.

    • Search Google Scholar
    • Export Citation
  • Yamaguchi, M., F. Vitart, S. T. K. Lang, L. Magnusson, R. L. Elsberry, G. Elliott, M. Kyouda, and T. Nakazawa, 2015: Global distribution of the skill of tropical cyclone activity forecasts on short- to medium-range time scales. Wea. Forecasting, 30, 16951709, https://doi.org/10.1175/WAF-D-14-00136.1.

    • Search Google Scholar
    • Export Citation
  • Zawislak, J., 2020: Global survey of precipitation properties observed during tropical cyclogenesis and their differences compared to nondeveloping disturbances. Mon. Wea. Rev., 148, 15851606, https://doi.org/10.1175/MWR-D-18-0407.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, X., and H. Yu, 2017: A probabilistic tropical cyclone track forecast scheme based on the selective consensus of ensemble prediction systems. Wea. Forecasting, 32, 21432157, https://doi.org/10.1175/WAF-D-17-0071.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, X., G. Chen, H. Yu, and Z. Zeng, 2015: Verification of ensemble track forecasts of tropical cyclones during 2014. Trop. Cyclone Res. Rev., 4, 7987, https://doi.org/10.6057/2015TCRR02.04.

    • Search Google Scholar
    • Export Citation
Save
  • Chan, J. C. L., and R. H. F. Kwok, 1999: Tropical cyclone genesis in a global numerical weather prediction model. Mon. Wea. Rev., 127, 611624, https://doi.org/10.1175/1520-0493(1999)127<0611:TCGIAG>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Cheung, K. K. W., and R. L. Elsberry, 2002: Tropical cyclone formations over the western North Pacific in the Navy Operational Global Atmospheric Prediction System forecasts. Wea. Forecasting, 17, 800820, https://doi.org/10.1175/1520-0434(2002)017<0800:TCFOTW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • DeMaria, M., C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is tropical cyclone intensity guidance improving? Bull. Amer. Meteor. Soc., 95, 387398, https://doi.org/10.1175/BAMS-D-12-00240.1.

    • Search Google Scholar
    • Export Citation
  • Elsberry, R. L., M. S. Jordan, and F. Vitart, 2011: Evaluation of the ECMWF 32-day ensemble predictions during 2009 season of western North Pacific tropical cyclone events on intraseasonal timescales. Asia-Pac. J. Atmos. Sci., 47, 305, https://doi.org/10.1007/s13143-011-0017-8.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., H. E. Fuelberg, R. E. Hart, J. H. Cossuth, P. Sura, and R. J. Pasch, 2013: An evaluation of tropical cyclone genesis forecasts from global numerical models. Wea. Forecasting, 28, 14231445, https://doi.org/10.1175/WAF-D-13-00008.1.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., H. E. Fuelberg, R. E. Hart, and J. H. Cossuth, 2016: Verification of tropical cyclone genesis forecasts from global numerical models: Comparisons between the North Atlantic and eastern North Pacific basins. Wea. Forecasting, 31, 947955, https://doi.org/10.1175/WAF-D-15-0157.1.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., R. E. Hart, H. E. Fuelberg, and J. H. Cossuth, 2017: The development and evaluation of a statistical–dynamical tropical cyclone genesis guidance tool. Wea. Forecasting, 32, 2746, https://doi.org/10.1175/WAF-D-16-0072.1.

    • Search Google Scholar
    • Export Citation
  • Halperin, D. J., A. B. Penny, and R. E. Hart, 2020: A comparison of tropical cyclone genesis forecast verification from three Global Forecast System (GFS) operational configurations. Wea. Forecasting, 35, 18011815, https://doi.org/10.1175/WAF-D-20-0043.1.

    • Search Google Scholar
    • Export Citation
  • Heming, J. T., 2017: Tropical cyclone tracking and verification techniques for Met Office numerical weather prediction models. Meteor. Appl., 24, 18, https://doi.org/10.1002/met.1599.

    • Search Google Scholar
    • Export Citation
  • Jaiswal, N., C. M. Kishtawal, S. Bhomia, and P. K. Pal, 2016: Multi-model ensemble-based probabilistic prediction of tropical cyclogenesis using TIGGE model forecasts. Meteor. Atmos. Phys., 128, 601611, https://doi.org/10.1007/s00703-016-0436-2.

    • Search Google Scholar
    • Export Citation
  • Knapp, K. R., M. C. Kruk, D. H. Levinson, H. J. Diamond, and C. J. Neumann, 2010: The International Best Track Archive for Climate Stewardship (IBTrACS) unifying tropical cyclone data. Bull. Amer. Meteor. Soc., 91, 363376, https://doi.org/10.1175/2009BAMS2755.1.

    • Search Google Scholar
    • Export Citation
  • Komaromi, W. A., and S. J. Majumdar, 2014: Ensemble-based error and predictability metrics associated with tropical cyclogenesis. Part I: Basinwide perspective. Mon. Wea. Rev., 142, 28792898, https://doi.org/10.1175/MWR-D-13-00370.1.

    • Search Google Scholar
    • Export Citation
  • Komaromi, W. A., and S. J. Majumdar, 2015: Ensemble-based error and predictability metrics associated with tropical cyclogenesis. Part II: Wave-relative framework. Mon. Wea. Rev., 143, 16651686, https://doi.org/10.1175/MWR-D-14-00286.1.

    • Search Google Scholar
    • Export Citation
  • Lee, C.-S., K. K. W. Cheung, J. S. N. Hui, and R. L. Elsberry, 2008: Mesoscale features associated with tropical cyclone formations in the western North Pacific. Mon. Wea. Rev., 136, 20062022, https://doi.org/10.1175/2007MWR2267.1.

    • Search Google Scholar
    • Export Citation
  • Li, W., Z. Wang, and M. S. Peng, 2016: Evaluating tropical cyclone forecasts from the NCEP Global Ensemble Forecasting System (GEFS) reforecast version 2. Wea. Forecasting, 31, 895916, https://doi.org/10.1175/WAF-D-15-0176.1.

    • Search Google Scholar
    • Export Citation
  • Liang, M., J. C. L. Chan, J. Xu, and M. Yamaguchi, 2021: Numerical prediction of tropical cyclogenesis Part I: Evaluation of model performance. Quart. J. Roy. Meteor. Soc., 147, 16261641, https://doi.org/10.1002/qj.3987.

    • Search Google Scholar
    • Export Citation
  • Magnusson, L., and Coauthors, 2021: Tropical cyclone activities at ECMWF. ECMWF Tech. Memo. 888, 140 pp., https://doi.org/10.21957/zzxzzygwv.

  • Majumdar, S. J., and R. D. Torn, 2014: Probabilistic verification of global and mesoscale ensemble forecasts of tropical cyclogenesis. Wea. Forecasting, 29, 11811198, https://doi.org/10.1175/WAF-D-14-00028.1.

    • Search Google Scholar
    • Export Citation
  • MetOffice, 2022: Numerical weather prediction models. Met Office, accessed 14 April 2022, https://www.metoffice.gov.uk/research/approach/modelling-systems/unified-model/weather-forecasting.

  • Nakano, M., M. Sawada, T. Nasuno, and M. Satoh, 2015: Intraseasonal variability and tropical cyclogenesis in the western North Pacific simulated by a global nonhydrostatic atmospheric model. Geophys. Res. Lett., 42, 565571, https://doi.org/10.1002/2014GL062479.

    • Search Google Scholar
    • Export Citation
  • Pedlosky, J., 1987: Geophysical Fluid Dynamics. Springer, 710 pp.

  • Ruan, Z., and Q. Wu, 2018: Precipitation, convective clouds, and their connections with tropical cyclone intensity and intensity change. Geophys. Res. Lett., 45, 10981105, https://doi.org/10.1002/2017GL076611.

    • Search Google Scholar
    • Export Citation
  • Swinbank, R., and Coauthors, 2016: The TIGGE project and its achievements. Bull. Amer. Meteor. Soc., 97, 4967, https://doi.org/10.1175/BAMS-D-13-00191.1.

    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., K.-C. Lu, R. L. Elsberry, M.-M. Lu, and C.-H. Sui, 2011: Tropical cyclone–like vortices detection in the NCEP 16-day ensemble system over the western North Pacific in 2008: Application and forecast evaluation. Wea. Forecasting, 26, 7793, https://doi.org/10.1175/2010WAF2222415.1.

    • Search Google Scholar
    • Export Citation
  • Tsai, H.-C., R. L. Elsberry, W.-C. Chin, and T. P. Marchok, 2020: Opportunity for early warnings of Typhoon Lekima from two global ensemble model forecasts of formation with 7-day intensities along medium-range tracks. Atmosphere, 11, 1162, https://doi.org/10.3390/atmos11111162.

    • Search Google Scholar
    • Export Citation
  • Walsh, K. J. E., M. Fiorino, C. W. Landsea, and K. L. McInnes, 2007: Objectively determined resolution-dependent threshold criteria for the detection of tropical cyclones in climate models and reanalyses. J. Climate, 20, 23072314, https://doi.org/10.1175/JCLI4074.1.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., W. Li, M. S. Peng, X. Jiang, R. McTaggart-Cowan, and C. A. Davis, 2018: Predictive skill and predictability of North Atlantic tropical cyclogenesis in different synoptic flow regimes. J. Atmos. Sci., 75, 361378, https://doi.org/10.1175/JAS-D-17-0094.1.

    • Search Google Scholar
    • Export Citation
  • Xiang, B., and Coauthors, 2015: Beyond weather time-scale prediction for Hurricane Sandy and Super Typhoon Haiyan in a global climate model. Mon. Wea. Rev., 143, 524535, https://doi.org/10.1175/MWR-D-14-00227.1.

    • Search Google Scholar
    • Export Citation
  • Yamaguchi, M., and N. Koide, 2017: Tropical cyclone genesis guidance using the early stage Dvorak analysis and global ensembles. Wea. Forecasting, 32, 21332141, https://doi.org/10.1175/WAF-D-17-0056.1.

    • Search Google Scholar
    • Export Citation
  • Yamaguchi, M., F. Vitart, S. T. K. Lang, L. Magnusson, R. L. Elsberry, G. Elliott, M. Kyouda, and T. Nakazawa, 2015: Global distribution of the skill of tropical cyclone activity forecasts on short- to medium-range time scales. Wea. Forecasting, 30, 16951709, https://doi.org/10.1175/WAF-D-14-00136.1.

    • Search Google Scholar
    • Export Citation
  • Zawislak, J., 2020: Global survey of precipitation properties observed during tropical cyclogenesis and their differences compared to nondeveloping disturbances. Mon. Wea. Rev., 148, 15851606, https://doi.org/10.1175/MWR-D-18-0407.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, X., and H. Yu, 2017: A probabilistic tropical cyclone track forecast scheme based on the selective consensus of ensemble prediction systems. Wea. Forecasting, 32, 21432157, https://doi.org/10.1175/WAF-D-17-0071.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, X., G. Chen, H. Yu, and Z. Zeng, 2015: Verification of ensemble track forecasts of tropical cyclones during 2014. Trop. Cyclone Res. Rev., 4, 7987, https://doi.org/10.6057/2015TCRR02.04.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    TC genesis locations during the period of 2018–20. Borders of the three TC basins (WP, EP, and AL) are indicated with green lines. Geographically, the EP basin includes the eastern North Pacific basin and central North Pacific basin.

  • Fig. 2.

    The schematic diagram of forecast lead time and forecast time.

  • Fig. 3.

    (a) The daily number of ensemble members with correct forecasts (hit) for TCs in 2018 in the three basins in UKMO-EPS. (b) As in (a), but for the number of ensemble members with disturbance forecasts. The numbers on the x axis denote the TC number in each basin. The numbers above the bars in (a) give the genesis location of the corresponding TC. The numbers above the bars in (b) give the maximum intensity of the corresponding TC within 2 days of genesis. The daily number of ensemble members is labeled in numerical form. For a perfect prediction, the number should be twice the ensemble size. The percentage of members with correct forecasts (the daily number divided by twice the ensemble size) is given in color shading.

  • Fig. 4.

    As in Fig. 3, but for forecasts in ECMWF-EPS.

  • Fig. 5.

    (a) The probability of detection (POD) as a function of the forecast lead time for several ensemble members in UKMO-EPS during 2018–20. (b) As in (a), but for the success ratio (SR). The black dots and their nearby numbers denote the mean of all ensemble members; the bottom and top of the black bars are the 10% and 90% percentiles of all the ensemble members, respectively. The colored numbers next to the bars in (b) indicate the sample sizes in the corresponding bars.

  • Fig. 6.

    As in Fig. 5, but for ECMWF-EPS during 2018–20.

  • Fig. 7.

    (a) PODs of the ensemble, ens_F, and the mean of all ensemble members as a function of the forecast lead time in UKMO-EPS. (b) As in (a), but for ECMWF-EPS. The definitions of ensemble PODs and ens_F PODs can be found in the text. The data include forecasts from all three basins.

  • Fig. 8.

    Ensemble PODs as a function of the forecast lead time for ECMWF-EPS, UKMO-EPS, and the superensemble of ECMWF-EPS and UKMO-EPS. The comparisons are performed with the same samples. The data include forecasts from all three basins.

  • Fig. 9.

    (a) Boxplot of the number of genesis forecasts near each genesis forecast at different forecast times in UKMO-EPS. (b) As in (a), but for ECMWF-EPS. The bottom and top of the box are the first and third quartiles, respectively; the percentiles for the upper and lower whiskers are 2.5% and 97.5%, respectively. The diamond mark inside the box denotes the mean value, the line inside the box denotes the median, and the plus marks not included between the whiskers are outliers. The numbers in color denote the sample size. The solid semicircle on the x axis indicates that the difference in number between hits and false alarms at that forecast interval is significant under Levene’s test with a 95% confidence level. The data include forecasts from all three basins.

  • Fig. 10.

    As in Fig. 9, but for the number of disturbance forecasts accompanying each genesis forecast. The solid semicircle on the x axis indicates that the difference in number between hits and false alarms at that forecast interval is significant under Levene’s test with a 95% confidence level. The data include forecasts from all three basins.

  • Fig. 11.

    (a) SRs of ens_F (blue bars) and SRs of the mean of all ensemble members (orange bars). (b) As in (a), but for ECMWF-EPS. The data include forecasts from all three basins.

  • Fig. 12.

    (a) The PODs of ens_F in different basins. (b) The SRs of ens_F in different basins.

  • Fig. 13.

    (a) Time and position errors for the control forecasts of UKMO-EPS. (b) As in (a), but for that of ECMWF-EPS. Only model TCs that form within 120 h of the observed genesis times and within a 10° radius of the observed tracks are shown here.

All Time Past Year Past 30 Days
Abstract Views 891 0 0
Full Text Views 3498 2991 154
PDF Downloads 735 344 14