On Summary Measures of Skill in Rare Event Forecasting Based on Contingency Tables

Charles A. Doswell III NOAA/Environmental Research Laboratories, National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Charles A. Doswell III in
Current site
Google Scholar
PubMed
Close
,
Robert Davies-Jones NOAA/Environmental Research Laboratories, National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Robert Davies-Jones in
Current site
Google Scholar
PubMed
Close
, and
David L. Keller NOAA/Environmental Research Laboratories, National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by David L. Keller in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The so-called True Skill Statistic (TSS) and the Heidke Skill Score (S), as used in the context of the contingency, table approach to forecast verification, are compared. It is shown that the TSS approaches the Probability of Detection (POD) whenever the forecasting is dominated by correct forecasts of non-occurrence, i.e., forecasting rare events like severe local storms. This means that the TSS is vulnerable to “hedging” in rare event forecasting. The S-statistic is shown to be superior to the TSS in this situation, accounting for correct forecasts of null events in a controlled fashion. It turns out that the TSS and S values are related in a subtle way, becoming identical when the expected values (due to chance in a k × k contingency table) remain unchanged when comparing the actual forecast table to that of a hypothetical perfect set of forecasts. Examples of the behavior of the TSS and S values in different situations are provided which support the recommendation that S be used in preference to TSS for rare event forecasting. A geometrical interpretation is also given for certain aspects of the 2 × 2 contingency table and this is generalized to the k × l case. Using this geometrical interpretation, it is shown to be possible to apply dichotomous verification techniques in polychotomous situations, thus allowing a direct comparison between dichotomous and polychotomous forecasting.

Abstract

The so-called True Skill Statistic (TSS) and the Heidke Skill Score (S), as used in the context of the contingency, table approach to forecast verification, are compared. It is shown that the TSS approaches the Probability of Detection (POD) whenever the forecasting is dominated by correct forecasts of non-occurrence, i.e., forecasting rare events like severe local storms. This means that the TSS is vulnerable to “hedging” in rare event forecasting. The S-statistic is shown to be superior to the TSS in this situation, accounting for correct forecasts of null events in a controlled fashion. It turns out that the TSS and S values are related in a subtle way, becoming identical when the expected values (due to chance in a k × k contingency table) remain unchanged when comparing the actual forecast table to that of a hypothetical perfect set of forecasts. Examples of the behavior of the TSS and S values in different situations are provided which support the recommendation that S be used in preference to TSS for rare event forecasting. A geometrical interpretation is also given for certain aspects of the 2 × 2 contingency table and this is generalized to the k × l case. Using this geometrical interpretation, it is shown to be possible to apply dichotomous verification techniques in polychotomous situations, thus allowing a direct comparison between dichotomous and polychotomous forecasting.

Save