Two clarifications are required to Barnes et al. (2007). First, on p. 1141 of Barnes et al. (2007), the layout of Table 1, the 2 × 2 contingency table, was reversed from that of standard practice, which places “observed event” as the heading for the columns and “forecast” as the heading for the rows (e.g., section 7.2.1 of Wilks 2006). This oversight can be corrected with the revised Table 1 here.
Second, the term “false alarm rate” was mistakenly used in Barnes et al. (2007) instead of the term “false alarm ratio” commonly used in recent authoritative sources on verification (e.g., Nurmi 2003; section 3.2.2 of Jolliffe and Stephenson 2003; section 7.2.1 of Wilks 2006). These sources define false alarm rate (also known more aptly as the probability of false detection) as the number of false alarms divided by the number of times the event did not happen, or the fraction of nonevents that were incorrectly forecast: FARate = Z/(W+Z), using the variables in Table 1. In contrast, the false alarm ratio is defined as the number of false alarms divided by the number of forecasted events, or the fraction of “yes” forecasts that did not occur: FARatio = Z/(X+Z). Barnes et al. (2007) can be made consistent with these definitions (hereafter referred to as the currently accepted definitions) by changing the meaning of the acronym “FAR” from false alarm rate to false alarm ratio on p. 1140.
Unfortunately, Barnes et al. (2007) was not unique in having applied these terms differently from the currently accepted definitions, as demonstrated by these five examples.
Olson (1965) defined the false alarm rate as 1 − “number of correct yes forecasts/number of yes forecasts.”
The Glossary of Meteorology (Glickman 2000, entry for “skill”) defines the false alarm rate as the number of forecasts of rain when rain did not occur divided by the number of forecasts of rain, but does not define false alarm ratio.
Roulston and Smith (2004) said that the false alarm rate is the “number of times the warning is issued but event does not occur” divided by the “number of times the warning is issued.”
Although Brooks (2004) defined FAR as false alarm ratio in his text (p. 837), FAR is listed as false alarm rate in his Table 1. (Probability of false detection is correctly listed in the table separately.)
Although false alarm ratio is the proper terminology, the National Weather Service has used false alarm rate and false alarm ratio interchangeably through the years depending on the audience being addressed. That is, false alarm rate has been used for nontechnical audiences who do not have a complete understanding of the fractional relationship (B. MacAloney 2009, personal communication).
Although the confusion surrounding the present state of affairs is quite clear, the origin of these terms in the literature is unclear. Apparently, the terminology appears to have evolved over time with the terms having had different generally accepted meanings at one time (e.g., section 7.2.2 in Wilks 2006). For example, articles published by the journals of the American Meteorological Society with the words “false alarm” in the abstract were identified from an online search. The earliest one was in 1975 (Georges and Greene 1975) and used false alarm rate differently from the currently accepted definition. The first paper that used the term false alarm ratio and used it consistently with the currently accepted definition was published in 1983, and the second was in 1985 (McCann 1983; Tsonis and Isaac 1985). Thus, the currently accepted definition of false alarm ratio has been used in atmospheric science at least since 1983.
To explore how pervasive this confusion is in the more recent literature, articles published by the journals of the American Meteorological Society during 2001–07 with the words “false alarm” in the abstract were identified from an online search. Thirty-five articles met these criteria. Twenty-six (74%) of these articles provided enough information within them to determine how the authors calculated false alarm rate or false alarm ratio. Of those 26 articles, 9 (35%) used the term false alarm rate inconsistent with its currently accepted definition, and 1 (4%) used the term false alarm ratio inconsistent with its currently accepted definition. Thus, 10 (38%) of these recent peer-reviewed articles failed to follow currently accepted practice.
That this confusion is widespread in the published literature does not excuse Barnes et al. (2007) for not following current practice. We apologize for any confusion that these two oversights may have caused. Nevertheless, we would like this corrigendum to increase the awareness in the atmospheric sciences community of these issues. Our literature review motivates the following three recommendations.
Authors should present their 2 × 2 contingency tables using standard practice, which places “observed event” as the heading for the columns and “forecast” as the heading for the rows.
Authors should make explicit how their verification statistics are calculated in their manuscripts to ensure that the readers know what definitions are being used.
Authors should consider using the terms probability of false detection, POFD = Z/(W+Z), and probability of false alarm, POFA = Z/(X+Z), which would (a) be consistent with the term “probability of detection” POD; (b) be less confusing than the terms false alarm rate and false alarm ratio, as well as the ambiguous acronyms F and FAR; and (c) not be burdened with past confusions.
Acknowledgments
We thank Ian Jolliffe and Bill Gallus for alerting us to our errors, and we thank Harold Brooks (National Severe Storms Laboratory), Brent MacAloney (National Weather Service), Caren Marzban (University of Washington), Pertti Nurmi (Finnish Meteorological Institute), Paul Roebber (University of Wisconsin—Milwaukee), and Daniel Wilks (Cornell University) for their help with this corrigendum. Partial funding for Schultz was provided by Vaisala Oyj.
REFERENCES
Barnes, L. R., Gruntfest E. C. , Hayden M. H. , Schultz D. M. , and Benight C. , 2007: False alarms and close calls: A conceptual model of warning accuracy. Wea. Forecasting, 22 , 1140–1147.
Brooks, H. E., 2004: Tornado warning performance in the past and future: A perspective from signal detection theory. Bull. Amer. Meteor. Soc., 85 , 837–843.
Georges, T. M., and Greene G. E. , 1975: Infrasound from convective storms. Part IV. Is it useful for storm warning? J. Appl. Meteor., 14 , 1303–1316.
Glickman, T. S., Ed.,. 2000: Glossary of Meteorology. 2nd ed. Amer. Meteor. Soc., 855 pp.
Jolliffe, I. T., and Stephenson D. B. , 2003: Forecast Verification: A Practitioner’s Guide in Atmospheric Science. John Wiley and Sons, 240 pp.
McCann, D. W., 1983: The enhanced-V: A satellite observable severe storm signature. Mon. Wea. Rev., 111 , 887–894.
Nurmi, P., 2003: Recommendations on the verification of local weather forecasts. ECMWF Tech. Memo. 430, 19 pp. [Available online at http://www.ecmwf.int/publications/library/do/references/list/14].
Olson, R. H., 1965: On the use of Bayes’ theorem in estimating false alarm rates. Mon. Wea. Rev., 93 , 557–558.
Roulston, M. S., and Smith L. A. , 2004: The Boy Who Cried Wolf revisited: The impact of false alarm intolerance on cost–loss scenarios. Wea. Forecasting, 19 , 391–397.
Tsonis, A., and Isaac G. , 1985: On a new approach for instantaneous rain area delineation in the midlatitudes using GOES data. J. Appl. Meteor., 24 , 1208–1218.
Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 627 pp.
A 2 × 2 contingency table, where X is the number of correct forecasts of events (hits), Y is the number of events that occurred but were not forecast (misses), Z is the number of forecasts of events that did not occur (false alarms), and W is the number of correct forecasts of events that did not occur (correct rejections, correct negatives, or correct “no” forecasts).