The Response of Performance Metrics for Binary Forecasts to Hedging that Approaches Random Change

Keith F. Brill Hydrometeorological Prediction Center, National Centers for Environmental Prediction, Camp Springs, Maryland

Search for other papers by Keith F. Brill in
Current site
Google Scholar
PubMed
Close
and
Matthew Pyle Environmental Modeling Center, National Centers for Environmental Prediction, Camp Springs, Maryland

Search for other papers by Matthew Pyle in
Current site
Google Scholar
PubMed
Close
Restricted access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

Critical performance ratio (CPR) expressions for the eight conditional probabilities associated with the 2 × 2 contingency table of outcomes for binary (dichotomous “yes” or “no”) forecasts are derived. Two are shown to be useful in evaluating the effects of hedging as it approaches random change. The CPR quantifies how the probability of detection (POD) must change as frequency bias changes, so that a performance measure (or conditional probability) indicates an improved forecast for a given value of frequency bias. If yes forecasts were to be increased randomly, the probability of additional correct forecasts (hits) is given by the detection failure ratio (DFR). If the DFR for a performance measure is greater than the CPR, the forecast is likely to be improved by the random increase in yes forecasts. Thus, the DFR provides a benchmark for the CPR in the case of frequency bias inflation. If yes forecasts are decreased randomly, the probability of removing a hit is given by the frequency of hits (FOH). If the FOH for a performance measure is less than the CPR, the forecast is likely to be improved by the random decrease in yes forecasts. Therefore, the FOH serves as a benchmark for the CPR if the frequency bias is decreased. The closer the FOH (DFR) is to being less (greater) than or equal to the CPR, the more likely it may be to enhance the performance measure by decreasing (increasing) the frequency bias. It is shown that randomly increasing yes forecasts for a forecast that is itself better than a randomly generated forecast can improve the threat score but is not likely to improve the equitable threat score. The equitable threat score is recommended instead of the threat score whenever possible.

Corresponding author address: Keith F. Brill, NCEP/HPC, W/NP32, NOAA Science Center, Rm. 410B-2, 5200 Auth Rd., Camp Springs, MD 20746-4304. Email: keith.brill@noaa.gov

Abstract

Critical performance ratio (CPR) expressions for the eight conditional probabilities associated with the 2 × 2 contingency table of outcomes for binary (dichotomous “yes” or “no”) forecasts are derived. Two are shown to be useful in evaluating the effects of hedging as it approaches random change. The CPR quantifies how the probability of detection (POD) must change as frequency bias changes, so that a performance measure (or conditional probability) indicates an improved forecast for a given value of frequency bias. If yes forecasts were to be increased randomly, the probability of additional correct forecasts (hits) is given by the detection failure ratio (DFR). If the DFR for a performance measure is greater than the CPR, the forecast is likely to be improved by the random increase in yes forecasts. Thus, the DFR provides a benchmark for the CPR in the case of frequency bias inflation. If yes forecasts are decreased randomly, the probability of removing a hit is given by the frequency of hits (FOH). If the FOH for a performance measure is less than the CPR, the forecast is likely to be improved by the random decrease in yes forecasts. Therefore, the FOH serves as a benchmark for the CPR if the frequency bias is decreased. The closer the FOH (DFR) is to being less (greater) than or equal to the CPR, the more likely it may be to enhance the performance measure by decreasing (increasing) the frequency bias. It is shown that randomly increasing yes forecasts for a forecast that is itself better than a randomly generated forecast can improve the threat score but is not likely to improve the equitable threat score. The equitable threat score is recommended instead of the threat score whenever possible.

Corresponding author address: Keith F. Brill, NCEP/HPC, W/NP32, NOAA Science Center, Rm. 410B-2, 5200 Auth Rd., Camp Springs, MD 20746-4304. Email: keith.brill@noaa.gov

Save
  • Baldwin, M. E., and Kain J. S. , 2006: Sensitivity of several performance measures to displacement error, bias, and event frequency. Wea. Forecasting, 21 , 636648.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brill, K. F., 2009: A general analytic method for assessing sensitivity to bias of performance measures for dichotomous forecasts. Wea. Forecasting, 24 , 307318.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brill, K. F., and Mesinger E. , 2009: Applying a general analytic method for assessing bias sensitivity to bias-adjusted threat and equitable threat scores. Wea. Forecasting, 24 , 17481754.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davis, C., Brown B. , and Bullock R. , 2006: Object-based verification of precipitation forecasts. Part I: Methodology and application to mesoscale rain areas. Mon. Wea. Rev., 134 , 17721784.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doswell C. A. III, , Davies-Jones R. , and Keller D. L. , 1990: On summary measures of skill in rare event forecasting based on contingency tables. Wea. Forecasting, 5 , 576585.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., Ahijevych D. , Brown B. G. , Casati B. , and Ebert E. , 2009: Intercomparison of spatial forecast verification methods. Wea. Forecasting, 24 , 14161430.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Janjić, Z., Black T. , Pyle M. , Rogers E. , Chuang H-Y. , and DiMego G. , 2005: High resolution applications of the WRF NMM. Preprints, 21st Conf. on Weather Analysis and Forecasting/17th Conf. on Numerical Weather Prediction, Washington, DC, Amer. Meteor. Soc., 16A.4. [Available online at http://ams.confex.com/ams/pdfpapers/93724.pdf].

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., and Stephenson D. B. , Eds. 2003: Forecast Verification. A Practitioner’s Guide in Atmospheric Science. John Wiley and Sons, 240 pp.

    • Search Google Scholar
    • Export Citation
  • Marzban, C., 1998: Scalar measures of performance in rare-event situations. Wea. Forecasting, 13 , 753763.

  • Mesinger, F., 2008: Bias adjusted precipitation threat scores. Adv. Geosci., 16 , 137142.

  • Rogers, E., and Coauthors, 2005: The NCEP North American Mesoscale modeling system: Final Eta Model/analysis changes and preliminary experiments using the WRF-NMM. Preprints, 21st Conf. on Weather Analysis and Forecasting/17th Conf. on Numerical Weather Prediction, Washington, DC, Amer. Meteor. Soc., 4B.5. [Available online at http://ams.confex.com/ams/pdfpapers/94707.pdf].

    • Search Google Scholar
    • Export Citation
  • Shi, W., Yarosh E. , Higgins R. W. , and Joyce R. , 2003: Processing daily rain-gauge precipitation data for the Americas at the NOAA Climate Prediction Center. Preprints, 19th Conf. on Interactive Information Processing Systems, Long Beach, CA, Amer. Meteor. Soc., P1.6. [Available online at http://ams.confex.com/ams/pdfpapers/56719.pdf].

    • Search Google Scholar
    • Export Citation
  • Stephenson, D. B., 2000: Use of the “odds ratio” for diagnosing forecast skill. Wea. Forecasting, 15 , 221232.

  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 630 pp.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 388 211 84
PDF Downloads 90 33 2