A General Analytic Method for Assessing Sensitivity to Bias of Performance Measures for Dichotomous Forecasts

Keith F. Brill Hydrometeorological Prediction Center, National Centers for Environmental Prediction, Camp Springs, Maryland

Search for other papers by Keith F. Brill in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Performance measures computed from the 2 × 2 contingency table of outcomes for dichotomous forecasts are sensitive to bias. The method presented here evaluates how the probability of detection (POD) must change as bias changes so that a performance measure improves at a given value of bias. A critical performance ratio (CPR) of the change of POD to the change in bias is derived for a number of performance measures. If a change in POD associated with a bias change satisfies the CPR condition, the performance measure will indicate an improved forecast. If a perfect measure of performance existed, it would always exhibit its optimal value at bias equal to one. Actual measures of performance are susceptible to bias, indicating a better forecast for bias values not equal to one. The CPR is specifically applied to assess the conditions for an improvement toward a more favorable value of several commonly used performance measures as bias increases or decreases through the value one. All performance measures evaluated are found to have quantifiable bias sensitivity. The CPR is applied to analyzing a performance requirement and bias sensitivity in a geometric model.

Corresponding author address: Keith F. Brill, NCEP/HPC, W/NP32, NOAA Science Center, Room 410B-2, 5200 Auth Rd., Camp Springs, MD 20746-4304. Email: keith.brill@noaa.gov

Abstract

Performance measures computed from the 2 × 2 contingency table of outcomes for dichotomous forecasts are sensitive to bias. The method presented here evaluates how the probability of detection (POD) must change as bias changes so that a performance measure improves at a given value of bias. A critical performance ratio (CPR) of the change of POD to the change in bias is derived for a number of performance measures. If a change in POD associated with a bias change satisfies the CPR condition, the performance measure will indicate an improved forecast. If a perfect measure of performance existed, it would always exhibit its optimal value at bias equal to one. Actual measures of performance are susceptible to bias, indicating a better forecast for bias values not equal to one. The CPR is specifically applied to assess the conditions for an improvement toward a more favorable value of several commonly used performance measures as bias increases or decreases through the value one. All performance measures evaluated are found to have quantifiable bias sensitivity. The CPR is applied to analyzing a performance requirement and bias sensitivity in a geometric model.

Corresponding author address: Keith F. Brill, NCEP/HPC, W/NP32, NOAA Science Center, Room 410B-2, 5200 Auth Rd., Camp Springs, MD 20746-4304. Email: keith.brill@noaa.gov

Save
  • Baldwin, M. E., and Kain J. S. , 2006: Sensitivity of several performance measures to displacement error, bias, and event frequency. Wea. Forecasting, 21 , 636648.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnes, L. R., Gruntfest E. C. , Hayden M. H. , Schultz D. M. , and Benight C. , 2007: False alarms and close calls: A conceptual model of warning accuracy. Wea. Forecasting, 22 , 11401147.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Doswell C. A. III, , Davies-Jones R. , and Keller D. L. , 1990: On summary measures of skill in rare event forecasting based on contingency tables. Wea. Forecasting, 5 , 576585.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gandin, L. S., and Murphy A. H. , 1992: Equitable skill scores for categorical forecasts. Mon. Wea. Rev., 120 , 361370.

  • Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14 , 155167.

  • Hilliker, J. L., 2004: The sensitivity of the number of correctly forecasted events to the threat score: A practical application. Wea. Forecasting, 19 , 646650.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Iribarne, J. V., and Godson W. L. , 1973: Atmospheric Thermodynamics. D. Reidel, 222 pp.

  • Marzban, C., 1998: Scalar measures of performance in rare-event situations. Wea. Forecasting, 13 , 753763.

  • Marzban, C., and Lakshmanan V. , 1999: On the uniqueness of Gandin and Murphy’s equitable performance measures. Mon. Wea. Rev., 127 , 11341136.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mesinger, F., and Brill K. , 2004: Bias normalized precipitation scores. Preprints, 17th Conf. on Probability and Statistics, Seattle, WA, Amer. Meteor. Soc., J12.6. [Available online at http://ams.confex.com/ams/pdfpapers/69561.pdf.].

    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1991: Forecast verification: Its complexity and dimensionality. Mon. Wea. Rev., 119 , 15901601.

  • Schaefer, J. T., 1990: The critical success index as an indicator of warning skill. Wea. Forecasting, 5 , 570575.

  • Stephenson, D. B., 2000: Use of the “odds ratio” for diagnosing forecast skill. Wea. Forecasting, 15 , 221232.

  • Stephenson, D. B., Casati B. , Ferro C. A. T. , and Wilson C. A. , 2008: The extreme dependency score: A non-vanishing measure for forecasts of rare events. Meteor. Appl., 15 , 4150.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 630 pp.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 345 151 26
PDF Downloads 157 47 3