Assessing the Skill of Yes/No Forecasts for Markov Observations

William Briggs General Internal Medicine, Weill Cornell Medical College, New York, New York

Search for other papers by William Briggs in
Current site
Google Scholar
PubMed
Close
and
David Ruppert School of Operations Research and Industrial Engineering, Cornell University, Ithaca, New York

Search for other papers by David Ruppert in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Briggs and Ruppert recently introduced a new, easy-to-calculate economic skill/value score for use in yes/no forecast decisions, of which precipitation forecast decisions are an example. The advantage of this new skill/value score is that the sampling distribution is known, which allows one to perform hypothesis tests on collections of forecasts and to say whether a given skill/value score is significant or not. Here, the climate skill/value score is taken and extended to the case where the predicted series is first-order Markov in nature, of which, again, precipitation occurrence series can be an example. It is shown that, in general, Markov skill/value is different and more demanding than is persistence skill. Persistence skill is defined as improvement over forecasts that state that the next value in a series will equal the present value. It is also shown that any naive forecasts based solely on the Markov parameters is always at least as skillful/valuable as are persistence forecasts; in general, persistence forecasts should not be used. The distribution for the Markov skill score is presented, and examples of hypothesis testing for precipitation forecasts are given. These skill scores are graphed for a wide range of forecast/user loss functions, a process that makes their interpretation simple.

Corresponding author address: William Briggs, General Internal Medicine, Weill Cornell Medical College, 525 E. 68th, Box 46, New York, NY 10021. Email: wib2004@med.cornell.edu

Abstract

Briggs and Ruppert recently introduced a new, easy-to-calculate economic skill/value score for use in yes/no forecast decisions, of which precipitation forecast decisions are an example. The advantage of this new skill/value score is that the sampling distribution is known, which allows one to perform hypothesis tests on collections of forecasts and to say whether a given skill/value score is significant or not. Here, the climate skill/value score is taken and extended to the case where the predicted series is first-order Markov in nature, of which, again, precipitation occurrence series can be an example. It is shown that, in general, Markov skill/value is different and more demanding than is persistence skill. Persistence skill is defined as improvement over forecasts that state that the next value in a series will equal the present value. It is also shown that any naive forecasts based solely on the Markov parameters is always at least as skillful/valuable as are persistence forecasts; in general, persistence forecasts should not be used. The distribution for the Markov skill score is presented, and examples of hypothesis testing for precipitation forecasts are given. These skill scores are graphed for a wide range of forecast/user loss functions, a process that makes their interpretation simple.

Corresponding author address: William Briggs, General Internal Medicine, Weill Cornell Medical College, 525 E. 68th, Box 46, New York, NY 10021. Email: wib2004@med.cornell.edu

Save
  • Agresti, A., 1990: Categorical Data Analysis. Wiley Series in Probability and Mathematical Statistics, Wiley, 558 pp.

  • Briggs, W. M., 2005: A general method of incorporating forecast cost and loss in value scores. Mon. Wea. Rev., 133 , 33933397.

  • Briggs, W. M., and R. A. Levine, 1998: Comparison of forecasts using the bootstrap. Preprints, 14th Conf. on Probability and Statistics in the Atmospheric Sciences, Phoenix, AZ, Amer. Meteor. Soc., 1–4.

  • Briggs, W. M., and D. Ruppert, 2005: Assessing the skill of yes/no predictions. Biometrics, 61 , 799807.

  • Briggs, W. M., M. Pocernich, and D. Ruppert, 2005: Incorporating misclassification error in skill assessment. Mon. Wea. Rev.,133, 3382–3392.

    • Search Google Scholar
    • Export Citation
  • Brooks, H. E., A. Witt, and M. D. Eilts, 1997: Verification of public weather forecasts available via the media. Bull. Amer. Meteor. Soc., 78 , 21672177.

    • Search Google Scholar
    • Export Citation
  • Drosdowsky, W., and H. Zhang, 2003: Verification of spatial fields. Forecast Verification, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 121–136.

    • Search Google Scholar
    • Export Citation
  • Livezey, R. E., 2003: Categorical events. Forecast Verification, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 77–96.

  • Mason, I. B., 2003: Binary events. Forecast Verification, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 37–76.

  • McNemar, L., 1947: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12 , 153157.

    • Search Google Scholar
    • Export Citation
  • Meeden, G., 1979: Comparing two probability appraisers. J. Amer. Stat. Assoc., 74 , 299302.

  • Mosteller, F., 1952: Some statistical problems in measuring the subjective response to drugs. Biometrics, 8 , 220226.

  • Mozer, J. B., and W. M. Briggs, 2003: Skill in real-time solar wind shock forecasts. J. Geophys. Res., . 108 , 12621270.

  • Murphy, A. H., 1991: Forecast verification: Its complexity and dimensionality. Mon. Wea. Rev., 119 , 15901601.

  • Murphy, A. H., 1997: Forecast verification. Economic Value of Weather and Climate Forecasts, R. W. Katz and A. H. Murphy, Eds., Cambridge University Press, 19–74.

    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., and A. Ehrendorfer, 1987: One the relationship between the accuracy and value of forecasts in the cost–loss ratio situation. Wea. Forecasting, 2 , 243251.

    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., and R. L. Winkler, 1987: A general framework for forecast verification. Mon. Wea. Rev., 115 , 13301338.

  • Richardson, D. S., 2000: Skill and relative economic value of the ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. Soc., 126 , 649667.

    • Search Google Scholar
    • Export Citation
  • Richardson, D. S., 2001: Measures of skill and value of ensemble prediction systems, their interrelationship and the effect of ensemble size. Quart. J. Roy. Meteor. Soc., 127 , 24732489.

    • Search Google Scholar
    • Export Citation
  • Schervish, M. J., 1989: A general method for comparing probability assessors. Ann. Stat., 17 , 18561879.

  • Self, S. G., and K. Y. Liang, 1987: Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Amer. Stat. Assoc., 82 , 605610.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1991: Representing serial correlation of meteorological events and forecasts in dynamic decision–analytic models. Mon. Wea. Rev., 119 , 16401662.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Wilks, D. S., 2001: A skill score based on economic value for probability forecasts. Meteor. Appl., 8 , 209219.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 87 28 2
PDF Downloads 55 26 2