• Anderson, J. L., 1996: A method for producing and evaluating probabilistic forecasts from ensemble model integrations. J. Climate, 9 , 15181530.

    • Search Google Scholar
    • Export Citation
  • Bernardo, J. M., 1979: Expected information as expected utility. Ann. Stat., 7 , 686690.

  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78 , 13.

  • Brier, G. W., , and R. A. Allen, 1951: Verification of weather forecasts. Compendium of Meteorology, T. Malone, Ed., Amer. Meteor. Soc., 841–848.

    • Search Google Scholar
    • Export Citation
  • Brocker, J., , and L. A. Smith, 2007: Scoring probabilistic forecast: The importance of being proper. Wea. Forecasting, 22 , 382388.

  • Epstein, E., 1969: A scoring system for probability forecast of ranked categories. J. Appl. Meteor., 8 , 985987.

  • Gandin, L. S., , and A. H. Murphy, 1992: Equitable skill scores for categorical forecasts. Mon. Wea. Rev., 120 , 361370.

  • Gibbs, J. W., 1902: Elementary Principles in Statistical Mechanics. Charles Scribner’s Sons, 207 pp.

  • Good, I. J., 1952: Rational decisions. J. Roy. Stat. Soc., 14A , 107114.

  • Jaynes, E. T., 1957: Information theory and statistical mechanics. Phys. Rev., 106 , 620630.

  • Jewson, S., cited. 2008: The problem with the Brier score. arXiv:physics/0401046v1 [physics.ao-ph]. [Available online at http://arxiv.org/abs/physics/0401046v1].

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., , and D. B. Stephenson, 2008: Proper scores for probability forecasts can never be equitable. Mon. Wea. Rev., 136 , 15051510.

    • Search Google Scholar
    • Export Citation
  • Kullback, S., , and R. A. Leibler, 1951: On information and sufficiency. Ann. Math. Stat., 22 , 7986.

  • Leung, L-Y., , and G. R. North, 1990: Information theory and climate prediction. J. Climate, 3 , 514.

  • Lindley, D. V., 1985: Making Decisions. John Wiley & Sons, 207 pp.

  • Mason, I. B., 1982: A model for assessment of weather forecasts. Aust. Meteor. Mag., 30 , 291303.

  • Murphy, A. H., 1971: A note on the ranked probability score. J. Appl. Meteor., 10 , 155156.

  • Roulston, M. S., , and L. A. Smith, 2002: Evaluating probabilistic forecasts using information theory. Mon. Wea. Rev., 130 , 16531660.

  • Shannon, C. E., 1948: A mathematical theory of communication. Bell Syst. Technol. J., 27 , 379423. 623656.

  • Talagrand, O., , R. Vautard, , and B. Strauss, 1999: Evaluation of probabilistic prediction system. Proc. ECMWF Workshop on Predictability, Reading, United Kingdom, ECMWF, 1–25.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences: An Introduction. Academic Press, 464 pp.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 380 380 59
PDF Downloads 357 357 58

Scoring Rules for Forecast Verification

View More View Less
  • 1 Laboratory for Environmental Monitoring and Modelling (LaMMA), Toscana, Italy
© Get Permissions
Restricted access

Abstract

The problem of probabilistic forecast verification is approached from a theoretical point of view starting from three basic desiderata: additivity, exclusive dependence on physical observations (“locality”), and strictly proper behavior. By imposing such requirements and only using elementary mathematics, a univocal measure of forecast goodness is demonstrated to exist. This measure is the logarithmic score, based on the relative entropy between the observed occurrence frequencies and the predicted probabilities for the forecast events. Information theory is then used as a guide to choose the scoring-scale offset for obtaining meaningful and fair skill scores. Finally the Brier score is assessed and, for single-event forecasts, its equivalence to the second-order approximation of the logarithmic score is shown.

The large part of the presented results are far from being new or original, nevertheless their use still meets with some resistance in the weather forecast community. This paper aims at providing a clear presentation of the main arguments for using the logarithmic score.

Corresponding author address: Riccardo Benedetti, Laboratory for Environmental Monitoring and Modelling (LaMMA), Building D, via Madonna del Piano, Sesto Fiorentino 50019, Italy. Email: benedetti@lamma.rete.toscana.it

Abstract

The problem of probabilistic forecast verification is approached from a theoretical point of view starting from three basic desiderata: additivity, exclusive dependence on physical observations (“locality”), and strictly proper behavior. By imposing such requirements and only using elementary mathematics, a univocal measure of forecast goodness is demonstrated to exist. This measure is the logarithmic score, based on the relative entropy between the observed occurrence frequencies and the predicted probabilities for the forecast events. Information theory is then used as a guide to choose the scoring-scale offset for obtaining meaningful and fair skill scores. Finally the Brier score is assessed and, for single-event forecasts, its equivalence to the second-order approximation of the logarithmic score is shown.

The large part of the presented results are far from being new or original, nevertheless their use still meets with some resistance in the weather forecast community. This paper aims at providing a clear presentation of the main arguments for using the logarithmic score.

Corresponding author address: Riccardo Benedetti, Laboratory for Environmental Monitoring and Modelling (LaMMA), Building D, via Madonna del Piano, Sesto Fiorentino 50019, Italy. Email: benedetti@lamma.rete.toscana.it

Save