I am grateful to Mike Harrison for arousing my interest in categorical forecast verification and to Ian Mason for interesting discussions and articles concerning the introduction of signal detection ideas into forecast verification. I also thank Francisco Doblas-Reyes, Daniel Rousseau, and John Thornes for stimulating discussions concerning the subtleties and the often nontrivial interpretation of forecast trials. For helpful and expert remarks on statistical matters, I am indebted to Alain Agresti, Philippe Besse, Antoine Falguerolles, and Ian Jolliffe. I also wish to thank Philippe Besse for etymological discussions concerning the rather odd word “odds.”4
Agresti, A., 1996: An Introduction to Categorical Data Analysis. John Wiley and Sons, 290 pp.
——, and B. A. Coull, 1998: Approximation is better than “exact” for interval estimation of binomial proportions. Amer. Stat.,52, 1–7.
Bishop, Y. M. M., S. E. Fienberg, and P. W. Holland, 1975: Discrete Multivariate Analysis: Theory and Practice. The MIT Press, 557 pp.
Doolittle, M. H., 1885: The verification of predictions. Amer. Meteor. J.,2, 327–329.
——, 1888: Association ratios. Bull. Philos. Soc. Wash.,10, 83–96.
Finley, J. P., 1884: Tornado predictions. Amer. Meteor. J.,1, 85–88.
Fisher, R. A., 1990: Statistical Methods, Experimental Design, and Scientific Inference. Oxford University Press, 155 pp.
Flueck, J. A., 1987: A study of some measures of forecast verification. Preprints, 10th Conf. on Probability and Statistics in Atmospheric Sciences, Edmonton, AB, Canada, Amer. Meteor. Soc., 69–73.
Gandin, L. S., and A. H. Murphy, 1992: Equitable skill scores for categorical forecasts. Mon. Wea. Rev.,120, 2709–2712.
Gilbert, G. F., 1884: Finley’s tornado predictions. Amer. Meteor. J.,1, 166–172.
Green, D. M., and J. A. Swets, 1966: Signal Detection Theory and Psychophysics. Wiley, 155 pp. (Reprinted by Robert E. Krieger Publishing Co., 1974.).
Hanssen, A. W., and W. J. A. Kuipers, 1965: On the relationship between the frequency of rain and various meteorological parameters. Meded. Verh.,81, 2–15.
Harvey, L. O., Jr., K. R. Hammond, C. M. Lusk, and E. F. Moss, 1992: The application of signal detection theory to weather forecasting behavior. Mon. Wea. Rev.,120, 863–883.
Heidke, P., 1926: Berechnung des Erfolges und der Güte der Windstärkevorhersagen im Sturmwarnungdienst (Calculation of the success and goodness of strong wind forecasts in the storm warning service). Geogr. Ann. Stockholm,8, 301–349.
Macmillan, N. A., and C. D. Creelman, 1991: Detection Theory: A User’s Guide. Cambridge University Press, 155 pp.
Mason, I. B., 1980: Decision-theoretic evaluation of probabilistic predictions. Proc. WMO Symp. on Probabilistic and Statistical Methods in Weather Forecasting, Nice, France, WMO, 219–228.
——, 1982: A model for the assessment of weather forecasts. Aust. Meteor. Mag.,30, 291–303.
——, 1989: Dependence of the critical success index on sample climate and threshold probability. Aust. Meteor. Mag.,37, 75–81.
——, 1997: The weather forecast as a statistical decision: An outline of signal detection theory and ROC analysis in assessment of forecast quality. Proc. Forecast and Warning Verification Workshop, Melbourne, Australia, Australian Bureau of Meteorology, 1–10.
——, 1993: What is a good forecast? An essay on the nature of goodness in weather forecasting. Wea. Forecasting,8, 281–293.
——, 1996: The Finley affair: A signal event in the history of forecast verification. Wea. Forecasting,11, 3–20.
——, and H. Daan, 1985: Forecast evaluation. Probability, Statistics, and Decision Making in the Atmospheric Sciences, A. H. Murphy and R. W. Katz, Eds., Westview Press, 379–437.
Pickup, M. N., 1982: A consideration of the effect of 500mb cyclonicity on the success of some thunderstorm forecasting techniques. Meteor. Mag.,111, 87–97.
Potts, J. M., C. K. Folland, I. T. Jolliffe, and D. Sexton, 1996: Revised“LEPS” scores for assessing climate model simulations and long-range forecasts. J. Climate,9, 34–53.
Schaefer, J. T., 1990: The critical success index as an indicator of warning skill. Wea. Forecasting,5, 570–575.
Seaman, R., I. Mason, and F. Woodcock, 1996: Confidence intervals for some performance measures of yes/no forecasts. Aust. Meteor. Mag.,45, 49–53.
Stanski, H. R., L. J. Wilson, and W. R. Burrows, 1989: Survey of common verification methods in meteorology. WMO/TD-No. 358, World Meteorological Organization, Geneva, Switzerland, 114 pp.
Stephenson, D. B., 1997: Correlation of spatial climate/weather maps and the advantages of using the Mahalanobis metric in predictions. Tellus,49A, 513–527.
——, and R. M. Pickett, 1982: Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, 253 pp.
Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 465 pp.
Woodcock, F., 1976: The evaluation of yes/no forecasts for scientific and administrative purposes. Mon. Wea. Rev.,104, 1209–1214.