We thank Robin Hogan and two anonymous reviewers for their comments on earlier versions of this paper, and members of the European Centre for Medium-Range Weather Forecasts Technical Advisory Committee Subgroup on Verification Measures for conversations about this work.
Brill, K. F., 2009: A general analytic method for assessing sensitivity to bias of performance measures for dichotomous forecasts. Wea. Forecasting, 24, 307–318.
Davies, T., Cullen M. J. P. , Malcolm A. J. , Mawson M. H. , Staniforth A. , White A. A. , and Wood N. , 2005: A new dynamical core for the Met Office’s global and regional modelling of the atmosphere. Quart. J. Roy. Meteor. Soc., 131, 1759–1782.
Ferro, C. A. T., 2007: A probability model for verifying deterministic forecasts of extreme events. Wea. Forecasting, 22, 1089–1100.
Ghelli, A., and Primo C. , 2009: On the use of the extreme dependency score to investigate the performance of an NWP model for rare events. Meteor. Appl., 16, 537–544.
Göber, M., Wilson C. A. , Milton S. F. , and Stephenson D. B. , 2004: Fair play in the verification of operational quantitative precipitation forecasts. J. Hydrol., 288, 225–236.
Hamill, T. M., and Juras J. , 2006: Measuring forecast skill: Is it real skill or is it the varying climatology? Quart. J. Roy. Meteor. Soc., 132, 2905–2923.
Harvey, L. O., Jr., Hammond K. R. , Lusk C. M. , and Mross E. F. , 1992: The application of signal detection theory to weather forecasting behavior. Mon. Wea. Rev., 120, 863–883.
Hogan, R. J., O’Connor E. J. , and Illingworth A. J. , 2009: Verification of cloud-fraction forecasts. Quart. J. Roy. Meteor. Soc., 135, 1494–1511.
Hogan, R. J., Ferro C. A. T. , Jolliffe I. T. , and Stephenson D. B. , 2010: Equitability revisited: Why the “equitable threat score” is not equitable. Wea. Forecasting, 25, 710–726.
Hubálek, Z., 1982: Coefficients of association and similarity, based on binary (presence-absence) data: An evaluation. Biol. Rev. Cambridge Philos. Soc., 57, 669–689.
Mason, I. B., 2003: Binary events. Forecast Verification: A Practitioner’s Guide in Atmospheric Science, I. T. Jolliffe and D. B. Stephenson, Eds., John Wiley and Sons, 37–76.
Mason, S. J., and Graham N. E. , 1999: Conditional probabilities, relative operating characteristics, and relative operating levels. Wea. Forecasting, 14, 713–725.
Mason, S. J., and Graham N. E. , 2002: Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quart. J. Roy. Meteor. Soc., 128, 2145–2166.
Murphy, A. H., and Daan H. , 1985: Forecast evaluation. Probability, Statistics and Decision Making in the Atmospheric Sciences, A. H. Murphy and R. W. Katz, Eds., Westview Press, 379–437.
Segers, J., and Vandewalle B. , 2004: Statistics of multivariate extremes. Statistics of Extremes: Theory and Applications, J. Beirlant et al., Eds., John Wiley and Sons, 297–368.
Stephenson, D. B., Casati B. , Ferro C. A. T. , and Wilson C. A. , 2008: The extreme dependency score: A non-vanishing measure for forecasts of rare events. Meteor. Appl., 15, 41–50.
Swets, J. A., 1986: Form of empirical ROCs in discrimination and diagnostic tasks: Implications for theory and measurement of performance. Psychol. Bull., 99, 181–198.
Swets, J. A., 1996: Signal Detection Theory and ROC Analysis in Psychology and Diagnostics: Collected Papers. Lawrence Erlbaum, 328 pp.
Woodcock, F., 1976: The evaluation of yes/no forecasts for scientific and administrative purposes. Mon. Wea. Rev., 104, 1209–1214.
Yule, G. U., 1900: On the association of attributes in statistics: With illustrations from the material of the Childhood Society, &c. Philos. Trans. Roy. Soc. London, 194A, 257–319.