I am grateful to three anonymous reviewers, the editor, and Chris Ferro, for perceptive, stimulating, and constructive comments that helped to improve this paper. Caio Coelho provided the data used in the examples. Much of the work leading to the paper was supported by a NERC Discipline Bridging Award, NER/T/S/2003/0126.
Benjamini, Y., , and Hochberg Y. , 1995: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Stat. Soc., 57B , 289–300.
Benjamini, Y., , and Yekutieli D. , 2001: The control of the false discovery rate in multiple testing under dependency. Ann. Stat., 29 , 1165–1188.
Bradley, A. A., , Hashino T. , , and Schwartz S. S. , 2003: Distributions-oriented verification of probability forecasts for small data samples. Wea. Forecasting, 18 , 903–917.
Briggs, W., , Pocernich M. , , and Ruppert D. , 2005: Incorporating misclassification error in skill assessment. Mon. Wea. Rev., 133 , 3382–3392.
Brown, L. D., , Cai T. T. , , and DasGupta A. , 2002: Confidence intervals for a binomial proportion and asymptotic expansions. Ann. Stat., 30 , 160–201.
Déqué, M., 2003: Continuous variables. Forecast Verification: A Practitioner’s Guide in Atmospheric Science, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 97–119.
Efron, B., 2004: Large-scale simultaneous hypothesis testing: The choice of a null hypothesis. J. Amer. Stat. Assoc., 99 , 96–104.
Elmore, K. L., , Baldwin M. E. , , and Schultz D. M. , 2006: Field significance revisited: Spatial bias errors in forecasts as applied to the Eta Model. Mon. Wea. Rev., 134 , 519–531.
Epstein, E. S., 1985: Statistical Inference and Prediction in Climatology: A Bayesian Approach. Meteor. Monogr., Amer. Meteor. Soc., No. 42, 199 pp.
Genovese, C., , and Wasserman L. , 2004: A stochastic process approach to false discovery rates. Ann. Stat., 32 , 1035–1061.
Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14 , 155–167.
Hogg, R. V., , and Tanis E. A. , 2001: Probability and Statistical Inference. 6th ed. Prentice Hall, 704 pp.
Kane, T. L., , and Brown B. G. , 2000: Confidence intervals for some verification measures—A survey of several methods. Preprints, 15th Conf. on Probability and Statistics in the Atmospheric Sciences, Asheville, NC, Amer. Meteor. Soc., 46–49.
Livezey, R. E., , and Chen W. Y. , 1983: Statistical field significance and its determination by Monte Carlo techniques. Mon. Wea. Rev., 111 , 46–59.
Mason, I. B., 2003: Binary events. Forecast Verification: A Practitioner’s Guide in Atmospheric Science, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 37–76.
Miao, W., , and Gastwirth J. L. , 2004: The effect of dependence on confidence intervals for a population proportion. Amer. Stat., 58 , 124–130.
Pearson, E. S., , and Hartley H. O. , 1970: Biometrika Tables for Statisticians. Vol. 1. 3d ed. Cambridge University Press, 270 pp.
Pepe, M. S., 2003: The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, 302 pp.
Perone Pacifico, M., , Genovese C. , , Verdinelli I. , , and Wasserman L. , 2004: False discovery rate control for random fields. J. Amer. Stat. Assoc., 99 , 1002–1014.
Potts, J. M., 1991: Statistical methods for the comparison of spatial patterns in meteorological variables. Ph.D. thesis, University of Kent at Canterbury, Canterbury, United Kingdom, 271 pp.
Raftery, A. E., , Gneiting T. , , Balabdaoui F. , , and Polakowski M. , 2005: Using Bayesian model averaging to calibrate forecast ensembles. Mon. Wea. Rev., 133 , 1155–1174.
Schenker, N., , and Gentleman J. , 2001: On judging the significance of differences by examining the overlap between confidence intervals. Amer. Stat., 55 , 182–186.
Seaman, R., , Mason I. , , and Woodcock F. , 1996: Confidence intervals for some performance measures of yes–no forecasts. Aust. Meteor. Mag., 45 , 49–53.
Stephenson, D. B., 2000: Use of the “odds ratio” for diagnosing forecast skill. Wea. Forecasting, 15 , 221–232.
Storey, J. D., , Taylor J. E. , , and Siegmund D. , 2004: Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J. Roy. Stat. Soc., 66B , 187–205.
Thornes, J. E., , and Stephenson D. B. , 2001: How to judge the quality and value of weather forecast products. Meteor. Appl., 8 , 307–314.
Ventura, V., , Paciorek C. J. , , and Risbey J. S. , 2004: Controlling the proportion of falsely rejected hypotheses when conducting multiple tests with climatological data. J. Climate, 17 , 4343–4356.
Woodcock, F., 1976: The evaluation of yes/no forecasts for scientific and administrative purposes. Mon. Wea. Rev., 104 , 1209–1214.
Yekutieli, D., , and Benjamini Y. , 1999: Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J. Stat. Plan. Info., 82 , 171–196.