• Barnston, A. G., and M. K. Tippett, 2013: Predictions of Nino3.4 SST in CFSv1 and CFSv2: A diagnostic comparison. Climate Dyn., 41, 16151633, doi:10.1007/s00382-013-1845-2.

    • Search Google Scholar
    • Export Citation
  • Conover, W. J., 1980: Practical Nonparametric Statistics. 2nd ed. Wiley-Interscience, 493 pp.

  • DelSole, T., and M. K. Tippett, 2014: Comparing forecast skill. Mon. Wea. Rev., 142, 46584678, doi:10.1175/MWR-D-14-00045.1.

  • Diebold, F. X., and R. S. Mariano, 1995: Comparing predictive accuracy. J. Bus. Econ. Stat., 13, 253263.

  • Kirtman, B. P., and Coauthors, 2014: The North American Multimodel Ensemble: Phase-1 seasonal-to-interannual prediction; Phase-2 toward developing intraseasonal prediction. Bull. Amer. Meteor. Soc., 95, 585601, doi:10.1175/BAMS-D-12-00050.1.

    • Search Google Scholar
    • Export Citation
  • Kumar, A., M. Chen, L. Zhang, W. Wang, Y. Xue, C. Wen, L. Marx, and B. Huang, 2012: An analysis of the nonstationarity in the bias of sea surface temperature forecasts for the NCEP Climate Forecast System (CFS) version 2. Mon. Wea. Rev., 140, 30033016, doi:10.1175/MWR-D-11-00335.1.

    • Search Google Scholar
    • Export Citation
  • Reynolds, R. W., N. A. Rayner, T. M. Smith, D. C. Stokes, and W. Wang, 2002: An improved in situ and satellite SST analysis for climate. J. Climate, 15, 16091625, doi:10.1175/1520-0442(2002)015<1609:AIISAS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Rosner, B., 2000: Fundamentals of Biostatistics. Duxbury, 792 pp.

  • Saha, S., and Coauthors, 2006: The NCEP Climate Forecast System. J. Climate, 19, 34833517, doi:10.1175/JCLI3812.1.

  • Saha, S., and Coauthors, 2014: The NCEP Climate Forecast System version 2. J. Climate, 27, 21852208, doi:10.1175/JCLI-D-12-00823.1.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 851 624 11
PDF Downloads 518 312 13

Forecast Comparison Based on Random Walks

View More View Less
  • 1 George Mason University, Fairfax, Virginia, and Center for Ocean–Land–Atmosphere Studies, Calverton, Maryland
  • | 2 Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York, and Center of Excellence for Climate Change Research, Department of Meteorology, King Abdulaziz University, Jeddah, Saudi Arabia
Restricted access

Abstract

This paper proposes a procedure based on random walks for testing and visualizing differences in forecast skill. The test is formally equivalent to the sign test and has numerous attractive statistical properties, including being independent of distributional assumptions about the forecast errors and being applicable to a wide class of measures of forecast quality. While the test is best suited for independent outcomes, it provides useful information even when serial correlation exists. The procedure is applied to deterministic ENSO forecasts from the North American Multimodel Ensemble and yields several revealing results, including 1) the Canadian models are the most skillful dynamical models, even when compared to the multimodel mean; 2) a regression model is significantly more skillful than all but one dynamical model (to which it is equally skillful); and 3) in some cases, there are significant differences in skill between ensemble members from the same model, potentially reflecting differences in initialization. The method requires only a few years of data to detect significant differences in the skill of models with known errors/biases, suggesting that the procedure may be useful for model development and monitoring of real-time forecasts.

Corresponding author address: Timothy DelSole, George Mason University, 4400 University Dr., 112 Research Hall, Mail Stop 2B3, Fairfax, VA 22030. E-mail: delsole@cola.iges.org

Abstract

This paper proposes a procedure based on random walks for testing and visualizing differences in forecast skill. The test is formally equivalent to the sign test and has numerous attractive statistical properties, including being independent of distributional assumptions about the forecast errors and being applicable to a wide class of measures of forecast quality. While the test is best suited for independent outcomes, it provides useful information even when serial correlation exists. The procedure is applied to deterministic ENSO forecasts from the North American Multimodel Ensemble and yields several revealing results, including 1) the Canadian models are the most skillful dynamical models, even when compared to the multimodel mean; 2) a regression model is significantly more skillful than all but one dynamical model (to which it is equally skillful); and 3) in some cases, there are significant differences in skill between ensemble members from the same model, potentially reflecting differences in initialization. The method requires only a few years of data to detect significant differences in the skill of models with known errors/biases, suggesting that the procedure may be useful for model development and monitoring of real-time forecasts.

Corresponding author address: Timothy DelSole, George Mason University, 4400 University Dr., 112 Research Hall, Mail Stop 2B3, Fairfax, VA 22030. E-mail: delsole@cola.iges.org
Save