• Annan, J. D., and J. C. Hargreaves, 2011: Understanding the CMIP3 multimodel ensemble. J. Climate, 24, 45294538, https://doi.org/10.1175/2011JCLI3873.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Benestad, R. E., 2013: Association between trends in daily rainfall percentiles and the global mean temperature. J. Geophys. Res. Atmos., 118, 10 80210 810, https://doi.org/10.1002/jgrd.50814.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bishop, C., 2007: Pattern Recognition and Machine Learning. 2nd ed. Information Science and Statistics Series, Springer, 738 pp.

  • Blum, A., J. Hopcroft, and R. Kannan, 2017: Foundations of data science. Cornell University, 454 pp., https://www.cs.cornell.edu/jeh/book.pdf.

  • Bretherton, C. S., M. Widmann, V. P. Dymnikov, J. M. Wallace, and I. Bladé, 1999: The effective number of spatial degrees of freedom of a time-varying field. J. Climate, 12, 19902009, https://doi.org/10.1175/1520-0442(1999)012<1990:TENOSD>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Casanova, S., and B. Ahrens, 2009: On the weighting of multimodel ensembles in seasonal and short-range weather forecasting. Mon. Wea. Rev., 137, 38113822, https://doi.org/10.1175/2009MWR2893.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cherkassky, V. S., and F. Mulier, 2007: Learning from Data: Concepts, Theory, and Methods. 2nd ed. John Wiley & Sons, 538 pp.

    • Crossref
    • Export Citation
  • Christiansen, B., 2015: The role of the selection problem and non-Gaussianity in attribution of single events to climate change. J. Climate, 28, 98739891, https://doi.org/10.1175/JCLI-D-15-0318.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Delle Monache, L., and R. B. Stull, 2003: An ensemble air-quality forecast over western Europe during an ozone episode. Atmos. Environ., 37, 34693474, https://doi.org/10.1016/S1352-2310(03)00475-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Du, J., S. L. Mullen, and F. Sanders, 1997: Short-range ensemble forecasting of quantitative precipitation. Mon. Wea. Rev., 125, 24272459, https://doi.org/10.1175/1520-0493(1997)125<2427:SREFOQ>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Flato, G., and et al. , 2013: Evaluation of climate models. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 741–866, https://doi.org/10.1017/CBO9781107415324.020.

    • Search Google Scholar
    • Export Citation
  • Gleckler, P., K. Taylor, and C. Doutriaux, 2008: Performance metrics for climate models. J. Geophys. Res., 113, D06104, https://doi.org/10.1029/2007JD008972.

    • Search Google Scholar
    • Export Citation
  • Hagedorn, R., F. J. Doblas-Reyes, and T. N. Palmer, 2005: The rationale behind the success of multi-model ensembles in seasonal forecasting—I. Basic concept. Tellus, 57A, 219233, https://doi.org/10.1111/j.1600-0870.2005.00103.x.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and S. J. Colucci, 1997: Verification of eta–RSM short-range ensemble forecasts. Mon. Wea. Rev., 125, 13121327, https://doi.org/10.1175/1520-0493(1997)125<1312:VOERSR>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hecht-Nielsen, R., 1990: Neurocomputing. Addison-Wesley, 433 pp.

  • Jones, P. D., and K. R. Briffa, 1996: What can the instrumental record tell us about longer timescale paleoclimatic reconstructions? Climatic Variations and Forcing Mechanisms of the Last 2000 Years, P. D. Jones, R. S. Bradley, and J. Jouzel, Eds., NATO ASI Series, Vol. 41, Springer, 625–644, https://doi.org/10.1007/978-3-642-61113-1_30.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., and et al. , 1996: The NCEP/NCAR 40-Year Reanalysis Project. Bull. Amer. Meteor. Soc., 77, 437471, https://doi.org/10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Knutti, R., R. Furrer, C. Tebaldi, J. Cermak, and G. A. Meehl, 2010: Challenges in combining projections from multiple climate models. J. Climate, 23, 27392758, https://doi.org/10.1175/2009JCLI3361.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krishnamurti, T. N., C. M. Kishtawal, T. E. LaRow, D. R. Bachiochi, Z. Zhang, C. E. Williford, S. Gadgil, and S. Surendran, 1999: Improved weather and seasonal climate forecasts from multimodel superensemble. Science, 285, 15481550, https://doi.org/10.1126/science.285.5433.1548.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lambert, S. J., and G. J. Boer, 2001: CMIP1 evaluation and intercomparison of coupled climate models. Climate Dyn., 17, 83106, https://doi.org/10.1007/PL00013736.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts. Mon. Wea. Rev., 102, 409418, https://doi.org/10.1175/1520-0493(1974)102<0409:TSOMCF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • McKeen, S., and et al. , 2005: Assessment of an ensemble of seven real-time ozone forecasts over eastern North America during the summer of 2004. J. Geophys. Res., 110, D21307, https://doi.org/10.1029/2005JD005858.

    • Search Google Scholar
    • Export Citation
  • Moron, V., A. W. Robertson, and M. N. Ward, 2006: Seasonal predictability and spatial coherence of rainfall characteristics in the tropical setting of Senegal. Mon. Wea. Rev., 134, 32483262, https://doi.org/10.1175/MWR3252.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • North, G. R., J. Wang, and M. G. Genton, 2011: Correlation models for temperature fields. J. Climate, 24, 58505862, https://doi.org/10.1175/2011JCLI4199.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pincus, R., C. P. Batstone, R. J. P. Hofmann, K. E. Taylor, and P. J. Glecker, 2008: Evaluating the present-day simulation of clouds, precipitation, and radiation in climate models. J. Geophys. Res., 113, D14209, https://doi.org/10.1029/2007JD009334.

    • Search Google Scholar
    • Export Citation
  • Rougier, J., 2016: Ensemble averaging and mean squared error. J. Climate, 29, 88658870, https://doi.org/10.1175/JCLI-D-16-0012.1.

  • Sillmann, J., V. V. Kharin, X. Zhang, F. W. Zwiers, and D. Bronaugh, 2013: Climate extremes indices in the CMIP5 multimodel ensemble: Part 1. Model evaluation in the present climate. J. Geophys. Res. Atmos., 118, 17161733, https://doi.org/10.1002/jgrd.50203.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Taylor, K. E., R. J. Stouffer, and G. A. Meehl, 2012: An overview of CMIP5 and the experiment design. Bull. Amer. Meteor. Soc., 93, 485498, https://doi.org/10.1175/BAMS-D-11-00094.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125, 32973319, https://doi.org/10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • van Loon, M., and et al. , 2007: Evaluation of long-term ozone simulations from seven regional air quality models and their ensemble. Atmos. Environ., 41, 20832097, https://doi.org/10.1016/j.atmosenv.2006.10.073.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, X., and S. S. Shen, 1999: Estimation of spatial degrees of freedom of a climate field. J. Climate, 12, 12801291, https://doi.org/10.1175/1520-0442(1999)012<1280:EOSDOF>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 109 109 28
PDF Downloads 76 76 22

Ensemble Averaging and the Curse of Dimensionality

View More View Less
  • 1 Danish Meteorological Institute, Copenhagen, Denmark
© Get Permissions
Restricted access

Abstract

When comparing climate models to observations, it is often observed that the mean over many models has smaller errors than most or all of the individual models. This paper will show that a general consequence of the nonintuitive geometric properties of high-dimensional spaces is that the ensemble mean often outperforms the individual ensemble members. This also explains why the ensemble mean often has an error that is 30% smaller than the median error of the individual ensemble members. The only assumption that needs to be made is that the observations and the models are independently drawn from the same distribution. An important and relevant property of high-dimensional spaces is that independent random vectors are almost always orthogonal. Furthermore, while the lengths of random vectors are large and almost equal, the ensemble mean is special, as it is located near the otherwise vacant center. The theory is first explained by an analysis of Gaussian- and uniformly distributed vectors in high-dimensional spaces. A subset of 17 models from the CMIP5 multimodel ensemble is then used to demonstrate the validity and robustness of the theory in realistic settings.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Bo Christiansen, boc@dmi.dk

A comment/reply has been published regarding this article and can be found at http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-18-0274.1 and http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-18-0416.1

Abstract

When comparing climate models to observations, it is often observed that the mean over many models has smaller errors than most or all of the individual models. This paper will show that a general consequence of the nonintuitive geometric properties of high-dimensional spaces is that the ensemble mean often outperforms the individual ensemble members. This also explains why the ensemble mean often has an error that is 30% smaller than the median error of the individual ensemble members. The only assumption that needs to be made is that the observations and the models are independently drawn from the same distribution. An important and relevant property of high-dimensional spaces is that independent random vectors are almost always orthogonal. Furthermore, while the lengths of random vectors are large and almost equal, the ensemble mean is special, as it is located near the otherwise vacant center. The theory is first explained by an analysis of Gaussian- and uniformly distributed vectors in high-dimensional spaces. A subset of 17 models from the CMIP5 multimodel ensemble is then used to demonstrate the validity and robustness of the theory in realistic settings.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Bo Christiansen, boc@dmi.dk

A comment/reply has been published regarding this article and can be found at http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-18-0274.1 and http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-18-0416.1

Save