Calibrated Surface Temperature Forecasts from the Canadian Ensemble Prediction System Using Bayesian Model Averaging

Laurence J. Wilson Meteorological Research Division, Environment Canada, Dorval, Quebec, Canada

Search for other papers by Laurence J. Wilson in
Current site
Google Scholar
PubMed
Close
,
Stephane Beauregard Canadian Meteorological Centre, Meteorological Service of Canada, Dorval, Quebec, Canada

Search for other papers by Stephane Beauregard in
Current site
Google Scholar
PubMed
Close
,
Adrian E. Raftery Department of Statistics, University of Washington, Seattle, Washington

Search for other papers by Adrian E. Raftery in
Current site
Google Scholar
PubMed
Close
, and
Richard Verret Canadian Meteorological Centre, Meteorological Service of Canada, Dorval, Quebec, Canada

Search for other papers by Richard Verret in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Bayesian model averaging (BMA) has recently been proposed as a way of correcting underdispersion in ensemble forecasts. BMA is a standard statistical procedure for combining predictive distributions from different sources. The output of BMA is a probability density function (pdf), which is a weighted average of pdfs centered on the bias-corrected forecasts. The BMA weights reflect the relative contributions of the component models to the predictive skill over a training sample. The variance of the BMA pdf is made up of two components, the between-model variance, and the within-model error variance, both estimated from the training sample. This paper describes the results of experiments with BMA to calibrate surface temperature forecasts from the 16-member Canadian ensemble system. Using one year of ensemble forecasts, BMA was applied for different training periods ranging from 25 to 80 days. The method was trained on the most recent forecast period, then applied to the next day’s forecasts as an independent sample. This process was repeated through the year, and forecast quality was evaluated using rank histograms, the continuous rank probability score, and the continuous rank probability skill score. An examination of the BMA weights provided a useful comparative evaluation of the component models, both for the ensemble itself and for the ensemble augmented with the unperturbed control forecast and the higher-resolution deterministic forecast. Training periods around 40 days provided a good calibration of the ensemble dispersion. Both full regression and simple bias-correction methods worked well to correct the bias, except that the full regression failed to completely remove seasonal trend biases in spring and fall. Simple correction of the bias was sufficient to produce positive forecast skill out to 10 days with respect to climatology, which was improved by the BMA. The addition of the control forecast and the full-resolution model forecast to the ensemble produced modest improvement in the forecasts for ranges out to about 7 days. Finally, BMA produced significantly narrower 90% prediction intervals compared to a simple Gaussian bias correction, while achieving similar overall accuracy.

Corresponding author address: Laurence J. Wilson, Environment Canada, 2121 Transcanada Highway, 5th Floor, Dorval, QC H9P 1J3, Canada. Email: lawrence.wilson@ec.gc.ca

Abstract

Bayesian model averaging (BMA) has recently been proposed as a way of correcting underdispersion in ensemble forecasts. BMA is a standard statistical procedure for combining predictive distributions from different sources. The output of BMA is a probability density function (pdf), which is a weighted average of pdfs centered on the bias-corrected forecasts. The BMA weights reflect the relative contributions of the component models to the predictive skill over a training sample. The variance of the BMA pdf is made up of two components, the between-model variance, and the within-model error variance, both estimated from the training sample. This paper describes the results of experiments with BMA to calibrate surface temperature forecasts from the 16-member Canadian ensemble system. Using one year of ensemble forecasts, BMA was applied for different training periods ranging from 25 to 80 days. The method was trained on the most recent forecast period, then applied to the next day’s forecasts as an independent sample. This process was repeated through the year, and forecast quality was evaluated using rank histograms, the continuous rank probability score, and the continuous rank probability skill score. An examination of the BMA weights provided a useful comparative evaluation of the component models, both for the ensemble itself and for the ensemble augmented with the unperturbed control forecast and the higher-resolution deterministic forecast. Training periods around 40 days provided a good calibration of the ensemble dispersion. Both full regression and simple bias-correction methods worked well to correct the bias, except that the full regression failed to completely remove seasonal trend biases in spring and fall. Simple correction of the bias was sufficient to produce positive forecast skill out to 10 days with respect to climatology, which was improved by the BMA. The addition of the control forecast and the full-resolution model forecast to the ensemble produced modest improvement in the forecasts for ranges out to about 7 days. Finally, BMA produced significantly narrower 90% prediction intervals compared to a simple Gaussian bias correction, while achieving similar overall accuracy.

Corresponding author address: Laurence J. Wilson, Environment Canada, 2121 Transcanada Highway, 5th Floor, Dorval, QC H9P 1J3, Canada. Email: lawrence.wilson@ec.gc.ca

Save
  • Anderson, J., 1996: A method for producing and evaluating probabilistic forecasts from ensemble model integrations. J. Climate, 9 , 15181530.

    • Search Google Scholar
    • Export Citation
  • Buizza, R., 1997: Potential forecast skill of ensemble prediction and spread and skill distributions of the ECMWF ensemble prediction system. Mon. Wea. Rev., 125 , 99119.

    • Search Google Scholar
    • Export Citation
  • Candille, G., and O. Talagrand, 2005: Evaluation of probabilistic prediction systems for a scalar variable. Quart. J. Roy. Meteor. Soc., 131 , 231250.

    • Search Google Scholar
    • Export Citation
  • Coté, J., S. Gravel, A. Méthot, A. Patoine, M. Roch, and A. Staniforth, 1998: The operational CMC/MRB Global Environmental Multiscale (GEM) Model. Part I:–Design considerations and formulation. Mon. Wea. Rev., 126 , 13731395.

    • Search Google Scholar
    • Export Citation
  • Dempster, A. P., N. M. Laird, and D. B. Rubin, 1977: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc., 39B , 139.

    • Search Google Scholar
    • Export Citation
  • Fisher, R. A., 1922: On the mathematical foundations of theoretical statistics. Philos. Trans. Roy. Soc. London, 222A , 309368.

  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11 , 12021211.

    • Search Google Scholar
    • Export Citation
  • Grimit, E. P., and C. F. Mass, 2002: Initial results of a mesoscale short-range ensemble forecasting system over the Pacific Northwest. Wea. Forecasting, 17 , 192205.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129 , 550560.

  • Hamill, T. M., and S. J. Colucci, 1997: Verification of ETA-RSM short-range ensemble forecasts. Mon. Wea. Rev., 125 , 13121327.

  • Hamill, T. M., J. S. Whitaker, and X. Wei, 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132 , 14341447.

    • Search Google Scholar
    • Export Citation
  • Hersbach, H., 2000: Decomposition of the continuous ranked probability score for ensemble prediction systems. Wea. Forecasting, 15 , 559570.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., L. Lefaivre, J. Derome, H. Ritchie, and H. L. Mitchell, 1996: A system simulation approach to ensemble prediction. Mon. Wea. Rev., 124 , 12251242.

    • Search Google Scholar
    • Export Citation
  • Krishnamurti, T. N., , T. LaRow, D. Bachiochi, Z. Zhang, C. E. Williford, S. Gadgil, and S. Surendran, 1999: Improved weather and seasonal climate forecasts from multimodel superensembles. Science, 285 , 15481550.

    • Search Google Scholar
    • Export Citation
  • Lefaivre, L., P. L. Houtekamer, A. Bergeron, and R. Verret, 1997: The CMC ensemble prediction system. Proc. Sixth Workshop on Meteorological Operational Systems, Reading, United Kingdom, ECMWF, 31–44.

  • McLachlan, G. J., and T. Krishnan, 1997: The EM Algorithm and Extensions. Wiley, 274 pp.

  • Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble system: Methodology and validation. Quart. J. Roy. Meteor. Soc., 122 , 73119.

    • Search Google Scholar
    • Export Citation
  • Pellerin, G., L. Lefaivre, P. Houtekamer, and C. Girard, 2003: Increasing the horizontal resolution of ensemble forecasts at CMC. Nonlinear Processes Geophys., 10 , 463468.

    • Search Google Scholar
    • Export Citation
  • Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, 2005: Using Bayesian model averaging to calibrate forecast ensembles. Mon. Wea. Rev., 133 , 11551174.

    • Search Google Scholar
    • Export Citation
  • Richardson, D. S., R. Buizza, and R. Hagedorn, 2005: Report of the 1st Workshop on the THORPEX Interactive Grand Global Ensemble (TIGGE). ECMWF, 34 pp. [Available online at http://www.ecmwf.int/newsevents/meetings/workshops/2005/TIGGE/TIGGE_report.pdf.].

  • Ritchie, H., 1991: Application of the semi-Lagrangian method to a multi-level spectral primitive-equations model. Quart. J. Roy. Meteor. Soc., 117 , 91106.

    • Search Google Scholar
    • Export Citation
  • Simonsen, C., 1991: Self-adaptive model output statistics based on Kalman filtering. Lectures and papers presented at the WMO training workshop on the interpretation of NWP products in terms of local weather phenomena and their verification, Wageningen, Netherlands, WMO PSMP Research Rep. Series 34, XX-33–XX-37.

  • Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125 , 32973319.

  • Toth, Z., Y. Zhu, and T. Marchok, 2001: The use of ensembles to identify forecasts with small and large uncertainty. Wea. Forecasting, 16 , 463477.

    • Search Google Scholar
    • Export Citation
  • Toth, Z., and Coauthors, 2005: The North American Ensemble Forecast System. Preprints, 21st Conf. on Weather Analysis and Forecasting/17th Conf. on Numerical Weather Prediction, Washington, DC, Amer. Meteor. Soc., CD-ROM, 11A.1.

  • Vallée, M., L. Wilson, and P. Bourgouin, 1996: New statistical methods for the interpretation of NWP output at the Canadian Meteorological Center. Preprints, 13th Conf. on Probability and Statistics in the Atmospheric Sciences, San Francisco, CA, Amer. Meteor. Soc., 37–44.

  • Vislocky, R. L., and J. M. Fritsch, 1995: Improved model output statistics forecasts through model consensus. Bull. Amer. Meteor. Soc., 76 , 11571164.

    • Search Google Scholar
    • Export Citation
  • Wilson, L. J., and M. Vallée, 2002: The Canadian updateable model output statistics (UMOS) system: Design and development tests. Wea. Forecasting, 17 , 206222.

    • Search Google Scholar
    • Export Citation
  • Wilson, L. J., and M. Vallée, 2003: The Canadian updateable model output statistics (UMOS) system: Validation against perfect prog. Wea. Forecasting, 18 , 288302.

    • Search Google Scholar
    • Export Citation
  • Ziehmann, C., 2000: Comparison of a single-model EPS with a multi-model ensemble consisting of a few operational models. Tellus, 52A , 280299.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 889 325 13
PDF Downloads 361 108 2