A Stratified Sampling Approach for Improved Sampling from a Calibrated Ensemble Forecast Distribution

Yiming Hu College of Hydrology and Water Resources, Hohai University, Nanjing, China
UNESCO-IHE Institute for Water Education, Delft, Netherlands

Search for other papers by Yiming Hu in
Current site
Google Scholar
PubMed
Close
,
Maurice J. Schmeits Royal Netherlands Meteorological Institute, De Bilt, Netherlands

Search for other papers by Maurice J. Schmeits in
Current site
Google Scholar
PubMed
Close
,
Schalk Jan van Andel UNESCO-IHE Institute for Water Education, Delft, Netherlands

Search for other papers by Schalk Jan van Andel in
Current site
Google Scholar
PubMed
Close
,
Jan S. Verkade Deltares, Delft, Netherlands

Search for other papers by Jan S. Verkade in
Current site
Google Scholar
PubMed
Close
,
Min Xu Deltares, Delft, Netherlands

Search for other papers by Min Xu in
Current site
Google Scholar
PubMed
Close
,
Dimitri P. Solomatine UNESCO-IHE Institute for Water Education, Delft, Netherlands
Water Resources Section, Delft University of Technology, Delft, Netherlands

Search for other papers by Dimitri P. Solomatine in
Current site
Google Scholar
PubMed
Close
, and
Zhongmin Liang National Cooperative Innovation Centre for Water Safety and Hydro-Science, and State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai University, Nanjing, China

Search for other papers by Zhongmin Liang in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Before using the Schaake shuffle or empirical copula coupling (ECC) to reconstruct the dependence structure for postprocessed ensemble meteorological forecasts, a necessary step is to sample discrete samples from each postprocessed continuous probability density function (pdf), which is the focus of this paper. In addition to the equidistance quantiles (EQ) and independent random (IR) sampling methods commonly used at present, the stratified sampling (SS) method is proposed. The performance of the three sampling methods is compared using calibrated GFS ensemble precipitation reforecasts over the Xixian basin in China. The ensemble reforecasts are first calibrated using heteroscedastic extended logistic regression (HELR), and then the three sampling methods are used to sample calibrated pdfs with a varying number of discrete samples. Finally, the effect of the sampling method on the reconstruction of ensemble members with preserved space dependence structure is analyzed by using EQ, IR, and SS in ECC for reconstructing postprocessed ensemble members for four stations in the Xixian basin. There are three main results. 1) The HELR model has a significant improvement over the raw ensemble forecast. It clearly improves the mean and dispersion of the predictive distribution. 2) Compared to EQ and IR, SS can better cover the tails of the calibrated pdfs and a better dispersion of calibrated ensemble forecasts is obtained. In terms of probabilistic verification metrics like the ranked probability skill score (RPSS), SS is slightly better than EQ and clearly better than IR, while in terms of the deterministic verification metric, root-mean-square error, EQ is slightly better than SS. 3) ECC-SS, ECC-EQ, and ECC-IR all calibrate the raw ensemble forecast, but ECC-SS shows a better dispersion than ECC-EQ and ECC-IR in this study.

Corresponding author address: Dr. Yiming Hu, College of Hydrology and Water Resources, Hohai University, No. 1 Xikang Road, Nanjing City, Nanjing 210098, China. E-mail: hymkyan@163.com

Abstract

Before using the Schaake shuffle or empirical copula coupling (ECC) to reconstruct the dependence structure for postprocessed ensemble meteorological forecasts, a necessary step is to sample discrete samples from each postprocessed continuous probability density function (pdf), which is the focus of this paper. In addition to the equidistance quantiles (EQ) and independent random (IR) sampling methods commonly used at present, the stratified sampling (SS) method is proposed. The performance of the three sampling methods is compared using calibrated GFS ensemble precipitation reforecasts over the Xixian basin in China. The ensemble reforecasts are first calibrated using heteroscedastic extended logistic regression (HELR), and then the three sampling methods are used to sample calibrated pdfs with a varying number of discrete samples. Finally, the effect of the sampling method on the reconstruction of ensemble members with preserved space dependence structure is analyzed by using EQ, IR, and SS in ECC for reconstructing postprocessed ensemble members for four stations in the Xixian basin. There are three main results. 1) The HELR model has a significant improvement over the raw ensemble forecast. It clearly improves the mean and dispersion of the predictive distribution. 2) Compared to EQ and IR, SS can better cover the tails of the calibrated pdfs and a better dispersion of calibrated ensemble forecasts is obtained. In terms of probabilistic verification metrics like the ranked probability skill score (RPSS), SS is slightly better than EQ and clearly better than IR, while in terms of the deterministic verification metric, root-mean-square error, EQ is slightly better than SS. 3) ECC-SS, ECC-EQ, and ECC-IR all calibrate the raw ensemble forecast, but ECC-SS shows a better dispersion than ECC-EQ and ECC-IR in this study.

Corresponding author address: Dr. Yiming Hu, College of Hydrology and Water Resources, Hohai University, No. 1 Xikang Road, Nanjing City, Nanjing 210098, China. E-mail: hymkyan@163.com
Save
  • Claggett, P. R., Okay J. A. , and Stehman S. V. , 2010: Monitoring regional riparian forest cover change using stratified sampling and multiresolution imagery. J. Amer. Water Resour. Assoc., 46, 334343, doi:10.1111/j.1752-1688.2010.00424.x.

    • Search Google Scholar
    • Export Citation
  • Clark, M., Gangopadhyay S. , Hay L. , Rajagopalan B. , and Wilby R. , 2004: The Schaake shuffle: A method for reconstructing space–time variability in forecasted precipitation and temperature fields. J. Hydrometeor., 5, 243262, doi:10.1175/1525-7541(2004)005<0243:TSSAMF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Ding, C. G., and Lee H. Y. , 2014: An accurate confidence interval for the mean tourist expenditure under stratified random sampling. Curr. Issues Tour., 17, 674678, doi:10.1080/13683500.2013.857296.

    • Search Google Scholar
    • Export Citation
  • Gneiting, T., Raftery A. E. , Westveld A. H. III, and Goldman T. , 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., 133, 10981118, doi:10.1175/MWR2904.1.

    • Search Google Scholar
    • Export Citation
  • Gneiting, T., Stanberry L. I. , Grimit E. P. , Held L. , and Johnson N. A. , 2008: Assessing probabilistic forecasts of multivariate quantities, with applications to ensemble predictions of surface winds. TEST, 17, 211235, doi:10.1007/s11749-008-0114-x.

    • Search Google Scholar
    • Export Citation
  • Hagedorn, R., Hamill T. M. , and Whitaker J. S. , 2008: Probabilistic forecast calibration using ECMWF and GFS ensemble reforecasts. Part I: Two-meter temperatures. Mon. Wea. Rev., 136, 26082619, doi:10.1175/2007MWR2410.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 2007: Comments on “Calibrated surface temperature forecasts from the Canadian ensemble prediction system using Bayesian model averaging.” Mon. Wea. Rev., 135, 42264230, doi:10.1175/2007MWR1963.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and Colucci S. J. , 1997: Verification of Eta–RSM short-range ensemble forecasts. Mon. Wea. Rev., 125, 13121327, doi:10.1175/1520-0493(1997)125<1312:VOERSR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., Whitaker J. S. , and Wei X. , 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132, 14341447, doi:10.1175/1520-0493(2004)132<1434:ERIMFS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., Hagedorn R. , and Whitaker J. S. , 2008: Probabilistic forecast calibration using ECMWF and GFS ensemble reforecasts. Part II: Precipitation. Mon. Wea. Rev., 136, 26202632, doi:10.1175/2007MWR2411.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., Bates G. T. , Whitaker J. S. , Murray D. R. , Fiorino M. , Galarneau T. J. , Zhu Y. , and Lapenta W. , 2013: NOAA’s second-generation global medium-range ensemble forecast dataset. Bull. Amer. Meteor. Soc., 94, 15531565, doi:10.1175/BAMS-D-12-00014.1.

    • Search Google Scholar
    • Export Citation
  • Kann, A., Wittmann C. , Wang Y. , and Ma X. , 2009: Calibrating 2-m temperature of limited-area ensemble forecasts using high-resolution analysis. Mon. Wea. Rev., 137, 33733387, doi:10.1175/2009MWR2793.1.

    • Search Google Scholar
    • Export Citation
  • Messner, J. W., Mary G. J. , Zeileis A. , and Wilks D. S. , 2014a: Heteroscedastic extended logistic regression for post-processing of ensemble guidance. Mon. Wea. Rev., 142, 448456, doi:10.1175/MWR-D-13-00271.1.

    • Search Google Scholar
    • Export Citation
  • Messner, J. W., Mary G. J. , and Zeileis A. , 2014b: Extending extended logistic regression: Extended versus separate versus ordered versus censored. Mon. Wea. Rev., 142, 30033014, doi:10.1175/MWR-D-13-00355.1.

    • Search Google Scholar
    • Export Citation
  • Noble, W., Naylor G. , Bhullar N. , and Akeroyd M. A. , 2012: Self-assessed hearing abilities in middle- and older-age adults: A stratified sampling approach. Int. J. Audiol., 51, 174180, doi:10.3109/14992027.2011.621899.

    • Search Google Scholar
    • Export Citation
  • Padilla, M., Stehmanb S. V. , and Chuviec E. , 2014: Validation of the 2008 MODIS-MCD45 global burned area product using stratified random sampling. Remote Sens. Environ., 144, 187196, doi:10.1016/j.rse.2014.01.008.

    • Search Google Scholar
    • Export Citation
  • Raftery, A. E., Gneiting T. , Balabdaoui F. , and Polakowski M. , 2005: Using Bayesian model averaging to calibrate forecast ensembles. Mon. Wea. Rev., 133, 11551174, doi:10.1175/MWR2906.1.

    • Search Google Scholar
    • Export Citation
  • Roulin, E., and Vannitsem S. , 2012: Postprocessing of ensemble precipitation predictions with extended logistic regression based on hindcasts. Mon. Wea. Rev., 140, 874888, doi:10.1175/MWR-D-11-00062.1.

    • Search Google Scholar
    • Export Citation
  • Roulston, M. S., and Smith L. A. , 2003: Combining dynamical and statistical ensembles. Tellus, 55A, 1630, doi:10.1034/j.1600-0870.2003.201378.x.

    • Search Google Scholar
    • Export Citation
  • Schefzik, R., Thorarinsdottir T. L. , and Gneiting T. , 2013: Uncertainty quantification in complex simulation models using ensemble copula coupling. Stat. Sci., 28, 616640, doi:10.1214/13-STS443.

    • Search Google Scholar
    • Export Citation
  • Scherrer, S. C., Appenzeller C. , Eckert P. , and Cattani D. , 2004: Analysis of the spread–skill relations using the ECMWF ensemble prediction system over Europe. Wea. Forecasting, 19, 552565, doi:10.1175/1520-0434(2004)019<0552:AOTSRU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Schmeits, M. J., and Kok K. , 2010: A comparison between raw ensemble output, (modified) Bayesian model averaging, and extended logistic regression using ECMWF ensemble precipitation reforecasts. Mon. Wea. Rev., 138, 41994211, doi:10.1175/2010MWR3285.1.

    • Search Google Scholar
    • Export Citation
  • Sloughter, J. M., Raftery A. E. , Gneiting T. , and Fraley C. , 2007: Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon. Wea. Rev., 135, 32093220, doi:10.1175/MWR3441.1.

    • Search Google Scholar
    • Export Citation
  • Wallenius, K., Niemi R. M. , and Rita H. , 2011: Using stratified sampling based on pre-characterisation of samples in soil microbiological studies. Appl. Soil Ecol., 51, 111113, doi:10.1016/j.apsoil.2011.09.006.

    • Search Google Scholar
    • Export Citation
  • Weigel, A. P., Liniger M. A. , and Appenzeller C. , 2007: The discrete Brier and ranked probability skill scores. Mon. Wea. Rev., 135, 118124, doi:10.1175/MWR3280.1.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2009: Extending logistic regression to provide full probability distribution MOS forecasts. Meteor. Appl., 16, 361368, doi:10.1002/met.134.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2011: Statistical Methods in the Atmospheric Sciences. 3rd ed. International Geophysics Series, Vol. 100, Academic Press, 704 pp.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2015: Multivariate ensemble model output statistics using empirical copula. Quart. J. Roy. Meteor. Soc., 141, 945952, doi:10.1002/qj.2414.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., and Hamill T. M. , 2007: Comparison of ensemble-MOS methods using GFS reforecasts. Mon. Wea. Rev., 135, 23792390, doi:10.1175/MWR3402.1.

    • Search Google Scholar
    • Export Citation
  • Williams, R. M., Ferro C. A. T. , and Kwasniok F. , 2014: A comparison of ensemble post-processing methods for extreme events. Quart. J. Roy. Meteor. Soc., 140, 11121120, doi:10.1002/qj.2198.

    • Search Google Scholar
    • Export Citation
  • Zalachori, I., Ramos M. H. , Garçon R. , Mathevet T. , and Gailhard J. , 2012: Statistical processing of forecasts for hydrological ensemble prediction: A comparative study of different bias correction strategies. Adv. Sci. Res., 8, 135141, doi:10.5194/asr-8-135-2012.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 2309 1542 208
PDF Downloads 1329 814 139