Probabilistic Forecasting of Cloud-Base Height and Visibility Using Quantile Regression Forests, Based on NWP and Observation Features

Dirk Wolters Royal Netherlands Meteorological Institute (KNMI), Utrecht, Netherlands

Search for other papers by Dirk Wolters in
Current site
Google Scholar
PubMed
Close
,
Maurice Schmeits Royal Netherlands Meteorological Institute (KNMI), Utrecht, Netherlands

Search for other papers by Maurice Schmeits in
Current site
Google Scholar
PubMed
Close
, and
Kirien Whan Royal Netherlands Meteorological Institute (KNMI), Utrecht, Netherlands

Search for other papers by Kirien Whan in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

We have applied quantile regression forests (QRFs) to generate probabilistic forecasts of weather conditions associated with low-visibility procedures (LVPs) at Schiphol Airport (Amsterdam, the Netherlands). LVPs are determined by combined thresholds of cloud-base height and (runway) visibility. Forecasts of these conditions are critical for airport operations, as they inform operational planning, with the potential of minimizing meteorologically induced disruptions. Using a dataset of 5 years of hourly data, we have performed a forward feature selection and optimized QRF’s hyperparameters for this specific application, and evaluated the model’s performance for different forecast lead times and different LVP classes. Hereby, LVP forecasts were obtained by combining separate models for cloud-base height and (runway) visibility, applying a Schaake shuffle approach for restoration of the dependencies between these parameters. The verification revealed consistent positive Brier skill scores (BSSs) for the three most common LVP classes: marginal, A, and B. Although the skill was not always positive for the more extreme LVP classes, C and D, we argue that also for these conditions forecasters might derive valuable indications from the forecast system. We demonstrate the operational utility of the system with an example, also illustrating the support of interpretability through the use of Shapley additive explanation (SHAP) values. Our results underscore the potential of QRF for probabilistic forecasts of meteorological conditions, for aviation and other purposes.

© 2025 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Dirk Wolters, dirk.wolters@knmi.nl

Abstract

We have applied quantile regression forests (QRFs) to generate probabilistic forecasts of weather conditions associated with low-visibility procedures (LVPs) at Schiphol Airport (Amsterdam, the Netherlands). LVPs are determined by combined thresholds of cloud-base height and (runway) visibility. Forecasts of these conditions are critical for airport operations, as they inform operational planning, with the potential of minimizing meteorologically induced disruptions. Using a dataset of 5 years of hourly data, we have performed a forward feature selection and optimized QRF’s hyperparameters for this specific application, and evaluated the model’s performance for different forecast lead times and different LVP classes. Hereby, LVP forecasts were obtained by combining separate models for cloud-base height and (runway) visibility, applying a Schaake shuffle approach for restoration of the dependencies between these parameters. The verification revealed consistent positive Brier skill scores (BSSs) for the three most common LVP classes: marginal, A, and B. Although the skill was not always positive for the more extreme LVP classes, C and D, we argue that also for these conditions forecasters might derive valuable indications from the forecast system. We demonstrate the operational utility of the system with an example, also illustrating the support of interpretability through the use of Shapley additive explanation (SHAP) values. Our results underscore the potential of QRF for probabilistic forecasts of meteorological conditions, for aviation and other purposes.

© 2025 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Dirk Wolters, dirk.wolters@knmi.nl
Save
  • Atger, F., 2004: Estimation of the reliability of ensemble-based probabilistic forecasts. Quart. J. Roy. Meteor. Soc., 130, 627646, https://doi.org/10.1256/qj.03.23.

    • Search Google Scholar
    • Export Citation
  • Bari, D., and A. Ouagabi, 2020: Machine-learning regression applied to diagnose horizontal visibility from mesoscale NWP model forecasts. SN Appl. Sci., 2, 556, https://doi.org/10.1007/s42452-020-2327-x.

    • Search Google Scholar
    • Export Citation
  • Bengtsson, L., and Coauthors, 2017: The HARMONIE–AROME model configuration in the ALADIN–HIRLAM NWP system. Mon. Wea. Rev., 145, 19191935, https://doi.org/10.1175/MWR-D-16-0417.1.

    • Search Google Scholar
    • Export Citation
  • Biecek, P., 2018: DALEX: Explainers for complex predictive models in R. arXiv, 1806.08915v2, https://doi.org/10.48550/arXiv.1806.08915.

  • Breiman, L., 2001: Random forests. Mach. Learn., 45, 532, https://doi.org/10.1023/A:1010933404324.

  • Breiman, L., J. H. Friedman, R. A. Olshen, C. A. Stone, and C. J. Mack, 1984: Classification and Regression Trees. Chapman & Hall/CRC, 358 pp.

  • Brier, G. W., 1950: Verification of forecasts expressed in terms of probability. Mon. Wea. Rev., 78, 13, https://doi.org/10.1175/1520-0493(1950)078%3C0001:VOFEIT%3E2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Clark, M., S. Gangopadhyay, L. Hay, B. Rajagopalan, and R. Wilby, 2004: The Schaake Shuffle: A method for reconstructing space–time variability in forecasted precipitation and temperature fields. J. Hydrometeor., 5, 243262, https://doi.org/10.1175/1525-7541(2004)005%3C0243:TSSAMF%3E2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Dietz, S. J., P. Kneringer, G. J. Mayr, and A. Zeileis, 2019: Forecasting low-visibility procedure states with tree-based statistical methods. Pure Appl. Geophys., 176, 26312644, https://doi.org/10.1007/s00024-018-1914-x.

    • Search Google Scholar
    • Export Citation
  • Doshi-Velez, F., and B. Kim, 2017: Towards a rigorous science of interpretable machine learning. arXiv, 1702.08608v2, https://doi.org/10.48550/arXiv.1702.08608.

  • Frogner, I.-L., and Coauthors, 2019: HarmonEPS—The HARMONIE Ensemble Prediction System. Wea. Forecasting, 34, 19091937, https://doi.org/10.1175/WAF-D-19-0030.1.

    • Search Google Scholar
    • Export Citation
  • Fushiki, T., 2011: Estimation of prediction error by using K-fold cross-validation. Stat. Comput., 21, 137146, https://doi.org/10.1007/s11222-009-9153-8.

    • Search Google Scholar
    • Export Citation
  • Hsu, W.-R., and A. H. Murphy, 1986: The attributes diagram a geometrical framework for assessing the quality of probability forecasts. Int. J. Forecasting, 2, 285293, https://doi.org/10.1016/0169-2070(86)90048-8.

    • Search Google Scholar
    • Export Citation
  • ICAO, 2010: Meteorological service for international air navigation. International Civil Aviation Organization, 206 pp., https://www.icao.int/airnavigation/IMP/Documents/Annex%203%20-%2075.pdf.

  • Lundberg, S. M., and S.-I. Lee, 2017: A unified approach to interpreting model predictions. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates, Inc., 4768–4777, https://dl.acm.org/doi/10.5555/3295222.3295230.

  • Meinshausen, N., 2006: Quantile regression forests. J. Mach. Learn. Res., 7, 983999.

  • Prasetiyowati, M. I., N. U. Maulidevi, and K. Surendro, 2020: Feature selection to increase the random forest method performance on high dimensional data. Int. J. Adv. Intell. Inf., 6, 303312, https://doi.org/10.26555/ijain.v6i3.471.

    • Search Google Scholar
    • Export Citation
  • Rover, D. D., and Coauthors, 2008: Improved low visibility forecasts at Schiphol Airport. KDC- LVP Project team Final Rep., part 1, 43 pp., https://cdn.knmi.nl/knmi/pdf/bibliotheek/knmipubmetnummer/knmipub222.pdf.

  • Schefzik, R., 2016: A similarity-based implementation of the Schaake Shuffle. Mon. Wea. Rev., 144, 19091921, https://doi.org/10.1175/MWR-D-15-0227.1.

    • Search Google Scholar
    • Export Citation
  • Schmeits, M. J., K. J. Kok, and D. H. P. Vogelezang, 2005: Probabilistic forecasting of (severe) thunderstorms in the Netherlands using model output statistics. Wea. Forecasting, 20, 134148, https://doi.org/10.1175/WAF840.1.

    • Search Google Scholar
    • Export Citation
  • Segal, M. R., 2004: Machine learning benchmarks and random forest regression. UCSF doc., 15 pp., https://escholarship.org/uc/item/35x3v9t4.

  • Slangen, A. B. A., and M. J. Schmeits, 2009: Probabilistic forecasts of winter thunderstorms around Amsterdam Airport Schiphol. Adv. Sci. Res., 3, 3943, https://doi.org/10.5194/asr-3-39-2009.

    • Search Google Scholar
    • Export Citation
  • Staniak, M., and P. Biecek, 2018: Explanations of model predictions with live and breakDown packages. arXiv, 1804.01955v2, https://doi.org/10.48550/arXiv.1804.01955.

  • Taillardat, M., O. Mestre, M. Zamo, and P. Naveau, 2016: Calibrated ensemble forecasts using quantile regression forests and ensemble model output statistics. Mon. Wea. Rev., 144, 23752393, https://doi.org/10.1175/MWR-D-15-0260.1.

    • Search Google Scholar
    • Export Citation
  • Velthoen, J., J.-J. Cai, and G. Jongbloed, 2020: Interpretable random forest models through forward variable selection. arXiv, 2005.05113v1, https://doi.org/10.48550/ARXIV.2005.05113.

  • Whan, K., and M. Schmeits, 2018: Comparing area probability forecasts of (extreme) local precipitation using parametric and machine learning statistical postprocessing methods. Mon. Wea. Rev., 146, 36513673, https://doi.org/10.1175/MWR-D-17-0290.1.

    • Search Google Scholar
    • Export Citation
  • Wijngaard, J., D. Vogelezang, J. van Bruggen, and N. Maat, 2007: Low visibility and ceiling forecasts at Schiphol. Part 1—Assessment of the current system. 28 pp., https://cdn.knmi.nl/system/data_center_publications/files/000/068/115/original/schipholzicht.pdf?1495621044.

  • Wright, M. N., and A. Ziegler, 2017: ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Software, 77 (1), 117, https://doi.org/10.18637/jss.v077.i01.

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1343 1343 1215
PDF Downloads 260 260 176