Extended range machine-learning severe weather guidance based on the operational GEFS

Adam J. Clark a NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
c School of Meteorology, U. of Oklahoma, Norman, OK

Search for other papers by Adam J. Clark in
Current site
Google Scholar
PubMed
Close
,
Kimberly A. Hoogewind a NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
b Cooperative Institute for Severe and High-Impact Weather Research and Operations, U. of Oklahoma, Norman, OK

Search for other papers by Kimberly A. Hoogewind in
Current site
Google Scholar
PubMed
Close
,
Aaron J. Hill c School of Meteorology, U. of Oklahoma, Norman, OK

Search for other papers by Aaron J. Hill in
Current site
Google Scholar
PubMed
Close
,
Eric D. Loken a NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
b Cooperative Institute for Severe and High-Impact Weather Research and Operations, U. of Oklahoma, Norman, OK

Search for other papers by Eric D. Loken in
Current site
Google Scholar
PubMed
Close
, and
Michael J. Hosek a NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
b Cooperative Institute for Severe and High-Impact Weather Research and Operations, U. of Oklahoma, Norman, OK
c School of Meteorology, U. of Oklahoma, Norman, OK

Search for other papers by Michael J. Hosek in
Current site
Google Scholar
PubMed
Close
Restricted access

Abstract

Hill et al. (2023) demonstrated promising results for 1–8-day severe weather predictions using a random forest (RF) trained with Global Ensemble Forecast System Reforecasts (GEFS/R) and applied to operational GEFS forecasts. However, the skill of the reforecasts may be affected by using fewer members and coarser initial conditions relative to operational GEFS forecasts. Thus, this work builds on Hill et al. by formulating and testing a similar RF model using operational GEFS data for training and testing, instead of reforecasts, to produce 1–15-day severe weather predictions.

Prior to training RFs from operational and reforecast GEFS data, feature engineering experiments were conducted for optimization and assessing performance sensitivities. These experiments found: (1) There was only modest degradation in forecast skill with decreasing numbers of members. (2) RFs performed best using simple ensemble means as predictors rather than percentiles or individual members. (3) Thermodynamic predictors were most impactful to RF performance. (4) A multi-model RF combining predictors from GEFS and ECMWF’s Integrated Forecast System improved upon single-model RFs. Using the optimal RF configuration, detailed performance characteristics were presented and four case studies analyzed. Finally, the optimal RF trained on operational GEFS forecasts significantly outperformed an identically configured RF trained on reforecasts. The superior performance in the RF trained from operational data could be related to operational GEFS forecasts themselves having better performance than the reforecasts, and/or the storm report database better sampling severe weather in more recent years.

© 2025 American Meteorological Society. This is an Author Accepted Manuscript distributed under the terms of the default AMS reuse license. For information regarding reuse and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Adam J. Clark, adam.clark@noaa.gov

Abstract

Hill et al. (2023) demonstrated promising results for 1–8-day severe weather predictions using a random forest (RF) trained with Global Ensemble Forecast System Reforecasts (GEFS/R) and applied to operational GEFS forecasts. However, the skill of the reforecasts may be affected by using fewer members and coarser initial conditions relative to operational GEFS forecasts. Thus, this work builds on Hill et al. by formulating and testing a similar RF model using operational GEFS data for training and testing, instead of reforecasts, to produce 1–15-day severe weather predictions.

Prior to training RFs from operational and reforecast GEFS data, feature engineering experiments were conducted for optimization and assessing performance sensitivities. These experiments found: (1) There was only modest degradation in forecast skill with decreasing numbers of members. (2) RFs performed best using simple ensemble means as predictors rather than percentiles or individual members. (3) Thermodynamic predictors were most impactful to RF performance. (4) A multi-model RF combining predictors from GEFS and ECMWF’s Integrated Forecast System improved upon single-model RFs. Using the optimal RF configuration, detailed performance characteristics were presented and four case studies analyzed. Finally, the optimal RF trained on operational GEFS forecasts significantly outperformed an identically configured RF trained on reforecasts. The superior performance in the RF trained from operational data could be related to operational GEFS forecasts themselves having better performance than the reforecasts, and/or the storm report database better sampling severe weather in more recent years.

© 2025 American Meteorological Society. This is an Author Accepted Manuscript distributed under the terms of the default AMS reuse license. For information regarding reuse and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Adam J. Clark, adam.clark@noaa.gov
Save