Evaluation of 0–6-Hour Forecasts from the Experimental Warn-on-Forecast System and the Hybrid Analysis and Forecast System for Real-Time Cases in 2021

Noah T. Carpenter Cooperative Institute for Severe and High-Impact Weather Research and Operations, University of Oklahoma, Norman, Oklahoma
School of Meteorology, University of Oklahoma, Norman, Oklahoma
ICF International Inc., Reston, Virginia

Search for other papers by Noah T. Carpenter in
Current site
Google Scholar
PubMed
Close
,
Jidong Gao NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
School of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Jidong Gao in
Current site
Google Scholar
PubMed
Close
,
Adam Clark NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
School of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Adam Clark in
Current site
Google Scholar
PubMed
Close
,
Patrick Burke NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Patrick Burke in
Current site
Google Scholar
PubMed
Close
,
Patrick Skinner Cooperative Institute for Severe and High-Impact Weather Research and Operations, University of Oklahoma, Norman, Oklahoma
NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma
School of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Patrick Skinner in
Current site
Google Scholar
PubMed
Close
,
Yunheng Wang Cooperative Institute for Severe and High-Impact Weather Research and Operations, University of Oklahoma, Norman, Oklahoma
NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Yunheng Wang in
Current site
Google Scholar
PubMed
Close
,
Kent Knopfmeier Cooperative Institute for Severe and High-Impact Weather Research and Operations, University of Oklahoma, Norman, Oklahoma
NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Kent Knopfmeier in
Current site
Google Scholar
PubMed
Close
,
Sijie Pan NOAA/OAR/Global Systems Laboratory, Boulder, Colorado

Search for other papers by Sijie Pan in
Current site
Google Scholar
PubMed
Close
,
Brian Matilla Cooperative Institute for Severe and High-Impact Weather Research and Operations, University of Oklahoma, Norman, Oklahoma
NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Brian Matilla in
Current site
Google Scholar
PubMed
Close
, and
Joshua Martin Cooperative Institute for Severe and High-Impact Weather Research and Operations, University of Oklahoma, Norman, Oklahoma
NOAA/OAR/National Severe Storms Laboratory, Norman, Oklahoma

Search for other papers by Joshua Martin in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

This study compares real-time forecasts produced by the Warn-on-Forecast System (WoFS) and a hybrid ensemble and variational data assimilation and prediction system (WoF-Hybrid) for 31 events during 2021. Object-based verification is used to quantify and compare strengths and weaknesses of WoFS ensemble forecasts with 3-km horizontal grid spacing and WoF-Hybrid deterministic forecasts with 1.5-km horizontal grid spacing. The goal of such comparison is to provide evidence as to whether WoF-Hybrid has performance characteristics that complement or improve upon those of WoFS. Results indicate that both systems provide similar accuracy for timing and placement of thunderstorm objects identified using simulated reflectivity. WoF-Hybrid provides more accurate forecasts of updraft helicity tracks. Differences in forecast quality are case dependent; the largest difference in accuracy favoring WoF-Hybrid occurs in eight cases identified as “high-impact” by the quantity of National Weather Service Local Storm Reports, while WoFS performance is favored at short lead times for 10 “moderate-” and 13 “low-impact” events. WoF-Hybrid reflectivity objects are closer in size and location to observed objects. However, a higher thunderstorm overprediction bias is identified in WoF-Hybrid, particularly early in the forecast. Two severe weather events are selected for detailed investigation. In the case of 26 May, both systems had similar skill; however, for 10 December, WoF-Hybrid forecasts significantly outperformed WoFS forecasts. These results show improved performance for WoF-Hybrid over WoFS under certain regimes that warrants further investigation. To understand reasons for these differences will help incorporate higher-resolution modeling into Warn-on-Forecast systems.

Significance Statement

The NOAA Warn-on-Forecast (WoF) project uses advanced data assimilation for rapidly updating numerical weather prediction systems to provide forecasts of individual thunderstorms. Forecasts show promise for enabling greater warning lead time for some storms. The flagship Warn-on-Forecast System (WoFS) is a 36-member analysis and 18-member forecast system at 3-km grid spacing. The project also produced a single member system that employs variational analysis and produces a deterministic forecast at 1.5-km grid spacing (WoF-Hybrid). This study seeks to evaluate and compare the performance of WoFS and WoF-Hybrid for 31 severe weather events that occurred during 2021. Results found that WoF-Hybrid predicts storm rotation particularly well compared to WoFS, and several other strengths and limitations of both systems are identified. This research may help us understand the complementary nature of two systems and improve our ability to provide more reliable 0–6-h forecasts in the future.

© 2025 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Jidong Gao, Jidong.Gao@noaa.gov

Abstract

This study compares real-time forecasts produced by the Warn-on-Forecast System (WoFS) and a hybrid ensemble and variational data assimilation and prediction system (WoF-Hybrid) for 31 events during 2021. Object-based verification is used to quantify and compare strengths and weaknesses of WoFS ensemble forecasts with 3-km horizontal grid spacing and WoF-Hybrid deterministic forecasts with 1.5-km horizontal grid spacing. The goal of such comparison is to provide evidence as to whether WoF-Hybrid has performance characteristics that complement or improve upon those of WoFS. Results indicate that both systems provide similar accuracy for timing and placement of thunderstorm objects identified using simulated reflectivity. WoF-Hybrid provides more accurate forecasts of updraft helicity tracks. Differences in forecast quality are case dependent; the largest difference in accuracy favoring WoF-Hybrid occurs in eight cases identified as “high-impact” by the quantity of National Weather Service Local Storm Reports, while WoFS performance is favored at short lead times for 10 “moderate-” and 13 “low-impact” events. WoF-Hybrid reflectivity objects are closer in size and location to observed objects. However, a higher thunderstorm overprediction bias is identified in WoF-Hybrid, particularly early in the forecast. Two severe weather events are selected for detailed investigation. In the case of 26 May, both systems had similar skill; however, for 10 December, WoF-Hybrid forecasts significantly outperformed WoFS forecasts. These results show improved performance for WoF-Hybrid over WoFS under certain regimes that warrants further investigation. To understand reasons for these differences will help incorporate higher-resolution modeling into Warn-on-Forecast systems.

Significance Statement

The NOAA Warn-on-Forecast (WoF) project uses advanced data assimilation for rapidly updating numerical weather prediction systems to provide forecasts of individual thunderstorms. Forecasts show promise for enabling greater warning lead time for some storms. The flagship Warn-on-Forecast System (WoFS) is a 36-member analysis and 18-member forecast system at 3-km grid spacing. The project also produced a single member system that employs variational analysis and produces a deterministic forecast at 1.5-km grid spacing (WoF-Hybrid). This study seeks to evaluate and compare the performance of WoFS and WoF-Hybrid for 31 severe weather events that occurred during 2021. Results found that WoF-Hybrid predicts storm rotation particularly well compared to WoFS, and several other strengths and limitations of both systems are identified. This research may help us understand the complementary nature of two systems and improve our ability to provide more reliable 0–6-h forecasts in the future.

© 2025 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Jidong Gao, Jidong.Gao@noaa.gov

1. Introduction

The NOAA National Severe Storms Laboratory (NSSL) Warn-on-Forecast (WoF) project began in 2009 and aimed to provide forecasters with probabilistic guidance for individual thunderstorms and their hazards at 0–6-h lead times. This, in turn, may aid National Weather Service (NWS) forecasters in increasing warning lead times for hazardous weather events like tornadoes, large hail, straight-line winds, and flash flooding. The WoF project has built a rapidly updating, ensemble data assimilation and forecasting system known as the Warn-on-Forecast System (WoFS) (Heinselman et al. 2024). In a parallel effort, NSSL also developed a deterministic hybrid ensemble variational analysis and forecast system supported by the WoF project (WoF-Hybrid; Gao et al. 2013; Wang et al. 2019; Hu et al. 2021). This hybrid 3DVAR-EnKF data assimilation and forecast system was introduced as another method to forecast individual thunderstorms in the 0–6-h time frame (Gao et al. 2021). Both systems can supplement observation-based severe weather warning guidance within the watch-to-warning time frame by rapidly assimilating radar, satellite, and other observations into a numerical weather prediction system that is tuned for severe weather prediction. Application of such tools may help forecasters extend warning lead times; rapidly updating storm-scale ensemble guidance is needed for 0–6-h lead times at which existing operational systems insufficiently address the uncertainty and rapid evolution associated with individual mesoscale thunderstorm events.

Three major differences between WoFS and WoF-Hybrid are 1) the data assimilation methodology, 2) resolution differences (1.5 km for WoF-Hybrid and 3 km for WoFS), and 3) WoF-Hybrid is a single deterministic forecast while WoFS is an ensemble which emphasizes probabilistic prediction. WoF-Hybrid was tested alongside WoFS in the 2017–21 Spring Forecasting Experiments (SFEs; Gao et al. 2017, 2021). One major drawback of the WoF-Hybrid system was that many spurious convective cells were produced in the first-hour forecasts, which may be attributed to both model errors and excessive moisture introduced by the data assimilation process (Wang et al. 2018). The WoFS ensemble and WoF-Hybrid produced similar results when comparing reflectivity output, but WoF-Hybrid had relatively more accurate updraft helicity (UH) tracks in shape, size, and location (Gao et al. 2022). This suggests that the inclusion of WoF-Hybrid in tandem with the WoFS ensemble may provide useful information (Hu et al. 2021), but this hypothesis has not been systematically tested. This study aims to help assess the strengths and shortcomings of WoF-Hybrid compared to WoFS through an objective verification study on multiple forecast products over a large number of cases.

The remainder of this study is organized as follows: A brief description of WoFS and WoF-Hybrid is given in section 2. The verification methods used in this study and a description of real-time cases are included in section 3. Section 4 details results for reflectivity and UH forecasts evaluated quantitatively and qualitatively. Section 5 provides concluding remarks, limitations, and ideas for future work.

2. Brief description of WoFS and WoF-Hybrid

WoFS is an experimental, real-time, convection-allowing ensemble that aims to provide rapidly updating probabilistic storm-scale guidance along the convective watch-to-warning timeline (Heinselman et al. 2023). WoFS utilizes the Weather Research and Forecasting (WRF) Model as its convective-scale NWP model and the Gridpoint Statistical Interpolation–based ensemble Kalman filter (GSI-EnKF) data assimilation system (Liu et al. 2018) as its data assimilation system in recent years. The GSI-EnKF data assimilation system has capability to assimilate surface data from Automated Surface Observing Systems (ASOS), radar radial velocity and reflectivity data from the WSR-88D radar network, and data from the Geostationary Operational Environmental Satellite-R Series (GOES-R) (Yussouf and Knopfmeier 2019; Jones et al. 2018, 2020). The system comprises 36 members with 3-km grid spacing run over a 900 km × 900 km domain where storms are expected to develop on a given day (Yussouf and Knopfmeier 2019; Jones et al. 2020). Each ensemble member has a unique amalgamation of initial conditions, boundary conditions, radiation schemes, and planetary boundary layer physics schemes. Forecasts from 18 members are launched every 30 min (Skinner et al. 2018; Clark et al. 2021a,b). Forecasts are run to 6 h of lead time at the top of each hour and to 3 h at the bottom of each hour. Initial background fields and boundary conditions for WoFS are derived from forecasts generated by a 36-member, hourly cycled High-Resolution Rapid Refresh Data Assimilation System (HRRRDAS; Dowell et al. 2022; James et al. 2022) initialized at 1400 UTC.

Parallel efforts were also made to develop a deterministic analysis and forecast system at NSSL using the WRF Model and a three-dimensional variational data assimilation (3DVAR) scheme (Gao et al. 2013; Stensrud et al. 2013). Flow-dependent covariances from ensemble forecasts can also be incorporated into this 3DVAR system (Gao and Stensrud 2014; Gao et al. 2017). Building on these capabilities, a real-time, weather-adaptive, dual-resolution, hybrid variational ensemble analysis and forecast system, referred to as WoF-Hybrid, has been developed at NSSL (Wang et al. 2019; Pan et al. 2021; Gao et al. 2021, 2022). WoF-Hybrid was also designed to assimilate observations from multiple scales (Gao et al. 2013). Specifically, the sparse sounding and wind profiler data were assimilated in the first pass. After completing this step, surface data were assimilated in the second pass, followed by assimilating high-density WSR-88D radar data and satellite data. The above procedure makes sure that observations at different scales are assimilated properly with different influence radii, similar to a scale-dependent localization scheme implemented in an ensemble Kalman filter. WoF-Hybrid initial background fields and boundary conditions are obtained similarly to WoFS (Clark et al. 2021a). Complementing the ensemble-based WoFS, WoF-Hybrid uses the flow-dependent background error covariances derived from the 3-km WoFS background and hourly provides one deterministic 0–6-h short-term forecast at 1.5-km resolution over the WoFS 900 km × 900 km domain. WoF-Hybrid forecast runs are repeated hourly.

Both WoFS and WoF-Hybrid typically run from 1700 to 0300 UTC (Wang et al. 2019; Gao et al. 2021; Pan et al. 2021; Clark et al. 2021a). There are several reasons for this operational time frame. First, severe weather events frequently occur during the afternoon and late evening. Second, we test the systems during the Hazard Weather Testbed (HWT) spring forecast experiments period, when the invited forecasters from National Weather Service usually evaluate the performance of the systems during the afternoon and late evening hours. Third, we allocate computer resources from 0400 to 1600 UTC to rerun cases if issues arise during real-time experiments. However, nocturnal warning operations could also benefit from these real-time runs and this can be done in future operations.

As described above, both systems are capable of assimilating high-density WSR-88D radar data every 15 min. However, radar data are preprocessed differently for each system. WoFS uses radar data preprocessed to a 5-km horizontal resolution, whereas WoF-Hybrid uses data preprocessed to the same 1.5-km horizontal model grid resolution. Consequently, WoF-Hybrid assimilates more radar data than WoFS.

3. Description of real-time cases and verification methods

a. Cases description

In this study, we evaluate 31 cases for which both WoFS and WoF-Hybrid forecasts are available from April to December 2021. Warn-on-Forecast runs have traditionally been targeted toward days and regions with a substantial severe weather risk and/or toward other hazards such as flash flooding and landfalling tropical cyclones. Most runs were performed during the 5- week period usually from late April to early June (apart from weekend days). In 2021, Warn-on-Forecast was run 44 times either in real time or to inform retrospective case studies. The 31 cases studied here represent dates for which both WoFS and WoF-Hybrid output were available and were not compromised by missing forecast data or Multi-Radar/Multi-Sensor (MRMS) data.

Forecasts of reflectivity and UH are verified against the NSSL MRMS reflectivity and azimuthal wind shear (AWS) products, respectively (Smith et al. 2016). Table 1 provides a detailed breakdown of each case, including the number of tornado, wind, and hail reports; the highest Storm Prediction Center (SPC) risk category; event severity; and any 6-h forecast periods that are missing from the analysis.

Table 1.

Summary of 2021 WoFS/WoF-Hybrid real-time cases. For each case, the number of tornado, hail, wind, and total reports within the WoFS domain, maximum SPC risk category for the 1630 UTC Day 1 Convective Outlook, event severity per the criteria in this study, and missing initialization times are provided.

Table 1.

We categorize the 31 cases into three groups: high-, mid-, and low-severity/impact events based on the number of filtered SPC storm reports in the WoF analysis and forecast domains. Here, the “filtered” storm reports mean the corrected reports after missed and redundant reports were taken care of (https://www.spc.noaa.gov/faq/index.html#6.8). A date is deemed a high-impact event if it has at least 50 storm reports, a midimpact event if it falls between 10 and 50 storm reports, and a low-impact event if it has 10 or fewer storm reports. There are a total of 8 high-impact, 10 midimpact, and 13 low-impact events in this study. Table 2 illustrates how the partitioning of events based on severity relates to the maximum SPC outlook category within the WoFS domains. It compares SPC risk category and observed event severity.

Table 2.

Summary of all 2021 cases with respect to maximum SPC risk category and observed event severity.

Table 2.

b. Object-based verification

The object-based verification method described by Guerra et al. (2022) was employed in this study. This method yields a collection of hits and misses from which contingency table-based statistics, including the critical success index (CSI), probability of detection (POD), success ratio (SR), false alarm ratio (FAR; 1 − SR), and bias, can be assessed. Additionally, statistics of object diagnostic properties, such as average area ratio (AAR) and average minimum distance (AMD), can be produced for matched objects. To determine the AAR, the area covered by each respective forecast object is divided by the area associated with the observed object with which it was matched. Then, the average of all ratios within each time step is taken, yielding the AAR. For the AMD calculation, minimum distance refers to the shortest line that may be drawn between any one point along the boundary of a forecast object and any one point along the boundary of its matched observed object. The AMD is the average of these minimum distances across all matched sets.

The above statistics were calculated for composite reflectivity and 2–5-km UH objects against their corresponding MRMS objects for both WoFS and WoF-Hybrid. These scores were computed at 10-min intervals for every 0–6-h forecast for each day and then averaged at each lead time and aggregated over 31 cases, or over cases per severity category. For WoFS, the scores were calculated for each individual member first and then averaged over the 18 members for each 10-min interval, yielding what is referred to as the WoFS member average (WMA). For the purpose of verification, the scores from WoF-Hybrid can be compared with each individual member as well as WMA.

For both systems, 10-min MRMS composite reflectivity was used to verify model reflectivity. UH, which is defined as the integral of the vertical vorticity multiplied by the updraft velocity (Kain et al. 2008), is not an observable variable. Therefore, it was verified using proxy, 30-min MRMS AWS swaths calculated every 10 min. In addition, the proxy data were interpolated to 1.5-km resolution for the purpose of the WoF-Hybrid evaluation. The object-based methodology measures the proximity of forecast closed-contour reflectivity and UH swaths of a certain strength to observed swaths (by proxy for UH) of the same or similar strength. If an object is deemed “close” (the meaning of this is described herein), it is considered a “matched pair” or hit. Additionally, misses and false alarms are calculated, producing contingency table–based statistics.

For reflectivity, an observed object is defined as a closed-contour composite reflectivity field with values greater than or equal to 40 dBZ. To diminish the impacts of bias as much as possible, reflectivity objects for WoFS and WoF-Hybrid were bias corrected and defined using a value matching the percentile of 40 dBZ in the MRMS data used. This percentile, 99.30%, corresponds to 46.1 dBZ in WoFS and 47.1 dBZ in WoF-Hybrid. In addition to the requirement that observed objects exceed 40 dBZ, the maximum dBZ value within a closed contour must be greater than or equal to 45 dBZ (52.3 dBZ in WoFS and 53.3 dBZ in WoF-Hybrid), which is used primarily to filter out intense stratiform regions produced by mesoscale convective systems, which have been previously observed in WoFS forecasts (Skinner et al. 2018). In addition to the above filters, a clustering process is used to group objects in proximity into one object. This process occurs when the smallest distance between the boundaries of two objects from the same dataset at the same time step is less than 15 km.

UH and AWS objects are defined in a slightly different manner. While reflectivity objects are found at each individual time step, UH/AWS objects are identified in 30-min swaths to better detect the presence of a persistent mesocyclone. For this study, a closed-contour AWS 30-min swath object is defined as a contour with values greater than 0.004 s−1. This falls within the range of values used in similar research (Skinner et al. 2018; Chen et al. 2022; Miller et al. 2022). The percentile value of this threshold in AWS data corresponds to UH threshold values of 65 m2 s−2 for WoFS and 130 m2 s−2 for WoF-Hybrid. The higher UH threshold for WoF-Hybrid is due to better resolution of vertical vorticity and updraft speeds at 1.5-km grid spacing, which yields higher extrema for each. Objects are clustered similarly to reflectivity objects.

Once the final set of reflectivity and UH/AWS objects have been determined, a model object is considered a match to an observed object at the same time using a total interest score (Davis et al. 2009) identical to Guerra et al. (2022). The total interest score is calculated as the average of the minimum displacement ratio and centroid displacement ratio for each object pair using a maximum allowable displacement limit of 40 km. For example, an object pair with a centroid displacement > 40 km and a minimum displacement of 16 km would produce a total interest score of 0.2 and be considered a matched pair. Counts of matched object pairs, misses, and false alarms are sensitive to object matching criteria; however, relative differences when comparing forecast quality between two systems are less sensitive to the object matching algorithm (Skinner et al. 2018).

4. Results and discussion

a. Aggregated forecast performance for reflectivity using object-based verification

One general measure of performance in the object-based framework is whether the total number of forecast objects follows the observed object counts. To evaluate the performance in early hour and late-hour forecasts, the total object counts for both 1-h forecasts and 6-h forecasts aggregated over all 31 cases from WoFS, WoF-Hybrid, and MRMS are shown in Fig. 1.

Fig. 1.
Fig. 1.

Object count of composite reflectivity for (a),(b) forecasts valid at 1 hour and (c),(d) forecasts valid at 6 hours, binned by forecast initialization time over 31 cases in 2021. (a),(c) The total object count per event; (b),(d) the matched object count per event. Solid blue lines represent WoF-Hybrid, gray lines are for individual WoFS members, and solid black lines are for WMA. In (a) and (c), dashed blue and dashed black lines are for 1.5- and 3-km MRMS AWS objects, respectively.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

To enable comparisons of WoFS and WoF-Hybrid at their native resolutions, MRMS object counts are included for MRMS data on both the 1.5- and 3-km WoFS grids. The interpolation to the different resolutions has a small effect.

In Fig. 1a, the overall trend of total objects per event is similar between the models and the MRMS data, with object counts low but increasing for the 1-h forecasts from earlier initializations, peaking near 2300–0000 UTC, and then decreasing through the rest of the period. While the general trend is similar, there is a high reflectivity object count bias present in both systems for forecasts from all initialization times, with WoF-Hybrid having a higher object count per event than a majority of WoFS members and WMA through the 2300 UTC initializations. Figure 1b depicts a similar matched object count per event for WoF-Hybrid and the WoFS members and WMA during the first hour of all forecasts from different initialization times, with more matched objects occurring later in the forecast period. Thus, both systems perform reasonably well in terms of predicting similar object counts for composite reflectivity.

Figures 1c and 1d depict the same information as Figs. 1a and 1b but for the 6-h forecast lead time. The 1-h forecasts cover times ranging from 1800 to 0400 UTC while the 6-h forecasts cover times ranging from 2300 to 0900 UTC. The MRMS observed objects are only about half as many in number as the 6-h forecast objects per event for 1700 UTC initializations and are close to the number of the forecast objects for 2200 UTC initializations and later. The number of matched objects from each system is similar over the forecasts initialized at all analysis times, with WoF-Hybrid having slightly more for 1700 UTC initializations over the majority of WoFS members and WMA (Fig. 1d).

Next, performance diagrams are depicted, displaying SR, POD, bias, and CSI for 1- and 6-h lead times averaged over all 31 cases at each initialization time (Fig. 2). WoF-Hybrid POD is notably higher than WMA POD at 1700 and 1800 UTC but is similar to WMA POD for the rest of the period for the forecasts at 1-h lead time. But WoF-Hybrid looks better nearly across the board for the 6-h lead time. For SR, there is a clear dependence of scores on the forecast initialization time for both systems. At early times (1700–2200 UTC), the SRs, or 1 − FAR, are quite similar between the two systems. Comparing Figs. 2a and 2b, forecasts at 1 h are more skillful than those at 6 h. For forecasts initialized from 2300 UTC and later analysis times, however, WoF-Hybrid has notably higher SRs than WMA, especially for the 6-h forecast (Fig. 2b). Given that many of the events sampled are in Great Plains and Midwest environments in which thunderstorms develop just after peak heating in otherwise capped air masses, this could imply that once storms have started, WoF-Hybrid better predicts small-scale intricacies of storm structure, leading to a more accurate propagation of convection indicated by a higher POD/CSI and lower FAR.

Fig. 2.
Fig. 2.

The performance diagram for (a) forecasts valid at 1 h and (b) forecasts valid at 6 h aggregated over 31 cases in 2021. Each number represents a forecast initialization time for WMA (solid black) or WoF-Hybrid (solid blue).

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

Examining object-based reflectivity metrics aggregated with respect to the forecast lead time offers a different perspective on the same data (Fig. 3). The error bars represent the 95% confidence intervals for WoF-Hybrid and WMA calculated by aggregating contingency table elements across all forecasts, as recommended by Hamill (1999), followed by a bootstrap resampling over 1000 samples. CSIs aggregated over all 31 cases show better scores for WoFS before the first 200 min, very similar scores between 200 and 270 min of lead times for WoF-Hybrid, all WoFS members, and WMA, and the scores for WoF-Hybrid are slightly better than the other two sets after 270 min (Fig. 3a). Between the first 30-min and 240-min forecasts, PODs for all WoFS members and WMA are greater than those for WoF-Hybrid. But for the forecasts 0–20 min after initialization, PODs are higher for WoF-Hybrid for most cases, although FARs are also higher (Figs. 3b,c). This indicates that WoF-Hybrid produced more spurious cells during the early forecast times than WoFS, though with a high bias for cell production there is also a tendency to produce more hits in WoF-Hybrid. In general, CSIs for composite reflectivity are statistically different (95% confidence) between WoF-Hybrid and WMA near the end of 6-h forecasts, PODs are statistically different for most of forecast times, and FARs are not statistically different.

Fig. 3.
Fig. 3.

Statistical scores including (a),(d) CSI, (b),(e) POD, and (c),(f) FAR for composite reflectivity. (left) Aggregated scores over all 2021 cases binned by forecast lead time. (right) Aggregated scores over three different event severity categories. On the left, gray lines are for individual WoFS members, black lines are for WMA, and blue lines are for WoF-Hybrid. On the right, solid blue (black) lines are for high-end severity cases for WoF-Hybrid (WMA). Dashed blue (black) lines represent midseverity cases for WoF-Hybrid (WMA). Dotted blue (black) lines represent low severity cases for WoF-Hybrid (WMA). Each point represents a 10-min time interval. The error bars represent the 95% confidence intervals in each panel (note: on the left column, error bars are plotted for WMA and WoF-Hybrid; on the right column, error bars are only plotted for high-severity cases for clarity purpose).

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

Some differences emerge with respect to event severity (Figs. 3d–f). The high-end events have the highest CSIs and PODs as well as the lowest FARs at all forecast times for both systems. This finding should bolster forecaster confidence in using WoFS and WoF-Hybrid guidance during high-end events. It is also interesting to note that WoF-Hybrid high-end event forecasts are more skillful than WMA during most of the 0–6-h forecasts. For mid- and low-end events, especially for low-end events, WoFS performs slightly better, especially during the first 270 min of lead time (Fig. 3d). It is also very clear that only PODs are statistically different for most of the forecast times, and CSIs are statistically different toward the end of 6-h forecasts between the two systems for high-end events.

Figure 4 shows results for AAR and AMD for matched objects binned by lead time. Except for the first 30–40 min, WoF-Hybrid has an AAR closer to one than all individual WoFS members and WMA for the entirety of the 6-h forecast period averaged over all 31 cases. The WoF-Hybrid AAR ranges from about 1.0 to just over 1.2 after the first hour of lead times while the WoFS members range between 1.15 and 1.7. The AMD for WoF-Hybrid is lower than the majority of WoFS members and WMA by as much as 1 km for all lead times after 10 min. This result shows that WoF-Hybrid predicts storm location and size better compared to most of the WoFS members throughout the majority of the 2021 cases.

Fig. 4.
Fig. 4.

(a) AAR and (b) AMD for composite reflectivity binned by forecast lead time for all 2021 cases. Blue lines represent WoF-Hybrid, gray lines represent individual WoFS members, and black lines represent WMA. Each point represents a 10-min time interval.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

b. Aggregated UH/AWS forecast performance using object-based verification

Total UH objects and matched objects are also examined for the first 90 min (which is 1 h worth of 30-min UH swaths, since there are no UH swath objects before 30 min) and 6-h forecasts from each initialization time. WoF-Hybrid, the WoFS members, and WMA all have a similar number of UH objects throughout, with relatively high biases for the forecasts initialized at early hours and low biases for the forecasts at late hours (not shown).

Comparing the performance diagram for the UH forecasts at 90-min (Fig. 5a) and 6-h forecasts (Fig. 5b), statistical scores are notably higher for the former, with PODs ranging between 0.25 and 0.55 except for the forecasts from the first two (1700 and 1800 UTC) and final (0300 UTC) initialization times. SRs are in a similar range at the 90-min forecast, slightly higher for WoF-Hybrid compared to WMA for most initialization times. For the 6-h forecasts, the SRs are similar for the 1700 UTC initialization but are higher for WoF-Hybrid than WMA for forecasts initialized hourly between 2000 and 0300 UTC (Fig. 5b). Overall, CSIs are similar, with larger scores for WoF-Hybrid compared to WMA, mainly attributed to the difference in FAR between the systems.

Fig. 5.
Fig. 5.

As in Fig. 2, but for UH, and in (a), forecasts valid at 90 min rather than 60 min.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

Like reflectivity, the UH/AWS average results over all 31 cases can be viewed with respect to lead time, which is illustrated in Fig. 6. At all forecast lead times, WoF-Hybrid yields higher CSIs over almost all WoFS individual members as well as the WMA, though not exceeding the 95% confidence interval (Fig. 6a). During the first 2.5 h of lead time, WoF-Hybrid PODs are most often within the range of the WoFS member spread. After the 2.5-h mark, however, WoF-Hybrid PODs are above the majority of members and WMA, higher than WMA by as much as 0.09 (Fig. 6b). Looking at FAR against lead time in Fig. 6c, WoF-Hybrid has a lower FAR compared to WMA at all forecast lead time by as much as 0.08 (Fig. 6c).

Fig. 6.
Fig. 6.

As in Fig. 3, but for UH.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

The analysis for UH/AWS contingency table-based statistics is detailed by partitioning cases by event severity, seen in Figs. 6d–f. Low-end events are omitted because of their very small UH values. Again, the high-end events have the highest CSIs and PODs as well as the lowest FAR at all forecast times for both systems. WoF-Hybrid CSIs for high-end events are consistently above WMA by over 0.05 at some lead times. But for the midend events, WMA CSIs are better for most of the first 0–3 h of forecast valid times and vice versa for the remaining 3 h (Fig. 6d) though not exceeding 95% significant confidence. Performance in terms of POD is similar to CSIs (Fig. 6e). For both systems, there is a very high FAR ranging between 0.56 and 0.95 that increases with lead time (Fig. 6f). High-end events have the lowest FARs at all lead times, with WoF-Hybrid for the high-end events yielding a smaller FAR than the majority of WoFS individual members and WMA. In general, these statistics indicate that high-end events are more skillfully predicted by WoFS and WoF-Hybrid, most likely attributable to stronger, well-organized storms in environments very supportive of robust rotation.

The AAR and AMD for UH objects are depicted in Fig. 7. Between hour one and hour two, the AAR is between 0.95 and 1.25 for WoF-Hybrid while it hovers around 1.5 for most of the WoF individual members and WMA. After the second forecast hour, the AAR of WoF-Hybrid is within the WoFS member spread and AARs for both systems range from about 1.0 to 3.0. The AMD for matched objects from WoF-Hybrid, individual WoFS members, and WMA generally increases with increasing lead time, starting between 1 and 3 km and ending near 6.5 km. The AMD for both systems is similar throughout, with most times for WoF-Hybrid yielding AMD values near what only several best WoFS members are able to achieve (Fig. 7b).

Fig. 7.
Fig. 7.

As in Fig. 4, but for UH.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

Overall, WoF-Hybrid has better scores in CSI and POD at initialization times generally from 1700 to 0100 UTC. For many severe weather events which follow the typical diurnal life cycle, this represents better performance for WoF-Hybrid during the early to midhours of an active event; WoFS catches up in performance by mid- to late evening initialization times (Fig. 5). When binning by the forecast lead time, WoF-Hybrid produces higher CSIs and PODs and lower FARs in high-end events than WMA for UH forecasts valid at 6 h (Fig. 6). This suggests WoF-Hybrid is more accurately predicting convective cells’ rotations throughout the 6-h period.

c. Case studies

Two severe weather outbreaks of interest occurred on 26 May and 10 December 2021. These events had the greatest numbers of NWS tornado and hail reports among the cases sampled here. The outbreak on 26 May 2021 resulted in a total of 200 severe weather reports, including 31 tornadoes and 115 hail events, while 10 December 2021 featured the Quad-State Supercell (QSS) (NWS 2024) and had 264 total reports, with 85 tornadoes and 163 damaging winds reports (see Table 1, Fig. 8). These cases illustrate several findings from the broader 31-case dataset.

Fig. 8.
Fig. 8.

The analysis and forecast domains and locations of the radar sites for (left) 26 May 2021 and (right) 10 Dec 2021. The red triangles, green squares, and blue circles indicate observed tornadoes, hail, and wind events from NWS storm reports, respectively. The pink box in each domain represents the area that experienced the greatest severe weather impact, displayed in Figs. 10 and 12, respectively. The green star sign inside the box in the right panel is the location of Mayfield, Kentucky.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

1) The 26 May severe weather event in the central plains

The 26 May event was primarily characterized by a warm front and an approaching upper-level trough. A storm that initiated early at 1500 UTC strengthened into a long-track supercell that tracked directly along the warm front, producing several tornadoes and large hail over central Kansas through the afternoon. Other supercells initiated in the afternoon in southwest Nebraska and produced a few tornadoes, one of them producing damage rated EF2. Convection later grew upscale into multiple MCSs tracking eastward through Nebraska and northern Kansas.

In this event, WoF-Hybrid displayed a greater overprediction bias than WoFS, which is shown in 20-min forecasts initialized at 2200 UTC (Fig. 9). In western Kansas at that time, there was one small, weak convective cell (Fig. 9a). The WoF-Hybrid 20-min forecast (Fig. 9b) contained a large convective object having a large region over 50 dBZ. WoFS members 4 (Fig. 9c) and 12 (Fig. 9d) contained only two or three small cells with maximum composite reflectivity above 50 dBZ. Similar behavior existed for other WoFS members (not shown). WoF-Hybrid also produced far more small convective cells in Nebraska and South Dakota. While this is only one visualization of the difference between the two systems, WoF-Hybrid exhibited a greater number of spurious, high-intensity reflectivity cores, which is consistent with higher FAR scores during the first 30 min of the forecast period.

Fig. 9.
Fig. 9.

Composite reflectivity from (a) MRMS and 20-min forecasts from (b) WoF-Hybrid, (c) WoFS member 4, and (d) WoFS member 12.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

In the following, we focus on an area and time period that produced the most high-impact hail and tornado damage in this case. Starting at roughly 2230 UTC, two supercells matured and produced multiple tornado and hail reports in central Kansas and near the Kansas/Nebraska border, including an EF2 tornado near Max, Nebraska. Both forecast systems handled the central Kansas supercell fairly well over the course of forecasts initialized from six different analysis times, though location errors are more noticeable for the first two forecasts (lower-right storm in panels in Fig. 10). For the forecast initialized at 1700 UTC, WoF-Hybrid produced intense convection with a slight northwestward bias, while WoFS Best Member had a larger northward bias. For forecasts initialized at 1800 UTC, both systems placed convection in the correct area. With each successive forecast, both systems produced a supercell that is closer in object size and placement to the observed reflectivity object. For the forecast initialized at 2100 UTC, WoF-Hybrid produced multiple reflectivity objects that match the observed objects (90-min lead time), while WoFS Best Member not only has a forecast object collocated with the observed but also produces an additional spurious storm further east. All WoFS members have convection present in this area, but most of them show an eastward bias, although five members produced accurately placed objects (not shown). It is important to note, however, all WoFS members and WoF-Hybrid produce object matches for this supercell. The steady improvement in forecast quality with additional data assimilation following convection initiation is consistent with the findings of Guerra et al. (2022).

Fig. 10.
Fig. 10.

A spatial representation of composite reflectivity near the Kansas/Nebraska border at 2230 UTC 26 May 2021, for six different forecasts initialized at (a) 1700, (b) 1800, (c) 1900, (d) 2000, (e) 2100, and (f) 2200 UTC. The domain represented on this plot is zoomed in to focus on the most impactful convection (see Fig. 8, left). Hatched contours are MRMS data above 40 dBZ from 2230 UTC. Red shaded areas are the reflectivity above 46.1 dBZ (WoFS object threshold) of WoFS Best Member, which is defined by the member with the highest averaged CSI values for the parameter of interest during 0–6-h forecasts among the 18 members. Blue shaded areas are the reflectivity contours above 47.1 dBZ (WoF-Hybrid object threshold) for WoF-Hybrid. CSIs over the entire domain from the initialization time until 2230 UTC of WoF-Hybrid and WoFS Best Member are displayed in the upper-right-hand corner.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

Analyzing the cluster of supercells near the Kansas/Nebraska border (upper-left corner of panels in Fig. 10), it becomes apparent that for the first five forecasts from different initializations, WoFS Best Member did not produce large enough reflectivity objects at 2230 UTC. WoF-Hybrid, on the other hand, produced small cells near the observed contour in at least four forecasts, but nothing of similar size to the observed object. In the forecast initialized at 2100 UTC, WoF-Hybrid placed a strong reflectivity object at the far northern edge of the observed object (Fig. 10e). This is a successful forecast for WoF-Hybrid, as there was only weak convection under 40 dBZ in the MRMS data, but the cell quickly grew larger within the next half hour. WoF-Hybrid accurately initialized this small area of weak convection and predicted its development. While WoFS Best Member did not predict the presence of this cell whatsoever in forecasts initialized up through 2100 UTC, it is important to note that six of the 18 WoFS members did have a supercell present in the forecasts initialized at 2100 UTC. At the half-hour forecast initialized at 2200 UTC, both WoFS Best Member and WoF-Hybrid had reflectivity contours present for the supercell at the Kansas/Nebraska border (Fig. 10f). Over the previous forecasts initialized at 2000 and 2100 UTC, respectively, the supercell was beginning to split in WoF-Hybrid, which occurs in the observed data just after 2230 UTC (Fig. 10f). Conversely, the majority of WoFS members, however, predicted a nonsplitting supercell that propagates eastward (not shown).

Analyzing 1-h swaths of UH from 2300 to 0000 UTC for forecasts from six different initialization times on 26–27 May 2021 provides further insights into mesocyclone development (Fig. 11). Forecasts initialized from the first two analyses for both systems yielded very low CSI scores, all under 0.2. By 2100 UTC, WoF-Hybrid and WoFS Best Member earned a CSI near 0.3. Finally, at 2300 UTC, WoF-Hybrid was over 0.1 higher than WoFS Best Member. There were many false alarms, with many strong UH objects being predicted in western Kansas, northeastern Colorado, and the Nebraska Panhandle by both systems.

Fig. 11.
Fig. 11.

A spatial representation of UH from 2300 to 0000 UTC 26–27 May over the WoFS domain of six different forecasts initialized at (a) 1800, (b) 1900, (c) 2000, (d) 2100, (e) 2200, and (f) 2300 UTC. Light red shaded areas are the UH swaths above 65 m2 s−2 from WoFS Best Member. Blue contours are the UH swaths above 130 m2 s−2 for WoF-Hybrid. The red triangles, green squares, and blue circles are tornado, hail, and wind reports from 2300 to 0000 UTC, respectively. CSI scores for 30-min UH swaths over the entire domain for WoF-Hybrid and WoFS Best Member are displayed in the upper-right-hand corner.

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

Examining the forecast from 1800 UTC more closely, there is an observed supercell over central Kansas that produced many hail and tornado reports over the hour depicted. WoFS Best Member and WoF-Hybrid UH forecasts exhibited great skill in the propagation of the supercell over the course of the hour, with both systems producing UH swaths over almost the entire area of storm reports associated with this object (Fig. 11a). By the 2100 UTC run, both WoFS members and WoF-Hybrid produced UH swaths near the storm reports displayed and did so throughout the remaining two forecasts initialized at 2200 and 2300 UTC, respectively (Figs. 11e,f). This convective object had been forecast for many hours prior to the forecasts initialized from the final three analyses.

While WoFS and WoF-Hybrid produced accurate forecasts of the supercell in central Kansas, the UH forecasts of the tornado and hail reports on the Kansas/Nebraska border tell a different story. None of the UH forecasts depicted swaths at the border in the forecast initialized at 1800 and 1900 UTC (Figs. 11a,b). All three forecasts for WoFS Best Member and WoF-Hybrid initialized at 2100 UTC depicted strong UH swaths to the north and west of the storm reports (Fig. 11d). The forecasts for WoFS were slightly displaced to the northwest because these forecasts did not produce a supercell split, which occurred around 2230 UTC in the observations. By 2300 UTC, all systems accurately predicted the motion of this storm, correctly creating UH swaths where severe weather was reported (Fig. 11f). Overall, while both systems produced excellent forecasts of the supercell in central Kansas, there were discrepancies in the UH forecasts in relation to the tornado and hail reports on the Kansas/Nebraska border, with WoFS members slightly outperforming WoF-Hybrid in detecting tornado clusters slightly earlier.

2) The 10 December tornado outbreak–QSS

In the case of the severe weather outbreak on 10 December, the atmospheric conditions were conducive to significant severe weather events, with a strong thermodynamic environment [(7°–8°C km−1 midlevel lapse rates and 1000–2000 J kg−1 of mixed-layer convective available potential energy (MLCAPE)] and favorable vertical wind shear profiles [60–80 kt (1 kt ≈ 0.51 m s−1) for the layer between 0 and 6 km]. The SPC had issued a moderate risk area with tornado probabilities up to 15% within 40 km of a point to either side of a line from central Arkansas to southwest Indiana. Hatching of the tornado risk area indicated the potential for significant tornadoes. Convection began to initiate in the warm sector over central Arkansas and southwestern Missouri by 2100 UTC, eventually leading to several long-track supercells and quasi-linear convective systems producing numerous tornadoes, large hail, and damaging winds across several states. The most destructive of these storms was the QSS, which produced an EF4 tornado that was on the ground for about 4 h and led to 60 deaths and over 600 injuries.

Manual inspection of simulated reflectivity and UH swaths quickly reveals stark performance differences between the two systems in this case. WoF-Hybrid consistently depicted a long-lived supercell resembling the QSS over multiple consecutive forecasts initialized beginning at 2300 UTC. These runs represent lead times of up to 4–5 h before the tornado struck Mayfield, Kentucky. In contrast, WoFS members were unable to sustain the storm for more than a few tens of minutes before dissipating, with low probabilities for reflectivity and UH exceedance.

Reflectivity forecasts are compared to the location of the QSS at 0330 UTC, when the associated long-track tornado produced the most structural damage and fatalities, directly striking downtown Mayfield, Kentucky. The storm position can be seen in each panel of Fig. 12 in the MRMS data. In the 2200 UTC runs, neither system predicted reflectivity objects near the QSS at 0330 UTC (Fig. 12a). By the 2300 UTC run, WoF-Hybrid predicted the reflectivity object near Mayfield, Kentucky (Fig. 12b). The 0000 and 0100 UTC runs of WoF-Hybrid produced a reflectivity object in the correct location, while WoFS Best Member still did not sustain this convection much beyond forecast initialization (Figs. 12c,d). By 0200 UTC, WoFS Best Member and WoF-Hybrid had objects overlapping with the observed QSS (Fig. 12e). By this point, almost all the WoFS members also placed convection of various intensities at the same location. Finally, at the 0300 UTC run, both systems produced a similarly sized supercell at the correct location (Fig. 12f).

Fig. 12.
Fig. 12.

(a)–(f) A spatial representation of composite reflectivity in the mid-Mississippi Valley at 0330 UTC 11 Dec for six different forecasts starting with 2200 UTC 10 Dec 2021 in (a) and ending with 0300 UTC 11 Dec 2021 in (f). The domain represented on this plot is zoomed in to focus on the most impactful convection (see Fig. 8, right). Hatched contours are 40-dBZ composite reflectivity objects from MRMS data at 0330 UTC. Red contours are bias-adjusted 40-dBZ composite reflectivity objects from WoFS Best Member. Blue contours are the bias-adjusted 40-dBZ composite reflectivity objects from WoF-Hybrid. CSIs over the entire domain from the initialization time until 0330 UTC for WoF-Hybrid and WoFS Best Member are displayed in the upper-right-hand corner. Mayfield, Kentucky, is located at 36.74°N, 88.64°W (red star).

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

The analysis of UH swaths during the peak event severity (from 0200 to 0400 UTC) on 10–11 December provides further insights into how well WoF-Hybrid and WoFS forecast mesocyclone develops (Fig. 13). For the forecasts initialized at 2300 UTC, WoF-Hybrid already predicted the long-lived supercell UH track. By the 0000 UTC run, WoF-Hybrid produced two UH tracks, including the QSS-related UH track which matched the storm reports well. At the 0100 UTC forecasts, WoF-Hybrid showed a swath collocated with all storm reports associated with the QSS, while WoFS Best Member showed a similar swath with a northeastward bias (Fig. 13c). By the 0200 UTC run, both WoF-Hybrid and WoFS Best Member showed comparable depictions of the QSS (Fig. 13d). Overall, WoF-Hybrid depicted the QSS more accurately than WoFS Best Member in the 0–6-h forecasts from three of the four initialization times between 2300 and 0200 UTC. WoF-Hybrid exhibited fewer false alarm UH swaths and provided a more accurate representation of the propagation and location of the QSS several hours before WoFS in this case.

Fig. 13.
Fig. 13.

A spatial representation of UH from 0200 to 0400 UTC 11 Dec 2021, over the WoFS domain for forecasts initialized at (a) 2300, (b) 0000, (c) 0100, and (d) 0200 UTC. Light red shaded areas are the UH swaths above 65 m2 s−2 of WoFS Best Member. Blue contours are UH swaths above 130 m2 s−2 for WoF-Hybrid. The red triangles, green squares, and blue circles are tornado, hail, and wind reports from 0200 to 0400 UTC 11 Dec 2021, respectively. CSIs of 30-min UH swaths over the entire domain for WoF-hybrid and WoFS Best Member are displayed in the upper-right-hand corner. Mayfield, Kentucky, is located at 36.74°N, 88.64°W (red star).

Citation: Weather and Forecasting 40, 3; 10.1175/WAF-D-24-0143.1

The main takeaway for this case was the difference in lead time. WoF-Hybrid sustained convection as early as the 2300 UTC run, providing 4.5 h of lead time prior to the storm reaching Mayfieled, Kentucky, and accurately predicted the track of the QSS. The majority of WoFS members did not place any convection near Mayfield, Kentucky, until the 0200 UTC run, providing only 1.5 h of lead time for Mayfield, and having missed the earlier lifetime of this long-lived tornado. One benefit of producing forecasts every hour is the ability to look back at previous runs to determine if reflectivity objects are being consistently forecast in the same place at the same valid time. WoF-Hybrid consistently produced strong convection near Mayfield, Kentucky, for five straight forecasts initialized at five different times, representing lead times of 0.5–4.5 h. If available in operations, this could have helped increase forecaster confidence in strong convection occurring in this area around the predicted time. Given the ongoing reports of tornadoes with this storm and long, continuous UH swaths in the WoF-Hybrid output, forecasters would have had numerical guidance to support the notion of a long-track tornadic storm that would pass near Mayfield and many of the other areas where damage occurred. Further research is needed to understand how differences in model resolution and data assimilation techniques between WoF-Hybrid and WoFS impact their respective forecast outputs, particularly in the context of supercell size and environmental parameters.

5. Summary and conclusions

The comparison between WoFS and WoF-Hybrid, two rapidly updating, high-resolution data assimilation and forecast systems, provides valuable insights into their strengths and weaknesses in forecasting severe weather hazards. Both systems were evaluated using object-based statistical methods against observations, focusing on their performance during high-impact severe weather events in 2021.

Overall, both WoFS and WoF-Hybrid demonstrated reasonably good performance in predicting severe weather hazards, with similar object counts for composite reflectivity and UH, roughly matching those derived from NSSL MRMS products. CSI scores indicated similar performance between the two systems, particularly during high-impact severe weather events. However, there were differences in the details of system performance. Binning the forecast results by initialization time revealed that both systems generally produced better forecasts at later starting times, very likely owing to assimilation of radar and satellite observations of storms once they have initiated. This is consistent with previous findings (Guerra et al. 2022). WoF-Hybrid exhibited slightly better performance than WoFS in forecasts of reflectivity and UH, particularly for early initialization times. One reason for this difference may be attributed to the time required for WoFS to spin up storms compared to the variational-based WoF-Hybrid.

Examining object-based reflectivity and UH binned by lead time, WoF-Hybrid performed better at almost all lead times for UH forecasts. Conversely, WoFS reflectivity forecasts showed slightly better performance in the first 4.5 h. Partitioning events by severity indicated that WoF-Hybrid yielded higher CSIs/PODs and lower FARs for high-end severe weather events compared to WoFS. However, WoFS performed better for midrange and low-end events, especially for the first few hours of forecasts. In terms of spatial representations, WoF-Hybrid generally produced more accurate depictions of thunderstorms than WoFS, with closer matches to observed objects in terms of size and proximity. This may be attributed to the higher-resolution radar data and model grids used in WoF-Hybrid, allowing for better resolution of storm structure and propagation, particularly for supercells.

WoFS and WoF-Hybrid performance was similar in a 26 May 2021 case study; WoF-Hybrid produced more accurate depictions at longer lead times than WoFS Best Member for the QSS in a 10 December 2021 case study. One of the most successful aspects of WoF-Hybrid is that the most impactful reflectivity objects from 26 May and 10 December were predicted 1–4 h earlier than the most skillful WoFS member. On 10 December, WoF-Hybrid produced fewer false alarms in UH during the most impactful event times compared to the most skillful WoFS member.

In conclusion, this research demonstrated that both WoFS and WoF-Hybrid perform similarly well in predicting storm location and UH swaths during high-impact severe weather events. WoF-Hybrid provides a slightly more accurate depiction of thunderstorms than WoFS in terms of statistical scores over 31 cases as well as in two individual case studies. This is no surprise, as the two systems share similar modeling components (e.g., WRF core), but WoF-Hybrid is run at 1.5-km grid spacing while WoFS is at 3.0 km.

Ensemble probabilistic guidance provided by WoFS is very important for NWS forecasters, especially for extreme hazardous weather events because it provides users valuable quantification of forecast uncertainty. On the other hand, WoF-Hybrid is very efficient and can run with a higher resolution than WoFS to resolve more detailed convective-scale features.

WoF-Hybrid yields an AAR closer to one at all initialization times for reflectivity forecasts and after the 2100 UTC initialization for UH forecasts. This implies that WoF-Hybrid matched objects are more accurate in size to the observed objects than WoFS. This is a notable strength of WoF-Hybrid, indicating more accurate location forecasts for higher-resolution runs. In addition, both systems have their shortcomings. The EnKF-based WoFS, while capable of providing valuable probabilistic information, has the drawback of requiring time to fully spin up storms, as discussed by Guerra (2022). On the other hand, the WoF-Hybrid’s single higher-resolution analysis and forecast provide detailed convective-resolving features, such as internal thunderstorm structures without requiring spinup time. The higher-resolution forecasts must be initialized directly from higher-resolution analyses, as downscaled lower-resolution analyses typically do not lead to improved short-term severe weather forecasts using WoFS (Miller et al. 2022). WoF-Hybrid utilizes a hybrid ensemble-variational data assimilation scheme, offering flexibility to integrate flow-dependent information from WoFS to the analysis system and assimilating higher-density radar data than WoFS without significantly increasing computational costs. However, WoF-Hybrid cannot provide probabilistic information. Forecasters using both WoFS and WoF-hybrid would have probabilistic information on a 3-km grid from WoFS and deterministic information on a 1.5-km grid from WoF-Hybrid to examine storm structure details. Also, motivated by a desire to feed the strengths of WoF-Hybrid into WoFS, two-way coupling was recently developed and tested with a few real data cases (Gao et al. 2023). The initial finding was this approach has potential to improve WoFS forecasts for storm location and intensity. Operationally, more skillful spatial and temporal representations of storms are very important for increasing severe thunderstorm warning lead times and potentially saving lives. Research and verification with additional cases for coupling of WoFS and WoF-Hybrid are needed to fully assess this approach and its use to improve severe weather warnings and forecasts.

NSSL is working with the National Centers for Environmental Prediction (NCEP) to initiate the WoFS into operation to maximize its usefulness to both short-term warnings and short-term decision support services. Early in the lifetime of WoFS experimental runs, the research group established a research-to-operations-to-research loop in which NWS forecaster feedback and product suggestions are received both in real time and during after-action reviews. WoFS has been an integral part of NOAA’s Hazardous Weather Testbed Spring Forecasting Experiment since 2017, and several WoFS experiments have also been performed in the testbed to study operational workflows and the efficient communication of WoFS information through NWS products and services. All of these efforts are helping researchers describe the strengths and weaknesses of WoFS and operational best practices for applying WoFS output along the watch to warning timeline and are perhaps best summarized by Wilson et al. (2024). Formal training for forecasters is planned as part of a research-to-operation transition plan for WoFS. WoF-Hybrid has not been formally tested since 2022. However, we are doing more research and hopefully spurious cells can be reduced and the accuracy of the forecasts can be further improved with WoF-Hybrid. Ultimately, we envision that WoF-Hybrid can be used to provide complementary information to augment WoFS capabilities.

Acknowledgments.

Funding was provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA-University of Oklahoma Cooperative Agreement NA11OAR4320072, U.S. Department of Commerce. This work was further supported by the NOAA Warn-on-Forecast project. The second author’s research was also partially supported by NSF Grant AGS-2136161. The simulations were conducted on NSSL’s local HPC resources (Buxton and Odin). We would also like to thank the University of Oklahoma, School of Meteorology, for supporting this work.

Data availability statement.

For this work, the Multi-Radar Multi-Sensor (MRMS) and the Stage IV rainfall products and the aggregate forecast statistics for composite reflectivity and APCP are accessible online (https://doi.org/10.5281/zenodo.4495919). The community, version 1.3, of (GSI-EnKF) data assimilation software can be downloaded at https://dtcenter.org/community-code/gridpoint-statistical-interpolation-gsi/download. The (WRF) source code, version 3.9, is publicly available at NCAR/UCAR (https://github.com/wrf-model/WRF). The (WSR-88D Level-II) data (reflectivity factor and radial velocity) used in this research can be accessed at http://www.ncdc.noaa.gov/ by filling in locations of radar site and date. The WoFS forecast data used in this study are not currently available in a publicly accessible repository. However, the data and code used to generate the results herein are available from the authors upon request.

REFERENCES

  • Chen, L., C. Liu, Y. Jung, P. Skinner, M. Xue, and R. Kong, 2022: Object-based verification of GSI EnKF and hybrid En3DVar radar data assimilation and convection-allowing forecasts within a warn-on-forecast framework. Wea. Forecasting, 37, 639658, https://doi.org/10.1175/WAF-D-20-0180.1.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2021a: A real-time, virtual spring forecasting experiment to advance severe weather prediction. Bull. Amer. Meteor. Soc., 102, E814E816, https://doi.org/10.1175/BAMS-D-20-0268.1.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2021b: Spring forecasting experiment 2021 – Preliminary findings and results. NOAA Hazardous Weather Testbed, accessed 1 October 2022, https://hwt.nssl.noaa.gov/sfe/2021/docs/HWT_SFE_2021_Prelim_Findings_FINAL.pdf.

  • Davis, C. A., B. G. Brown, R. Bullock, and J. Halley-Gotway, 2009: The Method For Object-Based Diagnostic Evaluation (MODE) applied to numerical forecasts from the 2005 NSSL/SPC spring program. Wea. Forecasting, 24, 12521267, https://doi.org/10.1175/2009WAF2222241.1.

    • Search Google Scholar
    • Export Citation
  • Dowell, D. C., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part I: Motivation and system description. Wea. Forecasting, 37, 13711395, https://doi.org/10.1175/WAF-D-21-0151.1.

    • Search Google Scholar
    • Export Citation
  • Gao, J., and D. J. Stensrud, 2014: Some observing system simulation experiments with a hybrid 3DEnVAR system for storm-scale radar data assimilation. Mon. Wea. Rev., 142, 33263346, https://doi.org/10.1175/MWR-D-14-00025.1.

    • Search Google Scholar
    • Export Citation
  • Gao, J., and Coauthors, 2013: A real-time weather-adaptive 3DVAR analysis system for severe weather detections and warnings. Wea. Forecasting, 28, 727745, https://doi.org/10.1175/WAF-D-12-00093.1.

    • Search Google Scholar
    • Export Citation
  • Gao, J., Y. Wang, D. M. Wheatley, K. H. Knopfmeier, T. A. Jones, and G. Creager, 2017: Test of a weather-adaptive hybrid 3DEnVAR and WRF-DART analysis and forecast system during the HWT spring experiments in 2017. 38th Conf. on Radar Meteorology, St. Gallen, Switzerland, Amer. Meteor. Soc., 19B.1, https://ams.confex.com/ams/38RADAR/meetingapp.cgi/Paper/321145.

  • Gao, J., and Coauthors, 2021: Testing of the Warn-on-Forecast (WoF) hybrid data assimilation and forecasting system during the HWT spring experiment in 2020. 11th Conf. on Transition of Research to Operations, Online, Amer. Meteor. Soc., 10.2, https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/382602.

  • Gao, J., and Coauthors, 2022: Testing of the warn-on-forecast hybrid data assimilation and forecasting system at 1.5-km resolution during the HWT spring forecasting experiment in 2021. 12th Conf. on Transition of Research to Operations, Online, Amer. Meteor. Soc., 11B.6, https://ams.confex.com/ams/102ANNUAL/meetingapp.cgi/Paper/395399.

  • Gao, J., and Coauthors, 2023. Improving the Warn-on-Forecast System using a hybrid gain data assimilation method: A case study of the 10 December 2021 tornado outbreak. 13th Conf. on Transition of Research to Operations., Denver, CO, Amer. Meteor. Soc., 11B.3, https://ams.confex.com/ams/103ANNUAL/meetingapp.cgi/Session/63173.

  • Guerra, J. E., P. S. Skinner, A. Clark, M. Flora, B. Matilla, K. Knopfmeier, and A. E. Reinhart, 2022: Quantification of NSSL Warn-on-Forecast System accuracy by storm age using object-based verification. Wea. Forecasting, 37, 19731983, https://doi.org/10.1175/WAF-D-22-0043.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14, 155167, https://doi.org/10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Heinselman, P. L., and Coauthors, 2024: Warn-on-Forecast System: From vision to reality. Wea. Forecasting, 39, 7595, https://doi.org/10.1175/WAF-D-23-0147.1.

    • Search Google Scholar
    • Export Citation
  • Hu, J., and Coauthors, 2021: Evaluation of an experimental Warn-on-Forecast 3DVAR analysis and forecast system on quasi-real-time short-term forecasts of high-impact weather events. Quart. J. Roy. Meteor. Soc., 147, 40634082, https://doi.org/10.1002/qj.4168.

    • Search Google Scholar
    • Export Citation
  • James, E. P., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part II: Forecast performance. Wea. Forecasting, 37, 13971417, https://doi.org/10.1175/WAF-D-21-0130.1.

    • Search Google Scholar
    • Export Citation
  • Jones, T. A., X. Wang, P. Skinner, A. Johnson, and Y. Wang, 2018: Assimilation of GOES-13 imager clear-sky water vapor (6.5 μm) radiances into a warn-on-forecast system. Mon. Wea. Rev., 146, 10771107, https://doi.org/10.1175/MWR-D-17-0280.1.

    • Search Google Scholar
    • Export Citation
  • Jones, T. A., and Coauthors, 2020: Assimilation of GOES-16 radiances and retrievals into the Warn-on-Forecast System. Mon. Wea. Rev., 148, 18291859, https://doi.org/10.1175/MWR-D-19-0379.1.

    • Search Google Scholar
    • Export Citation
  • Kain, J. S., and Coauthors, 2008: Some practical considerations regarding horizontal resolution in the first generation of operational convection-allowing NWP. Wea. Forecasting, 23, 931952, https://doi.org/10.1175/WAF2007106.1.

    • Search Google Scholar
    • Export Citation
  • Liu, H., M. Hu, G. Ge, C. Zhou, D. Stark, H. Shao, K. Newman, and J. Whitaker, 2018: Ensemble Kalman filter (EnKF) user’s guide version 1.3. Developmental Testbed Center, 80 pp., https://dtcenter.org/community-code/ensemble-kalman-filter-system-enkf/documentation.

  • Miller, W. J. S., and Coauthors, 2022: Exploring the usefulness of downscaling free forecasts from the Warn-on-Forecast System. Wea. Forecasting, 37, 181203, https://doi.org/10.1175/WAF-D-21-0079.1.

    • Search Google Scholar
    • Export Citation
  • NWS, 2024: NWS Storm Damage Summaries. National Weather Service, accessed 12 July 2024, https://www.weather.gov/crh/dec112021.

  • Pan, S., J. Gao, T. A. Jones, Y. Wang, X. Wang, and J. Li, 2021: The impact of assimilating satellite-derived layered precipitable water, cloud water path, and radar data on short-range thunderstorm forecasts. Mon. Wea. Rev., 149, 13591380, https://doi.org/10.1175/MWR-D-20-0040.1.

    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., and Coauthors, 2018: Object-based verification of a prototype warn-on-forecast system. Wea. Forecasting, 33, 12251250, https://doi.org/10.1175/WAF-D-18-0020.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., and Coauthors, 2013: Progress and challenges with Warn-on-Forecast. Atmos. Res., 123, 216, https://doi.org/10.1016/j.atmosres.2012.04.004.

    • Search Google Scholar
    • Export Citation
  • Wang, Y., J. Gao, P. S. Skinner, D. M. Wheatley, J. J. Choate, T. A. Jones, and G. Creager, 2018: Test of a hybrid 3DEnVAR and WRF-DART analysis and forecast system during the HWT spring experiments in 2017. 98th AMS Annual Meeting, Austin, TX, Amer. Meteor. Soc., 168, https://ams.confex.com/ams/98Annual/meetingapp.cgi/Paper/330920.

  • Wang, Y., J. Gao, P. S. Skinner, K. Knopfmeier, T. Jones, G. Creager, P. L. Heiselman, and L. J. Wicker, 2019: Test of a weather-adaptive dual-resolution hybrid warn-on-forecast analysis and forecast system for several severe weather events. Wea. Forecasting, 34, 18071827, https://doi.org/10.1175/WAF-D-19-0071.1.

    • Search Google Scholar
    • Export Citation
  • Wilson, K. A., and Coauthors, 2024: Collaborative exploration of storm-scale probabilistic guidance for NWS forecast operations. Wea. Forecasting, 39, 387402, https://doi.org/10.1175/WAF-D-23-0174.1.

    • Search Google Scholar
    • Export Citation
  • Yussouf, N., and K. H. Knopfmeier, 2019: Application of the Warn-on-Forecast system for flash-flood-producing heavy convective rainfall events. Quart. J. Roy. Meteor. Soc., 145, 23852403, https://doi.org/10.1002/qj.3568.

    • Search Google Scholar
    • Export Citation
Save
  • Chen, L., C. Liu, Y. Jung, P. Skinner, M. Xue, and R. Kong, 2022: Object-based verification of GSI EnKF and hybrid En3DVar radar data assimilation and convection-allowing forecasts within a warn-on-forecast framework. Wea. Forecasting, 37, 639658, https://doi.org/10.1175/WAF-D-20-0180.1.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2021a: A real-time, virtual spring forecasting experiment to advance severe weather prediction. Bull. Amer. Meteor. Soc., 102, E814E816, https://doi.org/10.1175/BAMS-D-20-0268.1.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2021b: Spring forecasting experiment 2021 – Preliminary findings and results. NOAA Hazardous Weather Testbed, accessed 1 October 2022, https://hwt.nssl.noaa.gov/sfe/2021/docs/HWT_SFE_2021_Prelim_Findings_FINAL.pdf.

  • Davis, C. A., B. G. Brown, R. Bullock, and J. Halley-Gotway, 2009: The Method For Object-Based Diagnostic Evaluation (MODE) applied to numerical forecasts from the 2005 NSSL/SPC spring program. Wea. Forecasting, 24, 12521267, https://doi.org/10.1175/2009WAF2222241.1.

    • Search Google Scholar
    • Export Citation
  • Dowell, D. C., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part I: Motivation and system description. Wea. Forecasting, 37, 13711395, https://doi.org/10.1175/WAF-D-21-0151.1.

    • Search Google Scholar
    • Export Citation
  • Gao, J., and D. J. Stensrud, 2014: Some observing system simulation experiments with a hybrid 3DEnVAR system for storm-scale radar data assimilation. Mon. Wea. Rev., 142, 33263346, https://doi.org/10.1175/MWR-D-14-00025.1.

    • Search Google Scholar
    • Export Citation
  • Gao, J., and Coauthors, 2013: A real-time weather-adaptive 3DVAR analysis system for severe weather detections and warnings. Wea. Forecasting, 28, 727745, https://doi.org/10.1175/WAF-D-12-00093.1.

    • Search Google Scholar
    • Export Citation
  • Gao, J., Y. Wang, D. M. Wheatley, K. H. Knopfmeier, T. A. Jones, and G. Creager, 2017: Test of a weather-adaptive hybrid 3DEnVAR and WRF-DART analysis and forecast system during the HWT spring experiments in 2017. 38th Conf. on Radar Meteorology, St. Gallen, Switzerland, Amer. Meteor. Soc., 19B.1, https://ams.confex.com/ams/38RADAR/meetingapp.cgi/Paper/321145.

  • Gao, J., and Coauthors, 2021: Testing of the Warn-on-Forecast (WoF) hybrid data assimilation and forecasting system during the HWT spring experiment in 2020. 11th Conf. on Transition of Research to Operations, Online, Amer. Meteor. Soc., 10.2, https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/382602.

  • Gao, J., and Coauthors, 2022: Testing of the warn-on-forecast hybrid data assimilation and forecasting system at 1.5-km resolution during the HWT spring forecasting experiment in 2021. 12th Conf. on Transition of Research to Operations, Online, Amer. Meteor. Soc., 11B.6, https://ams.confex.com/ams/102ANNUAL/meetingapp.cgi/Paper/395399.

  • Gao, J., and Coauthors, 2023. Improving the Warn-on-Forecast System using a hybrid gain data assimilation method: A case study of the 10 December 2021 tornado outbreak. 13th Conf. on Transition of Research to Operations., Denver, CO, Amer. Meteor. Soc., 11B.3, https://ams.confex.com/ams/103ANNUAL/meetingapp.cgi/Session/63173.

  • Guerra, J. E., P. S. Skinner, A. Clark, M. Flora, B. Matilla, K. Knopfmeier, and A. E. Reinhart, 2022: Quantification of NSSL Warn-on-Forecast System accuracy by storm age using object-based verification. Wea. Forecasting, 37, 19731983, https://doi.org/10.1175/WAF-D-22-0043.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14, 155167, https://doi.org/10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Heinselman, P. L., and Coauthors, 2024: Warn-on-Forecast System: From vision to reality. Wea. Forecasting, 39, 7595, https://doi.org/10.1175/WAF-D-23-0147.1.

    • Search Google Scholar
    • Export Citation
  • Hu, J., and Coauthors, 2021: Evaluation of an experimental Warn-on-Forecast 3DVAR analysis and forecast system on quasi-real-time short-term forecasts of high-impact weather events. Quart. J. Roy. Meteor. Soc., 147, 40634082, https://doi.org/10.1002/qj.4168.

    • Search Google Scholar
    • Export Citation
  • James, E. P., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part II: Forecast performance. Wea. Forecasting, 37, 13971417, https://doi.org/10.1175/WAF-D-21-0130.1.

    • Search Google Scholar
    • Export Citation
  • Jones, T. A., X. Wang, P. Skinner, A. Johnson, and Y. Wang, 2018: Assimilation of GOES-13 imager clear-sky water vapor (6.5 μm) radiances into a warn-on-forecast system. Mon. Wea. Rev., 146, 10771107, https://doi.org/10.1175/MWR-D-17-0280.1.

    • Search Google Scholar
    • Export Citation
  • Jones, T. A., and Coauthors, 2020: Assimilation of GOES-16 radiances and retrievals into the Warn-on-Forecast System. Mon. Wea. Rev., 148, 18291859, https://doi.org/10.1175/MWR-D-19-0379.1.

    • Search Google Scholar
    • Export Citation
  • Kain, J. S., and Coauthors, 2008: Some practical considerations regarding horizontal resolution in the first generation of operational convection-allowing NWP. Wea. Forecasting, 23, 931952, https://doi.org/10.1175/WAF2007106.1.

    • Search Google Scholar
    • Export Citation
  • Liu, H., M. Hu, G. Ge, C. Zhou, D. Stark, H. Shao, K. Newman, and J. Whitaker, 2018: Ensemble Kalman filter (EnKF) user’s guide version 1.3. Developmental Testbed Center, 80 pp., https://dtcenter.org/community-code/ensemble-kalman-filter-system-enkf/documentation.

  • Miller, W. J. S., and Coauthors, 2022: Exploring the usefulness of downscaling free forecasts from the Warn-on-Forecast System. Wea. Forecasting, 37, 181203, https://doi.org/10.1175/WAF-D-21-0079.1.

    • Search Google Scholar
    • Export Citation
  • NWS, 2024: NWS Storm Damage Summaries. National Weather Service, accessed 12 July 2024, https://www.weather.gov/crh/dec112021.

  • Pan, S., J. Gao, T. A. Jones, Y. Wang, X. Wang, and J. Li, 2021: The impact of assimilating satellite-derived layered precipitable water, cloud water path, and radar data on short-range thunderstorm forecasts. Mon. Wea. Rev., 149, 13591380, https://doi.org/10.1175/MWR-D-20-0040.1.

    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., and Coauthors, 2018: Object-based verification of a prototype warn-on-forecast system. Wea. Forecasting, 33, 12251250, https://doi.org/10.1175/WAF-D-18-0020.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., and Coauthors, 2013: Progress and challenges with Warn-on-Forecast. Atmos. Res., 123, 216, https://doi.org/10.1016/j.atmosres.2012.04.004.

    • Search Google Scholar
    • Export Citation
  • Wang, Y., J. Gao, P. S. Skinner, D. M. Wheatley, J. J. Choate, T. A. Jones, and G. Creager, 2018: Test of a hybrid 3DEnVAR and WRF-DART analysis and forecast system during the HWT spring experiments in 2017. 98th AMS Annual Meeting, Austin, TX, Amer. Meteor. Soc., 168, https://ams.confex.com/ams/98Annual/meetingapp.cgi/Paper/330920.

  • Wang, Y., J. Gao, P. S. Skinner, K. Knopfmeier, T. Jones, G. Creager, P. L. Heiselman, and L. J. Wicker, 2019: Test of a weather-adaptive dual-resolution hybrid warn-on-forecast analysis and forecast system for several severe weather events. Wea. Forecasting, 34, 18071827, https://doi.org/10.1175/WAF-D-19-0071.1.

    • Search Google Scholar
    • Export Citation
  • Wilson, K. A., and Coauthors, 2024: Collaborative exploration of storm-scale probabilistic guidance for NWS forecast operations. Wea. Forecasting, 39, 387402, https://doi.org/10.1175/WAF-D-23-0174.1.

    • Search Google Scholar
    • Export Citation
  • Yussouf, N., and K. H. Knopfmeier, 2019: Application of the Warn-on-Forecast system for flash-flood-producing heavy convective rainfall events. Quart. J. Roy. Meteor. Soc., 145, 23852403, https://doi.org/10.1002/qj.3568.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Object count of composite reflectivity for (a),(b) forecasts valid at 1 hour and (c),(d) forecasts valid at 6 hours, binned by forecast initialization time over 31 cases in 2021. (a),(c) The total object count per event; (b),(d) the matched object count per event. Solid blue lines represent WoF-Hybrid, gray lines are for individual WoFS members, and solid black lines are for WMA. In (a) and (c), dashed blue and dashed black lines are for 1.5- and 3-km MRMS AWS objects, respectively.

  • Fig. 2.

    The performance diagram for (a) forecasts valid at 1 h and (b) forecasts valid at 6 h aggregated over 31 cases in 2021. Each number represents a forecast initialization time for WMA (solid black) or WoF-Hybrid (solid blue).

  • Fig. 3.

    Statistical scores including (a),(d) CSI, (b),(e) POD, and (c),(f) FAR for composite reflectivity. (left) Aggregated scores over all 2021 cases binned by forecast lead time. (right) Aggregated scores over three different event severity categories. On the left, gray lines are for individual WoFS members, black lines are for WMA, and blue lines are for WoF-Hybrid. On the right, solid blue (black) lines are for high-end severity cases for WoF-Hybrid (WMA). Dashed blue (black) lines represent midseverity cases for WoF-Hybrid (WMA). Dotted blue (black) lines represent low severity cases for WoF-Hybrid (WMA). Each point represents a 10-min time interval. The error bars represent the 95% confidence intervals in each panel (note: on the left column, error bars are plotted for WMA and WoF-Hybrid; on the right column, error bars are only plotted for high-severity cases for clarity purpose).

  • Fig. 4.

    (a) AAR and (b) AMD for composite reflectivity binned by forecast lead time for all 2021 cases. Blue lines represent WoF-Hybrid, gray lines represent individual WoFS members, and black lines represent WMA. Each point represents a 10-min time interval.

  • Fig. 5.

    As in Fig. 2, but for UH, and in (a), forecasts valid at 90 min rather than 60 min.

  • Fig. 6.

    As in Fig. 3, but for UH.

  • Fig. 7.

    As in Fig. 4, but for UH.

  • Fig. 8.

    The analysis and forecast domains and locations of the radar sites for (left) 26 May 2021 and (right) 10 Dec 2021. The red triangles, green squares, and blue circles indicate observed tornadoes, hail, and wind events from NWS storm reports, respectively. The pink box in each domain represents the area that experienced the greatest severe weather impact, displayed in Figs. 10 and 12, respectively. The green star sign inside the box in the right panel is the location of Mayfield, Kentucky.

  • Fig. 9.

    Composite reflectivity from (a) MRMS and 20-min forecasts from (b) WoF-Hybrid, (c) WoFS member 4, and (d) WoFS member 12.

  • Fig. 10.

    A spatial representation of composite reflectivity near the Kansas/Nebraska border at 2230 UTC 26 May 2021, for six different forecasts initialized at (a) 1700, (b) 1800, (c) 1900, (d) 2000, (e) 2100, and (f) 2200 UTC. The domain represented on this plot is zoomed in to focus on the most impactful convection (see Fig. 8, left). Hatched contours are MRMS data above 40 dBZ from 2230 UTC. Red shaded areas are the reflectivity above 46.1 dBZ (WoFS object threshold) of WoFS Best Member, which is defined by the member with the highest averaged CSI values for the parameter of interest during 0–6-h forecasts among the 18 members. Blue shaded areas are the reflectivity contours above 47.1 dBZ (WoF-Hybrid object threshold) for WoF-Hybrid. CSIs over the entire domain from the initialization time until 2230 UTC of WoF-Hybrid and WoFS Best Member are displayed in the upper-right-hand corner.

  • Fig. 11.

    A spatial representation of UH from 2300 to 0000 UTC 26–27 May over the WoFS domain of six different forecasts initialized at (a) 1800, (b) 1900, (c) 2000, (d) 2100, (e) 2200, and (f) 2300 UTC. Light red shaded areas are the UH swaths above 65 m2 s−2 from WoFS Best Member. Blue contours are the UH swaths above 130 m2 s−2 for WoF-Hybrid. The red triangles, green squares, and blue circles are tornado, hail, and wind reports from 2300 to 0000 UTC, respectively. CSI scores for 30-min UH swaths over the entire domain for WoF-Hybrid and WoFS Best Member are displayed in the upper-right-hand corner.

  • Fig. 12.

    (a)–(f) A spatial representation of composite reflectivity in the mid-Mississippi Valley at 0330 UTC 11 Dec for six different forecasts starting with 2200 UTC 10 Dec 2021 in (a) and ending with 0300 UTC 11 Dec 2021 in (f). The domain represented on this plot is zoomed in to focus on the most impactful convection (see Fig. 8, right). Hatched contours are 40-dBZ composite reflectivity objects from MRMS data at 0330 UTC. Red contours are bias-adjusted 40-dBZ composite reflectivity objects from WoFS Best Member. Blue contours are the bias-adjusted 40-dBZ composite reflectivity objects from WoF-Hybrid. CSIs over the entire domain from the initialization time until 0330 UTC for WoF-Hybrid and WoFS Best Member are displayed in the upper-right-hand corner. Mayfield, Kentucky, is located at 36.74°N, 88.64°W (red star).

  • Fig. 13.

    A spatial representation of UH from 0200 to 0400 UTC 11 Dec 2021, over the WoFS domain for forecasts initialized at (a) 2300, (b) 0000, (c) 0100, and (d) 0200 UTC. Light red shaded areas are the UH swaths above 65 m2 s−2 of WoFS Best Member. Blue contours are UH swaths above 130 m2 s−2 for WoF-Hybrid. The red triangles, green squares, and blue circles are tornado, hail, and wind reports from 0200 to 0400 UTC 11 Dec 2021, respectively. CSIs of 30-min UH swaths over the entire domain for WoF-hybrid and WoFS Best Member are displayed in the upper-right-hand corner. Mayfield, Kentucky, is located at 36.74°N, 88.64°W (red star).

All Time Past Year Past 30 Days
Abstract Views 19 19 19
Full Text Views 531 531 531
PDF Downloads 206 206 206