Biases in the Prediction of Convective Storm Characteristics with a Convection Allowing Ensemble

Joseph A. Grim aResearch Applications Laboratory, National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Joseph A. Grim in
Current site
Google Scholar
PubMed
Close
,
James O. Pinto aResearch Applications Laboratory, National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by James O. Pinto in
Current site
Google Scholar
PubMed
Close
,
Thomas Blitz aResearch Applications Laboratory, National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Thomas Blitz in
Current site
Google Scholar
PubMed
Close
,
Kenneth Stone aResearch Applications Laboratory, National Center for Atmospheric Research, Boulder, Colorado

Search for other papers by Kenneth Stone in
Current site
Google Scholar
PubMed
Close
, and
David C. Dowell bNOAA/Global Systems Laboratory, Boulder, Colorado

Search for other papers by David C. Dowell in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The severity, duration, and spatial extent of thunderstorm impacts are related to convective storm mode. This study assesses the skill of the High-Resolution Rapid Refresh Ensemble (HRRR-E) and its deterministic counterpart (HRRRv4) at predicting convective mode and storm macrophysical properties using 35 convective events observed during the 2020 warm season across the eastern United States. Seven cases were selected from each of five subjectively determined convective organization modes: tropical cyclones, mesoscale convective systems (MCSs), quasi-linear convective systems, clusters, and cellular convection. These storm events were assessed using an object-based approach to identify convective storms and determine their individual size. Averaged across all 35 cases, both the HRRR-E and HRRRv4 predicted storm areas were generally larger than observed, with this bias being a function of storm lifetime and convective mode. Both modeling systems also underpredicted the rapid increase in storm counts during the initiation period, particularly for the smaller-scale storm modes. Interestingly, performance of the HRRRv4 differed from that of the HRRR-E, with the HRRRv4 generally having a larger bias in total storm area than the HRRR-E due to HRRRv4 predicting up to 66% more storm objects than the HRRR-E. The HRRR-E accurately predicted the convective mode 65% of the time, with complete misses being very rare (<5% of the time overall). However, an evaluation of rank histograms across all 35 cases revealed that the HRRR-E tended to be underdispersive when predicting storm size for all but the MCS mode.

© 2021 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Joseph A. Grim, grim@ucar.edu

Abstract

The severity, duration, and spatial extent of thunderstorm impacts are related to convective storm mode. This study assesses the skill of the High-Resolution Rapid Refresh Ensemble (HRRR-E) and its deterministic counterpart (HRRRv4) at predicting convective mode and storm macrophysical properties using 35 convective events observed during the 2020 warm season across the eastern United States. Seven cases were selected from each of five subjectively determined convective organization modes: tropical cyclones, mesoscale convective systems (MCSs), quasi-linear convective systems, clusters, and cellular convection. These storm events were assessed using an object-based approach to identify convective storms and determine their individual size. Averaged across all 35 cases, both the HRRR-E and HRRRv4 predicted storm areas were generally larger than observed, with this bias being a function of storm lifetime and convective mode. Both modeling systems also underpredicted the rapid increase in storm counts during the initiation period, particularly for the smaller-scale storm modes. Interestingly, performance of the HRRRv4 differed from that of the HRRR-E, with the HRRRv4 generally having a larger bias in total storm area than the HRRR-E due to HRRRv4 predicting up to 66% more storm objects than the HRRR-E. The HRRR-E accurately predicted the convective mode 65% of the time, with complete misses being very rare (<5% of the time overall). However, an evaluation of rank histograms across all 35 cases revealed that the HRRR-E tended to be underdispersive when predicting storm size for all but the MCS mode.

© 2021 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Joseph A. Grim, grim@ucar.edu

1. Introduction

The severity, duration, and spatial extent of thunderstorm impacts is directly related to convective storm mode. Larger thunderstorm complexes (e.g., mesoscale convective systems, MCSs) tend to spawn more widespread flooding and wind reports, while severe cellular convection and linear storms (e.g., quasi-linear convective systems, QLCSs) are more likely to spawn damaging hail and tornados (e.g., Gallus et al. 2008; Smith et al. 2012). Because of their small footprint and limited lifetime, the smaller thunderstorms, classified as cellular storms or pulse storms, tend to have lesser impacts overall, with the exception of supercells. More organized convective systems that develop via upscale growth of individual storm cells into clusters can have broader-ranging effects including flash flooding, hail, and wind damage. In addition, these clusters of storms tend to move slowly, erratically, or become quasi-stationary, and are difficult to predict due to their weakly forced nature and chaotic interactions.

Further increases in storm scale usually require mesoscale or synoptic-scale forcing, such as lift generated by a surface-based or elevated frontal boundary or upper-level vorticity advection that helps foster up-scale growth of storm cells into QLCSs or MCSs. Impacts of these larger storm systems can be far reaching and very costly, with significant economic and societal consequences. In fact, one of the cases studied here, the August 2020 Midwest derecho, cut a 1000+-km-long swath from South Dakota to Ohio, killing four people and causing damages totaling $11 billion (U.S. dollars) (Smith 2021; Ribeiro et al. 2021). More typical, less damaging QLCSs or MCSs can be major contributors to flight delays across the United States, with convective storms totaling over 50% of weather-related delays reported each summer (FAA 2015). Finally, landfalling tropical systems tend to have the greatest impact on society. These systems can influence wide swaths of the United States, having caused over $20 billion in property damage per event over the last 40 years.1 While the skill at predicting the track (and to a lesser extent, intensity) of these large convective systems has increased over time (e.g., Elsberry 2014; DeMaria et al. 2014), predicting the wide-reaching impacts of rainbands and associated severe weather of decaying tropical systems has been less studied (see Matyas 2010 for a review).

A number of studies have shown that at a grid spacing of 4 km or less, numerical weather prediction (NWP) models can adequately capture much of the mesoscale dynamics and thermodynamic processes required to effectively predict convective storm mode (e.g., Weisman et al. 1997; Done et al. 2004; Weisman et al. 2008; Schwartz et al. 2009). These findings preceded the development of an operational convection-allowing forecast system called the High-Resolution Rapid Refresh (HRRR; Dowell et al. 2021, manuscript submitted to Wea. Forecasting; James et al. 2021, manuscript submitted to Wea. Forecasting; Benjamin et al. 2016), which has been running at the National Centers for Environmental Prediction (NCEP) since 30 September 2014, with the latest and final version (HRRRv4) installed on 2 December 2020. This forecast system has formed the backbone of short-term weather prediction in the United States over the past decade. However, since the development of the first convection-allowing models (CAMs; e.g., Fritsch and Carbone 2004), the prediction of convective storms has been fraught with forecast uncertainty due to inadequate specification of initial conditions (Dabberdt et al. 2005; Duda and Gallus 2013), parameterization sensitivities (e.g., Coniglio et al. 2010, 2013), and limitations of predictability (Lilly 1990).

Over the past 10 years, there has been great progress in the development of CAM ensembles that more fully capture the true forecast uncertainty (e.g., Schwartz et al. 2010; Romine et al. 2014). This work has been facilitated by the National Oceanic and Atmospheric Administration (NOAA) Hazardous Weather Testbed (HWT; e.g., Kain et al. 2003; Gallo et al. 2017) and further enhanced by Community Leveraged Unified Ensemble (CLUE; Clark et al. 2014). The HWT has enabled the evaluation and intercomparison of several CAM ensembles’ skill at predicting convection using an array of techniques including subjective evaluation by forecasters (Kain et al. 2006), standard skill scores, surrogate severe verification (e.g., Sobash et al. 2016), neighborhood-based evaluation methods (Schwartz et al. 2010; Skinner et al. 2016), object-based techniques (Gallus 2010; Skinner et al. 2018), or some combination of these verification methods (Johnson and Wang 2013; Potvin et al. 2019; Gallo et al. 2019).

Work to evaluate CAM ensemble spread and reliability is critical for developers and users of ensemble prediction systems. Clark et al. (2014) demonstrated that the predicted number of storm objects can be quite sensitive to the microphysical scheme. Thus, using different microphysics schemes can be used to broaden ensemble spread in the predicted storm attributes. Applying an object-based verification technique to predictions from a large regional CAM ensemble associated with the National Severe Storms Laboratory Experimental Warn-on-Forecast System for ensembles, Skinner et al. (2018) found that ensemble skill increased with storm size and intensity.

The next generation of operational CAM ensemble models is currently under development. The Global Systems Laboratory (GSL) has run the HRRR-E (Kalina et al. 2021) experimentally for the last few years and its successor called the Rapid Refresh Forecast System ensemble (RRFS-E) is currently entering a rapid phase of development. Both systems were recently run as part of the HWT. However, there have been only limited evaluations of the skill of these two systems at predicting convective storms. For example, Potvin et al. (2019) assessed the performance of several CAM ensembles for smaller-scale (250–2500 km2) severe (99.9th percentile) storms. In this study, object-based evaluation was performed using a single representative member from several ensembles including two early versions of the RRFS (i.e., finite volume cubed, FV3, core implementations). The Potvin et al. (2019) study revealed that the two FV3-based ensemble members had significantly lower probability of detection (POD) of these small severe storms (POD of 0.35–0.40) than that obtained with the HRRR-E member (POD of 0.46) with HRRR-E also better capturing observed occurrence maxima over the southeastern United States and high plains. These initial comparisons revealed significant differences between the FV3-based systems and HRRR-E that warrant further exploration.

In this study, we focus on the performance of the HRRR and HRRR-E modeling systems at predicting convective storm properties in order to establish a baseline skill level for comparison with future RRFS/RRFS-E developmental versions. This study is unique in that it combines a subjective classification of storm convective mode with an object-based verification technique to evaluate a CAM ensemble and its deterministic counterpart as a function of convective storm mode. The analyses herein present a statistical assessment of storm attributes similar to the method used by Clark et al. (2014), in which modeled and observed composite evolution of storm properties are compared. This study also builds on the work of Skinner et al. (2018) by assessing CAM ensemble performance across a much wider range of storm object sizes and by assessing ensemble skill at capturing convective mode. Section 2 details the methodology employed in this study. Analyses of the performance of the HRRR-E and HRRRv4 are presented in section 3, followed by an evaluation of probabilistic information provided by HRRR-E in section 4. Finally, section 5 summarizes key findings, provides some reflection on the cause of differences between HRRRv4 and HRRR-E performance, and discusses areas ripe for future work.

2. Methodology

As mentioned above, herein we present a statistical object-based evaluation of the HRRR modeling suite using subjective categorization of storm convective mode. Dowell et al. (2016) describes the initial HRRR-E design, which was updated by Dowell (2020) to describe the configuration used to support the HWT in 2020, while Dowell et al. (2021, manuscript submitted to Wea. Forecasting) explains the latest features of the HRRR modeling systems. Both HRRR-E and HRRRv4 use 3-km grid spacing with configurations that are nearly identical. Both modeling systems assimilate a range of conventional observations as well as radar reflectivity. The key distinction between HRRRv4 and HRRR-E configurations used in 2020 is that HRRR-E uses a 9-member forecast ensemble that employs initial-condition perturbations from the Global Data-Assimilation System (GDAS) ensemble in the Global Forecast System (GFS), random perturbations to the lateral boundary conditions, and stochastic perturbations. In the stochastically perturbed parameterizations (SPP) applied to each HRRR-E member, key parameters are perturbed in the Mellor–Yamada–Nakanishi–Niino (MYNN) planetary boundary layer scheme, Thompson microphysics scheme, and Smirnova/Rapid Update Cycle land surface scheme (Jankov et al. 2019; Kalina et al. 2021). In 2020, the HRRR-E configuration also included SPP tendencies (SPPT; Berner et al. 2015). This study only evaluates the 0000 and 1200 UTC cycles from both the deterministic HRRRv4 and HRRR-E. A real-time feed from NOAA Oceanic and Atmospheric Research GSL provided the data from both modeling systems.

The observed dataset uses the national mosaic of composite reflectivity from the operational Multi-Radar Multi-Sensor (MRMS) system (see Smith et al. 2016). Based on inspection of MRMS composite reflectivity, a set of 35 cases were selected from a 5-month period (27 April–27 September) during the 2020 convective season. Cases were chosen subjectively to fit into one of five categories, based on their predominant convective mode: cellular, cluster, QLCS, MCS, and land-falling tropical cyclones (TCs). Here, we focus on land-falling TCs to limit the impact of radar attenuation on the storm properties derived from the MRMS radar mosaic. Using the time of landfall as the evaluation start time also ensures some consistency across the TC cases so that all of the systems were entering the dissipation phase. The total number of cases chosen was somewhat limited by the availability of HRRR-E model output that was running in an experimental environment (Fig. 1). Since HRRR-E sometimes had missing forecast valid times for some or all members, this study only includes cases with at least 90% availability (Table 1).

Fig. 1.
Fig. 1.

Availability of the nine HRRR-E members and single HRRRv4 member during the period 0000 UTC 27 Apr 2020–1200 UTC 26 Sep 2020, where the shaded color indicates the number of members available via LDM feed at each forecast generation time (y axis) and lead time (x axis).

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

Table 1.

Case start and end dates, predominant storm organization, and other statistics.

Table 1.

Figure 2 shows the composite reflectively at mature stage for all 35 cases, organized by convective mode and date. The cellular convective mode includes thunderstorm events dominated by small-scale (i.e., <50-km diameter) circular storms that occur under weakly forced conditions driven by diurnal heating. The multicellular cluster mode includes cases in which storm cells grew and often merged to form clusters, and were generally a mixture of cellular convection and clusters of storms (but dominated by multicellular clusters). QLCSs were events where the storms evolved into long nearly continuous lines of convection without substantial trailing or leading stratiform regions. The MCS category includes only those convective systems that developed substantial stratiform regions. Finally, TCs were events that were warm core and had winds of at least tropical storm strength (wind speeds greater than 17 m s−1) at, or shortly before, landfall. For each storm convective mode, seven cases were selected, with cases consisting of between 226 and 2626 individual hourly storm objects (Table 1) occurring over the course of the event, providing a robust sample size for statistical comparisons between each storm category.

Fig. 2.
Fig. 2.

Radar depictions from the College of DuPage of the 35 convective cases during the mature stage of each event. The cases are ordered by date and event type.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

The verification region used in this study was bounded by the U.S. coastline to the south and east, the U.S.–Canadian and U.S.–Mexican borders, and the 105°W meridian. The western United States was excluded from these analyses because there are substantial areas that lack adequate radar coverage, and storms there are generally smaller in size. Table 1 summarizes the cases included in the study. The start date and time of the evaluation periods were defined by the time 3 h before the beginning of the observed increase in convective activity for a given convective event, with two exceptions: 1) for TCs, the start time was the approximate time of landfall, and 2) start time was increased by one hour if the observed event started at 0300 or 1500 UTC. This second criterion was implemented in order to avoid having the beginning of the evaluation period three hours previously coincide with the model analysis times at 0000 and 1200 UTC, when there is a strong positive bias in radar coverage. The end date and time for each case was the time when the convective activity reached a minimum, or 24 h after the start date and time, whichever came first. The 24-h limit ensured that the entire case period fell entirely within a single 36-h HRRR-E forecast (issued at 0000 or 1200 UTC). Using these criteria, the case periods ranged from 12 to 24 h with all the TC cases being 24-h periods.

Each dataset is mapped onto a common grid defined using a regular latitude–longitude projection and 0.05° grid spacing using bilinear interpolation between grid points. The influence of each pixel was then expanded by setting the value at each grid point to the maximum value occurring within a 3-gridpoint radius; this was done so that small gaps between convective elements that are part of a larger convective system were considered as a single storm object following Davis et al. (2006). The Thunderstorm Identification, Tracking, Analysis and Nowcasting (TITAN, Dixon and Wiener 1993) software was used to identify storms using a technique similar to that described by Pinto et al. (2015). A composite reflectivity threshold of 35 dBZ was used to identify storm objects in both the MRMS and model data. This threshold is comparable to that used previously to identify areas of impactful convective rainfall (e.g., Roberts and Rutledge 2003; Davis et al. 2006). While this threshold is lower than that used in other studies to detect the most intense convective storms (e.g., Skinner et al. 2018; Potvin et al. 2019), the 35-dBZ threshold used here better captures the full range of convective storms (i.e., includes less severe convection), while at the same time resulting in a bias of nearly one in for both the HRRRv4 and HRRR-E, as compared to MRMS observations for the entire evaluation period. Figure 3 shows the mean hourly count of composite reflectivity pixels over the continental United States east of 105°W using 5-dBZ-wide bins, at thresholds from 20 to 50 dBZ for all lead times and all 35 cases in this study. It shows that the HRRRv4 bias was minimized near 35 dBZ, while the mean HRRR-E bias was minimized near 38 dBZ. Therefore, it can be concluded that the threshold used here resulted in minimal bias between the products in the derived storm properties.

Fig. 3.
Fig. 3.

Hourly mean frequency of pixels within 5-dBZ-wide bins for all times of the 35 cases, for MRMS (thick, dashed black), HRRRv4 (thick, dashed navy blue), and HRRR-E (thin rainbow colors) over the continental United States east of 105°W for all forecast lead times.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

The area of each storm was determined by summing the area of each of the unexpanded pixels (colored blue in Fig. 4) within each TITAN storm object (red outline in Fig. 4). As evident from Fig. 4, this technique produced far more accurate estimates of individual storm area for the larger storms than the TITAN polygon approach because its object definition, which is limited to 72 vertices, is inadequate for defining the shape and area of the larger storms. The number of storms at each time for each case is given as the number of TITAN storm polygons, while the total storm area was the sum of all the individual storm areas at each time.

Fig. 4.
Fig. 4.

HRRR-E member 8 grid of composite reflectivity greater than 35 dBZ (blue and green shades) for the 27-h forecast of the 0000 UTC 28 Apr 2020 model generation time. The TITAN-identified objects are outlined in red, while the area of each storm within a TITAN object is indicated by blue, and tiny sections of storms outside of TITAN objects are shaded green.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

For each storm event, the evaluation area was created using the entire history of modeled and observed storm objects (e.g., Fig. 5). A polygon was then subjectively drawn around each event that encompassed the entire area where MRMS, HRRRv4, and all of the HRRR-E members had substantial convection. In this way, the observed (MRMS) and modeled (both HRRR-E and HRRRv4) storm objects are loosely matched for each storm event. Only those storm objects whose center points are within the subjectively drawn polygons are used in the calculation of storm statistics. Figure 6 shows the polygons for all 35 cases. All told, the evaluation regions nearly cover the entire United States east of the Rocky Mountains, with at least 7 polygons encompassing most grid points. To assess the representativeness of the 35 storm systems selected for this study, the distribution of observed storm sizes obtained with the 35 cases is compared with that obtained for the entire 154-day period: 27 April–27 September 2020 (Fig. 7). The area limits (km2) of the 20 bins used to define the probability density function were varied with size to account for much more numerous small cells, according to the equation: bini = 100×100.2Δi, where i is the bin limit number, ranging from 0 to 20. The 35 cases selected for this study had a notably higher frequency of the larger storms than that found over the entire period of interest. For example, storms with area of 104 km2 occurred 1% of the time in the evaluation set, but only 0.5% of the time overall. This comparison reveals that the overall performance statistics presented below are generally weighted toward larger storm sizes than they would have been if we randomly selected the 35 case events.

Fig. 5.
Fig. 5.

All areas of MRMS (gray) and HRRR-E (rainbow colors, one for each HRRR member) composite reflectivity exceeding 35 dBZ during the duration of the case on 27–28 Aug 2020. The black outline indicates the subjectively defined event area encompassing the case.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

Fig. 6.
Fig. 6.

Polygons outlining the maximum extent of each of the 35 cases used in this study. Lines are colored by date from 27 Apr to 27 Sep 2020: blue–green–yellow–orange–red–maroon. Semitransparent red color fill indicates the validation region used for the 10 Aug 2020 derecho MCS case.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

Fig. 7.
Fig. 7.

Probability distribution function (PDF) of MRMS storm area, as defined by reflectivity > 35 dBZ over the United States east of 105°W. The black line is the PDF for all 35 cases in this study, while the red line is for all storms between 0000 UTC 27 Apr and 2300 UTC 27 Sep 2020.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

3. Results

Using the object-based storm identification technique discussed above, storm properties (total storm area, total storm count, and individual storm area) obtained from MRMS, HRRR-E, and HRRRv4 were composited by storm type and compared. Note that there is no explicit requirement for individual storm objects obtained from the model predictions to overlap with the observed storm objects, except that both must occur within the same subjectively drawn event validation polygon and period (Fig. 6) for inclusion in the statistical assessment. In this way, the modeled storm objects loosely match with the observed storm objects. Since the cases were conditionally sampled based on the observed composite reflectivity, this study only considers false alarms related to overpredicting storm counts and sizes for observed storms. The study does not include events where the model falsely produced storm systems that never occurred. Also, the area of validation region (given by percent of eastern United States) varies by case as indicated in Table 1 and generally ranges from 5% to 50% of the eastern United States. As described below, normalization of each storm characteristic for each case ensures that every case is weighted equally in the composting, thus removing the impact of having a validation region size.

a. Case study

The verification technique is demonstrated using the derecho event that occurred over the Midwestern United States on 10 August 2020, which was the costliest natural disaster in the United States that year (Ribeiro et al. 2021). During the first six hours of the event, the storm system consisted of clusters of small-scale convective storms that were gradually organizing over southern South Dakota and northeastern Nebraska. These storm-cell clusters eventually organized into a large MCS that exhibited a double bow echo structure at its peak intensity at 2100 UTC (Fig. 8). The validation region encompasses this entire evolution obtained from both models and the observations (see Fig. 6). The predominant convective mode predicted by the nine HRRR-E members ranged from those predicting widely scattered small cellular convective storms (e.g., member 8) to those depicting an MCS with bow structures similar to those observed. Member 4 was the most accurate ensemble member showing notable similarities with the observed MCS in terms of timing, convective mode (including a clear bow echo structure), number of storm objects, orientation, and storm area. Looking at all ensemble members for this case finds that one predicted primarily cellular, another clusters, while 7 indicated the possibility of an MCS (not shown). This skill of the HRRR-E at capturing convective mode is explored in more detail in section 4. The deterministic HRRRv4 produced two MCSs both with some indication of bow-echo structure and both of sizes similar to those observed; however, both MCSs are misplaced and with the orientation of the larger MCS being off by 60° in the clockwise direction. This example demonstrates the importance of using convection-allowing ensemble models to capture uncertainties in the predicted timing, location, and morphology of convective storms.

Fig. 8.
Fig. 8.

Composite reflectivity from MRMS, HRRRv4, and the best- and worst-performing HRRR-E members for the 10 Aug 2020 derecho MCS event. Composite reflectivity is valid 2100 UTC 10 Aug 2020, while HRRRv4 and HRRR-E forecasts are from simulation hour 21, initialized at 0000 UTC and valid at 2100 UTC 10 Aug 2020. The 35-dBZ threshold used for identifying storms is indicated by the sharp blue–yellow transition in the reflectivity scale.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

Figure 9 shows time series of modeled and observed storm characteristics obtained for this case. The MRMS storm counts and total storm area nearly doubled between hours 3–5 of the verification period. Many of the HRRR-E members capture this initial ramp up in storm activity to varying degrees and the ensemble forecast generally brackets the observed evolution of total area and storm counts over the first 10 h of the event. Interestingly, the upscale growth occurs much faster in the HRRRv4, resulting in a much earlier development of a large MCS, and hence its westward displacement and rotated orientation by 2100 UTC (Fig. 8). Observations indicate that a second growth period occurred between 1700 and 2300 UTC. Only member 4 captured the timing and amplitude of this second critical growth period, while the rest of the members either missed this second phase of upscale growth altogether or had delayed timing and less storm growth, resulting in a general underforecast of this high-impact MCS event. This is in contrast to the predicted storm area obtained with HRRRv4, which initiated upscale growth too quickly, but maintained the system long enough to capture the storm area later in the event. Most of the ensemble members and HRRRv4 captured the onset of dissipation for this event, which began around 0200 UTC 11 August 2020.

Fig. 9.
Fig. 9.

Time series of (a) total storm area from MRMS (thick black), HRRRv4 deterministic (thick blue), and each HRRR-E member (thin rainbow colors) for the 24-h verification period of the derecho MCS event, from the 0000 UTC 10 Aug 2020 generation time. (b) As in (a), but for storm count. (c) As in (a), but for mean storm area. Brown vertical dashed line marks the time shown in Fig. 8.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

In terms of storm counts, the HRRR-E values nicely bracket the observed storm counts with a spread of roughly 50% of the median value throughout the event. At the same time, HRRRv4 tends to be an outlier compared with the ensemble with too many storm objects predicted prior to the onset of the second growth period. Mean individual storm area showed large uncertainty in the first few hours (due in part to the small number of storms present in each ensemble member). Five hours into the prediction, the individual area stabilized at around 400 km2 after phase 1, then growing to over 1000 km2 during phase 2 when the derecho was at its maximum extent. The increasing uncertainty in the individual storm size is clear from the evolution of the ensemble values, which has a peak during the period of maximum observed storm area. The HRRRv4 is hard to distinguish from the other members with regard to individual storm size, except for during its premature peak, when the individual storms grew too large.

Identical analyses were performed for all 34 other cases (not shown). A statistical assessment of the performance of the HRRRv4 and HRRR-E as a function of observed convective mode is given using the compositing technique described below.

b. Composite results

The statistical assessment was performed by compositing each storm property as a function of convective mode and time relative to observed storm initiation. Compositing was performed using a normalization approach, whereby the maximum of the storm characteristic (total area or total count) derived from MRMS data for a given case (spanning all case times) was used to normalize the results for that metric across all event times t. That is, for a given case, the normalized MRMS-derived storm characteristic O(t) is defined as
O(t)=O(t)Omax,
where O(t) is the observed storm characteristic as a function of time for a given case and Omax is the maximum observed storm characteristic value for a given case. Thus, O(t) varies between 0 and 1. Similarly, the normalized model values for each storm characteristic, Mi(t), are found using:
Mi(t)=Mi(t)Omax,
where Mi(t) is the model storm characteristic value for ensemble member “i” or deterministic model and O(t) is the observed storm characteristic value for a given case and time. Normalized model values Mi that are less than one indicate an underforecasting bias and those above one indicate an overforecasting bias. Composite values were found by averaging all normalized values available for a given event hour following:
O(t)¯=1Nj=1NOj(t),
Mi(t)¯=1Nj=1NMi,j(t),
where N is the number of cases. The normalization places equal weight on each case’s contribution to the composite time evolution for each storm-type characteristic. Similar to that done in Schwartz et al. (2017) in their assessment of resolution impacts on ensemble skill, the performance of the range of HRRR-E members is indicated for each metric using the minimum, lower quartile, median, upper quartile, and maximum values (Figs. 1012).
Fig. 10.
Fig. 10.

Normalized storm metrics for all 35 cases for MRMS (black), HRRRv4 deterministic (blue), and HRRR-E members (shades of pink, representing minimum, lower quartile, median, upper quartile, and maximum). (a) Total storm area, (b) storm count, and (c) individual storm area ratio.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

Fig. 11.
Fig. 11.

Evolution of smaller-scale storm modes: (left) cellular mode and (right) cluster mode of (top) normalized total storm area, (middle) normalized total storm count, and (bottom) normalized individual storm area ratio from MRMS (black), HRRRv4 deterministic (blue), and HRRR-E members (shades of pink, representing minimum, lower quartile, median, upper quartile, and maximum).

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

Fig. 12.
Fig. 12.

As in Fig. 11, but for the larger storm modes: (left) QLCS mode, (center) MCS mode, and (right) tropical cyclone mode.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

1) Overall assessment

Figure 10 compares the time evolution of modeled and observed total storm areas, total counts, and individual storm areas composited across all 35 cases. On average, the observed peak in total storm area occurs about 6 h after initiation (9 h from the event start, since each event started 3 h before initiation). The HRRRv4 and HRRR-E peak storm area occur one hour later with the modeled peaks being higher than observed. The peak median value obtained with the HRRR-E is just 5% greater than observed, while that obtained with the HRRRv4 is 35% greater than observed. The storm area predicted by HRRRv4 is greater than observed throughout the life cycle of convection, while that predicted by the HRRR-E just slightly overshoots the observed peak but remains slightly high during most of the dissipation phase. Both ensemble and deterministic models capture the rate of storm dissipation, but modeled storm area remains too large due to overshooting the observed peak. The slightly longer growth period, coupled with the early onset of convection initiation, resulted in a significant storm area bias in the HRRRv4 prediction. This bias carried through the dissipation phase of storm evolution, despite generally capturing the storm dissipation rate. Overall, the HRRR-E better handles the evolution of storm area; however, due to the abovementioned timing bias, the entire ensemble lies outside of the observed storm area during most of the dissipation phase.

As for the total number of storms, there was a rapid increase in MRMS storm count for the first two hours after an event started (hours 3–5). Convective initiation dominates this period, which leads to a rapid increase in storm counts. Both the HRRRv4 and HRRR-E show a more gradual increase in storm counts that peak a couple hours later than observed (Fig. 10b) at normalized values that are 21% and 7% smaller than observed. Thereafter, HRRRv4 storm count was consistently higher than observed, while the normalized counts obtained with all of the HRRR-E members are consistently lower than observed. This indicates that HRRR-E does not create enough storms during the storm initiation and growth period and thus the counts remain too low during the mature and dissipation stages. Conversely, while ultimately developing nearly the correct number of storms (albeit in a delayed fashion), the HRRRv4 underestimated the complete dissipation of storm cells with counts remaining too high throughout the dissipation period. Previous studies (e.g., Johnson and Wang 2013; Johnson et al. 2013) indicate the tendency of CAMs using the Advanced Research version of the Weather Research and Forecasting (WRF-ARW) dynamical core to overforecast the number of storm objects. Johnson et al. (2013) found that this overforecasting bias occurred throughout the day (peaking at the time of maximum observed counts) in runs obtained using both 1- and 4-km grid spacing. Notably, the HRRR-E actually underpredicted the storm counts throughout much of the storm life cycle from the time of peak occurrence and throughout the storm dissipation phase.

To calculate a normalized mean individual storm area (hereafter referred to as “individual storm area ratio”), the normalized total storm area at each time for each case was divided by the normalized storm count for the corresponding case and time; this was then averaged over all cases to create the individual storm area ratio plot in Fig. 10c. This results in the observed individual storm area ratio being greater than one at times, since the observed normalized storm area is not constrained to be less than the observed normalized storm count. This method for computing the individual storm area ratio avoids extreme values at the beginning and/or end of events when only a few storms tend to result in a highly variable value for individual storm area ratio. An example of this is evident during the initiation phase of the derecho event shown in Fig. 9; the result is very large variability in the mean storm area among the ensemble members. Using the ratio technique results in a less noisy calculation of individual storm area, as seen in Fig. 10c. The individual storm area ratio for HRRRv4 and HRRR-E was consistently 25%–30% higher than that obtained with MRMS, though peaking at about the correct time (roughly 11 h into the event).

2) Assessment by storm mode

When considering statistics by storm mode, it is important to emphasize that there are only seven storm events used to compute the statistics for each mode; however, each storm event consists of 226 to 2626 individual hourly storm objects, resulting in a large enough sample size to statistically evaluate performance as a function of event duration. It should be noted, however, that the cellular and cluster event duration were typically around 15 h or less (Table 1), so statistics beyond this duration should be viewed with caution for these two categories. Figures 11 and 12 present the evolution of the mean normalized total storm area, normalized total storm count, and individual storm area ratio for each storm mode. As discussed below, performance varies notably with convective mode in terms of both timing and amplitude. All of the non-TC modes go through an evolution consisting of an initiation and growth period, a mature period, and then a storm decay period. In contrast, within the TC mode we assess the models’ ability to predict the dissipation of land-falling tropical systems.

Observed storms within the cellular convective mode possess the largest amplitude in variability in both area and counts (Figs. 11a–c). Both the HRRRv4 and HRRR-E indicate the large amplitude in the evolution of storm area, but the initiation and growth periods are ∼2 h too long, resulting in a peak area that overshoots the observed peak area by 44% and 23%, respectively. This issue is largest in HRRRv4, which lies at the high end of the ensemble range of the peak in total storm area. While the dissipation rate is actually very similar to that observed, the overestimated and slightly delayed peak area results in storms persisting too long. Again, this issue is compounded in HRRRv4, with these small cellular storms persisting longer than observed. However, the largest bias in the models’ representation of cellular storm counts is the rate of initiation and the sheer number of storms. Both the HRRRv4 and HRRR-E underestimate the normalized counts by 38% and 48%, respectively. Thus, both model systems predict far too few storm cells that are generally too large.

Multicellular storm clusters often develop from the merger of storm cells via complex interactions of cold pools and surface convergence boundaries (e.g., Weisman and Klemp 1986), making their evolution notably different than that of the cellular mode. The observed initial area growth rates are comparatively slower and the amplitude of the evolution of storm area is much smaller than that found for cellular convection (Figs. 11d–f). Both model configurations capture this slower storm growth rate fairly well, but both also prolong the growth period by an hour. In addition, the HRRRv4 storm area has a high baseline at the start of the case period, resulting in a consistent high bias throughout the life cycle of total storm area. The HRRR-E has a lower baseline and generally follows the observed trends fairly well through most of the cluster life cycle of about 15 h (Table 1). Both HRRRv4 and HRRR-E produce too few storms during the initiation and growth phase, with HRRRv4 finally catching up to the observations around 9 h into the case period and HRRR-E counts falling short throughout the cluster storm life cycle. This general underprediction in storm counts and overprediction of storm area in HRRRv4 results in model storms generally being too large. As in the cellular mode, the dissipation stage is slightly delayed in both model systems, but follows a rate similar to that observed. Thus, storm clusters persist longer than observed.

It is not surprising that such large biases are evident in the prediction of storm cell counts and sizes in these two categories. Generally, both modeling systems predicted too few storms that tended to be too big and had a somewhat delayed evolution. These biases are likely due, in part, to the inability of the model to fully resolve the kinematic processes and thermodynamic structures that influence storm evolution since WRF ARW has a known effective resolution of 5–7Δx (Skamarock 2004). In fact, the observed maximum mean storm size was less than 250 km2 for more than half of the cellular cases. While the maximum size of storms during cluster cases tended to be larger (i.e., >430 km2), this category typically has many storms far below 250 km2, particularly during the initial stages of storm development.

Both modeling systems perform notably better for all three of the larger-scale storm categories. Observations indicate that the QLCS and MCS categories are characterized by a longer storm growth phase than the smaller storm modes, with QLCSs having a larger amplitude in temporal variability of total storm area. Both modeling systems more accurately represent the evolution of total storm area for the QLCS mode (Figs. 12a–c) than the MCS mode (Figs. 12d–f). However, both also tend to be too slow to dissipate storms during the early dissipation phase, resulting in individual storm area ratios remaining too large, particularly in the QLCS category (see hours 10–20).

MCSs are the result of upscale growth via organization of convective elements into a long-lived self-sustaining system that includes a large area of leading or trailing stratiform precipitation. All seven of the MCS cases in this study included a substantial region of trailing stratiform precipitation (Fig. 2). The HRRR-E tends to be a bit slow in ramping up the total storm area, while the HRRRv4 captures the growth rate of MCSs fairly well initially, but is too slow to taper the growth rate between event hours 8–14, resulting in HRRRv4 overestimating the peak total storm area by nearly 50% (Fig. 12d). Part of this overestimation is the tendency for HRRRv4 to generate too many storms during the period of peak storm activity (hours 12–20), as seen in the delayed peak in storm counts at hour 18 (Fig. 12e). In terms of individual storm area ratio, the HRRRv4 better captures the evolution of initial individual storm sizes than the HRRR-E, which tends to have MCSs that are larger than observed initially (hours 0–6, Fig. 12f). Both modeling systems capture the evolution of the individual storm area ratio of the MCSs after about 6 h into the event period.

As mentioned above, the TC category is unique because of the way the verification period is defined for this storm type. The feature that sets this category apart from the others is that the convective system is mature at landfall at the start time of the sampling period. Despite starting at the mature phase, the observed total storm area still increases by over 50% during the first 12 h of the verification period due to a combination of the storm moving into the verification domain and possibly expansion of convective precipitation coverage (Fig. 12g). Both the HRRR-E and HRRRv4 tend to start with a slightly larger storm system, with more rapid growth during the first 12 h, with HRRRv4 having the largest growth rate. The result is that HRRRv4 prediction of TC total storm area increases by over 110%, while HRRR-E has the storm area expanding by nearly 60%. Both model systems generally match the observed decay rate with the total storm area decreasing by ∼50% during the 12–24-h time frame. Some of the bias in HRRRv4 predicted total storm area is due to the generation of too many storms (Fig. 12h). Interestingly, the HRRR-E does a good job with storm counts in TC convective mode, with the ensemble nicely bracketing the observed storm counts. Despite matching individual storm area ratio at the beginning of the verification period, both HRRRv4 and HRRR-E diverge from the observed individual storm area ratio after 4 h with both modeling systems having individual storms that are ∼50% larger than observed by 24 h (Fig. 12i).

4. Evaluation of model spread

As discussed above, cases were selected based on the observed convective mode spanning a total of 35 cases, as outlined for each case in Fig. 6. Here we assess model performance at predicting the predominant convective storm mode for each of the non-TC cases at the time of maximum observed storm coverage. The valid time chosen matches the valid time of the images shown in Fig. 2 and listed in Table 1. While the analyses of macroscale storm properties given in section 3 used the most recent available forecast issue time, the analyses of convective mode utilized the two most recent available forecasts, in order to increase the sample size, since each case contributes a single sample (for a total of 14 samples per convective mode: 7 cases × 2 forecast generation times). The dominant convective mode was selected based on subjective inspection of the forecast images. In cases for which there was a near equal mix of convective modes, the more organized mode was used. The QLCS category includes a mix of broken and solid lines. To allow some leeway in the assessment, the validation window was expanded to include lead times within 2 h of the valid time and for MCS and QLCS categories the centroid distance between modeled and observed storm objects had to be less than 150 km. If the modeled convective system fell outside the window, the forecast was classified as a miss. For the cluster and cellular storm modes, the modeled coverage within the verification region had to be at least 50% of the observed coverage to be considered a hit.

Using these criteria, each ensemble member valid at the time of peak observed storm coverage was inspected within the verification window and the predominant storm mode was classified and tabulated. Figure 13 indicates the performance of the ensemble at predicting storm convective mode. For each observed category, the ensemble mean probability of each convective storm mode is given. Averaging across all four storm categories, the ensemble members predicted the correct convective mode roughly 65% of the time with only the cellular category having a mean probability of less than 60%. It is also noted that complete misses (i.e., no convection predicted within 2 h of the observed peak storm coverage, or modeled cluster/cellular coverage was not within 50% of the observed coverage) were rare for all but the cellular convective mode for which over 30% of the members produced no storms. Overall, the storm cluster mode was the most commonly predicted convective mode at nearly 30% of the time among all 56 model forecast time periods. Interestingly, the QLCS and MCS categories were clearly distinct indicating that the ensemble members tended to have the correct larger-scale forcing that produced the correct storm organization of these larger systems. The distinction between cellular and cluster modes was also quite good with just a 10% misclassification rate for each category. Based on this relatively small sample size, it appears that the ensemble does an excellent job of both predicting whether or not convection will occur, and if so, correctly identifying the convective mode over 60% of the time.

Fig. 13.
Fig. 13.

Mean probability of storm convective mode as a function of observed dominant convective mode at time of peak storm coverage for QLCS, MCS, cluster, and cellular modes. Ensemble data collected from the two most recent forecasts valid at time of peak convective mode resulted in a sample size of 56 cases. See detailed description of this analysis in the main text.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

While getting convective mode correct is an important first order aspect of ensemble model skill, further assessment of the macroscale properties (counts and sizes) within each category revealed notable bias that were consistent across all the ensemble members. These biases are a function of convective mode and the phase of storm evolution (growth, mature, decay) as seen in Figs. 1012 (middle and bottom panels). For example, the ensemble spread (as given by the minimum and maximum values) in the predicted number of storm objects for the cellular storm category is broader during the initiation phase than the dissipation phase, while for the cluster category, the ensemble spread remained relatively constant with time (Fig. 11e). However, for both storm categories, the observations fell outside of the ensemble spread for large portions of time. The worst correspondence is evident in the cellular mode category within which the observed normalized attribute falls well outside of the ensemble spread most of the time. In contrast, the observed attributes tend to fall within the ensemble spread for the three larger storm categories aided somewhat by larger ensemble spread (Figs. 12a–f).

The rank histogram metric further assesses the skill of the ensemble in capturing forecast uncertainty by ranking the observed attribute among the 9 ensemble members. A rank of zero indicates the observed attribute is smaller than that found for all ensemble members, while a rank of 9 indicates that the observation was larger than all the members. The best characterization of forecast uncertainty is if the frequency of each rank is identical at 10% for a 9-member ensemble (i.e., rank uniformity, Wilks 2006). The rank histograms are created by accumulating counts across all case hours listed in Table 1 for each convective mode, shown in Fig. 14. Among all cases, it is evident that the ensemble underestimates the spread and is thus underdispersive for both total storm area and individual storm area ratio. The shape of the rank histogram for storm count (i.e., skewed toward higher counts) is also indicative of the ensemble generally predicting too few storms. Likewise, the tendency for the observed individual storm area ratios to rank lowest indicates the models’ tendency to predict storms that are generally too large. Closer inspection of Fig. 14 reveals that these general biases (too few storms that are too big) tend to be exacerbated in the cellular and cluster storm categories. Interestingly, the TC category tends to exhibit the largest bias in terms of total area, with most members being larger than observed (i.e., the observed TC is smaller than all members 40% of the time). In contrast, the HRRR-E performed best at capturing the spread in individual storm area ratio for the MCS convective storm mode with a nearly uniform distribution at about 10%.

Fig. 14.
Fig. 14.

Rank histograms for (a) total storm area, (b) storm count, and (c) individual storm area ratio obtained from all 35 cases (gray bars) and for each convective storm mode category listed (colored lines). Model data are from the verification domains shown in Fig. 6 for the valid times listed in Table 1. The sample size across all times included as part of the 35 cases is 658.

Citation: Weather and Forecasting 37, 1; 10.1175/WAF-D-21-0106.1

5. Summary and conclusions

In this study, we performed a subjectively guided object-based evaluation of a convection-resolving ensemble model and its deterministic counterpart. The HRRR-E and HRRRv4 were evaluated for a set of 35 convective events over the contiguous United States east of 105°W. Cases were selected in one of five categories based on their primary convective organization structures: cellular, clusters, QLCSs, MCSs, and land-falling TCs, limited by the availability of HRRR-E data. The skill of the HRRR-E in predicting convective mode was assessed by performing a subjective analysis of 28 of the nontropical cases.

Averaged across all 35 cases, both the HRRR-E and HRRRv4 predicted storm areas were generally larger than observed, with the magnitude of this bias being a function of time into the event. The overforecasting bias was primarily due to the individual storms being too large, with storms predicted by the HRRR-E ensemble mean being 30% larger than observed. The HRRRv4 storm size typically fell within the upper quartile of the HRRR-E. The time-dependent variations in bias can be related to biases in the models’ handling of storm initiation and dissipation. Both model systems failed to capture the rapid increase in storm counts associated with the cellular and cluster modes, signaling an underforecast of storm initiation events. This delay in ramping up storm counts in the HRRRv4 is followed by underestimation of the storm dissipation rate resulting in an overforecast of the storm area during the decay phase. The biases in both model systems tend to be dominated by biases evident in the smaller storm categories (i.e., cellular and cluster convective modes).

The bias in individual storm area ratio was fairly constant throughout the storm life cycle, while the bias in the total storm area is a function of time of day (and forecast lead time) in both modeling systems. This bias is again exacerbated in the HRRRv4 where modeled storm area is nearly 55% larger than observed. It is believed that this overprediction of storm sizes may, in part, be related to bias in the model simulated radar reflectivity, which has been shown to be overestimated in tropical convection due to a tendency of the Thompson microphysics scheme to produce too many large ice particles in deep tropical convection (Stanford et al. 2017). In fact, we found the most consistent and largest overforecasting bias in the land-falling TC convective mode, which is again, consistent with Stanford et al. (2017).

Both model systems tend to have delayed storm initiation, as manifested in the storm counts that have a large underestimate in the initiation of new storms. Thus, while the counts are underforecasted, the bias in storm sizes compensates resulting in smaller overestimates of total storm area during the initiation and growth phase of the storm life cycle. Much of this underprediction in the storm counts occurs within the cellular and cluster categories. However, due to the compensating overprediction of individual storm size, the total storm area ends up peaking much higher than observed. Application of this object-based evaluation to compare the performance of an ensemble to its deterministic counterpart also revealed notable differences between the two model systems. In general, the HRRRv4 tended to lie within or above the upper quartile of the HRRR-E in predictions of storm total area. While both systems overpredict the size of the smaller storms, HRRRv4 also tended to predict too many storms, with the counts often falling outside the range of values predicted by HRRR-E.

Both the HRRR-E and HRRRv4 did remarkably well predicting convective mode, but the HRRR-E was generally underdispersive (U-shaped rank histograms) in capturing the true range of potential macrophysical storm properties. The subjective assessment of storm mode prediction (excluding TCs) revealed that the HRRR-E predicted the correct convective mode roughly 65% of the time, with the cellular mode being the most challenging to predict, while the clusters mode was most often correctly predicted. TCs were always captured by both the HRRRv4 and the HRRR-E since they were existing at model initialization time in all cases.

The ability of the HRRR-E to correctly portray the uncertainty or spread in predicted storm macrophysical properties was explored using rank histograms (Fig. 14). Averaged across all convective modes, the ensemble prediction of both individual and total storm area had a U-shaped distribution which is indicative of the ensemble being underdispersive (i.e., overconfident). This is consistent with previous studies that looked at rank histograms for precipitation accumulation in WRF-ARW CAMs (e.g., Clark 2019). However, the shape of the distribution varied by convective mode with the TC category having an overforecasting bias in total area, while in the QLCS category the ensemble had an underforecasting bias (storm area too low). Similar trends are evident in the individual storm area ratio, but the overforecasting bias is most extreme for the cellular convection category, while the HRRR-E best captures the true uncertainty (i.e., flat rank histogram) in the size of individual storms in the MCS mode. The rank histogram for storm counts indicates that a consistent underforecasting bias is present in the HRRR-E, particularly for the smaller convective modes. Further work is needed to explain the generally underdispersive nature of the HRRR-E storm macroscale characteristics.

The HRRRv4 and HRRR-E share the same WRF Model and configuration (dynamics and numerics) and are initialized from the same hourly cycled, convection-allowing ensemble (Dowell et al. 2021, manuscript submitted to Wea. Forecasting). Therefore, the differences in statistics between the two systems (e.g., total storm area and total storm count in Fig. 10) are interesting and worthy of future study. Here, we can only speculate about these differences. At short lead times (<6 h), the HRRR-E members had a higher 35-dBZ reflectivity bias than the HRRRv4 in the eastern United States during the 2020 warm season (not shown). We speculate that the different initialization methods (Dowell et al. 2021, manuscript submitted to Wea. Forecasting), particularly regarding the use of radar-reflectivity data, contribute to these differences at short lead times. At long lead times (6–36 h), the opposite was true in 2020; the HRRR-E members had a lower 35-dBZ reflectivity bias than the HRRRv4 (not shown). One possibility is that the combined effects of all the perturbations (initial- and boundary-condition perturbations, SPP, and SPPT, respectively) in the HRRR-E members reduce the number of convective storms. Another relevant factor could be that the HRRRv4 receives its boundary conditions from the Rapid Refresh model, whereas perturbed GFS forecasts provide boundary conditions for the HRRR-E; any forecast differences resulting from differences in boundary conditions would be most significant at the longest lead times. Last, since both the HRRRv4 and HRRR-E were experimental systems during the 2020 warm season, glitches in real-time runs could have affected a small number of cases in this study.

In future work, we plan to test some of these ideas with controlled retrospective reruns of HRRRv4 and HRRR-E for the 2020 cases, as these results could provide new insight into the HRRR design and future implementation of the RRFS and ensembles thereof. In addition, work is needed to assess the cause of systematic biases, with the tendency of the HRRRv4 and HRRR-E to produce too few storms, to overpredict storm size and total area, and to have a tendency for delayed storm evolution. Future work could also be performed to expand the evaluation of the probabilistic information provided by the HRRR-E. Additional studies should also intercompare HRRR and HRRR-E with the next generation of mesoscale prediction systems (i.e., RRFS, RRFS-E) that are planned for operational implementation within the next 3 years.

Acknowledgments

This research has been conducted in response to requirements of the Federal Aviation Administration (FAA) and has been supplemented by funding provided by the National Science Foundation (NSF) to NCAR. The views expressed are those of the authors and do not necessarily represent the official policy or position of either the FAA or NSF.

Data availability statement.

MRMS composite reflectivity data were available in real time from https://www.nssl.noaa.gov/projects/mrms/. HRRR Ensemble and HRRRv4 data were also obtained from NOAA GSL in real-time via an LDM feed and are stored at NCAR. Documentation of HRRR-E can be found at Dowell et al. (2021, manuscript submitted to Wea. Forecasting). Additional HRRRv4 documentation and real-time data access can be found online at https://rapidrefresh.noaa.gov/hrrr/.

REFERENCES

  • Benjamin, S. G., and Coauthors, 2016: A North American hourly assimilation and model forecast cycle: The Rapid Refresh. Mon. Wea. Rev., 144, 16691694, https://doi.org/10.1175/MWR-D-15-0242.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Berner, J., K. R. Fossell, S.-Y. Ha, J. P. Hacker, and C. Snyder, 2015: Increasing the skill of probabilistic forecasts: Understanding performance improvements from model-error representation. Mon. Wea. Rev., 143, 12951320, https://doi.org/10.1175/MWR-D-14-00091.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Clark, A. J., 2019: Comparisons of QPFs derived from single- and multicore convection-allowing ensembles. Wea. Forecasting, 34, 19551964, https://doi.org/10.1175/WAF-D-19-0128.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Clark, A. J., R. G. Bullock, T. L. Jensen, M. Xue, and F. Kong, 2014: Application of object-based time-domain diagnostics for tracking precipitation systems in convection-allowing models. Wea. Forecasting, 29, 517542, https://doi.org/10.1175/WAF-D-13-00098.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., J. Y. Hwang, and D. J. Stensrud, 2010: Environmental factors in the upscale growth and longevity of MCSs derived from Rapid Update Cycle analyses. Mon. Wea. Rev., 138, 35143539, https://doi.org/10.1175/2010MWR3233.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., J. Correia Jr., P. T. Marsh, and F. Kong, 2013: Verification of convection-allowing WRF model forecasts of the planetary boundary layer using sounding observations. Wea. Forecasting, 28, 842862, https://doi.org/10.1175/WAF-D-12-00103.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dabberdt, W. F., and Coauthors, 2005: Multifunctional mesoscale observing networks. Bull. Amer. Meteor. Soc., 86, 961982, https://doi.org/10.1175/BAMS-86-7-961.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davis, C., B. Brown, and R. Bullock, 2006: Object-based verification of precipitation forecasts. Part I: Methods and application to mesoscale rain areas. Mon. Wea. Rev., 134, 17721784, https://doi.org/10.1175/MWR3145.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is tropical cyclone intensity guidance improving? Bull. Amer. Meteor. Soc., 95, 387398, https://doi.org/10.1175/BAMS-D-12-00240.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dixon, M., and G. Wiener, 1993: TITAN: Thunderstorm Identification, Tracking, Analysis, and Nowcasting—A radar-based methodology. J. Atmos. Oceanic Technol., 10, 785797, https://doi.org/10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Done, J., C. A. Davis, and M. L. Weisman, 2004: The next generation of NWP: Explicit forecasts of convection using the Weather Research and Forecast (WRF) model. Atmos. Sci. Lett., 5, 110117, https://doi.org/10.1002/asl.72.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dowell, D., 2020: HRRR Data-Assimilation System (HRRRDAS) and HRRRE forecasts. NOAA, 8 pp., https://rapidrefresh.noaa.gov/internal/pdfs/2020_Spring_Experiment_HRRRE_Documentation.pdf.

    • Search Google Scholar
    • Export Citation
  • Dowell, D., and Coauthors, 2016: Development of a High-Resolution Rapid Refresh Ensemble (HRRRE) for severe weather forecasting. 28th Conf. on Severe Local Storms, Portland, OR, Amer. Meteor. Soc., 8B.2, https://ams.confex.com/ams/28SLS/webprogram/Paper301555.html.

    • Search Google Scholar
    • Export Citation
  • Duda, J. D., and W. A. Gallus Jr., 2013: The impact of large-scale forcing on skill of simulated convective initiation and upscale evolution with convection-allowing grid spacings in the WRF. Wea. Forecasting, 28, 9941018, https://doi.org/10.1175/WAF-D-13-00005.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Elsberry, R. L., 2014: Advances in research and forecasting tropical cyclones from 1963-2013. Asia-Pac. J. Atmos. Sci., 50, 316, https://doi.org/10.1007/s13143-014-0001-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Federal Aviation Administration, 2015: FAQ: Weather delay. FAA, accessed 25 June 2021, https://www.faa.gov/nextgen/programs/weather/faq/.

    • Search Google Scholar
    • Export Citation
  • Fritsch, J. M., and R. E. Carbone, 2004: Improving quantitative precipitation forecasts in the warm season: A USWRP research and development strategy. Bull. Amer. Meteor. Soc., 85, 955966, https://doi.org/10.1175/BAMS-85-7-955.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallo, B. T., and Coauthors, 2017: Breaking new ground in severe weather prediction: The 2015 NOAA/Hazardous Weather Testbed Spring Forecasting Experiment. Wea. Forecasting, 32, 15411568, https://doi.org/10.1175/WAF-D-16-0178.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallo, B. T., and Coauthors, 2019: Initial development and testing of a convection-allowing model scorecard. Bull. Amer. Meteor. Soc., 100, ES367ES384, https://doi.org/10.1175/BAMS-D-18-0218.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallus, W. A., Jr., 2010: Application of object-based verification techniques to ensemble precipitation forecasts. Wea. Forecasting, 25, 144158, https://doi.org/10.1175/2009WAF2222274.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallus, W. A., Jr., N. A. Snook, and E. V. Johnson, 2008: Spring and summer severe weather reports over the Midwest as a function of convective mode: A preliminary study. Wea. Forecasting, 23, 101113, https://doi.org/10.1175/2007WAF2006120.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jankov, I., and Coauthors, 2019: Stochastically perturbed parameterizations in an HRRR-based ensemble. Mon. Wea. Rev., 147, 153173, https://doi.org/10.1175/MWR-D-18-0092.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Johnson, A., and X. Wang, 2013: Object-based evaluation of a storm-scale ensemble during the 2009 NOAA Hazardous Weather Testbed Spring Experiment. Mon. Wea. Rev., 141, 10791098, https://doi.org/10.1175/MWR-D-12-00140.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Johnson, A., X. Wang, F. Kong, and M. Xue, 2013: Object-based evaluation of the impact of horizontal grid spacing on convection-allowing forecasts. Mon. Wea. Rev., 141, 34133425, https://doi.org/10.1175/MWR-D-13-00027.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kain, J. S., and Coauthors, 2003: Collaboration between forecasters and research scientists at the NSSL and SPC: The Spring Program. Bull. Amer. Meteor. Soc., 84, 17971806, https://doi.org/10.1175/BAMS-84-12-1797.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kain, J. S., S. J. Weiss, J. J. Levit, M. E. Baldwin, and D. R. Bright, 2006: Examination of convective allowing configurations of the WRF Model for the prediction of severe convective weather: The SPC/NSSL Spring Program 2004. Wea. Forecasting, 21, 167181, https://doi.org/10.1175/WAF906.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kalina, E. A., and Coauthors, 2021: A progress report on the development of the High-Resolution Rapid Refresh ensemble. Wea. Forecasting, 36, 791804, https://doi.org/10.1175/WAF-D-20-0098.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lilly, D. K., 1990: Numerical prediction of thunderstorms—Has its time come? Quart. J. Roy. Meteor. Soc., 116, 779798, https://doi.org/10.1002/qj.49711649402.

    • Search Google Scholar
    • Export Citation
  • Matyas, C. J., 2010: Associations between the size of hurricane rain fields at landfall and their surrounding environments. Meteor. Atmos. Phys., 106, 135148, https://doi.org/10.1007/s00703-009-0056-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pinto, J. O., J. A. Grim, and M. Steiner, 2015: Assessment of the High-Resolution Rapid Refresh model’s ability to predict mesoscale convective systems using object-based evaluation. Wea. Forecasting, 30, 892913, https://doi.org/10.1175/WAF-D-14-00118.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Potvin, C. K., and Coauthors, 2019: Systematic comparison of convection-allowing models during the 2017 NOAA HWT Spring Forecasting Experiment. Wea. Forecasting, 34, 13951416, https://doi.org/10.1175/WAF-D-19-0056.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ribeiro, B. Z., S. J. Weiss, and L. F. Bosart, 2021: An analysis of the low predictability associated with the high-impact 10 August 2020 derecho over the Midwest United States. Mesoscale Processes Across Scales: Engaging with Communities in the Physical and Social Sciences, New Orleans, LA, Amer. Meteor. Soc., P321, https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/380875.

    • Search Google Scholar
    • Export Citation
  • Roberts, R. D., and S. Rutledge, 2003: Nowcasting storm initiation and growth using GOES-8 and WSR-88D data. Wea. Forecasting, 18, 562584, https://doi.org/10.1175/1520-0434(2003)018<0562:NSIAGU>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Romine, G. S., and Coauthors, 2014: Representing forecast error in a convection permitting ensemble system. Mon. Wea. Rev., 142, 45194541, https://doi.org/10.1175/MWR-D-14-00100.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and Coauthors, 2009: Next-day convection-allowing WRF Model guidance: A second look at 2-km versus 4-km grid spacing. Mon. Wea. Rev., 137, 33513372, https://doi.org/10.1175/2009MWR2924.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and Coauthors, 2010: Toward improved convection-allowing ensembles: Model physics sensitivities and optimizing probabilistic guidance with small ensemble membership. Wea. Forecasting, 25, 263280, https://doi.org/10.1175/2009WAF2222267.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and Coauthors, 2017: Toward 1-km ensemble forecasts over large domains. Mon. Wea. Rev., 145, 29432969, https://doi.org/10.1175/MWR-D-16-0410.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skamarock, W. C., 2004: Evaluating mesoscale NWP models using kinetic energy spectra. Mon. Wea. Rev., 132, 30193032, https://doi.org/10.1175/MWR2830.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., L. Wicker, D. M. Wheatley, and K. H. Knopfmeier, 2016: Application of two spatial verification methods to ensemble forecasts of low-level rotation. Wea. Forecasting, 31, 713735, https://doi.org/10.1175/WAF-D-15-0129.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., and Coauthors, 2018: Object-based verification of a prototype Warn-on-Forecast System. Wea. Forecasting, 33, 12251250, https://doi.org/10.1175/WAF-D-18-0020.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, A. B., 2021: U.S. billion-dollar weather and climate disasters (2020). NOAA National Centers for Environmental Information (NCEI), accessed 25 June 2021, https://www.climate.gov/disasters2020.

    • Search Google Scholar
    • Export Citation
  • Smith, B. T., R. L. Thompson, J. S. Grams, C. Broyles, and H. E. Brooks, 2012: Convective modes for significant severe thunderstorms in the contiguous United States. Part I: Storm classification and climatology. Wea. Forecasting, 27, 11141135, https://doi.org/10.1175/WAF-D-11-00115.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sobash, R. A., C. S. Schwartz, G. S. Romine, K. R. Fossell, and M. L. Weisman, 2016: Severe weather prediction using storm surrogates from an ensemble forecasting system. Wea. Forecasting, 31, 255271, https://doi.org/10.1175/WAF-D-15-0138.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stanford, M. W., A. Varble, E. Zipser, J. W. Strapp, D. Leroy, A. Schwarzanboeck, R. Potts, and A. Protat, 2017: A ubiquitous ice size bias in simulations of tropical deep convection. Atmos. Chem. Phys., 17, 95999621, https://doi.org/10.5194/acp-17-9599-2017.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., and J. B. Klemp, 1986: Characteristics of isolated convective storms. Mesoscale Meteorology and Forecasting, P. S. Ray, Ed., Amer. Meteor. Soc., 331358, https://doi.org/10.1007/978-1-935704-20-1_15.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., W. C. Skamarock, and J. B. Klemp, 1997: The resolution dependence of explicitly modeled convective systems. Mon. Wea. Rev., 125, 527548, https://doi.org/10.1175/1520-0493(1997)125<0527:TRDOEM>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., C. A. Davis, W. Wang, K. W. Manning, and J. B. Klemp, 2008: Experiences with 0–36-h explicit convective forecasts with the WRF-ARW Model. Wea. Forecasting, 23, 407437, https://doi.org/10.1175/2007WAF2007005.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. International Geophysics Series, Vol. 100, Academic Press, 648 pp.

    • Search Google Scholar
    • Export Citation
Save
  • Benjamin, S. G., and Coauthors, 2016: A North American hourly assimilation and model forecast cycle: The Rapid Refresh. Mon. Wea. Rev., 144, 16691694, https://doi.org/10.1175/MWR-D-15-0242.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Berner, J., K. R. Fossell, S.-Y. Ha, J. P. Hacker, and C. Snyder, 2015: Increasing the skill of probabilistic forecasts: Understanding performance improvements from model-error representation. Mon. Wea. Rev., 143, 12951320, https://doi.org/10.1175/MWR-D-14-00091.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Clark, A. J., 2019: Comparisons of QPFs derived from single- and multicore convection-allowing ensembles. Wea. Forecasting, 34, 19551964, https://doi.org/10.1175/WAF-D-19-0128.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Clark, A. J., R. G. Bullock, T. L. Jensen, M. Xue, and F. Kong, 2014: Application of object-based time-domain diagnostics for tracking precipitation systems in convection-allowing models. Wea. Forecasting, 29, 517542, https://doi.org/10.1175/WAF-D-13-00098.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., J. Y. Hwang, and D. J. Stensrud, 2010: Environmental factors in the upscale growth and longevity of MCSs derived from Rapid Update Cycle analyses. Mon. Wea. Rev., 138, 35143539, https://doi.org/10.1175/2010MWR3233.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Coniglio, M. C., J. Correia Jr., P. T. Marsh, and F. Kong, 2013: Verification of convection-allowing WRF model forecasts of the planetary boundary layer using sounding observations. Wea. Forecasting, 28, 842862, https://doi.org/10.1175/WAF-D-12-00103.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dabberdt, W. F., and Coauthors, 2005: Multifunctional mesoscale observing networks. Bull. Amer. Meteor. Soc., 86, 961982, https://doi.org/10.1175/BAMS-86-7-961.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davis, C., B. Brown, and R. Bullock, 2006: Object-based verification of precipitation forecasts. Part I: Methods and application to mesoscale rain areas. Mon. Wea. Rev., 134, 17721784, https://doi.org/10.1175/MWR3145.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • DeMaria, M., C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is tropical cyclone intensity guidance improving? Bull. Amer. Meteor. Soc., 95, 387398, https://doi.org/10.1175/BAMS-D-12-00240.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dixon, M., and G. Wiener, 1993: TITAN: Thunderstorm Identification, Tracking, Analysis, and Nowcasting—A radar-based methodology. J. Atmos. Oceanic Technol., 10, 785797, https://doi.org/10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Done, J., C. A. Davis, and M. L. Weisman, 2004: The next generation of NWP: Explicit forecasts of convection using the Weather Research and Forecast (WRF) model. Atmos. Sci. Lett., 5, 110117, https://doi.org/10.1002/asl.72.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dowell, D., 2020: HRRR Data-Assimilation System (HRRRDAS) and HRRRE forecasts. NOAA, 8 pp., https://rapidrefresh.noaa.gov/internal/pdfs/2020_Spring_Experiment_HRRRE_Documentation.pdf.

    • Search Google Scholar
    • Export Citation
  • Dowell, D., and Coauthors, 2016: Development of a High-Resolution Rapid Refresh Ensemble (HRRRE) for severe weather forecasting. 28th Conf. on Severe Local Storms, Portland, OR, Amer. Meteor. Soc., 8B.2, https://ams.confex.com/ams/28SLS/webprogram/Paper301555.html.

    • Search Google Scholar
    • Export Citation
  • Duda, J. D., and W. A. Gallus Jr., 2013: The impact of large-scale forcing on skill of simulated convective initiation and upscale evolution with convection-allowing grid spacings in the WRF. Wea. Forecasting, 28, 9941018, https://doi.org/10.1175/WAF-D-13-00005.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Elsberry, R. L., 2014: Advances in research and forecasting tropical cyclones from 1963-2013. Asia-Pac. J. Atmos. Sci., 50, 316, https://doi.org/10.1007/s13143-014-0001-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Federal Aviation Administration, 2015: FAQ: Weather delay. FAA, accessed 25 June 2021, https://www.faa.gov/nextgen/programs/weather/faq/.

    • Search Google Scholar
    • Export Citation
  • Fritsch, J. M., and R. E. Carbone, 2004: Improving quantitative precipitation forecasts in the warm season: A USWRP research and development strategy. Bull. Amer. Meteor. Soc., 85, 955966, https://doi.org/10.1175/BAMS-85-7-955.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallo, B. T., and Coauthors, 2017: Breaking new ground in severe weather prediction: The 2015 NOAA/Hazardous Weather Testbed Spring Forecasting Experiment. Wea. Forecasting, 32, 15411568, https://doi.org/10.1175/WAF-D-16-0178.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallo, B. T., and Coauthors, 2019: Initial development and testing of a convection-allowing model scorecard. Bull. Amer. Meteor. Soc., 100, ES367ES384, https://doi.org/10.1175/BAMS-D-18-0218.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallus, W. A., Jr., 2010: Application of object-based verification techniques to ensemble precipitation forecasts. Wea. Forecasting, 25, 144158, https://doi.org/10.1175/2009WAF2222274.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gallus, W. A., Jr., N. A. Snook, and E. V. Johnson, 2008: Spring and summer severe weather reports over the Midwest as a function of convective mode: A preliminary study. Wea. Forecasting, 23, 101113, https://doi.org/10.1175/2007WAF2006120.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jankov, I., and Coauthors, 2019: Stochastically perturbed parameterizations in an HRRR-based ensemble. Mon. Wea. Rev., 147, 153173, https://doi.org/10.1175/MWR-D-18-0092.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Johnson, A., and X. Wang, 2013: Object-based evaluation of a storm-scale ensemble during the 2009 NOAA Hazardous Weather Testbed Spring Experiment. Mon. Wea. Rev., 141, 10791098, https://doi.org/10.1175/MWR-D-12-00140.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Johnson, A., X. Wang, F. Kong, and M. Xue, 2013: Object-based evaluation of the impact of horizontal grid spacing on convection-allowing forecasts. Mon. Wea. Rev., 141, 34133425, https://doi.org/10.1175/MWR-D-13-00027.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kain, J. S., and Coauthors, 2003: Collaboration between forecasters and research scientists at the NSSL and SPC: The Spring Program. Bull. Amer. Meteor. Soc., 84, 17971806, https://doi.org/10.1175/BAMS-84-12-1797.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kain, J. S., S. J. Weiss, J. J. Levit, M. E. Baldwin, and D. R. Bright, 2006: Examination of convective allowing configurations of the WRF Model for the prediction of severe convective weather: The SPC/NSSL Spring Program 2004. Wea. Forecasting, 21, 167181, https://doi.org/10.1175/WAF906.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kalina, E. A., and Coauthors, 2021: A progress report on the development of the High-Resolution Rapid Refresh ensemble. Wea. Forecasting, 36, 791804, https://doi.org/10.1175/WAF-D-20-0098.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lilly, D. K., 1990: Numerical prediction of thunderstorms—Has its time come? Quart. J. Roy. Meteor. Soc., 116, 779798, https://doi.org/10.1002/qj.49711649402.

    • Search Google Scholar
    • Export Citation
  • Matyas, C. J., 2010: Associations between the size of hurricane rain fields at landfall and their surrounding environments. Meteor. Atmos. Phys., 106, 135148, https://doi.org/10.1007/s00703-009-0056-1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pinto, J. O., J. A. Grim, and M. Steiner, 2015: Assessment of the High-Resolution Rapid Refresh model’s ability to predict mesoscale convective systems using object-based evaluation. Wea. Forecasting, 30, 892913, https://doi.org/10.1175/WAF-D-14-00118.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Potvin, C. K., and Coauthors, 2019: Systematic comparison of convection-allowing models during the 2017 NOAA HWT Spring Forecasting Experiment. Wea. Forecasting, 34, 13951416, https://doi.org/10.1175/WAF-D-19-0056.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Ribeiro, B. Z., S. J. Weiss, and L. F. Bosart, 2021: An analysis of the low predictability associated with the high-impact 10 August 2020 derecho over the Midwest United States. Mesoscale Processes Across Scales: Engaging with Communities in the Physical and Social Sciences, New Orleans, LA, Amer. Meteor. Soc., P321, https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/380875.

    • Search Google Scholar
    • Export Citation
  • Roberts, R. D., and S. Rutledge, 2003: Nowcasting storm initiation and growth using GOES-8 and WSR-88D data. Wea. Forecasting, 18, 562584, https://doi.org/10.1175/1520-0434(2003)018<0562:NSIAGU>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Romine, G. S., and Coauthors, 2014: Representing forecast error in a convection permitting ensemble system. Mon. Wea. Rev., 142, 45194541, https://doi.org/10.1175/MWR-D-14-00100.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and Coauthors, 2009: Next-day convection-allowing WRF Model guidance: A second look at 2-km versus 4-km grid spacing. Mon. Wea. Rev., 137, 33513372, https://doi.org/10.1175/2009MWR2924.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and Coauthors, 2010: Toward improved convection-allowing ensembles: Model physics sensitivities and optimizing probabilistic guidance with small ensemble membership. Wea. Forecasting, 25, 263280, https://doi.org/10.1175/2009WAF2222267.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and Coauthors, 2017: Toward 1-km ensemble forecasts over large domains. Mon. Wea. Rev., 145, 29432969, https://doi.org/10.1175/MWR-D-16-0410.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skamarock, W. C., 2004: Evaluating mesoscale NWP models using kinetic energy spectra. Mon. Wea. Rev., 132, 30193032, https://doi.org/10.1175/MWR2830.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., L. Wicker, D. M. Wheatley, and K. H. Knopfmeier, 2016: Application of two spatial verification methods to ensemble forecasts of low-level rotation. Wea. Forecasting, 31, 713735, https://doi.org/10.1175/WAF-D-15-0129.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., and Coauthors, 2018: Object-based verification of a prototype Warn-on-Forecast System. Wea. Forecasting, 33, 12251250, https://doi.org/10.1175/WAF-D-18-0020.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, A. B., 2021: U.S. billion-dollar weather and climate disasters (2020). NOAA National Centers for Environmental Information (NCEI), accessed 25 June 2021, https://www.climate.gov/disasters2020.

    • Search Google Scholar
    • Export Citation
  • Smith, B. T., R. L. Thompson, J. S. Grams, C. Broyles, and H. E. Brooks, 2012: Convective modes for significant severe thunderstorms in the contiguous United States. Part I: Storm classification and climatology. Wea. Forecasting, 27, 11141135, https://doi.org/10.1175/WAF-D-11-00115.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sobash, R. A., C. S. Schwartz, G. S. Romine, K. R. Fossell, and M. L. Weisman, 2016: Severe weather prediction using storm surrogates from an ensemble forecasting system. Wea. Forecasting, 31, 255271, https://doi.org/10.1175/WAF-D-15-0138.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stanford, M. W., A. Varble, E. Zipser, J. W. Strapp, D. Leroy, A. Schwarzanboeck, R. Potts, and A. Protat, 2017: A ubiquitous ice size bias in simulations of tropical deep convection. Atmos. Chem. Phys., 17, 95999621, https://doi.org/10.5194/acp-17-9599-2017.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., and J. B. Klemp, 1986: Characteristics of isolated convective storms. Mesoscale Meteorology and Forecasting, P. S. Ray, Ed., Amer. Meteor. Soc., 331358, https://doi.org/10.1007/978-1-935704-20-1_15.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., W. C. Skamarock, and J. B. Klemp, 1997: The resolution dependence of explicitly modeled convective systems. Mon. Wea. Rev., 125, 527548, https://doi.org/10.1175/1520-0493(1997)125<0527:TRDOEM>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weisman, M. L., C. A. Davis, W. Wang, K. W. Manning, and J. B. Klemp, 2008: Experiences with 0–36-h explicit convective forecasts with the WRF-ARW Model. Wea. Forecasting, 23, 407437, https://doi.org/10.1175/2007WAF2007005.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. International Geophysics Series, Vol. 100, Academic Press, 648 pp.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Availability of the nine HRRR-E members and single HRRRv4 member during the period 0000 UTC 27 Apr 2020–1200 UTC 26 Sep 2020, where the shaded color indicates the number of members available via LDM feed at each forecast generation time (y axis) and lead time (x axis).

  • Fig. 2.

    Radar depictions from the College of DuPage of the 35 convective cases during the mature stage of each event. The cases are ordered by date and event type.

  • Fig. 3.

    Hourly mean frequency of pixels within 5-dBZ-wide bins for all times of the 35 cases, for MRMS (thick, dashed black), HRRRv4 (thick, dashed navy blue), and HRRR-E (thin rainbow colors) over the continental United States east of 105°W for all forecast lead times.

  • Fig. 4.

    HRRR-E member 8 grid of composite reflectivity greater than 35 dBZ (blue and green shades) for the 27-h forecast of the 0000 UTC 28 Apr 2020 model generation time. The TITAN-identified objects are outlined in red, while the area of each storm within a TITAN object is indicated by blue, and tiny sections of storms outside of TITAN objects are shaded green.

  • Fig. 5.

    All areas of MRMS (gray) and HRRR-E (rainbow colors, one for each HRRR member) composite reflectivity exceeding 35 dBZ during the duration of the case on 27–28 Aug 2020. The black outline indicates the subjectively defined event area encompassing the case.

  • Fig. 6.

    Polygons outlining the maximum extent of each of the 35 cases used in this study. Lines are colored by date from 27 Apr to 27 Sep 2020: blue–green–yellow–orange–red–maroon. Semitransparent red color fill indicates the validation region used for the 10 Aug 2020 derecho MCS case.

  • Fig. 7.

    Probability distribution function (PDF) of MRMS storm area, as defined by reflectivity > 35 dBZ over the United States east of 105°W. The black line is the PDF for all 35 cases in this study, while the red line is for all storms between 0000 UTC 27 Apr and 2300 UTC 27 Sep 2020.

  • Fig. 8.

    Composite reflectivity from MRMS, HRRRv4, and the best- and worst-performing HRRR-E members for the 10 Aug 2020 derecho MCS event. Composite reflectivity is valid 2100 UTC 10 Aug 2020, while HRRRv4 and HRRR-E forecasts are from simulation hour 21, initialized at 0000 UTC and valid at 2100 UTC 10 Aug 2020. The 35-dBZ threshold used for identifying storms is indicated by the sharp blue–yellow transition in the reflectivity scale.

  • Fig. 9.

    Time series of (a) total storm area from MRMS (thick black), HRRRv4 deterministic (thick blue), and each HRRR-E member (thin rainbow colors) for the 24-h verification period of the derecho MCS event, from the 0000 UTC 10 Aug 2020 generation time. (b) As in (a), but for storm count. (c) As in (a), but for mean storm area. Brown vertical dashed line marks the time shown in Fig. 8.

  • Fig. 10.

    Normalized storm metrics for all 35 cases for MRMS (black), HRRRv4 deterministic (blue), and HRRR-E members (shades of pink, representing minimum, lower quartile, median, upper quartile, and maximum). (a) Total storm area, (b) storm count, and (c) individual storm area ratio.

  • Fig. 11.

    Evolution of smaller-scale storm modes: (left) cellular mode and (right) cluster mode of (top) normalized total storm area, (middle) normalized total storm count, and (bottom) normalized individual storm area ratio from MRMS (black), HRRRv4 deterministic (blue), and HRRR-E members (shades of pink, representing minimum, lower quartile, median, upper quartile, and maximum).

  • Fig. 12.

    As in Fig. 11, but for the larger storm modes: (left) QLCS mode, (center) MCS mode, and (right) tropical cyclone mode.

  • Fig. 13.

    Mean probability of storm convective mode as a function of observed dominant convective mode at time of peak storm coverage for QLCS, MCS, cluster, and cellular modes. Ensemble data collected from the two most recent forecasts valid at time of peak convective mode resulted in a sample size of 56 cases. See detailed description of this analysis in the main text.

  • Fig. 14.

    Rank histograms for (a) total storm area, (b) storm count, and (c) individual storm area ratio obtained from all 35 cases (gray bars) and for each convective storm mode category listed (colored lines). Model data are from the verification domains shown in Fig. 6 for the valid times listed in Table 1. The sample size across all times included as part of the 35 cases is 658.

All Time Past Year Past 30 Days
Abstract Views 603 0 0
Full Text Views 2539 1907 92
PDF Downloads 690 99 2