A field research campaign, the Hail Spatial and Temporal Observing Network Effort (HailSTONE), was designed to obtain physical high-resolution hail measurements at the ground associated with convective storms to help address several operational challenges that remain unsatisfied through public storm reports. Field phases occurred over a 5-yr period, yielding hail measurements from 73 severe thunderstorms [hail diameter ≥ 1.00 in. (2.54 cm)]. These data provide unprecedented insight into the hailfall character of each storm and afford a baseline to explore the representativeness of the climatological hail database and hail forecasts in NWS warning products. Based upon the full analysis of HailSTONE observations, hail sizes recorded in Storm Data as well as hail size forecasts in NWS warnings frequently underestimated the maximum diameter hailfall occurring at the surface. NWS hail forecasts were generally conservative in size and at least partially calibrated to incoming hail reports. Storm mode played a notable role in determining the potential range of maximum hail size during the life span of each storm. Supercells overwhelmingly produced the largest hail diameters, with smaller maximum hail sizes observed as convection became progressively less organized. Warning forecasters may employ a storm-mode hail size forecast philosophy, in conjunction with other radar-based hail detection techniques, to better anticipate and forecast hail sizes during convective warning episodes.
Warning forecasters at the National Weather Service (NWS) are tasked with predicting the maximum-diameter hail size for any thunderstorm expected to produce hail ≥ 1.00 in. (2.54 cm) (NWS 2014). Radar-based methods incorporating both reflectivity and velocity data from single-site radars in conjunction with knowledge of the near-storm environment have served as the primary means to forecast hail size in convective warnings over the past several decades (Donavon and Jungbluth 2007; Blair et al. 2011; WDTB 2016). Storm reports from a variety of sources also play a critical role in the NWS warning decision process by providing “ground truth” verification for ongoing weather events, thereby assisting warning forecasters in calibrating radar signatures to large hail reports (Lindley and Morgan 2004). After the conclusion of a severe weather event, these hail reports are compiled into Storm Data, a publication containing a vast collection of reports that serves as the most comprehensive U.S. severe weather climatological database available (Schaefer and Edwards 1999; Allen and Tippett 2015). Numerous research projects have utilized these hail data to examine seasonal and climatological trends, risk analysis, and other hazard mitigation plans (Doswell et al. 2005; Horgan et al. 2007; Changnon et al. 2009; Cintineo et al. 2012; Smith et al. 2012; Gensini and Mote 2014; Allen and Tippett 2015; Barrett and Henley 2015; Brown et al. 2015). Additionally, conceptual models and heuristics developed by operational forecasters over time were born from these data as a result of the connection between reports received and the available technology for hail detection. Radar-derived hail size, hail probability, and dual-polarization hydrometeor classification algorithms intended to assist the warning forecaster have also been developed, tested, and refined utilizing Storm Data or other crowdsourced reports (Lenning et al. 1998; Witt et al. 1998; Ortega et al. 2009; Park et al. 2009; Elmore et al. 2014; Snook et al. 2016). The value of storm reports is significant both in real-time severe weather episodes and postevent warning verification, as well as for training and climatological studies.
Unfortunately, there remains a high degree of uncertainty that the hail reports obtained during NWS warning verification efforts are representative of the true hailfall of a given storm. Nocturnal severe weather may lead to a reduction in reporting efficiency due to limited visibility for identifying large stones, and the majority of the public may be asleep (Ashley et al. 2008). Regardless of the time of day, the number of hail reports may fluctuate based on a storm’s path over rural versus urban areas (Dobur 2005; Cecil 2009). Even with storms over densely populated regions, large hailstones may go unidentified or unreported (Blair and Leighton 2012). Available NWS resources dedicated to seeking out ground-truth information may vary from event to event, and also between differing NWS offices’ emphasis on aggressive report collection verification (Doswell et al. 2005). Human reporting error in the form of exaggeration or underestimation of hail sizes, along with the potential for incorrect locations and times, can introduce further uncertainty in the quality and representativeness of these hail reports (Amburn and Wolf 1997; Baumgardt 2011). The limitations and inconsistencies in severe weather reporting found in Storm Data have been well established in previous studies utilizing this dataset (Lenning et al. 1998; Witt et al. 1998; Marzban and Witt 2001; Brooks 2004; Doswell et al. 2005; Trapp et al. 2006; Wilson et al. 2009; Ortega et al. 2009; Blair and Leighton 2012). For example, Amburn and Wolf (1997) and Blair et al. (2011) revealed that 29% and 24%, respectively, of the hailstorms examined from Storm Data failed to coincide with any notable radar reflectivity at the time of the report. Amburn and Wolf went on to state that NWS warning verification practices are often insufficient for research applications because of the inconsistent and low-resolution nature of the reports.
These reporting deficiencies are troubling, as an undetermined amount of uncertainty must be accepted to utilize these hail data in most forms of postevent warning verification and training, research and development, and risk assessment. While NWS warning verification is accomplished with a simple binary “hit or miss” of a severe weather report occurring within the temporal and areal boundary of a warning polygon (NWS 2015), verification of the forecasted maximum hail size contained in the NWS convective warning has largely been unexplored. This is unsurprising because of the vast amounts of uncertainty inherent in the Storm Data reports, and without a higher resolution of hail reports available, accurately determining forecast skill is virtually impossible in many situations. Ironically, it may be that the accuracy of the hazard information in NWS warnings, in this case the forecast maximum hail size, is one of the most crucial pieces of information to the end users as it provides them with specific information on potential impacts. The importance of accurately forecasting hail size information can be illustrated by considering that the same NWS “severe thunderstorm warning” is issued for storms producing quarter-sized [1.00 in. (2.54 cm)] and grapefruit-sized [4.00 in. (10.16 cm)] hail, even though the specific impacts are significantly different throughout this spectrum of sizes.
As the NWS continues to emphasize the importance of improving convective forecasts and warnings to build a Weather-Ready Nation (Lindell and Brooks 2013), the need for accurate hazard information will become key in the provision of expert decision support services. The forecaster of the future is expected to excel in nowcasting and warning operations, leveraging proven and emerging technologies to better predict the impacts from hailstorms and other hazards (Craven et al. 2015). Because of the potential for extreme economic and societal impacts from large hail, there is a critical need to better understand the true peak magnitude of hail events, but in order to accomplish this objective, a dense network of hail observations sampling a spectrum of convective storms is required.
A field research campaign, the Hail Spatial and Temporal Observing Network Effort (HailSTONE), was designed and implemented over a 5-yr period to obtain unprecedented high spatiotemporal hail observations associated with convective storms. These data provide tremendous insight into both the hailfall character of storms and NWS forecast skill and help rectify many of the limitations of Storm Data. Our initial research presented herein aims to answer several outstanding questions pertaining to the reliability of Storm Data in its representation of the maximum hail size in a storm, the accuracy of NWS maximum hail size forecasts in convective warnings, and the range of maximum hail sizes with varying storm morphologies.
Section 2 briefly overviews the field project and describes the methods used to collect and interrogate the data. Section 3 compares HailSTONE observations of the maximum-diameter hail size to both Storm Data and NWS warning hail size forecasts, stratified by the entire storm duration and by NWS warnings. Section 4 explores the hail size characteristics found with respect to storm mode and its potential applications to enhance hail size forecasts in NWS warnings. Discussion and concluding remarks follow in section 5.
2. Hail observations and methodology
a. HailSTONE background
The HailSTONE project took place in several phases from 2011 to 2015. A combined 26 volunteers directly participated in the project during the 5-yr period (Fig. 1). The participants were composed of NWS employees, private sector and broadcast meteorologists, undergraduate students in meteorology, geographers, and professional photographers. From five to seven vehicles were used during the primary operating periods each year, although fewer vehicles were occasionally used during single-day efforts to record hail sizes because of the limited availability of personnel outside the annual primary operating periods. One team of vehicles was used to take direct measurements of falling hail. This was accomplished by designing metal overhangs over the vehicle windows to minimize damage from large hailstones. A major advantage of near-instantaneous observations of falling hail is the ability to eliminate potential diameter loss from melting effects of stones lying on the ground for an extended period of time, which is an inherent limitation of poststorm hail observation and collection. A secondary group of vehicles conducted additional hail measurements in the immediate wake of a hail core and focused on either scouring areas of particularly large hail reports from the near-instantaneous observation team, or filling in gaps where more observations were needed. The primary objective of each mobile team was to obtain in situ surface hail measurements and electronically store each observation into a GPS-enabled software package designed for the project. Field participants measured the maximum diameter of the largest hailstone and the approximate mode of hail sizes within close proximity of the vehicle (sample area < 200 m2) during each stop to obtain a representative sample, as well as the circumference and weight of the largest hailstone when applicable. When very large hail was ongoing and hailstones were not retrievable because of safety concerns or other hazardous conditions, hail diameters were estimated; approximations that were verified by the collection team soon after very large hailfall ended. Each team remained within a coordinated storm-relative framework, with vehicles staggered to ensure a near-constant period of observations within the hail core.
The large majority of hail reports in Storm Data originate from the public or trained storm spotters (Allen and Tippett 2015; Blair and Sanders 2015); thus, reports are traditionally limited to stationary points of an individuals’ residence or business. Simply put, where there are no people, hail data are typically not available. Most times those reports that do provide hail information do not measure the stone with a ruler, but rather compare it to a commonly sized object, frequently coins or sports balls (Jewell and Brimelow 2009; Barrett and Henley 2015). Additional uncertainty is introduced by an unknown degree of surface melting on the stone while on the ground before being observed, or whether the individual provided the maximum or average hail size on the ground. The Severe Hazards Analysis and Verification Experiment (SHAVE) has provided an attractive option for research applications collecting an increased amount of hail reports compared with Storm Data (Ortega et al. 2009). SHAVE operated for ten years across a national domain mainly during the warm season and collected a higher-resolution hail dataset than is traditionally available. While these data are an improvement over Storm Data, the hail data still originate from the general public where many of the same limitations exist.
The collected hail observations from HailSTONE have provided some of the highest-resolution spatiotemporal datasets available to date, especially for near-instantaneous measurements of falling hail and provide a substantial improvement over other available hail datasets. The field campaign’s fundamental mobility allowed for high-resolution hail data to be obtained along any paved or dirt roadway, independent of human population. An illustration is shown in Fig. 2, where maximum-diameter hail sizes from Storm Data reports are compared with HailSTONE data for a supercell thunderstorm that occurred in rural southwest Oklahoma. During a 2-h period, a total of 11 hail reports ≥ 1.00 in. (2.54 cm) were recorded in Storm Data, three of which were baseball sized [2.75 in. (7.0 cm)] or larger, and several of which originated from a radar research group in the field. For an individual storm over a sparsely populated area, this could be considered a good sampling of reports by Storm Data standards. Unfortunately, with those data alone, it is very difficult to draw conclusions for operational or research applications with respect to the hailfall character of the storm. In contrast, the HailSTONE data provided a much more complete picture of the hailfall (Fig. 2b), with 247 hail reports ≥ 1.00 in. (2.54 cm), of which 105 reports are baseball sized or larger, and information regarding the specific hail swath location, distribution of the largest hail sizes, and hail size trends over time. The uncertainty and potential error that traditionally plague other hail datasets is greatly mitigated, with minimal surface melting of the hailstones prior to measurement and a consistent measurement scheme applied throughout the dataset.
b. Hail data
HailSTONE operated on 49 days across 12 states in the Great Plains region of the United States during the field phases of 2011–15. Operations were predominantly confined to the warm season, with approximately 90% of operational days occurring in May and June. A total of 73 thunderstorms producing hail diameters ≥ 1.00 in. (2.54 cm) were sampled during the project, and each of these different storms represents the data examined in this study. Approximately 4900 hailstone measurements were logged during the multiyear campaign, 2286 of which equal or exceed the NWS severe hail criterion of hail diameter ≥ 1.00 in. To keep data operationally relatable, hail size diameter and circumference were rounded to the nearest 0.25 in. (0.64 cm).
To compare and contrast Storm Data to these high-resolution data, all hail reports in Storm Data (NCDC 2011–15) were also compiled for the 73 severe thunderstorms sampled by HailSTONE. From these storms, a total of 181 Storm Data hail reports were available for comparison with the HailSTONE measurements. Storm Data averaged three hail reports per storm, compared to 66 measurements per storm with the HailSTONE data. Additionally, 32% of the storms sampled by HailSTONE had no Storm Data hail reports available, prohibiting a direct comparison of the two datasets for those cases.
While all tornadoes observed in the field by HailSTONE were immediately reported to NWS offices, only giant-hail cases that posed an imminent threat to large vulnerable populations or extreme hail events [this field project identified a new state record diameter hailstone for Oklahoma; SCEC (2011)] were provided to local weather forecast offices. This sequestering of data was required to establish a research-grade hail dataset independent from Storm Data; therefore, no duplicate reports exist between the two hail datasets.
HailSTONE participants used rulers or calipers to obtain the maximum diameter of a hailstone. Abnormally large singular spikes or protuberances on the surface of a hailstone were not incorporated into the maximum-diameter measurement. Multiple hailstones with a similar or equal maximum diameter were frequently found in each storm; thus, it is unlikely that a large, unrepresentative outlier was recorded as the maximum-size hailstone. The methods of hail measurements found in Storm Data are not documented for each event and are thought to be highly varied and inconsistent among each other, especially as they originate from many different sources. Additionally, while confidence is high that the maximum-diameter size for each storm is more representative with HailSTONE data than with any other hail dataset, it is somewhat improbable that the absolute largest stone was identified in the field for each case. HailSTONE operations were limited by available roads in the proximity of the hailfall, and there remains a nonzero amount of unsampleable territory by any other means other than satellite-derived hail swaths within damaged vegetation (Gallo et al. 2012). Therefore, it should be noted that the maximum-diameter hail sizes recorded by HailSTONE may be slightly underestimated, but still serve as a good representation of the largest hailfall within the storm.
c. NWS warnings
The forecasted maximum-diameter hail size contained in the initial issuance of NWS severe thunderstorm warnings (SVRs) and tornado warnings (TORs), along with subsequent follow-up warning statements called severe weather statements (SVSs), was assembled for each of the 73 storms sampled by HailSTONE. NWS forecasters are tasked with warning for tornadoes, wind speeds ≥ 50 kt (25.9 m s−1), and hail diameters ≥ 1.00 in. (2.54 cm) occurring in a thunderstorm (NWS 2014). As a storm changes intensity throughout the duration of a warning, or as reports of severe weather become available to help calibrate a more accurate hail size or wind speed, the forecasted hazards are adjusted through an SVS update (NWS 2014).
A total of 132 NWS warnings and their respective forecasted hail sizes were incorporated into the study: 101 SVRs and 31 TORs (Fig. 3). Thunderstorms sampled during HailSTONE operations that met NWS severe criteria but were not warned were omitted from the study as no maximum forecast hail size was available for verification. Prior to 2015, not all NWS offices required a warning format that provided the maximum hail size in tornado warnings, and in these instances when hail forecasts were not available in the TOR, those warnings were omitted from the forecast verification portion of the study. For comparison, the combined national annual average number of SVRs and TORs issued during 2011–15 was 22 100. While this study examines a small subset of all warnings, it is believed to accurately represent NWS warning hail size forecast performance because of its diverse geographical sampling of warnings across different NWS offices, inclusive of varying storm modes, with similar environments described in section 4a.
d. Storm mode classification
Storm classification is a subjective practice, since severe convection represents a continuous spectrum rather than discrete types placed into descriptive bins (Vasiloff et al. 1986; Hocker and Basara 2008). However, these storm-mode classifications can be useful for an operational forecaster to employ conceptual models to make scientific warning decisions and quickly differentiate storm hazards associated with different types of convection (Moller et al. 1994). To examine the relationship between storm mode and maximum hail size, a radar-based classification scheme was applied to the 73 thunderstorms sampled by HailSTONE. The arrangement of storm modes follows similar methodologies found in previous radar-classification studies (Moller et al. 1994; Thompson et al. 2003; Hocker and Basara 2008; Duda and Gallus 2010; Smith et al. 2012). Three classes of storm mode were used in this study: supercell, marginal supercell, and nonsupercell severe.
The supercell classification required the storm to persist for ≥30 min and contain well-defined reflectivity features commonly associated with supercells, including but not limited to a bounded weak-echo region (BWER), hook echo, persistent inflow notch, and a tight low-level reflectivity gradient. A mesocyclone must have been present through a substantial depth (¼) of the storm, with a maximum rotational velocity Vr ≥ 30 kt (15.4 m s−1) over a distance of ≤10 km, persisting at least tens of minutes (at least three full volume scans; ≥15 min on average) (Moller et al. 1994). Peak velocity values from low-level tornadic circulations and areas where dealiasing errors were suspected were omitted from the study.
Organized storms with identifiable rotation that satisfied a portion of the supercell definition, but contained Vr values shy of the supercell threshold or that persisted for <30 min, were classified as marginal supercells. Marginal supercell storms occasionally featured a resolvable mesocyclone that had brief, strong rotation or shallow and weak rotation, along with transient supercell reflectivity structure. Last, other storms not meeting either of the supercellular categories, but producing hail with diameters ≥ 1.00 in. (2.54 cm) were classified as nonsupercell severe. These storms were characterized by pulse storms, multicell clusters, and linear hybrids of these types, which lack both a persistent, well-organized reflectivity structure and mesocyclone.
3. Maximum hail size and NWS hail forecasts in warnings
a. Maximum-diameter hail size by entire storm duration
A direct comparison of hail sizes between HailSTONE and Storm Data was possible with 50 of the individual storms sampled by the field campaign, while 23 storms were removed from the analysis as no hail reports were available in Storm Data. A distribution of the largest hail diameter identified throughout the entire storm sampling duration for both HailSTONE and Storm Data is shown in Fig. 4. It was found that for the same 50 storms, the maximum hail sizes identified by HailSTONE were notably larger than the sizes recorded in Storm Data, with a median diameter of 2.25 in. (5.72 cm) compared with 1.75 in. (4.45 cm) published in Storm Data. HailSTONE observed larger hail sizes in 82% of the cases where a comparison with Storm Data was available, while 18% of storms had equal or larger hail diameters in Storm Data. To determine whether or not the difference between the two databases is statistically significant, a null hypothesis Student’s t test (Wilks 1995) was conducted. The difference between the HailSTONE and Storm Data maximum hail sizes was found to be statistically significant at the 99% confidence level. These data suggest that the hail reports received at local NWS offices and later recorded in Storm Data are frequently underestimating the maximum hail size in convective storms.
b. Maximum-diameter hail size by NWS warning
Another operationally relevant way to examine the differences between the two hail databases can be accomplished by binning hail observations by the NWS warning that was in effect for the storm. This allows a fair comparison to fit the data onto a time scale that has operational value shorter in length than the entire storm duration as previously shown in section 3a. The warning-based time scale, usually on the order of 30–60 min, helps identify short-term trends in hail size, captures changes in storm intensity, and better reflects the type of real-time hail information obtained during NWS operations and warning decision-making. This warning-based analysis also allows for explicit verification of NWS maximum hail size forecasts contained in each warning issuance and updated hail forecast in the SVS.
An illustration of the hail size comparison by NWS warning is provided in Fig. 5. In this example, all hail reports originating from HailSTONE and Storm Data are plotted inside an SVR. The maximum hail diameter identified from both hail datasets during the duration of the NWS warning, along with the initial and SVS maximum hail size forecasts included in the warning, were extracted and served as the hail sizes utilized in the following warning-based time-scale analysis.
1) Maximum-diameter hail size: HailSTONE versus Storm Data
The maximum hail diameter identified by HailSTONE was found to be consistently larger than Storm Data reports when comparing hail data occurring during NWS warnings. The scatterplot in Fig. 6 shows the maximum hail size observed by HailSTONE during an NWS warning as a function of the maximum size recorded in Storm Data. Linear regression was calculated to explore the relationship between the two hail datasets over the 85 warning cases available for comparison. The analysis revealed a coefficient of determination of 0.56, which suggests a moderate positive linear relationship between maximum hail sizes identified by HailSTONE and reports recorded in Storm Data. While we would not expect this relationship to be perfectly linear because of the numerous variables involved in severe weather reporting, it indicates that typical real-time hail reports (Storm Data) will frequently underestimate the actual size of hail falling in a storm. With the slope of the regression line greater than 1, the hail size underestimation becomes larger as diameter increases. The red line plotted in Fig. 6 represents the line that these data would be expected to cluster along if HailSTONE data and Storm Data were both capturing the true hailfall of convection sampled in this study. The bulk of the data that exist above this red line provide strong visual evidence that Storm Data was consistently underrepresenting the hailfall of convection targeted in this study. However, there is also evidence that hail reports in Storm Data are not sampling a dataset that is totally independent of HailSTONE observations. The fact that a moderate positive linear correlation between the data exists suggests that Storm Data hail reports can offer some insight into the hailfall character of a storm, if corrections are made for size underestimation.
2) Maximum-diameter hail size: NWS hail size forecasts versus ground-truth observations
Hail size forecasts in NWS warnings were further separated into three hail size bins: 0.75–1.00 in. (1.91–2.54 cm), 1.25–1.75 in. (3.18–4.45 cm), and 2.00–3.00 in. (5.08–7.62 cm). These bins were subjectively chosen to best represent a spectrum of the potential meteorological impacts from hail, spanning from marginally severe hail (MSH; 0.75 ≤ hail ≤ 1.00 in.), to general severe hail (GSH; 1.25–1.75 in.), to significant severe hail (SSH; hail ≥ 2.00 in.). The scheme follows similar classifications from previous studies (Edwards and Thompson 1998; Hales 1988; Doswell et al. 2005; Gallus et al. 2008; Bunkers et al. 2010), and also tends to mirror maximum hail size characteristics inherent to differing storm modes, as discussed later in section 4.
Figure 7 shows the distribution of the maximum-diameter hail identified by both HailSTONE and Storm Data during each NWS warning, relative to the initial hail size forecast contained in the warnings and organized into the three forecast size bins. Very little overlap exists between the observed HailSTONE sizes and the NWS forecast hail sizes in each of the forecast hail size bins, revealing that hail size forecasts within the initial issuance of NWS warnings consistently underforecast the maximum hail size occurring throughout the warning duration. A few exceptions were noted when earlier reports of large hail were used as the source to forecast the hail size in a downstream NWS warning, but the storms had weakened following the reception of the reports, resulting in a forecast overestimate.
While HailSTONE hail sizes were almost always larger than NWS hail size forecasts, it is interesting to note the changing relationship of Storm Data reports to initial NWS hail size forecasts for MSH compared to SSH. Nearly all the maximum-sized hail reports in Storm Data equal or exceed the NWS hail size forecast for the MSH category; however, as the NWS forecast hail size increases into the SSH category, the opposite is found. With these forecasted significant-size hail-producing storms, the majority of Storm Data reports fall below the forecasted value. These differences in the report-size distribution with respect to increasing NWS forecast hail sizes may be attributed to several potential factors, including forecaster confidence at the onset of the warning issuance or changes in storm intensity during the duration of the warning. For instance, an NWS warning forecaster may choose an MSH size for a developing storm that requires an initial warning issuance, since uncertainty in the storm’s evolution and severity may preclude the selection of a larger forecast hail size. In contrast, a forecaster issuing a new downstream warning that has previously received reports of significant hail may be more inclined to forecast a similar maximum hail size as the report. Therefore, the initial NWS warning forecast hail size may not always accurately reflect the anticipated hail size during the duration of the warning, especially in situations of rapid changes in storm intensity, which can partially explain the larger hail sizes in Storm Data associated with MSH forecast events.
In consideration of these forecast uncertainties associated with selecting an initial hail size in a warning, it is crucial to examine the updated NWS maximum hail size forecasts contained in the SVS warning product. Figure 8 is similar to Fig. 7, but showing instead the largest forecast hail size listed in any updated SVS product during the warning. Presumably, the updated SVS hail forecasts should provide a more accurate representation of the expected maximum hail size in a storm as the forecaster has the added benefit of adjusting the size based upon changes in storm intensity from remotely sensed observations and/or real-time hail reports.
Consistent with the analysis of the initial warning forecast hail size, hail observations from HailSTONE were notably larger than the updated NWS hail size forecasts in the SVS warning products. In fact, there is very little interquartile overlap of HailSTONE sizes and the associated Storm Data hail reports for all three forecast hail size bins. This finding illustrates that incoming storm reports into NWS offices, and the conceptual models used to forecast the maximum hail size, consistently failed to accurately portray the true hailfall character.
Changes to the forecast hail size from the initial warning to the follow-up SVS occurred in 38% (50) of the cases; 40 of which increased the size and 10 of which decreased it. Some NWS forecast improvement was noted with these SVS warning updates; this was especially true where initial warnings forecasting MSH were increased to larger sizes, likely driven by the receipt of storm reports or radar signatures that indicated a strengthening storm.
One interesting finding is that updated NWS hail size forecasts in SVS warning products appear to be strongly calibrated to incoming hail reports. This is illustrated in Fig. 8, where the forecast maximum hail size in the warnings tend to mirror the maximum hail sizes recorded in Storm Data during the warning period, and the interquartile hail sizes in Storm Data largely fall within each NWS forecast hail size bin examined. This is an intuitive relationship between updated NWS forecast hail sizes and Storm Data hail sizes, considering these reports aid in the warning decision process in real time. Furthermore, Storm Data reports have historically been the only consistent national database available to verify hail size, which in the absence of high-resolution reports have potentially shaped forecasters’ perception over time. There is a natural tendency to place a nontrivial amount of weight on incoming hail reports during warning operations, since some degree of uncertainty nearly always exists as a result of environmental and radar sampling limitations. The emphasis on incorporating storm reports into warning operations is not necessarily a bad practice; on the contrary, these reports serve as a critical tool to gauge hail size potential in convective storms (WDTB 2013a,b). However, overreliance on real-time hail reports is cautioned, and in light of HailSTONE observations, many times will lead to both an underestimation of the maximum hail sizes occurring and the forecast hail size by warning forecasters.
4. Maximum hail size by storm mode
a. Overview and environment
The 73 thunderstorms sampled by HailSTONE were classified into three categories—supercell, marginal supercell, and nonsupercell severe—based upon the methodology described in section 2d. A geographical breakdown of each storm and its respective storm-mode classification is shown in Fig. 9. Distribution of each storm type was relatively even throughout much of the Great Plains, although the largest concentration of supercell storms occurred in Kansas, Oklahoma, and Texas. This diverse sampling provided a wealth of environments in varying geographical areas.
Figure 10 shows the maximum hail diameter identified in each storm during HailSTONE operations, binned by storm mode. Supercells overwhelmingly produced the largest hail sizes during their lifetime compared with the other two storm classifications, with a median maximum hail size of nearly 3.00 in. (7.62 cm). The propensity for supercells to produce the largest hail is clear in this study, with no interquartile overlap between supercells and the other two storm modes, and minimal overlap with the other modes even in the quartile representing the smallest hail sizes found in supercells. This supports previous work that suggested a strong bias exists for supercell thunderstorms to frequently produce the largest hail compared with other convective morphologies (Nelson 1987; Miller et al. 1988; Johns and Doswell 1992; Duda and Gallus 2010; Smith et al. 2012; Dawson et al. 2014). Nonsupercellular severe convection produced the smallest-diameter hailstones of the three storm classifications with a median maximum diameter of 1.38 in. (3.5 cm) and 94% of storms yielding a maximum hail size ≤ 1.75 in. (4.45 cm). Marginal supercellular storms had a median maximum hail size of 1.75 in. (4.45 cm), and while the hail sizes were generally larger than the nonsupercell severe storms, significant interquartile overlap was noted between the two groups.
While a deeper investigation into the local environment’s role in supporting large hail production is beyond the scope of this research and is slated for future work, a basic overview of the environmental conditions present during HailSTONE is a useful operational metric, especially to ensure that the storms observed by HailSTONE occurred in environmental conditions typically supportive of hailstorms (List 1985; Rogers and Yau 1989). Figure 11 shows the distribution of most unstable convective available potential energy (MUCAPE), the 0–6-km bulk wind difference (BWD), and the supercell composite parameter (SCP) for each convective mode examined (Thompson et al. 2003). The majority of storms, regardless of mode, occurred in environments with 35–50 kt (18–25.7 m s−1) of deep layer shear. Instability generally increased with increasing storm organization, and the highest overall MUCAPE values were associated with supercells. HailSTONE operations generally occurred in environments with MUCAPE from 2000 to at least 3000 J kg−1 and with supercell environments exceeding 3000 J kg−1 in approximately half of the cases. Increasing values of SCP reflected well the environments where supercells did occur, and when the parameter exceeded values of 7.0, it was frequently associated with supercell storms. The range of instability and shear present during HailSTONE operations is hardly unique for warm-season severe weather events in the Great Plains, as these conditions are generally supportive for storm organization and maintenance (Rasmussen and Blanchard 1998; Thompson et al. 2003; Bunkers et al. 2006b). It is important to note that the results presented herein may vary during cold-season or tropical environments, where weaker instability, poor lapse rates, or low-topped convection may be present.
Supercell thunderstorms represent the most severe and organized deep moist convection and, likewise, can produce the largest hail compared to any other storm mode (Moller et al. 1994; Bunkers et al. 2006a; Dawson et al. 2014). SSH produced from supercells have the potential to produce notable socioeconomic impacts, especially over urban footprints, and deserve closer investigation in an attempt to improve recognition and prediction of these hail sizes in an operational warning environment. Operational observations and climatological studies using Storm Data have suggested the majority of significant hail [diameter ≥ 2.00 in. (5.08 cm)] reports are attributed to supercells (Rasmussen and Blanchard 1998; Thompson et al. 2003; Doswell et al. 2005; Duda and Gallus 2010; Blair et al. 2011; Grams et al. 2012; Smith et al. 2012). Specifically, earlier studies examining a relationship between severe weather reports and storm mode showed hail reports ≥ 2.00 in. (5.08 cm) in diameter were produced by supercells in approximately 90%–96% of the cases (Thompson et al. 2003; Duda and Gallus 2010; Smith et al. 2012).
The observations collected by HailSTONE tend to support the findings of these earlier studies. Of all storms that produced hail diameters ≥ 2.00 in. (5.08 cm), approximately 80% of these were supercells. This is a slightly lower ratio of hail reports ≥ 2.00 in. (5.08 cm) produced by supercell storms when compared to earlier research and can potentially be explained by the much lower spatial resolution of hail databases used in previous investigations. Also, earlier work did not always contain a “marginal supercell” category, and some of these storms may have been included in their supercell count. With a higher density of observations, HailSTONE was able to identify slightly larger hailstones associated with marginal supercell storms that may have gone undocumented in the past. Additionally, when the supercell and marginal supercell classifications are combined together, 98% of all storms that produced hail diameter ≥ 2.00 in. (5.08 cm) exhibited some supercellular characteristics.
HailSTONE observations revealed only 10% of supercells failed to produce a maximum hail diameter ≥ 2.00 in. (5.08 cm) during the course of the storm and that every supercell storm generated a maximum hail size of at least 1.50 in. (3.81 cm). These results suggest the vast majority of supercells produce SSH cases that meet the criteria in section 2d. Therefore, the presence of a supercell should lend high confidence that hail ≥ 2.00 in. (5.08 cm) is likely to occur during its lifetime. Conversely, hail reports received of SSH in an operational setting can also likely be attributed to storms bearing supercellular characteristics.
The percentage of supercell thunderstorms that fail to produce hail diameter ≥ 2.00 in. during their lifetime has remained an unknown quantity for decades, as a result of insufficient spatial and temporal resolution of observations to provide confidence that maximum hail size was sampled in a storm. Previously for supercells without a Storm Data report of hail ≥ 2.00 in., it could only be speculated whether the storm or possibly the environment was truly unsupportive of SSH, or whether the lack of significant hail reports was merely a function of rural areas with limited reports. The lack of SSH reports associated with supercells in the past may reveal one contributing factor in reduced forecaster confidence and the omission of larger hail sizes in warnings and follow-up statements for well-organized, long-lived convection.
c. Using storm mode for hail size forecasts
The ranges of maximum hail sizes associated with each convective mode fully illustrate the influence that storm type and organization have on the largest-diameter hail produced during the life span of each storm. Based on these ground-truth findings, the relationship between storm mode and maximum hail size should continue to be leveraged by operational meteorologists to better anticipate and forecast hail sizes during convective warning episodes.
Three primary philosophies can be drawn from the storm-mode classification and the maximum hail diameters of the 73 storms:
Supercells with well-organized reflectivity structure and a deep, persistent midlevel mesocyclone [Vr ≥ 30 kt (15.4 m s−1)] are likely to produce hail ≥ 2.00 in. (5.08 cm) during their lifetime.
Nonsupercellular convection, including pulse storms, multicell clusters, and linear hybrids of these types, which lack both a persistent, well-organized reflectivity structure and mesocyclone, are unlikely to produce hail > 1.75 in. (4.45 cm).
Marginal supercells with weak and/or shallow midlevel rotation and short-lived organized reflectivity structure present the largest challenge in determining maximum hail size, but typically produce hail sizes ranging in diameter from 1.25 to 2.00 in. (3.18 cm ≤ diameter ≤ 5.08 cm).
These are valuable operational benchmarks for hail size forecasting as these thresholds provide a simplified, scientific approach that should increase forecaster confidence to appropriately anticipate a range of maximum hail sizes based on the storm mode present, given the environments sampled in HailSTONE.
To illustrate the utility of this guidance, hail reports from HailSTONE and Storm Data for supercell storms are compared with NWS hail size forecasts and are binned by NWS warning to best mimic an operational time scale (Fig. 12). The initial hail size forecasts in NWS warnings and subsequent follow-up statements predicted a maximum diameter hail size ≥ 2.00 in. (5.08 cm) for supercells storms in approximately 35% and 50% of the warnings, respectively. Storm Data reports that are frequently used as guidance to help calibrate radar signatures also consistently underestimated the actual size of hail falling, and, in fact, no reports were available to the forecaster during 42% of the warnings. In contrast, HailSTONE observations from supercell storms identified ongoing hail ≥ 2.00 in. (5.08 cm) during 83% of the NWS warnings covering HailSTONE’s 73 storms. Using a storm-mode-based forecast philosophy, a substantial improvement in hail size prediction may be realized if an initial baseline of 2.00 in. (5.08 cm) is applied to supercell storms meeting the criteria and occurring in environments similar to those described in this paper. While supercells will frequently produce hail much larger than 2.00 in. (5.08 cm), setting this lower-end size threshold for supercells would improve the forecast and identification of significant hail and would likewise drive preparation for and response to potentially destructive and even deadly hail impacts.
5. Summary and discussion
During the 2011–15 field phases of HailSTONE, hail data were collected for 73 thunderstorms across the Great Plains. The mobility of the field campaign produced a relatively novel database of high spatiotemporal-resolution hail observations as a result of the ability to capture near-instantaneous measurements of falling hail across multiple locations within a storm. These dense observations, independent of human residences and obtained in a consistent manner, helped mitigate many of the limitations commonly found in Storm Data.
Hail reports received at local NWS offices, ultimately recorded in Storm Data, frequently underestimated the maximum-diameter hail or were simply not available. Over 30% of sampled severe storms had no corresponding Storm Data reports, which provides an initial quantitative answer raised by Cecil (2009) with respect to the unknown percentage of severe hailstorms that go undocumented. When Storm Data reports were present for comparison, HailSTONE documented larger hail sizes in 82% of the storms. The percentage grew to 90% when including storms with no Storm Data reports. The collected sample over the 5-yr period and our operational and field experiences suggest that real-time hail reports coming into an NWS office will frequently underestimate the actual size of hail falling in a storm. This is especially magnified for supercell thunderstorms, with increasing hail diameters yielding a larger underestimation. Furthermore, the reliability of Storm Data to provide an accurate hail climatology representing the maximum-sized hail occurring across the United States is shown to be questionable. These findings, along with other compounding factors related to the hail database described in Allen and Tippett (2015), imply that caution should be used when interpreting geographical trends of large hail, especially hail sizes that exceed 2.00 in. (5.08 cm).
Hail size forecasts in NWS warnings consistently underforecast the maximum hail size occurring in a storm throughout the warning duration. The findings also suggests NWS hail size forecasts are at least partially calibrated to incoming hail reports, and the reactive nature of adjusting forecast hail sizes upon receipt of a report—reports that are shown herein to underestimate the hailfall character—naturally leads to an underestimation of hail size. While real-time reports of hail remain a critical component to the NWS warning decision process, an overreliance on incoming ground-truth reports, especially when conceptual models or radar-based data support larger hail sizes, will lead to a probable underforecasting of the maximum hail size in a storm.
Our initial research suggests that some improvement to the maximum hail size forecasts in NWS warnings could be obtained in many instances by utilizing a storm-mode hail size forecast philosophy, in conjunction with other radar-based hail detection techniques (Donavon and Jungbluth 2007; Blair et al. 2011). Each storm mode classified in the study—supercell, marginal supercell, and nonsupercell severe—showed a general relationship to the range of the maximum hail size to be expected. Supercells overwhelmingly produced the largest-diameter hail sizes of any of the storm types, with approximately 90% of supercells producing hail ≥ 2.00 in. (5.08 cm) during their lifetime, and all generating hail ≥ 1.50 in. (3.81 cm). The median maximum-diameter hail size of the supercell storms sampled was nearly 3.00 in. (7.62 cm), compared with median values of 1.75 in. (4.45 cm) and 1.38 in. (3.5 cm) for marginal supercells and nonsupercell severe, respectively. These data strongly suggest that a warning forecaster should consider an initial hail size of at least 2.00 in. (5.08 cm) with storms that achieve the supercell criteria and environment described in this study, regardless of whether incoming hail reports are smaller in size. Marginal supercells were the most challenging when differentiating maximum hail size between convective modes, with minor overlap of supercellular hail sizes and a broad diameter overlap with nonsupercell severe storms. Still, forecasters may expect a common hail size range of 1.25 in. ≤ diameter ≤ 2.00 in. (3.18 cm ≤ diameter ≤ 5.08 cm) with marginal supercellular structures. Smaller maximum hail sizes were noted with the least organized convection, with nonsupercell severe storms rarely producing hail ≥ 1.75 in. (4.45 cm). Warning forecasters are encouraged to leverage these findings as initial guidance to best anticipate the potential spectrum of maximum hail sizes by storm mode, helping ensure proactive, scientifically based hail size forecasts that are less dependent on low-density storm reports.
This research provides one of the most robust examinations of the hailfall character of convective storms and subsequent hail size forecasts and should help increase forecaster confidence when considering larger maximum hail sizes in short-fused warnings, especially in the absence of storm reports. Ultimately, the ability of forecasters to provide accurate maximum-expected hail sizes through warnings will provide critical information and advanced notice to support decision-making for public safety and economic interests at risk, especially during destructive hail events associated with supercell storms.
The authors gratefully acknowledge all HailSTONE participants that volunteered a significant amount of time and personal funds, and seamlessly worked together in challenging conditions during long hours. Special thanks to the additional HailSTONE participants during the 2011–15 field campaign: Chris Dobbs, Lani Leighton, Bart Comstock, Maegan Rachelle, Paul Stofer, Josh Dean, Dan Shaw, Gordon May, Tony Laubach, and Stefanie Calvert. We also thank Amos Magliocco and three anonymous reviewers for their insightful and helpful review of the manuscript, and Andy Dean (SPC) for providing environmental data. The views and opinions expressed in this paper are those of the authors and do not necessarily represent an official position, policy, or decision of the National Weather Service.