Thunderstorms that produce surface hail accumulations, sometimes as large as 60 cm in depth, have significantly affected the residents of the Front Range and High Plains of Colorado and Wyoming by creating hazardous road conditions and endangering lives and property. To date, surface hail accumulation is not part of a routine forecasting or monitoring system. Extensive coordinated hail accumulation reports and operational products designed to identify deep hail accumulating storms in real time are lacking. Kalina et al. used dual-polarization WSR-88D radar observations to calculate hail depth and hail accumulations but never validated the algorithm. This study shows how 20 quality-controlled hail depth reports from the hail depth database built by the Colorado Hail Accumulation from Thunderstorms (CHAT) project are being used to validate the Kalina et al. radar-based hail accumulation algorithm for operational application. The validated algorithm shows increased correlations between radar-derived and reported accumulations for hail depth reports not included in the validation. Furthermore, increases in computational efficiency have allowed the improved algorithm to be used operationally. With an improved hail accumulation algorithm, thunderstorms that produce hail accumulations are more frequently detected than previously reported.
Thunderstorms producing surface hail accumulations, sometimes as large as 60 cm in depth, have frequently occurred along the Front Range and High Plains of Colorado and Wyoming. These hail accumulations, which often consist of hailstones ranging from 1 to 3 cm in diameter, can significantly affect road safety, damage crops and other vegetation, and endanger lives and property. Figure 1 shows the relative frequency of hail accumulation reports with >1 cm of hail accumulations between 2013 and 2017 in eastern Colorado and eastern Wyoming. In 2016, the number of hail accumulation reports exceeded the number of reported EF0–EF2 tornadoes and the totals are even comparable to flash flood events.
Currently, there are no official coordinated hail accumulation reporting or operational products to track, nowcast, or forecast these events, which limits the decision support information available to emergency responders, transportation departments, and the general public. There is a clear need to provide basic guidance on hail accumulation in thunderstorms, and as a result the National Weather Service (NWS) Forecast Office in Boulder, Colorado, has partnered with the University of Colorado to explore ways to deliver operational hail accumulation information. In this study, we show that validated hail accumulation information can be generated using data from the operational dual-polarization NWS Weather Surveillance Radar-1988 Doppler (WSR-88D) network to derive hail accumulation maps. To do this, we asked the general public and trained storm spotters to report hail depth in Colorado and Wyoming. This study addresses the following question: How can coordinated hail depth reports be utilized to validate a radar-based operational method that derives hail accumulations?
Hail accumulations from thunderstorms can pose a serious safety threat. The climatology of hail frequency shows that the areas that experience the most frequent hail days in the United States are located in the Great Plains as well as the Colorado–Wyoming Front Range urban corridor (Cintineo et al. 2012; Changnon 1977). In particular, the Front Range urban corridor, with an estimated population of 4.8 million people (U.S. Census Bureau 2016), has seen increased vulnerability to thunderstorms with accumulating hail due to the steady increase of the population (11.53% between 2010 and 2016). One way the population can be affected by hail accumulations is through dangerous driving conditions (Fig. 2a). Hail accumulations from thunderstorms usually occur between May and September when snowplows have been retired for the summer season, and motor vehicles are least likely to be equipped with snow tires. These hazardous conditions can lead to vehicles sliding off the road and, in some cases, vehicles rolling over, requiring an emergency response (Durta 2016). Large quantities of hail can also clog drainages resulting in flash flooding that strands drivers, as occurred during a supercell thunderstorm on 7 May 2015 in Colorado Springs, Colorado (KKTV 2015).
Besides public safety concerns, hail accumulation can also impact the local economy. The Colorado Department of Transportation has estimated that, for instance, closing U.S. Interstate 70, a major highway connecting the eastern and western United States, costs $880,000 an hour due to delayed shipments and costs associated with emergency services necessary to reopen roads (B. Wilson 2017, personal communication). Denver International Airport (DIA), the fifth busiest airport nationwide, with an estimated 61.38 million passengers in 2017 (https://en.wikipedia.org/wiki/List_of_busiest_airports_by_passenger_traffic), was forced to close for over an hour during the afternoon of 26 May 2016 when thunderstorms produced hail accumulations necessitating snowplows to clear runways (Fig. 2b; Sylte 2016). This 1-h closure resulted in an estimated economic loss for the region of over $3 million (ICF 2014).
Despite the impacts of hail accumulations on the economy and public safety, there exists a large knowledge gap concerning when and why thunderstorms accumulate significant amounts of hail on the ground. So far, experimental and numerical studies on hail have focused on understanding the growth of large hailstones (e.g., Browning 1964; Browning and Foote 1976; Heymsfield 1983; Nelson 1983; Miller et al. 1990; Conway and Zrnić 1993; Knight and Knight 2001; Grant and van den Heever 2014; Dennis and Kumjian 2017). As a result, hail reporting has been steered toward maximum hail size rather than hail accumulations. Only a few recent studies have investigated hail accumulating storms and the possible contributing factors for large accumulations (Knight et al. 2008; Schlatter et al. 2008; Schlatter and Doesken 2010; Kalina et al. 2016; Ward et al. 2018; Friedrich et al. 2019). Kalina et al. (2016) provide significant contributions to the field of radar and lightning characteristics of thunderstorms with deep hail accumulations, and proposed ways of estimating hail accumulations. Their results show that deep hail accumulating thunderstorms exhibit a combination of elevated levels of ice production in the cloud, slow storm motion, and low melting rates.
Currently, no operational method is available that specifically computes or estimates hail accumulations. Kalina et al. (2016) were the first to use dual-polarization radar data to derive hail accumulation maps from radar data from the WSR-88D network. An example of a hail accumulation map for 26 May 2016 based on the Kalina et al. (2016) study is shown in Fig. 3. As discussed in the previous paragraph, this thunderstorm passed over DIA and forced the airport to close for an hour (Fig. 2b). On the same day, areas south of Denver, including sections of Interstate 25, received reports of accumulations of up to 5 cm, affecting traffic significantly.
While Kalina et al. (2016) suggested the use of radar observations to derive hail accumulation maps, they were unable to validate the radar-based accumulation maps due to the limited number of hail depth observations. Furthermore, their methods of producing hail accumulation maps were not suitable for an operational application. This study will address the limitations of Kalina et al. (2016) and will show (i) how the lack of observations is addressed through encouraging the general public to submit hail depth information, (ii) how these reports are assessed for their quality, and (iii) how the reports are used to validate hail accumulation maps for operational application.
This study is part of the Colorado Hail Accumulation from Thunderstorms (CHAT) project initiated in the spring of 2016 (Friedrich et al. 2019). CHAT is a joint collaboration between the NWS Forecast Office in Boulder and the University of Colorado with the overarching goals of (i) building a database of reported hail depths, hail size distribution, and hail swath extent through reports from a community-based citizen science project; (ii) studying typical characteristics of thunderstorms that produce significant hail accumulations; (iii) developing techniques to identify thunderstorms with deep hail accumulations using operational weather radar and lightning networks; and (iv) developing techniques to forecast the potential of hail accumulations on the ground and including information about the timing and location of significant hailfall. This paper focuses on two of the fundamental goals of the CHAT project, namely, building a database of quality-controlled hail accumulation observations along the Front Range urban corridor (section 2) and validating operational radar-based hail accumulation maps (section 3).
2. Quality controlled hail accumulation observations
a. Hail accumulation database
The hail accumulation database for the Colorado–Wyoming Front Range urban corridor consists of two types of hail depth reports: (i) archived reports from news outlets, social media, and observing networks such as the Community Collaborative Rain, Hail and Snow Network (CoCoRaHS; Cifelli et al. 2005) and the National Oceanic Atmospheric Administration’s (NOAA) Storm Event Database (also referred to as Storm Data) and (ii) reports from a community-based citizen science project, which started in the spring of 2016. Through this citizen science approach, we have encouraged the public to report hail depth, hail size distribution, hailfall duration, and areal hail swath extent through social media such as Twitter, Facebook, and Instagram. Encouragement was primarily done by posting advertisements to social media and community tack boards, in addition to presenting to classrooms and spotter trainings. These advertisements included a website link (http://clouds.colorado.edu/deephail) that provided instructions on how to report hail depth, hail size distribution, and swath extent. The hail accumulation database only includes reports made since 2012 to coincide with dual-polarization capability available across eastern Colorado and Wyoming. We collected 124 hail reports from 96 thunderstorms that occurred during the convective seasons of 2012–18 along the Colorado–Wyoming Front Range and in eastern Colorado (Fig. 4). Out of the 124 hail depth observations, 47 reports were obtained from social media, 49 from trained storm spotters submitted through CoCoRaHS or NOAA’s Storm Data, and 28 from news outlets. For a more in-depth explanation and additional results of the CHAT project, see Friedrich et al. (2019).
b. Quality control procedure for hail reports
All reports go through a report verification and quality assessment process that is discussed in this section and is schematically illustrated in Fig. 5 (blue and orange boxes). Although we initially asked for hail depth, hail size distribution, hailfall duration, and areal hail swath extent, the majority of the reports only contained information on hail depth after a thunderstorm passed over the reported location. Reports gathered from Storm Data and CoCoRaHS always provided the information on the location of the report, while social media reports often provided insufficient location data. In a first step, we excluded reports that did not have any information on time or location of the hail event or information on accumulated depth (either reported depth or images). As a second step, we dismissed reports if observed or modeled radiosonde data were not available that day, or operational radar data were unavailable during the time of the hailfall (i.e., the beginning and ending of radar-based hail observations) for the specific location. In a third step, we ensured that the operational radar data were consistent with the time and location of the hail depth report. Many reports had ambiguous time information. For instance, reports were often posted on social media after the storm had passed without information on the exact end time of the hailfall. It was unclear if the time of the posting coincided with the end of hailfall. In those cases, we determined the time of hailfall based on when the Hydrometeor Classification Algorithm (HCA) developed by the National Severe Storms Laboratory (NSSL) (Park et al. 2009) identified hail at the lowest elevation scan over the report location. If no storm was detected by radar at the reported location within a 2-h window of the reported time, the report was dismissed.
Once we verified the report location and time, we started to assess the quality of the reported hail depth, which varied widely. For instance, some reports contained pictures that showed the accumulations together with a ruler or measuring device (Fig. 6a), which we considered high quality depth reports. Other reports contained no information on precise depth or depth that would match the associated pictures, but we were able to derive the depth from the picture (Figs. 6b,c). This was done by using the known height of objects within the picture. For example, the height of the grass at Coors Field (Fig. 6b) was used to determine hail depth because it is strictly maintained to 2.2 cm in height (Powell 2017). When possible, we traveled to the reported location to verify the height of permanent objects (e.g., gutters, signs, or brick walls) contained within the provided picture. If we were unable to discern the depth, we also dismissed the report in step 5 of the verification and quality assessment procedure (Fig. 5). Photographic evidence was also used to distinguish between cases where hail accumulations were deep enough to obscure the ground1 (Fig. 7a) and cases consisting of scattered hailstones where the ground was still readily discernable (Fig. 7b). This information will be important in identifying unique characteristics of storms with significant hail accumulations compared to thunderstorms with no or trace hail accumulations (null cases), which is one of the CHAT objectives but will not be further discussed in this paper.
After assessing the quality of all reports, we divided the reports into those of high quality containing an exact location, time of occurrence, and depth that was measured or discernable from pictures, and reports of lower quality where location, time, and depth could not be retrieved unambiguously (step 5; Fig. 5). Out of the 124 reports from 96 storms collected during 2012–18, we only identified 20 reports from 12 storms as high quality, which are indicated as cyan plus signs in Fig. 4. The 20 reports came mainly from storm spotters (12 reports), some were made by the authors (3 reports), and the remaining were submitted through social media (5 reports). Note that out of these 20 reports, 6 reports were defined as null cases. Although only 20 high quality reports are used to validate the radar-based hail accumulation map in section 3, all 124 reports are nevertheless useful for studying the characteristics of thunderstorms that do and do not accumulate hail on the ground, which is not discussed in this paper.
To validate the quality of the radar-based hail accumulation procedure discussed in section 3, we decided to assign error bars to the high quality reports based on their origin as part of step 6. Storm spotter reports from Storm Data and CoCoRaHS are often accompanied by descriptions of their observations, and the spotters are likely to be trained in proper measurement techniques. Conversely, estimates of hail accumulation from social media pictures were often made by untrained weather enthusiasts, which suggests the error on the depth measurement would likely be larger compared to trained and experienced spotter reports. However, hail depth reporting is not yet standardized, so no concrete knowledge exists on how each group performs the measurements differently. To assign an error on hail depth measurements, and, therefore, produce statistically robust results, we decided to base an assumption of hail depth measurement error on a field campaign performed by Blair et al. (2017). They found hail size reports by storm spotters were often underestimated by 2 cm. In a similar manner, we chose to use to an error of ±2 cm on all hail depth reports. This error assignment also helps mitigate errors in depth reports as a result of incorrect locations reported in Storm Data, of which the results were found to be erroneous on average by 1.6 km (Ortega 2018).
3. Hail accumulation algorithm
The ultimate goal of this study is to develop and evaluate improved radar-based hail accumulation maps for operational applications and process studies. Kalina et al. (2016) were the first to derive surface hail accumulations using dual-polarization radar observations together with the National Center for Atmospheric Research (NCAR) HCA (Vivekanandan et al. 1999). As a first guess and to keep their algorithm efficient and timely, they assumed a static fall velocity of 15 m s−1, which is appropriate for a hailstone with a diameter of 2 cm (Pruppacher and Klett 1997). A ubiquitous hailstone size as suggested by Kalina et al. (2016) is likely unrepresentative of the entire storm environment. Our study cases showed variations of radar-derived in-cloud hailstone sizes ranging from 2 to 12 cm. As a result, for instance when applying a diameter–fall velocity relationship provided by Heymsfield and Wright (2014; Table 1), the fall velocity varies roughly between 8.7 m s−1 for a 2-cm hailstone and 39.3 m s−1 for a 12-cm hailstone, or by about 350%. Whereas a static fall velocity of 15 m s−1, which disregards hailstone size, does not vary. In addition, Kalina et al. (2016) were never able to validate their radar-based hail accumulations.
To address the uncertainty in hail depth related to using a static fall velocity of 15 m s−1 and the lack of evaluation in the Kalina et al. (2016) accumulation algorithm, we introduced both a dynamic diameter fall velocity Vt that depends on the hailstone diameter D, and a statistical best coefficient ϵ, which results in the minimum difference between the observed and radar-based hail accumulations. The statistical best coefficient will be discussed later in this section. Hail accumulations at the surface (hAcc) are now being calculated as
where, following Kalina et al. (2016), the packing density η is assumed to be 0.64, which is the closest possible random packing of monodisperse spheres (Scott and Kilgour 1969), and ice density ρh is assumed to be 0.9 g cm−3. The variable Δt represents the change in time between successive radar scans (most commonly 4.5–5 min). The ice water content for hail (IWCh; g m−3) is derived from the equivalent radar reflectivity Ze following Heymsfield and Miller (1988):
In a similar manner as performed by Kalina et al. (2016), IWCh is calculated by using Ze at the lowest height level classified by the National Severe Storms Laboratory HCA (Park et al. 2009) as hail with rain. Hail with rain is the NSSL HCA category, which is operationally available as an NWS WSR-88D level III product. Following Kalina et al. (2016), we further constrain the IWCh calculation by only using reflectivity data below the height of the 0°C isotherm. Using reflectivity data below the 0°C isotherm provides two benefits: (i) hailstones in the cloud above the 0°C isotherm are not double counted in case they do not reach the surface by the next radar scan and (ii) we avoid including data from the hail growth zone, ensuring the ice water content observed is representative of the hail reaching the surface. The 0°C isotherm was derived from the operational sounding that was the closest in space and time to the report location and representative of the air mass that produced the thunderstorm. For this study, we used dual-polarization information from the Front Range WSR-88Ds at Cheyenne, Wyoming (KCYS); Denver (KFTG); and Pueblo, Colorado (KPUX), and the High Plains radars at North Platte, Nebraska (KLNX); Goodland, Kansas (KGLD); and Dodge City, Kansas (KDDC), as well as the operational soundings closest to the respective radar for all cases (Fig. 4).
The next variable that needs to be discussed in Eq. (1) is hailfall velocity Vt, which is a function of hailstone diameter. One of the primary ways of estimating hail diameter operationally in real time is to use a radar-based algorithm such as the maximum estimated size of hail (MESH; Witt et al. 1998). To accurately estimate hail size, MESH uses weighting functions in reflectivity together with temperature-based height thresholds to account for transition zones between rain and hail. The transition zone is defined as reflectivity values between 40 and 50 dBZ within the layer from 0° to −20°C. Despite observations of hail being present below the 0°C isotherm, Witt et al. (1998) chose a layer between the 0° and −20°C isotherms so as to only include the optimal hail growth zones (Browning 1977; Nelson 1983), where the largest hailstone is most likely to be found.
Witt et al. (1998) found sufficient evidence that MESH would lie in the largest 25% of reported hailstone sizes. Furthermore, in a field campaign to validate MESH to surface observations, Brimelow and Taylor (2017) found MESH to approximate the maximum hailstone found on the ground to within 2 cm of the radar-derived value. These results might suggest that MESH is a good approximation to use for the dynamic diameter–fall velocity relationships. However, an extensive study performed by Ortega (2018) indicates MESH can perform poorly in predicting hail size measured at the ground. We acknowledge MESH may be a limiting factor in producing the most accurate hail accumulation depths. With that said, alternative methods of calculating hail size include hail size determined through hail differential reflectivity HDR (Aydin et al. 1986, Depue et al. 2007) and the Hail Size Discrimination Algorithm (HSDA; Ortega et al. 2016). Unfortunately, neither are compatible with an operational hail accumulation algorithm. The performance of HDR as presented by Depue et al. (2007) required extensive quality control on the data, which would increase the processing time of the hail accumulation algorithm. The HSDA differentiates hail into three size categories: nonsevere, severe hail, and giant hail. Therefore, it does not provide a necessary continuous function with which to compute hail size in the hail accumulation algorithm. Thus, we are limited to using MESH for this study.
So far, we have accounted for variables that can be derived in near–real time but still need to address which dynamic diameter–fall velocity relationship is best suited for conditions along the Colorado–Wyoming Front Range and the High Plains. Given the vast number of these relationships available in the literature, we opted to only include the relationships listed in Table 1, all of which were derived experimentally in High Plains regions similar to that of the Colorado–Wyoming Front Range urban corridor. The discovery of the best-suited diameter–fall velocity relationship will be addressed by finding the minimum difference between radar-derived and reported high quality hail depth for each relationship, which is expressed by the statistical best coefficient ϵ in Eq. (1). Note that ϵ was also included in Eq. (1) because it helps address the unknown variations in packing density and ice density. While packing density has been shown to vary by only 6.4% between closely and loosely packed spheres (Scott and Kilgour 1969), previous studies have shown that Colorado hailstones can have densities that range to about 111%, from 0.44 to 0.93 g cm−3 (Knight and Heymsfield 1983; El-Magd et al. 2000). To our knowledge, there is no near-real-time method for quantitatively estimating either variable from radar.
To ensure ϵ is statistically robust, we recognized that ϵ needed to be derived from and tested with different sets of observations. We performed the derivation and test in four steps (green boxes in Fig. 5, denoting report analysis). First, we divided the reported depths into two subsets: the first with two-thirds of the observations and the second with the remaining one-third of the observations (S2/3 and S1/3; box 6 in Fig. 5). We chose subsets with roughly equal variance to ensure each subset represented the whole dataset. Second, we derived hail accumulations for each reported depth using Eq. (1) given ϵ = 1 (box 7 in Fig. 5). This step can be conceptualized by recognizing that ϵ = 1 would provide results on how solely incorporating a dynamic fall velocity would modify the results of the hail accumulation algorithm provided by Kalina et al. (2016). In the third step, we derived the slope of best-fit lines between reported and radar-derived accumulations for each subset (box 8 in Fig. 5). In the fourth step, we assessed if the slope of the best-fit line from S1/3 laid within two standard deviations of S2/3. When this was found to be true, we defined ϵ to be equivalent to the slope of S2/3 (box 10 in Fig. 5). In the case the slope of S1/3 laid outside two standard deviations of S2/3, ϵ was considered not robust, and the fall velocity relationship associated with that of ϵ was excluded from this study.
4. Determining the best fall velocity relationship and ϵ
In this section, we investigate which diameter–fall velocity relationship from Table 1 and its corresponding ϵ results in radar-based accumulations that are most representative of the set of 20 high quality hail depth reports (section 2b). To do this, we calculated the radar-based hail accumulations for the locations of the 20 high quality reports for each diameter–fall velocity relationship and compared them to the reported accumulations. Next, we derived ϵ as described in the previous section. We found all dynamic fall velocity relationships in Table 1 produced robust values of ϵ. The statistical best coefficient ϵ ranged between 0.8 and 1.6 (Fig. 8). Furthermore, all of the dynamic fall velocity relationships listed in Table 1 produced correlation coefficients of 0.69–0.87 between radar-derived and reported accumulations (Fig. 8). We introduced a second metric that shows the slope of the best-fit line between the observed and radar-derived hail accumulations for each diameter–fall velocity relationship after the derived ϵ is applied. We note that a slope equal to one indicates that the differences between the reported and derived accumulations are at their minimum. When comparing best-fit line slopes, our results show that applying a static fall velocity of 15 m s−1 produces the slope farthest from one, of 0.69. When applying a dynamic diameter fall velocity, the slopes increase, ranging from 0.78 to 0.84. The largest correlation coefficient of 0.87 and the largest best-fit line slope, of 0.84, occurred when implementing the relationship for rimed particles from Heymsfield and Wright (2014), which we refer to as HW14Ri hereafter. Note that Heymsfield and Wright (2014) also provide a relationship for graupel and hail as shown in Table 1 and Fig. 8. Conversely, both the smallest correlation coefficient and smallest slope, 0.69 and 0.69, respectively, are produced when employing the static fall velocity of 15 m s−1 used by Kalina et al. (2016). In particular, when selecting HW14Ri to use with Eq. (1) and comparing the results to Kalina et al. (2016), the correlation coefficient between the reported and calculated hail depths increased from 0.69 to 0.87, and the best-fit line slope increased from 0.69 to 0.84. These results suggest the fall velocity relationship as provided by HW14Ri (ϵ = 0.81) in general yields derived accumulations closest to the reported accumulations for our set of hail depth reports.
Having found that the best available diameter–fall velocity relationship for the 20 high quality reports is provided by HW14Ri, we can examine in more detail how well HW14Ri and its corresponding ϵ performs across the full range of reported accumulation depths (Fig. 9). When comparing radar-derived and reported accumulations using HW14Ri, three out of nine reports with reported accumulations > 3 cm lie between the ratios of 0.66 and 1.5 of reported and derived accumulations (Fig. 9, green diamonds). Furthermore, eight out of nine reports greater than 3 cm lie between the ratios of 0.5 and 2 (Fig. 9, green and orange diamonds). Only one report with accumulations > 3 cm deviated beyond a ratio of 2 (Fig. 9, red diamond). Conversely, 9 out of 11 reports with <3-cm accumulations lie outside the ratios of reported and derived accumulations of 1.5 and 0.66, while 8 out of 11 lie outside the ratios of 0.5 and 2. Possible reasons for derived accumulations that lie outside ratios of 0.5 and 2 are further discussed in section 6.
We can further show how incorporating a dynamic fall velocity and ϵ improves Eq. (1) with an example of how radar-derived hail accumulations—both using the static and dynamic fall velocity—compare to hail reports across a thunderstorm that occurred around Denver on 28 June 2016. On that day, hail depths were reported across the storm. This storm was first identified by radar as an unorganized thunderstorm near 2330 UTC in Boulder, 50 km northwest of Denver. By 2350 UTC, the southeast-moving storm showed signs of organization, as evidenced by weak updraft rotation as it passed over Arvada and then Denver. The storm then dissipated shortly after 0100 UTC on 29 June 2016. The deepest hail accumulations were reported near Arvada (5 ± 2 cm), moderate hail accumulations were reported in Denver (3–5 ± 2 cm), and only scattered stones were reported in the Boulder area. When comparing these reports to the radar-based hail accumulation maps using a static fall velocity of 15 m s−1 and Eq. (1) with HW14Ri and MESH, the reported depths are more consistent with the map derived using the relationship provided by HW14Ri (Figs. 10a,b). This improvement over a static fall velocity is not surprising given that a wide range of hailstone sizes occur within and across different types of thunderstorms, resulting in a wide range of possible fall velocities (Fig. 11).
5. Operational setup
Besides improving and validating the hail accumulation algorithm, we also tested if Eq. (1) can be utilized operationally in near–real time using dual-polarization WSR-88D level II data, WSR-88D level III products, and the operational sounding data. While the algorithm for deriving hail accumulation developed by Kalina et al. (2016) may have been timely enough to be used in real time, the authors never performed such a test, thus necessitating the test performed here. In addition to radar-derived hail accumulation, we also incorporated into the algorithm the computation of storm speed from a WSR-88D level III product and a proxy of hail presence in the cloud, also referred to as vertically integrated ice (VII). VII and storm speed were included because (i) large values of VII and slowly moving storms correlate with large hail accumulations and (ii) increases in VII often precede increases in accumulations rates (Friedrich et al. 2019). Both conclusions suggest that showing hail accumulation and VII on the same display may be of the most use to forecasters. VII can be calculated using radar reflectivity as shown in Carey and Rutledge (2000), Gauthier et al. (2006), and Mosier et al. (2011), and is currently available to NWS forecasters as a Multi-Radar Multi-Sensor product (Smith et al. 2016). Storm speed was derived using the WSR-88D level III product called Storm Tracking Information by taking the quotient of the distance between storm centroids and the time between successive radar scans.
To test the operational feasibility of using Eq. (1), as well as including the aforementioned products, an operational algorithm was written to run in real time and become available to operational forecasters 2–3 min after the WSR-88D level II and level III products become available. The algorithm, summarized in Fig. 12, accomplishes the following tasks:
Retrieve the sounding observation most recent in time and closest in space to identify the height of the 0°C isotherm and most recent WSR-88D level II reflectivity and the level III products (hydrometeor type, storm tracking information). Real-time hail size reports from CoCoRaHS and the Meteorological Phenomena Identification Near the Ground app (mPING; https://mping.ou.edu/) are also retrieved (step 1). The purpose of obtaining real-time hail size reports is discussed in section 6.
Use level II reflectivity to compute VII and use level III storm tracking information to compute the speed of each storm in the radar volume (step 2a).
Compute IWCh [Eq. (2)] below the 0°C isotherm using level II reflectivity and level III hydrometeor classification to identify areas classified as hail or hail–rain mixture (step 2b).
Use WSR-88D level II reflectivity to calculate MESH necessary to compute hailstone fall velocity (step 2c).
Compute hail depth for the current time step and sum up hail accumulations across all previous time steps (steps 3 and 4).
Produce a hail accumulation map showing accumulated hail depth overlaid with current values of VII, storm speed, and real-time hail size reports (step 5).
Transfer the hail accumulation map to a website for public viewing (http://clouds.colorado.edu/Real-timeHailMaps; step 6). This is also currently how NWS forecasters view and incorporate the data into operations.
This algorithm was written using the Python module PyArt (Helmus and Collis 2016) and takes about 1.5 min to run on any state-of-the-art computer.2 An example of an improved operational hail accumulation map is shown in Fig. 13 for a thunderstorm on 8 May 2017 that passed over Denver. This hailstorm was one of the costliest in U.S. history with estimated losses totaling $2.2 billion (NOAA/NCEI 2018).
6. Operational guidance
Given an improved hail accumulation equation, the next step was to test the functionality of the equation [Eq. (1) and HW14Ri, sections 3 and 4] in an operational setting. To test this, a hail accumulation algorithm incorporating Eq. (1) was applied to data collected between May and October of 2017 from six radars along the Colorado–Wyoming Front Range and the western High Plains (Fig. 4).
With a viable operational method for deriving hail accumulations, an important piece of information forecasters need in order to effectively integrate the data into operations is the effective range of radar-derived hail accumulation detection for each of the six radars for the 153 days the algorithm was operated in 2017 (Fig. 14). Since the algorithm uses only data below the 0°C isotherm, the range of detection is determined by the distance between the radar and the center point of the lowest radar beam intersecting the height of the 0°C isotherm, which varies throughout the convective season. In late summer when the 0°C isotherm height is high, the algorithm’s detection range is the farthest reaching. Conversely, when 0°C isotherm heights are low, the detection range leaves substantial gaps in coverage between radars. For our study, the range of maximum hail accumulation detection for each radar ranged from 65 km from the radar in spring and early fall (with a typical 0°C isotherm height for 3 km MSL; green circles on Fig. 14) expanding to 170 km from the radar in summer (with a typical 0°C isotherm height for 5.5 km MSL; red circles in Fig. 14).
Another result that may be of interest to the NWS, which was obtained using the radar-derived hail accumulation for the convective season of 2017, is the frequency of days with hail accumulations and their maximum depths of hail accumulation for each radar (Figs. 15a,b). To do this, we recorded each day when at least one storm in the respective radar volume produced accumulations greater than 1 cm. Shallow accumulations between 1 and 3 cm of hail were observed most frequently by the KCYS, KFTG, and KPUX radars at 24, 17, and 18 days, respectively, equating to about once a week. Combined, the Front Range radars observed these shallow accumulations to have occurred on roughly 31% of the days the algorithm was operated during 2017. The High Plains radars observed shallow accumulations to have occurred on roughly 23% of the days. When considering accumulations between 3 and 7 cm, the Front Range radars observed a combined total of 83 days. This is comparable to the frequency of days for the High Plains radars, which observed a combined 77 days in the same depth range. Furthermore, during the 2017 convective season, the Front Range radars more frequently observed hail accumulations of less than 7 cm than did the High Plains radars (126 vs 111 days), while the High Plains radars were more likely to observe accumulations larger than 7 cm (121 vs 75 days; Fig. 15c). This is not surprising in that we might expect deeper hail accumulations to originate from organized supercell thunderstorms that may be more common to the High Plains, as opposed to less organized thunderstorms, which are common to the Front Range. We, however, leave testing this hypothesis to future studies. Finally, we can compare the frequency of radar-derived hail accumulations to reported occurrences. Previously, we showed that hail accumulation reports greater than 1 cm in 2017 cataloged by the CHAT project occurred eight times (Fig. 1). While it is likely the CHAT project was not able to obtain all hail accumulation reports in 2017, the result still pales in comparison to the 201 days where accumulations greater than 1 cm were observed by radars along the Front Range.
The accuracy of the operational radar-based algorithm is another important metric. Namely, how do the derived accumulations compare to the hail accumulation reports? When comparing the results obtained by using the 20 high quality reports, we can identify that the radar-based algorithm shows significant errors for accumulations below 3 cm in depth, in that all reports less than 3 cm are either over- or underestimated (Fig. 9). The reason for increased differences between reported and radar-derived depths at accumulations of less than 3 cm may be a result of in situ measurement error, radar-derived depth error, or a combination of both. Because in situ hail depth measurement has not yet been standardized, hail depth reports of <3 cm might be related to the maximum hail size rather than the depth. However, radar-derived accumulations, including those larger than 3 cm, may also be related to errors in the chosen HCA and hailstone fall velocities. While the HCA used in this study was shown to match theoretical predictions of an observed supercell storm, Park et al. (2009) showed their scheme’s limitations included not having been verified using in situ data, as well as vulnerabilities to several factors such as attenuation, nonuniform beam filling, and partial beam blockage, all of which can compromise the quality of the radar data. Errors in hailstone fall velocities may be present due to variations in radar-derived maximum hailstone diameters when using MESH (Witt et al. 1998; Brimelow and Taylor 2017; Ortega 2018). The variations in MESH may be an unavoidable source of error until an improved operational hail size estimation method is developed. Errors in hailstone fall velocity may also be mitigated by using a median hailstone size derived from hailstone size distributions. However, deriving these distributions from radar is not currently viable operationally. Despite the observed error in derived accumulations for the storms in this study, we found no evidence that the radar-derived accumulations less than 3 cm were associated with reported hail accumulations that led to travel disruptions or clogged storm drains. This suggests that when the hail accumulation algorithm indicates accumulations of less than 3 cm, societal impacts are not likely to result from accumulating hail. Conversely, for large radar-derived hail accumulations, the NWS and future scientific studies may want to assess the benefits of creating criteria to include hail accumulations in their severe storm warnings. Our results suggest the minimum threshold should be 3 cm for the regions included in this study, but more research is needed to ascertain at what depths do hail accumulations begin to produce damage that merits such warnings.
While evidence so far has shown that the accuracy of radar-derived hail accumulations is high enough that the data are potentially useful to operational meteorologists, we note that the effects of hailstones melting between the cloud and ground are not explicitly included in Eq. (1). The quantification of the melting rate may be operationally impossible since melting depends on the atmospheric temperature and relative humidity, as well as hailstone temperature, fall velocity, density, and size (Rasmussen and Heymsfield 1987a,b; Pruppacher and Klett 1997; Ryzhkov et al. 2013). Since there exists no known operational method to identify many of these variables, we are unable to incorporate an explicit melting correction into Eq. (1). To mitigate the effects of melting, it is reasonable to conjecture that the effects may become less concerning by restricting the range from the radar at which the hail accumulations are derived. We have, however, not tested this hypothesis given our dataset of high quality hail depth reports becomes increasingly smaller with reduced radar range. When a larger set of high-quality hail depth reports becomes available, this hypothesis can be effectively tested in the future. Despite this, it is possible to provide operational guidance when derived MESH values and real-time-reported hailstone sizes differ the most.
Given the lack of a method to quantitatively measure hailstone melting to show when Eq. (1) is most reliable, we use the operationally available values of MESH at the hail depth report location at the radar-derived time of the report (section 2b) and the corresponding hail size report obtained from either Storm Data, CoCoRaHS, mPING, or social media. We found Eq. (1) performs well when the ratio between MESH values and reported sizes at the ground is less than five. The largest outliers between reported accumulations and radar-derived accumulations also show the largest differences between MESH and maximum reported size (Fig. 16, red diamonds). We note reports of hail size larger than the corresponding measured MESH value are likely due to either measurement error or the reporter not identifying the largest hailstone. This information can be added to hail accumulation maps as a data-quality flag to rapidly assist forecasters in identifying when to place high confidence in the radar-derived hail accumulations. We recognize that the prescribed quality control is dependent on both the hail size derived in cloud and the hail size reported on the ground, where the latter cannot itself be quality controlled in real time.
The question remains, how does a MESH value 5 times larger than the reported hailstone size affect derived hail accumulations? To answer this question, we found a percent difference of about 113% in fall velocities between an arbitrary-sized hailstone and a hailstone of 20% its size when using the diameter–fall velocity relationship provided by HW14Ri. This percent difference holds true for all diameters. A 113% difference in fall velocity is directly translated into derived accumulations, suggesting that when MESH values are 5 times larger than reported hailstone sizes, the derived depths may be roughly twice as large as the reported hail depths. For our study, the largest outlier greater than 3 cm in reported depth (red diamond in Figs. 9 and 16) would match the reported hail depths within a ratio of 1.5 had the associated hailstone fall velocities been better estimated by MESH.
In this paper, we have demonstrated how hail depth reports collected by the CHAT project are assessed for their quality, and then used to validate hail accumulation maps for operational application. While about 124 hail depth reports have been collected since 2012, we used 20 high quality hail depth reports to validate an operational hail accumulation algorithm. Results from this research can be summarized as follows:
Reports of average hail depth remain fundamental for studying hail accumulations in thunderstorms and also for verifying radar-based estimates of hail accumulation. The ongoing efforts of the CHAT project have effectively produced a hail accumulation database that is used to validate the hail accumulation algorithm (section 2a).
Our method of quality controlling hail depth reports is effective in isolating high quality hail depth reports from more ambiguous, less reliable, reports (Fig. 5).
Adding the dynamic diameter–fall velocity relationship for rimed particles provided by Heymsfield and Wright (2014) to the hail accumulation equation given by Kalina et al. (2016) improves the correlation coefficient between reported and radar-derived hail accumulations from 0.69 to 0.87 and the slope of a best-fit line between reported and radar-derived accumulations from 0.69 to 0.84 (Fig. 8).
With the method described herein, hail accumulations can be calculated within 65–170 km of operational dual-polarization radars. We are currently calculating operational hail accumulations over the Colorado–Wyoming Front Range and the western plains (http://clouds.colorado.edu/Real-timeHailMaps). Because of the changes in melting-level height throughout the year, the maximum range of hail accumulation maps around a radar ranges between 65 km in spring and early fall and 170 km in summer over the Colorado–Wyoming Front Range and the western plains.
An operational hail accumulation algorithm using Eq. (1) that includes other hail accumulation nowcasting products such as vertically integrated ice and storm speed is now available for implementation in research studies and operational forecasts (Fig. 12).
Based on radar-based hail accumulation maps for 2017 in the Colorado–Wyoming and western Kansas–Nebraska regions, hail accumulating storms producing less than 7 cm occur most frequently (~80% of the days observed) near the Colorado–Wyoming Front Range, while the western Kansas–Nebraska plains observed the most radar-derived hail accumulations exceeding 7 cm in depth (Fig. 15).
By using operational MESH and real-time hail size reports from mPING and CoCoRaHS, the reliability of the operational hail accumulation algorithm can be assessed in real time (Fig. 16). Preliminary analysis showed that when the ratio between the MESH-derived and reported hail size is less than 5, using the diameter–fall velocity relationship for rimed particles provided by Heymsfield and Wright (2014) yields derived hail accumulations greater than 3 cm and matches reported accumulations to within a ratio of 1.5.
The results from this study were only possible due to the hail depth database developed by the CHAT project. However, it is important to recognize that of 124 available hail depth reports, only 20 were classified as being of high enough quality for use in determining the best dynamic fall velocity relationship to apply to Eq. (1). This suggests future work should encourage more frequent high quality hail depth reports as well as reports of hail size distribution and hail swath extent. A complete description of the best ways to report all the previously mentioned hail properties can be found on the web (http://clouds.colorado.edu/HowToReportHailDepth).
The impacts of improved radar-derived hail accumulations are twofold. First, scientific studies are now able to incorporate Eq. (1) into an algorithm to compare the hail accumulating properties of storms with fewer errors, and, second, forecasters are now able to better identify areas of accumulating hail in real time. However, much is left to be learned about the impacts of hail accumulating storms, including their specific impacts on property and human safety, as well as if criteria for warning hail accumulating storms are necessary. Future studies of hail accumulating storms, including those designed to develop forecasting techniques for deep hail accumulating storms, will benefit the most from more frequent and higher quality hail reporting.
We thank all volunteers who provided hail depth reports, as well as the employees of the National Weather Service Forecast Office in Boulder for providing support in educating storm spotters in hail depth reporting and tracking hail depth reports via social media. We also thank Brielle Kissack of the University of Colorado for searching archived social media and news articles for hail accumulation reports, and Dr. Elizabeth Weatherhead for her guidance in statistical methods. We thank the University of Colorado for providing the Summit supercomputer for reliable operational testing of the hail accumulation algorithm (Anderson et al. 2017). Finally, we thank three anonymous reviewers who provided useful comments in refining this manuscript. Some of the material is based upon work supported by the National Science Foundation Division of Atmospheric and Geospace Sciences through Award 1661583.
We tested the algorithm on a 64-bit operating system: Intel Core i5-4210U CPU at 1.70 GHz with 8-GB RAM personal computer.