Localized Aviation MOS Program (LAMP) convection and lightning probability and “potential” guidance forecasts for the conterminous United States, developed by the Meteorological Development Laboratory (MDL), have been produced operationally and made available to aviation and other users through the National Digital Guidance Database (NDGD) since April 2014. In response to user requests for improved skill and resolution of these forecasts, MDL has recently made extensive upgrades, and a switch to the new LAMP guidance was made in January 2018. Upgrades include improved spatial and temporal resolution of the predictands, which were enabled by first time LAMP use of finescale radar reflectivity products from the Multi-Radar Multi-Sensor (MRMS) system, total lightning observations from a ground-based lightning sensing system, and finescale model output from the High Resolution Rapid Refresh (HRRR) model. This article describes how these new data inputs are applied in the LAMP model to obtain improved skill and sharpness of the convection and total lightning probability forecasts. Strengths and limitations in LAMP performance are shown through verification statistics and example verification maps for a selected intense convective storm case.
During 2012–17, flight delays and cancellations in the United States cost airlines and passengers over $20 billion (U.S. dollars) annually (FAA 2018). About 65% of the air travel disruptions were due to adverse weather, which includes low ceilings, low visibility, air turbulence, wind shear, and lightning. Statistics compiled by the National Weather Service (NWS 2019) reveal lightning strikes have caused a U.S. annual average of 29 human deaths during 2008–18 and many more serious injuries (see Curran et al. 2000 for casualty statistics during 1959–94). Lightning also results in heavy property damage, as the Insurance Information Institute (Insurance Information Institute 2019) reports that insurance payouts from lightning losses to residential properties averaged almost $900 million annually during 2007–16, and additional losses are incurred by commercial properties. As improved thunderstorm forecasts could mitigate these public safety and economic impacts, this study addresses a continuing need for skillful thunderstorm forecast guidance for the user community.
The history of automated gridded thunderstorm guidance forecasts dates back to the early 1970s, as the Techniques Development Laboratory [now Meteorological Development Laboratory (MDL)] of the National Weather Service (NWS) used a statistical model with a linear regressions equations framework to estimate very short-range (2–6 h) thunderstorm probabilities (Charba 1977) on an 80-km grid for the eastern United States. About the same time, Reap and Foster (1979) used a similar statistical model together with a model output statistics approach (MOS; Glahn and Lowry 1972) to extend the predictive range of such gridded probabilities to 12–36 h. While both of these early studies used Weather Surveillance Radar (WSR-57) reflectivity data to define the thunderstorm predictand (Reap and Foster 1979, Fig. 2), predictors for the very short-range probabilities consisted of a mix of observational parameters derived from conventional weather observations and manually digitized WSR-57 radar data, large-scale numerical/dynamical weather prediction (NWP) model forecasts, and thunderstorm climatology, whereas the longer-range MOS forecasts used only the latter two predictor types. Over the course of four decades since these early studies, many similar automated, gridded thunderstorm probability guidance products involving a variety of statistical methods have followed, most of which feature large-scale NWP model predictors and forecast ranges of 12 h and longer (Reap 1991; Hughes 1999; Bright et al. 2005; Burrows et al. 2005; Bright and Grams 2009; Bothwell 2009; Bothwell and Buckley 2009; Shafer and Gilbert 2008; Shafer and Rudack 2015). Fewer statistically based thunderstorm guidance products that focus on the short range (i.e., ~24 h and less) have been developed and implemented (Charba and Liang 2005a; Dupree et al. 2009; Charba and Samplatsky 2009, and Charba et al. 2011), as they involve increased model complexity because of the need to incorporate current observations-based predictors together with NWP model-based predictors.
The Localized Aviation MOS Program (LAMP) was conceived at MDL during the early 1980s (Glahn 1980; Glahn and Unger 1986) to produce hourly updates of MOS guidance forecasts in the 1–25-h range. Over the years, principal users of LAMP guidance forecasts have been local NWS Weather Forecast Offices (WFOs) to support local public and aviation weather forecasts and the NWS Aviation Weather Center to support its national aviation forecast guidance responsibility. Quite recently, LAMP guidance products are also being considered as a short-range input to the NWS National Blend of Models (Tew et al. 2016; Craven et al. 2018).
As LAMP guidance gained prominence during the 1990s, the short-range gridded thunderstorm probability guidance mentioned above (Charba 1977) was incorporated into it, as thunderstorms are relevant to aviation operations and planning, and LAMP supports a combination of small-scale, observations-based, and large-scale NWP-based predictors such as used in this older thunderstorm model. Additionally, LAMP applies simple Lagrangian advection models to observations-based predictor grids to extrapolate their predictive value to longer forecast projections. Thus, LAMP functions to fill the information gap between current weather observations and large-scale MOS guidance forecasts (Ghirardelli and Glahn 2010). Recently, this “base” LAMP model was upgraded to incorporate hourly issued, finescale dynamical predictors from the High Resolution Rapid Refresh model (HRRR; Benjamin et al. 2016). Glahn et al. (2017) discuss how this “LAMP-HRRR meld” results in improved LAMP cloud ceiling and visibility forecast guidance.
At the time development work on the initial LAMP thunderstorm product (Charba and Liang 2005a) commenced in the early 2000s, cloud-to-ground (CG) lightning flash data (see section 2b for the definition of lightning flashes) from the National Lightning Detection Network (NLDN; Cummins et al. 1998) were already being used at MDL for MOS “thunderstorm” probability prediction (Hughes 1999, 2004). Thus, this early LAMP thunderstorm guidance also switched from use of radar reflectivity to CG flash data for the predictand definition. The product consisted of 1–25-h probabilities of one or more CG flashes in 20-km square grid boxes during 2-h periods in the 1–25-h range, which was later upgraded by Charba and Samplatsky (2009). Then, Charba et al. (2011) introduced a companion LAMP predictand called “convection,” where convection is defined as either U.S. Weather Surveillance Radar-1988 Doppler (WSR-88D; Crum and Alberty 1993) composite reflectivity (CREF) ≥ 40 dBZ or ≥1 NLDN CG lightning flashes. While feedback from users of LAMP convection and CG lightning forecast guidance has been mostly favorable, some aviation users indicated a need for improved skill, sharpness, and resolution of the probabilities. To address this need, MDL recently upgraded this guidance through incorporation of advanced, finescale radar and lightning observational data together with HRRR model output, which is the subject of article.
It is relevant to mention here that over the past 20–30 years major scientific advances have been made in understanding physical and kinematic processes associated with lightning production in thunderstorms. These advances have been enabled largely by the advent of four-dimensional radar reflectivity mapping of individual thunderstorm cells by WSR-88D radars (Crum and Alberty 1993; Crum et al. 1993) and lightning mapping by ground- and space-based lightning locating systems (LLSs; Nag et al. 2015). Both laboratory experiments (e.g., Takahashi 1978) and diagnostic studies of the internal structure of thunderstorms using radar and lightning data (e.g., Carey and Rutledge 2000; Lang and Rutledge 2002) indicate cloud electrification results from hydrometeor (graupel, hail, and supercooled water droplets) collisions in the mixed-phase region of intense convective cloud updrafts. Consistent with this cloud electrification mechanism, reflectivities of around 40 dBZ within intense thunderstorm updrafts from about the −10° to −20°C (environmental) height level (~6–12 km) have been found statistically well correlated with CG flashes (Vincent et al. 2004; Yang and King 2010; and Mosier et al. 2011). Also, these findings have been essentially confirmed through thunderstorm numerical simulations with explicit modeling of thunderstorm dynamics, kinematics, and cloud microphysics and electrification (Mansell et al. 2005; Fierro et al. 2006; Kuhlman et al. 2006, Barthe and Pinty 2007; Calhoun et al. 2014). Since the upgraded LAMP convection and lightning model incorporates cutting-edge radar and lightning data as well as output from the “convection-allowing” HRRR model, the degree to which these scientific advances are incorporated in the present LAMP model (or will be incorporated in a future LAMP upgrade) is noted at various points in the body of this article.
2. LAMP predictand and predictor upgrades
LAMP convection and lightning guidance upgrades were enabled through incorporation of 1) finescale observational data consisting of Multi-Radar Multi-Sensor (MRMS) reflectivity products (Smith et al. 2016; Zhang et al. 2016) and total lightning measurements [TL, comprised of CG and intracloud (IC) flashes] from the Earth Networks Total Lightning Network (ENTLN; Liu and Heckman 2012) and 2) mesoscale output from the HRRR model (Benjamin et al. 2016). MRMS and ENTLN observational data are used to upgrade the LAMP convection and lightning predictands, and these data along with HRRR forecasts are also used to specify new predictors. As radar and lightning data are used for both the LAMP predictands and predictors, they have critical roles. Thus, it is relevant to discuss the quality and characteristics of these observational data inputs for both the present (upgraded) and previous LAMP convection and lightning products.
a. Radar data
At the time the previous LAMP CG lightning (Charba and Liang 2005a; Charba and Samplatsky 2009) and convection (Charba et al. 2011) guidance products were being developed, the deployment of WSR-88D radars within the Next Generation Weather Radar (NEXRAD) program (Crum and Alberty 1993; Crum et al. 1993) had been completed, but real time availability of NEXRAD data was limited to the locally produced, coarse-resolution Radar Coded Message (RCM) product (OFCM 2017). RCM data were comprised of six broad categories of peak CREF in 10-km square grid boxes only twice each hour for the coverage area of each WSR-88D radar. Though Kitzmiller et al. (2002) developed and applied effective automated CREF screening procedures to remove various types of nonprecipitation echoes in CONUS mosaics of the local RCM grids, Charba and Liang (2005b) found these grids had enough residual “false” echoes to warrant their development and application of a simple, supplemental false precipitation echo screening procedure prior to application in the LAMP CG lightning and convection probability products (referenced above).
Less than a decade later, the National Oceanic and Atmospheric Administration (NOAA) National Severe Storms Laboratory (NSSL) developed the MRMS system to produce nationally mosaicked grids of multiple, locally produced NEXRAD radar reflectivity-based products (OFCM 2017). These grids were produced every 5 min beginning in early 2011 and every 2 min since August 2013 with 0.01° × 0.01° pixel (roughly 1 km × 1 km) spatial resolution for the full range of values of a number of reflectivity-based products, including CREF, vertically integrated liquid (VIL; Greene and Clark 1972), and the peak height of ≥30 dBZ reflectivities. The availability of these MRMS products allows replacement of coarse resolution RCM (CREF) data in LAMP with fine resolution MRMS CREF and VIL products, which constitutes a major LAMP upgrade.
Note, however, that LAMP use of only two MRMS products (CREF and VIL) is quite limited compared to MRMS products available to field forecasters today (Smith et al. 2016; OFCM 2017; https://www.nssl.noaa.gov/projects/mrms/operational/tables.php). Current MRMS products also include reflectivity at various heights above the freezing level and vertically integrated ice, which have been shown to be related physically (e.g., Carey and Rutledge 1996; 2000) and statistically (Vincent et al. 2004; Cecil et al. 2005; Yang and King 2010; and Mosier et al. 2011) to lightning. Unfortunately, these products could not be used in the current LAMP model since they did not become available in the NSSL MRMS archive until October 2016 [H. Reeves (NSSL), 2019 email communication], which is after the present January 2012–May 2016 LAMP developmental sample period. Thus, inclusion of these more recent MRMS products is a high priority in a planned near-term upgrade of the current LAMP convection and lightning guidance products.
The quality of MRMS products has significantly improved over the years since NSSL initiated their experimental production and archival in 2011. In particular, the presence of nonprecipitation echoes has been steadily declining due to development and application automated nonprecipitation echo screening procedures (Smith et al. 2016; Zhang et al. 2016). Further, advancements in false echo screening have been strong in recent years through incorporation of dual polarimetric radar measurements (Tang et al. 2014; Krause 2016). On the other hand, since the LAMP convection and lightning model developmental sample extends back to January 2012, this predates recent MRMS data quality advancements, and Charba et al. (2017) found sporadic contamination from nonprecipitation echoes, especially in the early years. Also, weak radar coverage over western portions of the CONUS WSR-88D network resulting from sparse radar siting and rugged mountainous terrain presents an ongoing coverage problem there (Maddox et al. 2002). These concerns prompted Charba et al. (2017) to undertake development of an automated MDL supplemental quality control (QC) process for the CREF and VIL products used in LAMP. The data quality investigation and development of an elaborate supplemental QC process involved use of 5-km grids of maximum CREF, VIL, and echo top height at 15-min intervals, where the maximum was specified as the highest MRMS pixel value in a 5-km grid box. Charba et al. (2017) document the data quality investigation, a description of the QC process, and its contribution to LAMP convection forecast performance.1 Here, it suffices to note that the QC’d 5-km maximum CREF and VIL grids comprised the radar database from which the convection predictand and predictors were specified.
b. Lightning data
CG flash data dating back to 1994 have been used for development of previous LAMP models, and these data have been used operationally in those models from 2004 to 2018. Over this period the CG data source has been the NLDN (Cummins et al. 1998). From its inception in 1989 (see Orville 2008 for the history of NLDN) until 2013, NLDN has been detecting mostly CG return strokes (see Krider et al. 1980 for a discussion of contrasting electromagnetic impulses emitted by CG return strokes versus IC lightning). NLDN groups CG strokes into flashes to mimic “lightning strikes” seen visually, as all strokes within 10 km and 1 s of each other are grouped into a CG flash (Cummins et al. 1998). Over the 1994–2018 period, the NLDN CG flash detection efficiency (DE; the detected percentage of actual CG flashes) and positioning accuracy over the CONUS has gradually improved from ≥70% to ≥90% and from ≤4 to ≤1 km, respectively, (Cummins et al. 1998; Mallick et al. 2014) due to several NLDN system upgrades. Then, in mid-2013 all NLDN network sensors were replaced with an upgraded version that is sensitive to both very low frequencies and high peak currents that characterize CG strokes and to higher frequencies and lower peak currents that characterize IC “pulses.” From an investigation of NLDN-detected IC pulses from rocket-triggered lightning in Florida during 2004–13, Mallick et al. (2014) reported an NLDN IC pulse DE of ≤32%. Using the upgraded NLDN sensor measurements during 2014 together with a newly formulated IC pulse grouping algorithm, Murphy and Nag (2015) obtained NLDN IC flash DE values in the 50%–60% range using lightning mapping array (LMA; Thomas et al. 2004) ground truth data in Oklahoma and Colorado. With such improved IC flash DE together with continuing high DE for CG flashes, the upgraded NLDN may be viewed as a TL LLS.
Meanwhile, Earth Networks, Inc. began providing TL flash data CONUS-wide in 2009 with their own ENTLN LLS. Note that ENTLN detects CG strokes and IC pulses as with the upgraded NLDN, though it defines CG and IC flashes with a smaller (0.7 s) time criterion than for NLDN [the ENTLN spatial criterion is 10 km (Liu et al. 2014), which is the same as for NLDN CG flashes]. ENTLN is distinguished by its use of “broadband” (1 Hz–12 mHz) sensors, which are sensitive to both CG strokes and IC pulses (Liu et al. 2014). In 2012, ENTLN was selected as the exclusive contract provider of TL data to the U.S. government for both research and operational applications. As we were planning a major upgrade of the LAMP convection and lightning models (the subject of this article), the potential use of ENTLN TL data was investigated by examining the quality of a January 2012–December 2015 sample using NLDN CG flash data as a benchmark of quality for the CG component of ENTLN TL flash data (Charba et al. 2015). This study showed that within the CONUS borders ENTLN CG flash counts were substantially higher than NLDN CG flash counts prior to an ENTLN waveform processing upgrade in June 2013, whereas these counts were close to one another afterward. Many spot checks conducted by the authors since then indicate the latter result still remains true today.
Regarding ENTLN TL (i.e., CG + IC) flash detection performance, several studies have shown marked improvement over the years. With an early 3-yr ENTLN sample (2011–13), Rudlosky (2015) found ENTLN detected about 72% of optically observed TL flashes by the Tropical Rainfall Measurement Mission (TRMM) Lightning Imaging Sensor (LIS) on board an orbiting satellite over its southern CONUS field of view, with yearly improvements from 62% in 2011 to 80% in 2013. The Charba et al. (2015) study referenced above showed ENTLN TL counts increased steadily CONUS-wide over the 2012–15 sample examined, which resulted from increased IC flash reporting with the June 2013 ENTLN upgrade (noted above) and another IC reporting increase with a subsequent June 2014 ENTLN upgrade. The apparent improvement in ENTLN CG and IC detection over this period is consistent with high DE for IC and CG flashes (>95%) for a 2014–15 sample for Florida reported by Zhu et al. (2017) using natural and rocket triggered lightning ground truth data. The Zhu et al. study also showed quite high ENTLN CG and IC flash classification accuracy (>90%) following an ENTLN flash classification upgrade in August 2015, which contrasts to much weaker ENTLN flash classification performance in Florida reported by Mallick et al. (2015) with an earlier (2009–12) sample.
Though improvements in ENTLN CG and IC detection over time have practical benefits, an evolving input dataset is potentially problematic in the statistically based LAMP model. Note that at least three ENTLN upgrades (June 2013, June 2014, and August 2015) fall within the January 2012–May 2016 LAMP model developmental (“training”) sample, which resulted in a decrease in CG counts and increases in both IC and TL counts. In this LAMP application, a potential adverse impact of this CG-to-IC reporting “swing” was alleviated by not distinguishing between CG and IC flashes in the predictand and predictor applications (sections 2c and 2d). Meanwhile, the reporting growth in ENTLN TL flashes over time was not addressed in the LAMP model development, and thus it could cause an underforecasting tendency in the early portion of the sample and overforecasting tendency afterward, including in real-time application of LAMP. While this potential prediction error has not been investigated, it will be addressed via an upcoming LAMP model upgrade through use of a more recent ENTLN sample with improved lightning reporting stability. Further, improved ENTLN IC-versus-CG reporting supports using IC flashes and CG flashes as separate LAMP convection and TL predictors, which could be beneficial considering large regional variations in the IC/CG ratio over the CONUS reported by Boccippio et al. (2001) and Medici et al. (2017).
In the more distant future, we also anticipate incorporating satellite-based TL data into LAMP from the recently launched Geostationary (GOES-16/17) Lightning Mapper (GLM) satellite system (Goodman et al. 2013). GLM provides continuous spatial and temporal coverage of TL, which makes these data well suited for merging with ground-based TL data. Also, complementary characteristics of ENTLN and GLM TL data offer the potential of synergistic merging of these data in LAMP. In particular, while both ENTLN and GLM TL flash data have high DE (GLM DE is >80% when averaged over day and night periods), ENTLN distinguishes between CG and IC flashes while GLM does not. On the other hand, ENTLN TL DE is likely not uniform (geographically) across the CONUS (which especially applies to IC flashes) since ENTLN network sensors are not evenly distributed over the CONUS, whereas geographical uniformity of detection is believed to be an inherent strength of the GLM TL data (Goodman et al. 2013). Still, early GLM TL data show significant systematic error (e.g., resulting from sun glint, “solar intrusions,” and “parallax” in locations of wide GLM viewing angles from nadir), as shown by Rudlosky et al. (2019). Before GLM data are applied in LAMP, these data quality concerns should be addressed.
c. Convection and lightning predictands
The upgraded LAMP convection predictand is defined as the occurrence/nonoccurrence of MRMS CREF ≥ 40 dBZ and/or one or more ENTLN TL flashes in a 20-km square grid box during a 1-h period. The CREF component is specified from the QC’d 5-km CREF grids (section 2a) as the maximum CREF pixel value in the predictand grid box over four 15-min times within the 1-h valid period, which means the CREF peak is used in the convection definition. In essence, this CREF assignment procedure acts to coarsen the MRMS spatial scale from the native 1-km pixel resolution to 20 km and the MRMS time scale from 5 (or 2) to 60 min. This scale coarsening is a form of smoothing, which should improve the match (correlation) between the predictand and meteorological predictors in the “neighborhood” of a grid point (Schwartz and Sobash 2017). Also, neighboring 20-km predictand grid boxes overlap by 10 km (as their centers have 10-km spacing), which results in a 10-km grid resolution while avoiding a reduction in event occurrences that would arise with nonoverlapping grid boxes. Since convection and lightning events are rare, especially in northern latitudes during the cool season, such boosting of predictand occurrences results in (statistically) more robust samples and improved stability of convection and lightning regression equations.
With assignment of the CREF maximum to the 20-km grid box, the binary (1/0 value) CREF component of the convection predictand is obtained by applying the ≥40-dBZ reflectivity criterion. The corresponding TL component is treated similarly through application of the TL criterion (i.e., the occurrence of one or more TL flashes). Finally, the convection predictand is obtained by combining the CREF and TL components.
The TL predictand is defined identically to the TL component to the convection predictand discussed above. Though this implies the TL predictand is strongly correlated to the convection predictand, the two predictands are substantially unique, as the convection relative frequency is roughly double the TL relative frequency (section 4a). This implies ≥40 dBZ radar echoes often occur without lightning, and convection predictand occurrences are mostly based on ≥40 dBZ radar echoes.
All upgrades to the convection and lightning predictands are summarized in Table 1, the most basic being reduction in valid period from 2 to 1 h and reduction in spacing of gridbox centers from 20 to 10 km. Note that the switch from the legacy RCM reflectivity to MRMS CREF should contribute to a more robust convection predictand, as the spatial, temporal, and nominal resolutions of the latter far exceed that for the former (see OFCM 2017 for a description of RCM product). Further, the MDL supplemental QC applied to the MRMS CREF (Charba et al. 2017) is more comprehensive and rigorous than that applicable to RCM reflectivity data (Charba and Liang 2005b).
The replacement of CG flashes with TL flashes has benefits in LAMP for both the convection and lightning predictands. For convection, the incorporation of IC flashes expands the convection definition, as many studies have shown most lightning flashes detected by ground networks are IC (e.g., Boccippio et al. 2001; Charba et al. 2015; Medici et al. 2017); the lightning event definition expansion also applies to the LAMP lightning predictand. Note also that the relative frequencies of reported IC flashes and CG flashes can vary greatly between thunderstorm events. For example, Carey and Rutledge (1998) found extreme IC:CG ratios (in the 20–70 range) in a severe thunderstorm case study that contained hail and tornadoes in the High Plains, whereas Boccippio et al. (2001) and Medici et al. (2017) reported mean IC:CG ratios of 3–4 CONUS-wide and factor-of-10 regional variations over the CONUS using multiyear ground- and satellite-based lightning data historical samples. This implies (LAMP) CG-based lightning forecast guidance could be misleading for users in cases of thunderstorm events dominated by IC flashes. Also, the inclusion of IC flashes in the LAMP predictand definition can increase the lead time for CG flashes, as MacGorman et al. (2011) reported that IC flashes often preceded CG flashes by 5–10 min in north Texas and Oklahoma based on ground observing networks during May–August 2005.
d. Convection and lightning potential predictors
As noted in section 1, a basic function of the LAMP model is to issue hourly updates of MOS forecasts using current weather observations (Ghirardelli and Glahn 2010). Here, current MRMS and ENTLN data are used to specify observations-based (Obs) predictors, and hourly refreshed forecasts from the HRRR provide an additional predictor input. Obs, HRRR, and MOS potential predictors (Table 2) are paired with LAMP convection and TL predictands by evaluating these predictors at predictand grid box centers.
Persistence of current weather observations together with their simple extrapolation can be a powerful predictive tool for the next several hours in automated short-range weather prediction (Ghirardelli and Glahn 2010; Glahn et al. 2017). This arises because the effectiveness of short-range predictors based on large-scale NWP model output is hindered by model latency and dynamical adjustments (“spinup”) within 4–6 h following initialization. Persistence parameters may be beneficial even in the case where potential predictors from the output from finescale, rapidly updating models such the HRRR (Benjamin et al. 2016) are used since such models may not mirror current observations well after model initialization (Glahn et al. 2017).
Current radar and lightning observations have long been used to fill the short-range predictive void left by NWP output for automated short-range thunderstorm, lightning, and convection prediction (Charba 1977; Charba and Liang 2005a; Dupree et al. 2009; Charba and Samplatsky 2009; Charba et al. 2011). In the present study, gridded MRMS CREF (and VIL) and ENTLN TL observations, underlying all but the last variable in the top section of Table 2, function as persistence predictors for the LAMP convection and TL predictands. Here, “persistence” means the most recent (current) observation is applied as a constant (fixed) predictor across multiple LAMP forecast projections (lead times), where the regression coefficient (in an associated regression equation) normally reduces its predictive impact (weighting) with increasing lead times. In Table 2 the valid times of the current MRMS reflectivity product and TL count (persistence) predictors are indicated. In addition, these persistence predictors are also displaced (advected) forward in time and space to enhance their predictive value with increasing LAMP predictand valid times. Henceforth, the terminology “initial” is used to denote the current time (persistence) predictors and “advected” denotes the displaced (persistence) predictors. Thus, the complete list of potential Obs predictors in Table 2 consists of the “initial” and “advected” forms of the CREF, VIL, and TL count variables.
An initial CREF (or VIL) grid is specified as the instantaneous MRMS CREF (VIL) pixel maximum value in a 10-km grid box centered on the 20-km predictand box. Note that using the pixel maximum in a 10-km grid box for the predictor specification is akin to using the pixel maximum in a 20-km grid box in case of the convection predictand, but the length scale in the former is 10 km rather than 20 km in the latter (section 2c; see Schwartz and Sobash 2017 for a review of alternative neighborhood approaches for rendering fine-grid data to coarser mesh grid points). This analogy also applies for the TL predictors and predictand, where the TL flash count constitutes the core parameter. Also, to maximize the correlation between the initial grids and the predictands the most current MRMS and ENTLN TL observations (valid 14 and 15 min after the top of the hour, respectively) for a given hourly LAMP cycle run are used (Table 2). Also, the CREF- and TL-based predictor time scales are effectively lengthened by also using the CREF parameter 30 min earlier and aggregating the TL counts over 30- and 60-min periods.
The advection of an initial grid consists of performing simple Lagrangian displacement using a weighted mean of predicted 850- and 500-hPa winds from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS; Kanamitsu et al. 1991); an early version of the advection model is described in Glahn and Unger (1986). The 850- and 500-hPa wind weights were predetermined heuristically (by a team of MDL developers that included the lead author) by visually comparing experimentally advected grids against verifying grids over many cases.
Several postprocessing operations are applied to the initial and advected Obs variables to obtain the final set of potential Obs predictors. These consist of the application of truncation bounds, the formulation of binary predictor variables, and the application of conventional grid smoothing. Truncation is applied to prevent potentially troublesome “outlier” values,2 binary predictors help account for nonlinear relationships in linear regression equations, and smoothing is applied to filter random, finescale variability (noise) and thus improve the predictor–predictand correlations. More information on the variable truncation and smoothing procedures is provided in the appendix.
Finally, the last variable in the top section of Table 2, predictand monthly relative frequency (PMRF), deserves elaboration, as it can have a significant predictive benefit when combined with a MOS probability (as shown in the bottom part of Table 2), especially for the longer-range LAMP forecast projections (i.e., projections beyond the 17-h maximum of HRRR model forecasts). The PMRF is computed separately for each 1-h valid period, 10-km grid point, and month to resolve geographical and diurnal variability in the convection and TL predictands over the course of the year. Since LAMP convection occurrences are rare, especially during cool-season months, PMRFs based on the January 2012–March 2016 developmental sample were quite choppy. Thus, extensive smoothing across adjacent grid points, valid periods, and months (see the appendix) was applied to obtain spatially and temporally coherent PMRFs. In the case of TL, which is rarer than convection, the sample was extended backward to 1997 by using NLDN CG data prior to 2012. This choice was supported through preliminary tests that showed broad spatial and temporal distributions of CG-based and TL-based relative frequencies are reasonably similar to one another. Since the space–time distributions of these lightning relative frequencies are more relevant in a predictor-use context than their absolute magnitudes, the choice of computing the relative frequencies with the CG-TL composite sample is reasonable. The benefit of the extended sample is that lightning relative frequencies are relatively coherent, and thus the follow-on smoothing of these grids was lighter than for convection.
HRRR potential predictors (middle section of Table 2) are obtained from archived output from the 3-km HRRR model (experimental version3; Benjamin et al. 2016). Key features of the HRRR include an hourly run cycle, nonhydrostatic dynamics with explicit modeling of convection, and assimilation of radar reflectivity and lightning data. Potential predictors include convective instability measures, surface moisture flux divergence, and precipitation parameters, which are commonly used for thunderstorm and lightning prediction (e.g., Bright et al. 2005; Burrows et al. 2005; Shafer and Fuelberg 2008). Also, included are HRRR-simulated CREF and VIL parameters (which mirror Obs versions in Table 2) and the lightning threat 3 index. The latter is a measure of the TL threat defined by McCaul et al. (2009), which consists of a weighted combination of the model-simulated vertically integrated ice and the vertical graupel flux. Note that all of the HRRR parameters in Table 2 are from direct model output, except surface moisture divergence, which is computed from HRRR 10-m wind and 2-m specific humidity forecasts. Also, because of HRRR latency (due to model run time), the most current HRRR cycle that can be used in an hourly LAMP cycle hh is hh − 1. Finally, to enhance cycle-to-cycle consistency in the HRRR predictor input into LAMP, all HRRR potential predictors in Table 2 are also used from the previous HRRR cycle (hh − 2).
The specification of HRRR predictors on the LAMP 10-km grid from the native 3-km HRRR grid is similar to that for the MRMS variables in that all HRRR grid point values within a 10-km grid box are used in the assignment. One slight difference is that a mean of those values (rather than the maximum) is used, which constitutes an alternative spatial smoothing procedure in the neighborhood of the 10-km grid point (Schwartz and Sobash 2017). Also, the same postprocessing procedures applied to the MRMS and TL variables [Table 2; section 2d(1)] were applied to the HRRR variables (see the appendix).
Finally, it is worth noting that, as for the MRMS reflectivity products in the top section of Table 2, additional potentially useful HRRR predictor parameters for LAMP convection and TL have become available since the May 2016 ending of the LAMP developmental sample. These HRRR parameters include simulated reflectivity at the −10°C level, echo top height, vertically integrated graupel, and maximum updraft velocity in a vertical column, which have been linked to charge separation in the upper levels of intense convective cloud updrafts in previous studies (Carey and Rutledge 2000; Lang and Rutledge 2002; Vincent et al. 2004; Cecil et al. 2005; Yang and King 2010; and Mosier et al. 2011). These new HRRR parameters will be included as potential predictors in a pending LAMP upgrade.
As noted in section 1, MOS forecasts produced four times per day are updated hourly in LAMP over the 1–25-h forecast range. Note that a practical benefit of the MOS-updating strategy in LAMP is that the output from multiple NWP models can be used without being overwhelmed with an excessive number of potential NWP model predictors. This advantage is used by developing separate MOS convection and TL probabilities based on each of two NCEP models: the GFS and the North American Mesoscale Model (NAM; Rogers et al. 2005), and then applying them as separate potential LAMP predictors (bottom section of Table 2). In addition, assorted product variables defined from these MOS probabilities (Table 2) are designed to capture dynamic, nonlinear interactions between the two component predictors and the convection or TL predictand (Bocchieri and Maglaras 1983; Reap and Foster 1979). Further, a meteorologically static variable, such as the predictand monthly relative frequency or terrain elevation (Table 2), is “dynamically activated” when multiplied with the MOS probability (a dynamic predictor; Charba and Samplatsky 2011b).
3. Development of LAMP convection and TL probability regression equations
The regression estimate of event probability (REEP) method (Miller 1964; Wilks 2006; Glahn et al. 2017) is used to produce the LAMP convection and TL probabilities. A forward screening technique (Glahn and Lowry 1972; Wilks 2006) is used to select predictors for the regression equations, which maximizes the reduction of predictand variance [RV, which is synonymous with explained predictand variance (Wilks 2006)]. The screening is applied to two forms of the candidate predictor variables: continuous value and grid binary (Jensenius 1992), as “continuous” predictors produce smooth distributions of forecast probabilities and experience reveals the combination of continuous and grid binary predictors yields enhanced forecast performance with fewer predictors than if only one of these is used.
a. Equation stratification by LAMP cycle, forecast range, and season
The convection and TL regression equations are stratified by LAMP hourly cycle, forecast projection, season, and geographical region. These stratifications are supported by a developmental sample that includes all days from 1 January 2012 to 31 May 2016 (~4.4 years). The LAMP cycle stratification consists of developing separate regression equations for each of the 24-hourly cycles, which allows diurnal diversity in predictors. The risk of cycle-to-cycle inconsistency in the forecast probabilities is countered by offering HRRR predictors not only from the latest hourly HRRR cycle but also from the previous HRRR cycle [Table 3; section 2d(2)]. A similar strategy is also applied for MOS predictors, as MOS predictors are offered not only from the most recent 6-hourly MOS cycle but also from the previous MOS cycle.
Regression equations are also stratified by LAMP forecast projection, such that a fixed, unique set of predictors is applied to each of four forecast projection ranges (Table 3). The adoption of this previously unused strategy was based on findings from probability skill sensitivity tests, wherein one or more of the three predictor types in Table 2 were excluded from test equations. These tests revealed (not surprisingly) that initial and advected Obs predictors contribute most to forecast probability skill at the shortest forecast projections, whereas they contribute little or no skill beyond about 6 h (section 4b). Thus, while all candidate predictor types (Obs, HRRR, and MOS) are screened for projections in the 1–12-h range, Obs predictors are excluded for longer projections, and HRRR predictors are not used for projections beyond 17 h (the maximum LAMP forecast projection for which HRRR model forecasts were available in the developmental dataset). This predictor segregation strategy results in improved equation development efficiency, as progressively fewer candidate predictors are screened with increasing forecast-range interval (Table 3) with little or no loss in forecast skill. Potential probability inconsistencies across the forecast-range interfaces are addressed by overlapping the ranges by one or two hours and averaging the ensuing “overlapping probabilities”.
The LAMP seasonal stratification consists of developing separate regression equations for each of three calendar periods. The “cool,” “spring,” and “summer” seasons are defined as 16 October–15 March, 16 March–30 June, and 1 July–15 October, respectively. This seasonal stratification was originally used for MOS thunderstorms prediction by Reap and Foster (1979) to account for seasonal diversity in dominant thunderstorm forcing mechanisms; during the cool season convection is typically associated with strong synoptic-scale cyclones and weak convective instability, during spring the associated cyclonic systems are generally weaker and convective instability is stronger, and during summer the cyclonic systems are usually quite weak and convective instability is strong.
b. Geographical regionalization
With seasonal stratification of the LAMP developmental sample and an overall sample comprised of 4.4 years, the number of days falling within each of the three seasons is in the 400–600 range. Since convection and TL occurrences are rare (i.e., climatic relative frequencies are of the order of one occurrence per 100 cases) these seasonal samples are not large enough to yield statistically stable regression equations (Wilks 2006) for individual grid points. The opposite extreme is to combine the predictor-predictand data over all grid points such that a single “generalized operator” regression equation (GOE; Bocchieri and Glahn 1972, p. 870) applies to all forecast points. While attractive features of GOE include prediction model simplicity and more stable regression equations, the method does not generally accommodate geographical diversity of predictors4 (which could hamper forecast probability performance) and computation of GOE regression equations can be challenging with a very large number of developmental grid points. A compromise strategy involves separate regression equations for geographical regions [i.e., a regionalized approach (REG)], which requires longer developmental samples for regression equation stability than with GOE. An even more formidable challenge with REG is nonmeteorological discontinuities in the probabilities across region boundaries. For LAMP CG lightning probabilities on a 20-km grid, Charba and Samplatsky (2009) found that treating regional discontinuities with conventional grid smoothing was only marginally effective, and the discontinuity problem is more severe with finer grids.
Charba and Samplatsky (2011a) developed a novel method to effectively address the regional discontinuity problem, which they applied for high-resolution MOS probabilistic quantitative precipitation forecasts (PQPFs) on a 4-km CONUS grid (Charba and Samplatsky 2011b). The method features specifying geographical regions with slight overlap of neighboring regions and weighted averaging of multiple regional forecasts in the overlap areas. Note that while the number of overlapping regions and their areal complexity was high in this PQPF application, it was also shown PQPF skill was only slightly less with a smaller number of simple regions owing to use of a few dominant predictors that contain the predictand relative frequency and terrain elevation at each grid point. With this strategy, the regional overlap technique was also successfully applied for the old 2-h LAMP convection and CG lightning guidance with a relatively small number of regions across the CONUS (Charba et al. 2011; Fig. 4).
In the present upgrade of the old LAMP convection and lightning guidance, we use the overlapping regions shown in Fig. 1, which are fewer in number and larger in size than in Charba et al. (2011) since the developmental sample here is smaller than that used before. Factors considered in their delineation include the gradual northward and westward climatological progression of convection from the southeastern United States [southeast region (SE)] during the cool season to all of the United States during spring (reflected by the south–north demarcation of regions over the eastern United States), broad convection coverage throughout the CONUS during summer, excluding the far western United States [accounted for by the Pacific Coast region (PC)], and substantially reduced radar data quality and coverage throughout the western United States [accounted for by the added Rocky Mountain region (RM)]. A consideration that forced specification of relatively large PC and RM regions is adequate developmental samples, as samples for these regions were depleted by missing data due to the MRMS supplemental QC.
c. Highly ranked convection and TL probability predictors
With the forward-selection screening technique (Glahn and Lowry 1972; Wilks 2006), the first selected predictor yields the maximum predictand RV, and subsequently selected predictors yield peak incremental RVs. Thus, predictors selected earliest contribute most to the cumulative RV and to forecast probability. For the LAMP convection and TL equations, the selection cutoff consisted of either an arbitrary maximum of 20 predictors or fewer predictors where the incremental RV with the next candidate was below 0.001.
To assess the relative importance of individual predictors over many regression equations, a simple ranking scheme was devised, which involved assignment of selection-order values to predictors. Since an equation has 20 or fewer predictors, the order value for the first selected predictor was 20, and the value for each successive predictor was reduced by one such that these values are 19, 18, etc. Since predictors vary as a function of LAMP cycle, season, and geographical region, the ordering values were summed over all (504; 24 cycles × 3 seasons × 7 regions) convection or TL equations. Then, predictors were ranked according to the ordering-value sums (the higher the sum, the higher the ranking). Note that this ranking procedure was applied to the 1–12-h (shortest) forecast range, as all candidate predictors in Table 2 were screened only for this range (Table 3).
Based on this ranking scheme, the eight highest ranked convection and TL predictors are listed in Table 4. Note that the list for both convection and TL exhibit approximately equal proportions of Obs, HRRR, and MOS predictors (Table 2), which suggests probabilities for these predictands should be broadly similar to one another. Conversely, the predictor lists also exhibit marked distinctions. Specifically, Obs predictors in the convection list consist exclusively of MRMS CREF and VIL parameters, which implies the dominance of the CREF component over the TL component in the convection predictand (section 2c). Meanwhile, Obs predictors in the TL list consist exclusively of TL count parameters, reflecting their inherent correlation with the similarly defined TL predictand. Regarding HRRR predictors (Table 4), note that 1) they are highly ranked in both lists, and 2) lightning threat 3 index is top ranked in the TL list and is absent in the convection list. These findings indicate HRRR predictors have strong impacts for both convection and TL, and these impacts are rather unique to each (shown in section 6). It is worth noting that when new MRMS and/or HRRR parameters (i.e., 30–40-dBZ reflectivity at or above the −10°C level, vertically integrated ice, echo top height, updraft strength) found to be linked to lightning in the literature (Carey and Rutledge 2000; Lang and Rutledge 2002; Vincent et al. 2004; Cecil et al. 2005; Yang and King 2010; Mosier et al. 2011) are incorporated in a pending upgrade of the LAMP convection and TL models, it will be interesting to investigate their potential impact in LAMP convection and TL predictor rankings.
4. Performance scoring of convection and lightning probability
a. Skill, reliability, and sharpness for CONUS, geographical regions, and seasons
The measure of skill used for the convection and lightning probabilities is the “half-Brier score” (Brier 1950) improvement on climatology [henceforth called Brier skill score (BSS); Wilks 2006], where convection and TL monthly relative frequencies discussed in section 2d(1) serve as the climatology reference. The BSS is a standard measure of skill of estimated probabilities and reliability diagrams show their calibration and sharpness (Wilks 2006).
BSS versus forecast projection curves for convection and TL probabilities are shown in Fig. 2, based on a verification sample consisting of 246 evenly spaced days from 6 May 2014 to 31 May 2016 and the 1800 and 0600 UTC LAMP cycles. Note that since these LAMP cycles are “diurnally diverse” [i.e., convection is (climatologically) in a stage of rapid diurnal growth at the former cycle and in a mature stage at the latter cycle], BSS values for these two cycles combined may provide a fair representation of skill across all 24-hourly LAMP cycles. The BSS curves for both convection and TL (Fig. 2) feature relatively high skill at the 1-h forecast projection and a sharp skill fall off to 4 h. Thereafter, the skill profiles abruptly level off with projection out to 25 h, except for a modest skill fall off in the 16–18-h range (which is attributed to the loss of HRRR predictors beyond 17 h; Table 3). Meanwhile, the convection versus TL skill curves also exhibit notable distinctions: TL shows higher skill for the 1-h projection but the subsequent skill fall-off is sharper and deeper. This suggests initial and advected observational (persistence) predictors at the first forecast projection are stronger for TL, whereas for longer projections HRRR and MOS predictors are apparently weaker for TL than for convection. (The sensitivity of skill to the individual Obs, HRRR, and MOS predictor types, discussed in section 4b, addresses this further.)
Strong regional contrasts in convection and TL BSS appear across the CONUS. For example, Fig. 3 shows a comparison of convection and TL BSS curves for the CP and RM regions (Fig. 1) for a verification sample comprised of a May–September subset of the sample used for Fig. 2. Note that BSS values for both convection and TL are much higher for the CP region than for the RM region. (For convection the contrast in skill for the western versus the eastern United States is also shown in Fig. 4). The relatively low skill in the (mountainous) RM region needs to be investigated to better understand the underlying cause, though the well-known poor quality of radar data across the western United States (Maddox et al. 2002; Charba et al. 2017) may be a factor and even the quality of TL data in the western United States could be adversely impacted by relatively sparse ENTLN sensor density there.
A comparative test of convection probability skill with regionalized (REG) versus nonregionalized (GOE) regression equations (section 3b) is shown in Fig. 4. In this figure, for simplicity the verification data are aggregated over the five regions east of the Rocky Mountains (“East”; Fig. 1) and similarly for the RM and PC regions comprising the western United States (“West”). The figure shows a small skill boost of REG over GOE for the East and a more substantial boost in the West. This finding of somewhat improved skill with regionalization via a 4-yr or larger developmental dataset is consistent with findings from similar tests reported in Charba et al. (2011) and Charba and Samplatsky (2011b).
CONUS BSS-versus-projection curves for convection and TL stratified by LAMP cool, spring, and summer seasons exhibit notable seasonal trends (Fig. 5). At the shortest forecast projections seasonal skill is highest during the spring for both convection and TL. At longer projections the highest seasonal skill for convection switches to the cool season. For most projections skill is lowest during summer for both convection and TL, which is attributed to relatively small space and time scales with which convection and TL occur during this time of the year.
Reliability diagrams for the convection and TL probabilities for short- (1–5 h) and medium-range (11–15 h) LAMP forecast projections are shown in Fig. 6. (The verification data are combined over five forecast projections to avoid short-sample aberrations in the plots.) The diagrams depict good reliability for both convection and TL (which is especially true for the short range), as the plotted points are close to the perfect reliability line over the entire 0%–100% probability span. For the medium range, moderate overforecasting of upper probabilities is present for the cool season, though the associated samples are relatively small.
The sharpness of forecast probabilities reflects the degree to which the probabilities deviate from sample means, where strong deviations indicate good sharpness (Wilks 2006). In Fig. 6 sample mean seasonal convection and TL probabilities together with corresponding relative frequencies of verifying observations are depicted by small arrows plotted near the lower-left corner of each reliability diagram. Note that these sample means are quite small (i.e., ~5% for convection and less for TL during the spring and summer seasons and for the cool season both are just slightly above 0%). Contrastingly, the inset probability histograms show that corresponding convection and TL probabilities are distributed over the entire 0%–100% probability range for the short projections, and this also applies to the midrange projections for convection. Thus, the probabilities are not biased toward climatology [i.e., sample mean relative frequencies (RFs)]; instead, they exhibit strong deviations from sample mean RFs (noted above), which signifies quite good probability sharpness, especially for convection.
b. Probability skill sensitivity to Obs, HRRR, and MOS predictors
Here, we investigate how the three LAMP predictor types used in LAMP [i.e., Obs, HRRR, and MOS (section 2d; Table 2)], relate to convection or TL probability skill with LAMP forecast projection. The investigation was conducted through development and testing of experimental LAMP models that use an individual predictor type and/or selected combinations (Table 5 specifies predictor types used in each model) to see how convection and lightning probability skill with each relate to skill with the “All” model (uses all three predictor types). For brevity, these models were limited to a maximum forecast projection of 16 h.
BSS versus projection curves for the experimental convection and TL probabilities (Fig. 7) reveal strong skill distinctions among various predictor types relative to “All” predictor types (which shows highest skill over all LAMP forecast projections). Note that Obs+advobs skill is almost as high as for All for the shortest two projections and Obs skill is close behind at the first (1-h) projection. Note that Obs+advobs skill curves together with All-advobs curves indicate that Obs predictors contribute to All skill throughout the 1–6-h range. Also, comparing Obs+advobs and HRRR skill, we see clearly superior Obs+advobs skill in the 1–3-h range and then higher HRRR skill at four hours and beyond. Last, comparing HRRR and MOS skill, we find similar skill levels over all projections, with a slight HRRR advantage at the shortest projections and a slight MOS advantage near 16 h. More significantly, combining these two predictor types (HRRR+MOS) yields a clear skill boost, which implies strong synergy between these NWP model predictor types.
It is worth commenting on LAMP convection and lightning forecast performance in relation to major advances that have been made over the past 15 years or so in cloud-scale NWP modeling of intense convection and lightning. Many of these NWP studies involve applications of various versions of the Weather Research and Forecasting (WRF) community regional model (Skamarock et al. 2008) over regional (continental) domains, wherein convective cloud and hydrometeor microphysics are explicitly simulated using “convection-allowing” (1–4 km) grid meshes. A basic challenge facing these applications is “spinup,” where convection absent in the initial state (based on large-scale analyses of basic atmospheric variables) develops over the following 6 h so but with error in the precise timing and placement of convective cells (Clark et al. 2010; Kain et al. 2010). To address this problem, a number of works have applied simple, computationally efficient methods to assimilate high-resolution observations of nonmodel variables (such as radar reflectivity and lightning measurements) in the initialization with varying levels of success (e.g., Fierro et al. 2012; Fierro et al. 2013; Marchand and Fuelberg 2014; Fierro et al. 2015; Lynn et al. 2015). Further, the (operational) HRRR model (Benjamin et al. 2016) applied in LAMP constitutes an advanced version of WRF, and though both radar reflectivity and lightning flashes are similarly incorporated in its initialization, the LAMP probability skill sensitivity tests (discussed above) imply these observational data are not well represented in HRRR short-range forecasts, which LAMP effectively compensated for by using these observational data directly as convection and lightning predictors. Meanwhile, more complex, computationally expensive methods also face observations–assimilation obstacles, as Marchand and Fuelberg (2014) note in their brief review. Finally, cloud electrification and lightning have been explicitly simulated in several studies involving WRF models. Using a simplified, computationally efficient electrification scheme in a WRF model, Fierro et al. 2013 obtained reasonable electrification simulations in three diverse cases, whereas more intricate schemes employed in Mansell et al. 2002; Mansell et al. 2005; Kuhlman et al. 2006; Barthe and Pinty 2007; Calhoun et al. 2014 are probably years away from real time application due largely to high computational cost. Even then, LAMP-type models may continue to have a useful predictive role because of the inherent difficulty for these NWP models to closely reflect current observations and to precisely forecast the coverage, timing, location, and intensity of convective storm cells (e.g., Clark et al. 2010).
5. Derivation and performance of convection and TL “potential”
In the previous section we saw that sharpness and skill of the convection and TL probabilities decrease strongly with increasing forecast projection, which is manifested as a reduction in peak probabilities with projection, particularly in the first 6 h. Also, probability sharpness and skill exhibit marked seasonal and geographical variations. Such variability in probability characteristics may pose a challenge for users, especially those who express forecast uncertainty in nonprobabilistic ways.
A traditional approach for addressing this problem at MDL has been to complement the probability product with a derived (categorical) yes/no forecast rendition. The latter is derived by applying predetermined threshold probabilities that vary with forecast projection, season, geographical region, etc., where the (occurrence) forecast is yes when the probability equals or exceeds the threshold value and no otherwise. A threshold probability is derived objectively through an iterative scheme where the threat score [TS; same as the critical success index (Schaefer 1990)] for the categorical forecasts is maximized within a prescribed, narrow bias range slightly above 1.0.5 Such categorical forecasts have a near-perfect, constant bias despite probabilities that vary strongly with forecast range, season, time of the day, etc.
On the other hand, the yes/no rendition of the probabilities results in a huge loss of certainty information contained in the probabilities. To address this problem, Charba et al. (2011) formulated an extension to the two-category deterministic product such that four categories of event threat (risk) (no, low, medium, and high) are defined, which they coined “potential.” The derivation of potential is analogous to that for the yes/no categorization mentioned above, except that three probability thresholds are used rather than one. The bias ranges for these probability thresholds (Table 6) was prescribed to produce rational distributions of the four categories, with the caveat that the bias range for the medium probability threshold (i.e., for medium and high potential combined), is identical to that for the yes/no case. This ensures that combining medium and high (no and low) potential reproduces the traditional yes (no) forecasts.
An example convection potential map together with the corresponding probability map is shown in Fig. 8. Note that potential nicely portrays the four convection threat levels while corresponding probabilities vary greatly across the CONUS. Figure 9 shows convection and TL TS and bias versus projection curves, where yes (no) forecasts consist of medium+high (no+low) potential and the verification sample is the same as for the corresponding BSS curves in Fig. 2. Note that the respective BSS and TS curves in Figs. 2 and 9 parallel one another for both convection and TL as they each show strong reductions in forecast skill and accuracy with projection following high values at early projections. Contrastingly, the bias curves in Fig. 9 exhibit near-constant bias across the entire forecast projection range, with values at or slightly above 1.0, which reflects the design of potential. Recall that this quasi-constant bias attribute also applies to low and high potential (Table 6). The constant bias feature of potential should serve to make this derived product a useful complement to the corresponding probabilities.
6. Subjective examination of probability and potential forecast performance: A case study
In this section, insights into performance strengths and weaknesses of the LAMP convection and lightning guidance are discussed through example forecast and verification maps for a massive, explosive convection outbreak over the central United States during the late afternoon and night of 6–7 March 2017. For this storm outbreak, Fig. 10 shows a 24-h map of tornado, large hail, and damaging wind reports from the NCEP Storm Prediction Center, most of which are from a 7-h period spanning from 2300 UTC 6 March to 0600 UTC 7 March.
a. Example of old 2-h probability versus new 1-h probability maps
Since the upgraded 1-h convection and TL products were developed to replace the corresponding 2-h products, improved forecast performance of the former over the latter is warranted. Unfortunately, an objective forecast performance comparison is not feasible since the predicted and observed events are unique to each of these products. On the other hand, subjective comparisons of various properties of forecast quality of the 2- versus 1-h products should be meaningful, particularly to field users. Figure 11 shows maps of 2-h convection and CG lightning probabilities for 8–10-h forecast projections from 1800 UTC 6 March 2017 together with corresponding maps of 1-h convection and TL probabilities for 9–10-h forecast projections. These maps show strong improvement in both spatial focus and sharpness of the 1-h probabilities over the 2-h probabilities. Improvements in spatial focus seen here are typical, as they reflect benefits resulting from the grid mesh reduction from 20 to 10 km, the valid period reduction from two hours to one hour, and finescale HRRR predictors. For short forecast projections, MRMS and TL observational predictors are largely responsible for similar improvements in detail and focus.
b. Example verification maps of 1-h probability and potential
Here, the performance of the 1-h convection and TL probability and potential guidance for the 6–7 March 2017 case is shown via verification maps. The first step in the verification procedure consists of rendering convection (TL) potential as yes /no forecasts, where yes (no) at a 10-km grid point corresponds to medium or high (no or low) potential (section 5). These yes/no forecasts are then matched with observed convection (TL) in 10-km grid boxes centered on the grid points. A forecast hit is scored where a yes forecast coincides with a convection (TL) occurrence (event), a miss is scored for a no forecast and the event occurred, and a false alarm arises with a yes forecast and the event did not occur. Finally, the verification map is depicted by superimposing hits (green color shading) and misses (black color) on the associated potential map [false alarms are shown via the “native” colors for medium (medium orange) and high (dark orange) potential]. Examples of such verification maps are shown in Fig. 12 for convection (right panels; left panels are the corresponding probability maps) [see also Fig. 14 (convection) and Fig. 15 (TL) below].
Figure 12 depicts rapid growth of convection (green and black areas in the verification maps) over the central United States spanning a 12-h period after 1800 UTC 6 March. Specifically, during 1800–1900 UTC, which corresponds to 0–1-h LAMP forecast projections (Fig. 12a), LAMP did not forecast convection from Kansas to Minnesota and none occurred, so the forecast is correct. By 2300–0000 UTC 6–7 March (5–6-h LAMP forecast projections from 1800 UTC), convection rapidly developed within a narrow band stretching across these states, most of which LAMP did not predict [shown in the right panel of Fig. 12b as the elongated black band (misses) that changes to green (hits) only along the northern flank]. Note that the corresponding probability map (left panel of Fig. 12b) shows just small scattered areas where peak probabilities were low (i.e., mostly below 50%). BSS, TS, and bias scores computed for this map time (11.1%, 0.15, and 0.4, respectively) are much weaker (20.1%, 0.26, and 1.1, respectively) than those for long samples in Figs. 2 and 9. This convection underforecast arose from error within all three (Obs, HRRR, and MOS) predictor types. In particular, Fig. 13a shows a line of quite weak MRMS CREF “initial” echoes and it is located to the east of the convection band in Fig. 12b, Fig. 13b shows that HRRR underpredicted the CREF intensity and coverage in this convection band, and Fig. 13c shows the GFS MOS convection probabilities were very low (under 15%) in the area. Contrastingly, by the 0500–0600 UTC 7 March valid time corresponding to 11–12-h forecast projections from the 1800 UTC LAMP cycle (Fig. 12c), the LAMP convection probabilities and potential increased explosively in a narrow line from eastern Oklahoma to southern Wisconsin, and this forecast verified very well, as the LAMP BSS, TS, and bias scores improve to 38.1%, 0.39, and 1.1, respectively. Figures 13d–f suggest the HRRR CREF (Fig. 13b) was largely responsible for the remarkable surge in peak convection probabilities to 100% seen in Fig. 12c, though other HRRR predictors not shown in Fig. 13 probably contributed also.
In contrast to the 1800 UTC cycle, LAMP 0000 UTC 7 March convection forecast performance for projections of 6 h and less is much improved (Fig. 14). For this cycle, a narrow, solid line of 100% convection probability was present for 0–1-h forecast projections over the northern plains, which the corresponding verification map shows to be a very good forecast (Fig. 14a), and yet the BSS and TS scores (73.3% and 0.60) are not greatly higher than for the long samples in Figs. 2 and 9 (52.7% and 0.54). Similarly, for 5–6-h forecast projections essentially all observed convection was correctly forecast (Fig. 14b), though there is notable overforecasting (bias = 1.4). These examples of outstanding short-range convection forecasts stem from an ideal scenario for the 0000 UTC LAMP model run: 1) an extensive line of strong convection at the 0000 UTC cycle time was already present, and this line persisted as it moved downstream over the subsequent 6-h period spanned in Fig. 14, and 2) the 2300 UTC 6 March HRRR run also correctly predicted this line. (Diagnostic predictor maps are not shown.) Thus, the extrapolated observational predictors and HRRR predictors in the LAMP model synergistically complemented each other to produce strong LAMP convection probability and potential performance.
As for TL forecast performance for the 6–7 March 2017 case, findings reveal general mirroring of convection performance. For example, LAMP TL performance spanning 6 h after 0000 UTC 7 March is shown in Fig. 15. Note that LAMP performance is especially strong for the 0–1-h period where the BSS (TS) is 73.3% (0.73), which is much higher than for the long sample average [59.4% (0.57)]. For the corresponding 5–6-h forecast (Fig. 15b), the TL BSS (TS) falls to 32.1% (0.30), which is still well above the long sample average [13.6% (0.18)].
Figure 16 shows maps for four key TL predictors for the 5–6-h forecast projection in Fig. 15b. Note that the HRRR lightning threat index (Fig. 16b) and the NAM MOS TL probability (Fig. 16c) match well the 5–6-h forecast in Fig. 15b, whereas the advected 30-min lightning count has little impact. Note also that the narrow strip of misses along the rear side of the convection band from Missouri to Illinois in Fig. 15b is linked to slight phase error in the HRRR lightning threat index (Fig. 16b), and false alarms from eastern Oklahoma to southern Missouri in Fig. 15b coincide with a mispositioned GFS TL probability maximum there (Fig. 16d).
7. Summary, findings, and future plans
Extensive upgrades have recently been made to previously operational LAMP 2-h convection and CG lightning probability and potential guidance forecasts for the CONUS. Spatial and temporal resolutions of the predictands were enhanced through reductions of the grid mesh from 20 to 10 km and predictand valid period from two hours to one hour, which were enabled through first-time LAMP application of finescale MRMS reflectivity products and HRRR model output. In addition, the replacement of CG lightning data in the old LAMP model with TL data provides more comprehensive definitions of the convection and lightning predictands as well as better lightning predictors (section 2).
Objective scoring of the upgraded convection and TL probabilities discussed in section 4 shows that probability skill and sharpness are quite high in the hour immediately following model cycle time, after which skill falls rapidly out to four hours and then levels off through 17 h. Skill sensitivity tests also discussed in this section reveal that the high skill and sharpness for the shortest few forecast projections is attributed mostly to extrapolation of finescale MRMS and TL observations, and the prevention of a skill fall-off in the 4–17-h range largely reflects the benefit of finescale HRRR model predictors. In the 17–25-h range, large-scale MOS predictors are the sole predictive source, which accounts for relatively low skill there. Section 5 describes how convection (and TL) potential is derived from, and aids use of, corresponding probabilities with widely varying skill and sharpness as a function of LAMP forecast projection, season, and geographical region.
Section 6 addresses LAMP forecast performance through examination of forecast-verification maps for a selected case of explosive convection development over the central United States. Example maps show the upgraded convection and TL probabilities have much better spatial focus and probability sharpness than those from the old, coarser-resolution LAMP model. Also, for forecast projections of 6 h and less these maps show LAMP performance was better where convection was already ongoing at LAMP model cycle time than for an earlier cycle where convection had not yet developed, which reflects the strong predictive role of extrapolated MRMS and TL observational predictors. Further, the example forecast maps showed outstanding LAMP forecast performance for scenarios where these observational predictors and HRRR-forecast predictors synergistically complement one another.
The upgraded LAMP convection and total lightning forecast guidance products were operationally implemented in January 2018 following real time experimental testing that commenced in August 2016. Feedback from users of the guidance has been positive (see Table 7). Work is presently underway to develop LAMP convection and lightning products for Alaska similar to these for the CONUS. MDL also has near-term plans to upgrade the CONUS guidance by incorporating more current historical TL, MRMS, and HRRR data (discussed in section 2), expanding the geographical coverage into southern Canada, the Gulf of Mexico, and Caribbean Sea areas, and extending the forecast range to 38 h. In the longer term, we expect new GOES 16/17 TL data will also be incorporated into LAMP (also discussed in section 2).
Archived MRMS data and HRRR model output were provided by NOAA’s National Severe Storms Laboratory and Earth Systems Research Laboratory/Global Systems Division, respectively, and archived total lightning data were furnished by Earth Networks, Inc. Dr. Bob Glahn of MDL performed a careful review of an earlier version of the manuscript, which resulted in improved text clarity. Three anonymous peer reviewers provided very thorough, thoughtful, and constructive comments, which led to major improvements to the manuscript. This paper is the responsibility of the authors and does not necessarily represent the views of the NWS or any other governmental agency.
Lower and/or upper bound truncation is applied to Obs and HRRR variables in Table 2, as these variables can have outlier values [see footnote 2 in section 2d(1)]. Outliers are extremely low or high values, which may be due to observation, data transmission, or prediction error. Truncation bounds for a given potential predictor variable were determined heuristically by examining its CONUS-wide minimum, maximum, mean, and standard deviation over the full developmental sample and also inspecting maps for cases containing extreme values. Ultimately, the assignment of truncation bounds for each variable was a matter of human judgment. Where an outlier value is found, it is changed to the appropriate truncation value.
b. Binary variables
New binary predictor variables were created for each of the Obs and HRRR predictors in Table 2. A binary predictor variable takes on a 1 value when the base continuous predictor variable equals or exceeds a predetermined threshold value and 0 otherwise. Three to four threshold values are generally specified for each variable, meaning that 3–4 additional candidate predictors are created for each continuous variable. The determination of threshold values is a trial-and error process involving test regression runs and examining the resulting predictor means, standard deviations, correlation coefficients, and reductions of predictand variance. The goal is to determine predictor thresholds that maximize the reduction of predictand variance without overfitting the developmental sample (Wilks 2006).
Conventional grid smoothing is applied to the Obs and HRRR potential predictors to enhance their predictive effectiveness, as experience at MDL shows that grid smoothing generally results in improved forecast performance scores. Further, smoothing of grid binary predictor fields results in smooth transitions across 0–1 interfaces, which results in improved spatial coherency in ensuing forecasts. In this study the grid smoothing operator is a nine-point weighted average, where the weights (in one grid direction) are usually 0.50 at the central grid point and 0.25 at each adjacent point. The smoothing amount is customized for each Obs and HRRR variable, largely by choosing the number smoothing passes required to remove finescale detail judged to be noise or not predictable. The number of passes ranged from one to three, though the smoothing was weakened on the final pass for some variables through weights adjustment.
In the case of the convection and TL predictand monthly relative frequencies (MRFs; Table 2), special smoothing was necessary to obtain fields with coherency across adjacent 1-h valid periods, adjacent months, as well as spatially across the CONUS. Thus, three sequential smoothing operations were applied, first across adjacent hours, then across adjacent months, and last the grid smoothing discussed above. Note that hour-to-hour (month-to-month) smoothing is applied across three hours (months) for a given pass. Also, the typical “1–2–1” smoothing weights (noted above) are weakened, especially for the hour-to-hour smoothing to enhance retention of diurnal variability, and the smoothing was similarly limited in the month-to-month smoothing though two passes were needed for the very choppy convection MRFs [see section 2d(1)]. Finally, for the conventional grid smoothing, five passes were needed to obtain desired spatial coherency for convection, while only three passes were needed for TL.
It bears mentioning NSSL has developed a MRMS radar quality index (RQI; Zhang et al. 2012), which incorporates radar beam height, terrain obstruction, and the atmospheric freezing level. While NSSL has produced the RQI experimentally since 2012 (Zhang et al. 2014), an archive was not available to investigate its potential application in LAMP. Since the RQI was implemented in a recent upgrade of the operational MRMS system (Zhang 2018, private communication), its potential use in a future LAMP upgrade will be investigated.
A linear regression equation containing predictor outlier values may cause (problematic) predictand estimates (forecasts) outside the 0–1 range of the predictand observations.
The HRRR model was operationally implemented on NOAA's Weather and Climate Operational Supercomputing System in September 2014, and upgrades have been implemented since then. The “experimental HRRR” contains model upgrades destined for operational implementation.
Still, GOE can incorporate geographic specificity in the case where specialized, point specific topo-climatic predictors that embody localized predictand relative frequency and topography (Charba and Samplatsky 2011b) are used.
Bias is defined as the number of forecast occurrences divided by the number observed, and thus unbiased forecasts have a 1.0 value. For most predictands, the TS is maximized when the bias is slightly above 1.0 (slight overforecasting).