This study evaluates a method for estimating the cloud-to-ground (CG) lightning flash rate from microwave remote sensing data. Defense Meteorological Satellite Program satellites have been in operation since 1987 and include global-viewing microwave sensors that capture thunderstorms as brightness temperature depressions. The National Lightning Detection Network (NLDN) has monitored CG lightning in the United States since 1997. This study investigates the relationship between CG lightning and microwave brightness temperature fields for the contiguous United States from April to September for the years 2005–12. The findings suggest that an exponential function, empirically fit to the NLDN and SSM/I data, provides lightning count measurements that agree to within 60%–70% with NLDN lightning, but with substantial misses and false alarms in the predictions. The discrepancies seem to be attributable to regional differences in thunderstorm characteristics that require a detailed study at smaller spatial scales to truly resolve, but snow at higher elevations also produces some anomalous microwave temperature depressions similar to those of thunderstorms. The results for the contiguous United States in this study are a step toward potentially using SSM/I data to estimate CG lightning around the world, although the sensitivity of the results to regional differences related to meteorological regimes would need further study.
Lightning is a phenomenon that connects weather, climate, and society. There are many known impacts and hazards associated with lightning that influence humans, ecosystems, and atmospheric chemistry. Lightning-induced wildfires have transformative effects on local ecosystems (Flannigan et al. 2013; Giglio et al. 2006; Littell et al. 2009; Stocks et al. 2002; Westerling et al. 2006), and the extremely high temperature associated with lightning also produces nitrogen-containing gases that are important in air quality and other issues related to atmospheric chemistry (Price et al. 1997). More fundamentally, an average of 51 lightning-related fatalities per year in the United States for the last 30 years—more deaths per year than are caused by hurricanes over that same period—make lightning a serious weather-related threat (NOAA 2014). Monitoring and understanding the patterns of lightning are the first steps to understanding how this phenomenon plays a role in human and natural environments.
Storm total lightning comprises both intracloud (IC) and cloud-to-ground (CG) lightning. The IC-to-CG ratio varies (Medici et al. 2015) as a function of latitude (Prentice and Mackerras 1977), elevation (Boccippio et al. 2001), and meteorological factors (Carey and Rutledge 2003; MacGorman et al. 1989; Schultz et al. 2011; Williams et al. 1999). On average, IC is often two to five times more common than CG lightning (Boccippio et al. 2001). Large-scale observations of total lightning were captured from 1998 to 2015 by the Lightning Imaging Sensor (LIS) on the Tropical Rainfall Measuring Mission (TRMM) satellite, which observed latitudes between 36°N and 36°S (Cecil et al. 2015). Prior to LIS, global total lightning observations were recorded from 1995 to 2000 by the Optical Transient Detector (OTD) on the Microlab-1 satellite (Christian et al. 2003). A data product that combines LIS and OTD data shows that the mean global lightning flash rate ranges from 35 flashes s−1 in February to 60 flashes s−1 in August (Cecil et al. 2014). While there is over a decade of TRMM LIS data, TRMM is limited to the tropics and subtropics, and while OTD had global coverage, it was only active between 1995 and 2000 (Cecil et al. 2014; Christian et al. 2003). The high latitudes in both hemispheres have had no satellite observations since 2000.
Lightning in a thunderstorm develops as a result of charge exchange from collisions that occur between ice crystals, supercooled water, and graupel in the mixed-phase region of the cloud, and gravitational sorting of these differently charged particles within the cloud. The charging mechanism and the subsequent lightning flash rate associated with any storm is related to the product of the flux of upward- and downward-moving ice mass within the storm (Blyth et al. 2001; Deierling et al. 2008). Since precipitation-sized ice scatters microwave radiation, there would be a microwave brightness temperature depression associated with thunderstorms in the upwelling ground-based microwave radiation monitored by satellite (Cecil et al. 2005; Liu et al. 2011; Mohr et al. 1996; Toracinta et al. 2002). A thunderstorm would appear colder than its surroundings at microwave frequencies.
Satellites from the Defense Meteorological Satellite Program (DMSP) have provided global coverage from 1987 to the present (Spencer et al. 1989), and, although DMSP satellites do not have a lightning sensor, they are equipped with a microwave imager called the Special Sensor Microwave Imager (SSM/I; Spencer et al. 1989) capable of capturing cold microwave brightness temperatures (or depressions in the microwave temperature) associated with thunderstorms (e.g., Mohr et al. 1996). Research using TRMM LIS and the TRMM Microwave Imager (TMI) has shown that total lightning is related to microwave brightness temperatures (Cecil et al. 2002; Liu et al. 2011; Zipser et al. 2006).
This study explores the potential of SSM/I data to serve as a way to estimate CG lightning counts. Since DMSP satellites do not have a lightning sensor, ground-based CG lightning data from the U.S. National Lightning Detection Network (NLDN; Cummins and Murphy 2009; Orville 2008) are used to evaluate whether the SSM/I can be used to detect lightning The methods and data processing choices are described in section 2, the results of the analysis and comparisons of estimated lightning with observed lightning are presented in section 3, and the sensitivity of the results to data processing choices are discussed in section 4. This study suggests that SSM/I data could be used to create a global CG lightning dataset.
SSM/I sensors have flown aboard the DMSP constellation of satellites since 1987 and have observed multiple frequencies in the microwave spectrum. Data from the different SSM/I sensors are intercalibrated and available from the Fundamental Climate Data Record (FCDR) dataset (http://rain.atmos.colostate.edu/FCDR/; Berg et al. 2013; Sapiano et al. 2013). SSM/I FCDR data are distributed as data granules (overpass scenes) from the DMSP satellite. The SSM/I frequencies used in this study are the vertically and horizontally polarized microwave channels at 37 and 85 GHz, which have spatial resolutions of 37 km × 28 km and 15 km × 13 km, respectively (Sapiano et al. 2013).
The ice in thunderstorms that is a precursor to lightning (Blyth et al. 2001; Deierling et al. 2008) also effectively scatters radiation at microwave frequencies (Mohr et al. 1996). To investigate the relationship of lightning and microwave brightness temperatures that are characteristic of ice, the vertical and horizontal polarization channels at 37 and 85 GHz are linearly combined into “polarization-corrected temperatures” (PCTs; e.g., Cecil et al. 2002). PCTs are used to minimize the effects of surfaces underlying thunderstorms on the microwave frequencies, while preserving the scattering effects of ice (Cecil et al. 2002; Liu et al. 2011; Mohr et al. 1996; Spencer et al. 1989). PCTs are calculated using
where are the vertically polarized microwave brightness temperatures at 85 and 37 GHz, are the horizontally polarized brightness temperatures at 85 and 37 GHz, and PCT85 and PCT37 are the polarization-corrected temperatures at 85 and 37 GHz (Cecil et al. 2002; Liu et al. 2011; Mohr et al. 1996; Spencer et al. 1989).
TRMM LIS and the TRMM Microwave Imager, which is based on the design of the SSM/I (Kummerow et al. 1998), have already been used to show how storm features, microwave brightness temperature depressions, and lightning are related (Cecil et al. 2002; Liu et al. 2011; Zipser et al. 2006), but TRMM is limited to viewing the tropics and subtropics. Strengths of DMSP satellites are that SSM/I sensors collect microwave data at high latitudes well beyond TRMM’s range of coverage, they have been observing the planet since 1987, and the data from the different sensors are intercalibrated (Berg et al. 2013; Sapiano et al. 2013). DMSP satellites, however, do not have a dedicated lightning sensor, so coincident ground-based lightning data from NLDN is used in this study.
Lightning flash data from NLDN are collected using ~200 ground sensors throughout the contiguous United States (Biagi et al. 2007; Cummins and Murphy 2009; Nag et al. 2014; Orville et al. 2011). There is also a Canadian Lightning Detection Network and an Alaska Lightning Detection Network (Orville et al. 2011), but those data were not as readily available. NLDN sensors detect the electromagnetic signals created by lightning, and using multiple sensors, the location of each lightning flash can be triangulated and flash rates can be calculated (Orville 2008; Rudlosky and Fuelberg 2010). NLDN data have been evaluated to assess the flash and stroke detection efficiency, accuracy in the peak current associated with strokes, accuracy in the location of the strike, and accuracy of the classification of the type of lightning detected with respect to CG or IC lightning (Biagi et al. 2007; Cummins and Murphy 2009; Fleenor et al. 2009; Nag et al. 2014; Nag et al. 2011; Orville et al. 2011). NLDN CG flash data are used in this study to understand whether SSM/I data can be used to derive a map of CG lightning flashes. The limitations of the NLDN that are most relevant here are those that are related to the detection efficiency and the classification accuracy.
The spatial domain in this study is the contiguous United States, which roughly corresponds to the observational range of the NLDN (Orville et al. 2011). The years used are 2005–12. The starting year is after network upgrades in 2002–03 (Biagi et al. 2007; Cummins and Murphy 2009) and during a year with relatively little qualitative deviation from the 2001–09 annual climatology (Orville et al. 2011).
SSM/I sensors in orbit from 2005 to 2012 include the DMSP F13 (1995–2009), F14 (1997–2008), and F15 (2000–present) satellites. The DMSP F16 (2006–present), F17 (2008–present), and F18 (2010–present) satellites were also in orbit with similar microwave sensors [Special Sensor Microwave Imager and Sounder (SSM/IS)], but SSM/IS sensors observe 91 GHz rather than 85 GHz (Kunkee et al. 2008; Yan and Weng 2008) so are not used. Data from the SSM/I sensors are intercalibrated (Berg et al. 2013; Sapiano et al. 2013) so F13–F15 should be comparable.
With F13, F14, and F15, there are three more years of overpasses from F15 than F13, and about one more year of overpasses from F13 than F14. In terms of diurnal sampling, the three sensors most often pass over the contiguous United States from 1200 to 1500 UTC (about 0600 to 0900 local time) and 2230 to 0200 UTC (about 1630 to 2000 local time), while from 0500 to 0800 and 1800 to 1930 UTC, there are no samples. These sample times are consistent with local observing times of sensors described on the FCDR SSM/I website (http://rain.atmos.colostate.edu/FCDR/SSM/I.html) and, generally, should capture the late afternoon to evening peak in storm activity over the United States. Within each year of the study period, the 6-month period between April and September is used to take advantage of a higher fraction of annual lightning across the contiguous United States (Cecil et al. 2014) and minimize the contamination of snowpack in the microwave data, as discussed below in this section.
NLDN and SSM/I data are aggregated onto a ¼° rectangular grid over the contiguous United States to be able to make direct comparisons between the disparate data types. NLDN lightning flashes are totaled for each grid box during the time window corresponding to the overpass time of the SSM/I sensor over the contiguous United States (approximately 7 min). SSM/I data are gridded such that the minimum PCT inside the grid box is used to represent the overall grid box.
NLDN data are largely intended to document CG lightning, but the NLDN sensors are also sensitive to IC lightning (Biagi et al. 2007; Fleenor et al. 2009; Nag et al. 2014). In general, while NLDN has a >90% detection efficiency for CG lightning (Orville et al. 2011), the IC detection efficiency is <30% (Cummins and Murphy 2009), which is why this study is limited to CG lightning. Fundamentally, NLDN sensors have a peak current detection threshold of about 5 kA (Cummins and Murphy 2009), but classifying a detected lightning flash outside of the nominal NLDN detection threshold as CG or IC is more challenging since there is no single peak current threshold to discriminate among the different types of lightning (Villarini and Smith 2013).
NLDN flash data from March 2008 onward include a discriminator flag to classify a detected flash as CG or IC lightning. The distribution of CG and IC according to the NLDN (Villarini and Smith 2013) suggests that using a threshold value of peak current to classify lightning would produce different results than what is reported by the NLDN. However, Biagi et al. (2007) evaluated the NLDN classification and found that lightning with a peak current greater than +20 kA is likely CG regardless of whether the NLDN classified the lightning as CG or IC, and similarly lightning with peak current less than −10 kA is likely CG. NLDN classification of CG or IC lightning has been discussed extensively (Cummins and Murphy 2009; Orville 2008; Orville et al. 2011; Rudlosky and Fuelberg 2010). For consistency, this study excludes all lightning detections with peak currents between −10 and +20 kA, in order to eliminate IC lightning. The remaining lightning flashes are considered CG and form the basis of the gridded NLDN dataset used here.
The data from 2005 to 2012 from DMSP F13, F14, and F15 satellites included over 13 000 overpasses capturing part of the contiguous United States. For this study, those ~13 000 overpass scenes are gridded, and a gridded NLDN dataset for the ~7-min overpass time is produced as well. Based on past research relating total lightning from TRMM LIS and TMI data (Liu et al. 2011), the expectation from the NLDN and the SSM/I data is that CG lightning counts are inversely related to minimum PCT85 and PCT37.
While PCTs minimize the polarization effect of water surfaces on microwave radiation, snow creates brightness temperature depressions in the PCT dataset (Grody 1991) that are not caused by thunderstorms, and therefore this snow signal needs to be filtered from the data to understand how the SSM/I and NLDN data are related. While this study focuses on the warm season months from April to September, there is still significant snow cover in parts of the contiguous United States. To remove snowpack from the gridded SSM/I dataset, a filter described by Grody (1991) was tested, but quality controlling individual overpass scenes (particularly in April) of the gridded data revealed evidence that a significant number of grid boxes with snow remained even after the snow filter was applied. This ineffectiveness may be a result of the snow filter having been developed for dry snow discrimination rather than the wet/melting snow conditions that are more common during the spring transition months.
The sheer number of overpass scenes in this study precluded individual quality control, so to efficiently filter out a greater number of grid boxes with snow contamination, a CG lightning probability lookup table was built from the gridded SSM/I and NLDN data. The lookup table was compiled by considering whether any CG lightning flash was recorded within a grid box with a specific minimum PCT85 and PCT37, similar to an approach used by Liu et al. (2011). The probability of CG lightning for a given minimum PCT85 and PCT37 (Fig. 1) is calculated as the number of times that CG lightning is observed at a particular minimum PCT85 and PCT37 divided by the number of times that a particular minimum PCT85 and PCT37 is observed. Figure 1b shows the number of data points contributing to the PCT-dependent calculation. The probability of CG lightning generally increases for lower minimum PCT85 and PCT37, but there are some threshold values evident as well. There is a high probability of CG lightning when minimum PCT85 < 165 K and when minimum PCT37 < 280 K, and beyond this threshold, the probability decreases toward zero with the exception of the diagonal extending toward the top right of the figure. This is qualitatively consistent with both past analysis of LIS and TMI (Liu et al. 2011), and with the expectation that small ice particles will scatter 85-GHz radiation more effectively than lower-frequency radiation (Cecil et al. 2002), which in turn creates deeper brightness temperature depressions at higher frequencies.
Figure 1 is used to filter SSM/I data to minimize the impact of snow cover in the minimum PCT85 and PCT37 data. For this study, all grid boxes where the minimum PCT85 and PCT37 values together result in a probability of CG lightning that is less than 20% (gray contour in Fig. 1) are excluded. This qualitatively matches past classification analyses of microwave data (Cecil 2015; Zipser et al. 2006) that show the region corresponding to less than 20% probability is about the same as the region that can be more formally attributed to snow. The effectiveness of the probability filter as a way to filter out snow is discussed in section 4.
The ~13 000 SSM/I overpasses are processed with the probability filter to eliminate the majority of nonstorm (snowpack) features in the scenes. One SSM/I overpass of the contiguous United States is presented in Fig. 2 and this also shows the CG probability, as well as the corresponding gridded NLDN data. The probability filter preserves microwave brightness temperature features within the SSM/I overpass that are spatially coincident with NLDN CG lightning flashes.
After the probability filter is applied to every SSM/I overpass scene, the remaining data are compiled and evaluated against gridded NLDN data to derive an empirical relationship between CG lightning and both minimum PCT85 and PCT37. The CG lightning data are binned into evenly spaced brightness temperature bins, and an empirical function is used to model the binned data (Figs. 3 and 4) with at least N = 100 occurrences, roughly corresponding to 10% counting error. Binning the data prevents the oversampled data [panels (c),(d),(g), and (h) in Figs. 3 and 4] in higher brightness temperatures—grid boxes with small or no thunderstorms—from dominating the empirical fit.
To model the relationship in Figs. 3 and 4, an exponential function with two fit coefficients is used. Other models were tested, but with similar or poorer results, and the exponential function offers potential for stronger thunderstorms to translate to much more lightning. Figure 4 shows that the fits are less robust, partly because of the smaller range of observed minimum PCT37 values, and the model also overestimates lightning at minimum PCT37 > 250 K. Overestimates for moderate-to-high minimum PCT85 and PCT37 values can dramatically impact results since these data occur frequently, and the models are applied to each grid box. The results from the model are assessed below in a number of different ways to understand the strengths and weaknesses in terms of simulating CG lightning using SSM/I data.
The exponential model is more robust for minimum PCT85 (Fig. 3) than for minimum PCT37 (Fig. 4) in terms of the ability of the model to capture the behavior of the data, and the fit coefficients in the different models are only weakly dependent on whether data from a particular SSM/I sensor is used [panels (b)–(h) in Figs. 3 and 4]. The reasons for the differences are likely a result of a combination of statistical noise, the sampling from 2005 to 2012 being biased to F15 relative to F13 and F14, and the impact of different and changing viewing times of the contiguous United States on the types of storms typically captured. The similarity between the models for specific sensors is expected given the explicit intercalibration of SSM/I data (Berg et al. 2013; Sapiano et al. 2013), and implies that all SSM/I data can be considered together. Thus, the model that is derived using all SSM/I sensor data (Fig. 3a) is used to estimate CG lightning. The noninteger output from the function in Fig. 3a is rounded down to an integer value, to be comparable with integer NLDN lightning data.
Figure 5 shows the results of applying the model in Fig. 3a to SSM/I data for two overpasses. Figures 5a and 5b show NLDN CG lightning that occurred within the overpass swath, while Figs. 5c and 5d show CG lightning derived from SSM/I. The analysis is restricted to the contiguous United States. Figure 2 shows the underlying datasets used to derive Figs. 5b,d,f.
Direct spatial correlation between observed and estimated lightning in a particular grid box is generally weakened by slight collocation differences between minimum PCT85 features and actual CG lightning, and this is evident in the difference plots (SSM/I minus NLDN lightning) in Figs. 5e,f. This collocation problem likely emerges from the spatial displacement of the largest ice volumes aloft from the region of most intense CG activity near the surface, or, in other words, the high vertical shear environment typical in thunderstorm formation. The spatial correlation is also impacted by SSM/I-derived lightning appearing where the NLDN reported zero lightning (i.e., a false alarm).
In the eastern U.S. case (Fig. 5e), the root-mean-square (RMS) difference between derived and observed is 4.0 flashes with a mean number of flashes being 2.1 for SSM/I and 5.9 for NLDN. The median number of flashes for SSM/I is 1.3, while the median for NLDN is 3.0. These calculations exclude grid boxes with both zero NLDN and zero SSM/I lightning. The total number of NLDN flashes (within the overpass swath) is 480 and the corresponding value for SSM/I is 465. The total and the differences between the mean and the median together indicate that the distribution of flashes within a grid box is skewed toward a larger number per grid box for NLDN, but that the SSM/I method ends up producing more grid boxes with fewer flashes for a similar total. This is evident from the large number of dark red grid boxes in Fig. 5e showing that there are many small positive differences, often in locations with zero NLDN flashes (North Carolina, Indiana, and Kentucky in Fig. 5a). These add up to a significant contribution to the total lightning in any given orbital overpass. Although not shown, using minimum PCT37 produces an even larger total of 1192 for this overpass.
In the Great Plains case (Fig. 5f), again for grid boxes with lightning from either or both NLDN and SSM/I, the RMS difference is 13 flashes, and the total NLDN CG lightning within the overpass is 1405 (Fig. 5b). Total SSM/I lightning is 1536 (Fig. 5d), and the overestimate results from putting more lightning in Colorado and northern Wisconsin than NLDN reported during the overpass time window, as well as simply overestimating lightning in and near the squall line. The mean SSM/I lightning is 7.2 flashes per grid box, while the median is 1.5. For NLDN, the mean is 14.1 with a median of 5.5. Again, NLDN reports a higher density of lightning, or lightning that is concentrated in fewer grid boxes than what is derived from SSM/I. The mean difference is 4.3 flashes per grid box, with a median of 1.2. The mean difference in this example is greater than in the eastern U.S. example, but the median difference is nearly the same, indicating that for 50% of the grid boxes with lightning, the difference between the observed and derived lightning is less than ~1.2 flashes.
The question of skill in predicting lightning using microwave remote sensing can be evaluated using contingency tables (Roebber 2009; Wilks 2011) where results are presented in a 2 × 2 table in terms of whether there was a prediction (SSM/I) and an observation (NLDN) of lightning in a particular grid box. There are four possibilities for any prediction in a contingency table: hit (SSM/I > 0 and NLDN > 0), false alarm (SSM/I > 0, NLDN = 0), miss (SSM/I = 0, NLDN > 0), or correct negative (SSM/I = 0, NLDN = 0). To evaluate a contingency table, multiple skill scores have been developed (Roebber 2009; Wilks 2011). Because lightning is infrequent within any given SSM/I overpass scene (e.g., Figs. 2 and 5), the number of correct negatives is enormous. Of the nearly 44 million grid-box comparisons (from the ~13 000 overpasses), 98.9% are correct negatives. The most useful skill scores for this study are those that are independent of the number of correct negatives, so the probability of detection (POD), false-alarm ratio (FAR), bias B, and the critical success index (CSI) are presented.
Using the full dataset, Table 1 quantifies the contingency table entries and skill scores that are independent of the correct negatives, along with the relevant calculations (Wilks 2011). POD is about 47% and FAR is about 58%, showing that grid-box hits are rarer than grid-box false alarms. Similarly, CSI is low because the false alarms and misses are each greater than the number of hits. The bias shows that SSM/I microwave data predict lightning about 10% more often than it occurs, which itself indicates that false alarms and misses are similar.
The skill scores are a useful way to evaluate whether predictions of lightning from SSM/I data can be verified, but accuracy in the amount of lightning is also a critical element. Two ways to present the accuracy for the full dataset are by calculating grid-box to grid-box comparisons of SSM/I and NLDN lightning, and by calculating the sum of all CG lightning flashes within an SSM/I overpass to compare with the equivalent region of interest for NLDN. The latter will be referred to as summed lightning. The grid-box comparisons are the finest spatial scale that could be resolved, while the summed lightning comparison is intended to help us better understand how false alarms and misses are related to small collocation differences.
For the grid-box to grid-box comparisons, the distribution is heavily skewed toward low lightning counts for both NLDN and SSM/I. The median (and 5th–95th percentile range) of lightning counts in grid boxes with hits is 4 (1–37) for NLDN and 3 (1–18) for SSM/I. Thus, SSM/I underpredicts the total lightning in grid boxes with the highest NLDN totals. Also important is the magnitude and range for grid boxes with false alarms and misses. For SSM/I in grid boxes with false alarms, the median is 2 (1–6). For NLDN in grid boxes with misses, the median is 2 (1–12). In other words, the misses and false alarms are for grid boxes with lightning counts much lower than the highest values in grid boxes with hits. False alarms and misses are often located near each other within the same scene, as one could infer from Fig. 5. In statistics for the grid-box comparisons, these would be failures at particular grid boxes, but realistically SSM/I is giving the right idea in those situations, without precisely identifying the lightning locations. Overall, the RMS difference for grid boxes with hits is 16 flashes, with SSM/I biased low at high values of NLDN and high at low values of NLDN. The RMS difference increases as a function of NLDN lightning counts, but the RMS difference calculated as a percent of the mean (ratio of the RMS difference to the mean value of NLDN and SSM/I lightning counts or, henceforth, RMS percent) for the grid-box to grid-box comparison is 127% with little variation as NLDN lightning counts increase.
To minimize the effects of misses and false alarms on comparisons of NLDN and SSM/I lightning, the summed lightning comparison from each of the ~13 000 overpass scenes in shown in Fig. 6. In Fig. 6a, each gray point represents the summed lightning within an overpass (lightning per SSM/I orbit) from NLDN in grid boxes with hits and misses and from SSM/I in grid boxes with hits and false alarms. The large circles in Fig. 6 are the average values of the data in evenly spaced bins, calculated to unweight the comparison from the large oversampling of low summed lightning counts. The thin gray line in Fig. 6 shows the one-to-one comparison, while the black line shows the best-fit line to the binned data that have N > 10 (large white circles) data points. The best-fit equations to the large white circles and to gray data points are shown in Fig. 6, as is the equation for the scatterplot of individual grid-box comparisons. Figure 6b shows the RMS percent as a function of the NLDN summed lightning.
From Fig. 6, the slopes of the best-fit lines to the NLDN and SSM/I summed lightning (0.56, 0.56) suggest much better agreement than for grid-box comparisons (slope of 0.18), and this shows how the spatial collocation problem (e.g., Figs. 5e,f, Table 1) weakens the results. The positive y intercept in all the best-fit lines indicates that SSM/I lightning is biased high for low amounts of lightning, while the slopes less than one indicate that SSM/I is biased low for higher amounts of lightning, which is consistent with the grid-box analysis above.
Figure 6b provides a way to estimate the uncertainty for summed lightning that attempts to minimize the effects of the collocation problem and speaks to the potential of SSM/I in estimating lightning accumulated within a region. For any scene with about 300–2000 flashes, the RMS percent is about 60%–70%. As the summed lightning decreases below 300, the RMS percent increases to over 100%. In the first bin in Fig. 6b, for example, the mean NLDN summed lightning is 44 flashes, the mean SSM/I summed lightning is 55 flashes, and the RMS difference is 77 flashes. This suggests that the lowest threshold of detection of the summed lightning in an SSM/I overpass is about 77 flashes, but thunderstorms (e.g., Fig. 5) can produce far more flashes even within the time frame of an SSM/I overpass. The RMS percent difference in that bin is 77 divided by the mean of 44 and 55 to get the 156% shown in Fig. 6b. Furthermore, the RMS percent in the summed lightning comparison is about one-half of the RMS percent in the grid-box comparison, suggesting that slight mislocations of SSM/I relative to NLDN (i.e., misses and false alarms) roughly double the percent uncertainty.
The results suggest that SSM/I-summed lightning for an entire overpass compares better to NLDN-summed lightning than the comparisons for individual grid boxes because of spatial collocation issues evident in the relatively high numbers of misses and false alarms (Table 1). This is evident in the improvement seen in the slope of the best-fit lines in Fig. 6a when comparing lightning in individual grid boxes with summed lightning. A spatiotemporal climatology of large-scale lightning in the contiguous United States may be the most immediate application of this study, with uncertainties in summed lightning closer to 60%–70% (Fig. 6b) as opposed to over 100% for specific grid-box predictions (section 3).
Summed lightning results across the contiguous United States during the entire sample period from April to September for the years 2005–12 are shown in Fig. 7 for grid boxes with hits, misses, and false alarms. Figures 7a and 7b compare summed lightning from NLDN and SSM/I for grid boxes with hits. The correlation coefficient of the two maps (henceforth, spatial correlation) is 0.78, which is statistically significant (p value is less than 0.05) and suggests that when lightning is predicted using SSM/I microwave brightness temperatures, it is spatially well matched with NLDN lightning. Figures 7c and 7d show summed lightning from NLDN in grid boxes with misses and from SSM/I in grid boxes with false alarms, respectively. Figure 7c shows that the highest summed lightning counts that are missed by the SSM/I method are in the southwest and southeast United States, as well as parts of the Ohio River valley. Overall, SSM/I-summed lightning where there are hits (Fig. 7b) is generally greater than the summed lightning where there are misses or false alarms, but clearly NLDN-summed lightning where there are hits (Fig. 7a) indicates consistent underestimates of lightning from SSM/I.
Figure 8 shows NLDN-summed lightning for grid boxes with hits and misses (Fig. 8a) and SSM/I-summed lightning for grid boxes with hits and false alarms (Fig. 8b), as well as SSM/I-summed lightning minus NLDN-summed lightning (Fig. 8c) and the ratio of the difference in Fig. 8c to the total in Fig. 8a. Figure 8a is the total of Figs. 7a and 7c, and Fig. 8b is the total of Figs. 7b and 7d. While the values are only for SSM/I overpass times, a qualitative comparison with a multiyear-average NLDN lightning dataset from Orville et al. (2011) reveals similar prominent features such as lightning maxima in Florida and along the northern Gulf Coast, with dwindling lightning from the Mississippi River valley into the Great Plains and much less lightning in the mountain and coastal western regions.
Figure 8c shows how SSM/I is biased high in the northern plains and throughout the mountain and coastal West, but biased low throughout the Southeast and Southwest, and along the Ohio River valley. There is a strong transition from high to low biases on either side of a diagonal from southwest Kansas to southern Wisconsin. Figure 8d shows the ratio of the difference to the NLDN-summed lightning to give an indication of how significant the difference actually is: values less than one mean the difference is less than the magnitude of the total observed lightning. The small differences in the mountain west are large relative to the number of flashes actually recorded by the NLDN. The best agreement in terms of the ratio comparison is in a band from northern Texas to Iowa and southern Wisconsin.
SSM/I data underestimates the lightning by about 50% or less in the eastern and the southwest United States (Arizona and New Mexico) based on Fig. 8d. Overestimates scattered through the West (western Wyoming) are due to either residual snowpack that the probability filter did not capture or high-elevation terrain. Overestimates in the northern plains are more challenging to explain, but this area has been documented to have higher IC-to-CG and positive-CG to negative-CG ratios (e.g., Boccippio et al. 2001; Medici et al. 2015), suggesting a regional dependence to the thunderstorm morphology (e.g., Carey and Rutledge 2003; MacGorman et al. 1989; Schultz et al. 2011; Williams et al. 1999) that affects the methods in this study. There is no particular physical reason to expect the SSM/I-based approach to be better suited to CG lightning than to total lightning, so it is not surprising that we are overestimating the CG lightning in a region known for high IC-to-CG ratios.
Figures 7 and 8 show maps using a 20% probability filter (i.e., Fig. 1) on SSM/I data at a spatial resolution of 0.25° with a specific CG and IC lightning threshold applied to the NLDN data, so there is good reason to test the sensitivity of the results to these choices in the analysis. The sensitivity is evaluated within the framework of changes in contingency table skill scores and accuracy metrics in the discussion below.
The probability filter is intended to prescreen the SSM/I data and minimize contamination by other features within an SSM/I overpass that result in brightness temperature depressions, such as snowpack. As the strength of the probability filter decreases, the bias increases from 1.1 (Table 1) to 3–4 for probability thresholds of 1%–5%, and the FAR also increases to about 80%. POD increases as well, but not as quickly as FAR and the bias. Correspondingly, as the strength of the probability filter increases, the bias, FAR, and POD all decrease. The 20% probability filter used in this study balances FAR and POD at about 50% with a bias of close to one, which is quantitatively similar to skill scores associated with snow and tornado forecasts (Roebber 2009).
Another way of evaluating the effectiveness of the choice of probability filter is by applying a more aggressive filtering by excluding months with the highest potential for snow in the contiguous United States (April and May) and/or filtering out high-elevation terrain from the dataset to avoid lingering mountain snow. These additional filters applied to the probability-filtered SSM/I dataset make less of a difference in terms of contingency table skill scores and accuracy metrics (such as RMS) than increasing the strength of the probability filter itself. For example, for a give probability filter, the bias, POD, FAR, and RMS difference essentially remain unchanged across a range of elevation filters ranging from excluding all grid boxes with elevation greater than 1000 m to 1500–2000 m. Similarly, excluding April and May does not significantly improve the results. There are some specific exceptions with strong false alarms in the mountainous terrain of western Wyoming, western Colorado, and the Sierra Nevada in east-central California (Fig. 7d), but these problems are secondary to the more significant under- and overestimates in the nonmountainous terrain east of the Rockies (Figs. 8c,d).
In addition to sensitivity to the probability filter, the results are sensitive to the spatial resolution used for gridding, but the overall results become worse as the spatial resolution increases from 0.25° to 2.0°. Furthermore, the collocation issue—where microwave brightness temperature depressions are not always being spatially coaligned with regions of intense CG lightning—is not resolved with coarser analysis resolution. Spatial and temporal smoothing of the estimated lightning, similar to what is done with OTD and TRMM LIS lightning maps (Cecil et al. 2014), may be a way to improve spatial correlation, but high resolution is needed to capture the spatial location of lightning relative to the thunderstorm. In terms of correlation and RMS, the methods degrade rapidly for spatial resolutions greater than 0.5°. Coarser resolution tends to assign minimum brightness temperature features associated with strong thunderstorms to an inordinately large grid box and results in excessive lightning flashes.
The CG and IC discrimination peak current is an imperfect threshold (Fig. 1) as well. The thresholds chosen for the analysis here (−10 kA to +20 kA are excluded, as possible IC lightning) are consistent with results from field verification studies (Biagi et al. 2007; Fleenor et al. 2009). Arbitrarily testing different thresholds, however, would ignore the physically based evidence from those studies related to the classification of CG and IC lightning by the NLDN. Certainly, the thresholds used in this study could be changed if new evidence emerges from additional field work, but for this study, there is no reason to circumvent the available findings in the literature.
Considering the sensitivities together with the results in Figs. 6–8 suggests the probability filter alone is relatively effective at minimizing snow or other nonthunderstorm microwave artifacts, but that the model itself may have regional dependencies that require a deeper investigation. Uncertainties in the methods at the grid-box scale are susceptible to spatial collocation issues and range to values of about 127%, but when the summed lightning within an SSM/I overpass is considered, the uncertainties decrease to about 60%–70%. This suggests that summed lightning within thunderstorm complexes could be captured once the regional dependencies are investigated.
A model (Fig. 3) derived from SSM/I brightness temperature and NLDN CG lightning is used to estimate CG lightning from SSM/I minimum PCT85 data (Figs. 6–8). From Fig. 3 we can see that F13, F14, and F15 have very similar models, which suggests that the SSM/I intercalibration efforts (Berg et al. 2013; Sapiano et al. 2013) are indeed providing homogeneous data across generations of SSM/I sensors, and lends confidence that data from prior SSM/I sensors (F8, F10, and F11) and current SSM/IS sensors (F16, F17, F18, and F19) could also be used to understand, at least to first order, a longer history of CG lightning data since 1987.
The methods are derived for the weather regime of the contiguous United States because of the range of the NLDN, and during warm season months to minimize the impact of snow, but SSM/I sensors are global in their viewing and include vast regions of the world that have not been extensively characterized even by TRMM LIS (Cecil et al. 2014). For example, the northern latitudes of Earth are experiencing marked changes to fire regimes that are at least in part influenced by lightning flashes as ignition sources (Flannigan et al. 2013; Stocks et al. 2002). However, while new lightning sensors, such as ones on the International Space Station, GOES-R, GOES-S, and Meteosat Third Generation, will provide views of the high latitudes, TRMM LIS did not map lightning at these latitudes, and the global lightning climatology has relied on the brief OTD dataset (Cecil et al. 2014). CG lightning derived from SSM/I would certainly offer a much broader spatiotemporal view of lightning during a time of changes in weather that are in part due to the changing climate. Larger-scale composites of lightning within a scene could be valuable for use in global fire models (e.g., Li et al. 2013; Pechony and Shindell 2009) and chemical transport models (e.g., Levy et al. 1996; Price et al. 1997).
The uncertainty in SSM/I-derived summed lightning within an SSM/I overpass is about 60%–70% (Fig. 6), while the uncertainty in individual grid-box comparisons is about a factor of 2 greater as a result of slight spatial collocation problems (i.e., false alarms and misses). The uncertainty of about 60%–70% is consistent with the comparison of summed lightning during the study period of April–September 2005–12 (Fig. 8). There is, however, significant regional variability in the uncertainty (Figs. 7 and 8) that suggest that methods would need careful refinements to be applied at spatial scales beyond the United States. Refinements for future research include regional dependencies of the function used to model CG lightning (Fig. 3) and a better way to filter snowpack. A more complex empirical model of how lightning and minimum PCT85 data are related may better capture the lightning in scenes with large lightning totals as well as the more common small lightning totals that drive the comparisons. This study shows that there is good potential for estimating lightning from microwave remote sensing, but that reduction in uncertainty is a necessity before the methods could be applied outside of the United States.
The authors thank the staff at the Precipitation Research Group at Colorado State University for maintaining and managing the SSM/I data distribution. BIM and TW were partially supported by the North Carolina Space Grant Consortium’s New Investigator Program. BIM was also partially supported by a UNC Charlotte faculty research grant. DJC was supported through the Lightning Imaging Sensor (LIS) team via the NASA Tropical Rainfall Measuring Mission.