The Development and Initial Capabilities of ThunderCast, a Deep Learning Model for Thunderstorm Nowcasting in the United States

Stephanie M. Ortland aUniversity of Wisconsin–Madison, Madison, Wisconsin

Search for other papers by Stephanie M. Ortland in
Current site
Google Scholar
PubMed
Close
,
Michael J. Pavolonis bAdvanced Satellite Products Branch, NOAA/NESDIS/Center for Satellite Applications and Research, Madison, Wisconsin

Search for other papers by Michael J. Pavolonis in
Current site
Google Scholar
PubMed
Close
, and
John L. Cintineo cCooperative Institute for Meteorological Satellite Studies, University of Wisconsin–Madison, Madison, Wisconsin

Search for other papers by John L. Cintineo in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

This paper presents the Thunderstorm Nowcasting Tool (ThunderCast), a 24-h, year-round model for predicting the location of convection that is likely to initiate or remain a thunderstorm in the next 0–60 min in the continental United States, adapted from existing deep learning convection applications. ThunderCast utilizes a U-Net convolutional neural network for semantic segmentation trained on 320 km × 320 km data patches with four inputs and one target dataset. The inputs are satellite bands from the Geostationary Operational Environmental Satellite-16 (GOES-16) Advanced Baseline Imager (ABI) in the visible, shortwave infrared, and longwave infrared spectra, and the target is Multi-Radar Multi-Sensor (MRMS) radar reflectivity at the −10°C isotherm in the atmosphere. On a pixel-by-pixel basis, ThunderCast has high accuracy, recall, and specificity but is subject to false-positive predictions resulting in low precision. However, the number of false positives decreases when buffering the target values with a 15 km × 15 km centered window, indicating ThunderCast’s predictions are useful within a buffered area. To demonstrate the initial prediction capabilities of ThunderCast, three case studies are presented: a mesoscale convective vortex, sea-breeze convection, and monsoonal convection in the southwestern United States. The case studies illustrate that the ThunderCast model effectively nowcasts the location of newly initiated and ongoing active convection, within the next 60 min, under a variety of geographical and meteorological conditions.

Significance Statement

In this research, a machine learning model is developed for short-term (0–60 min) forecasting of thunderstorms in the continental United States using geostationary satellite imagery as inputs for predicting active convection based on radar thresholds. Pending additional testing, the model may be able to provide decision-support services for thunderstorm forecasting. The case studies presented here indicate the model is able to nowcast convective initiation with 5–35 min of lead time in areas without radar coverage and anticipate future locations of storms without additional environmental context.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Stephanie M. Ortland, ortland2@wisc.edu

Abstract

This paper presents the Thunderstorm Nowcasting Tool (ThunderCast), a 24-h, year-round model for predicting the location of convection that is likely to initiate or remain a thunderstorm in the next 0–60 min in the continental United States, adapted from existing deep learning convection applications. ThunderCast utilizes a U-Net convolutional neural network for semantic segmentation trained on 320 km × 320 km data patches with four inputs and one target dataset. The inputs are satellite bands from the Geostationary Operational Environmental Satellite-16 (GOES-16) Advanced Baseline Imager (ABI) in the visible, shortwave infrared, and longwave infrared spectra, and the target is Multi-Radar Multi-Sensor (MRMS) radar reflectivity at the −10°C isotherm in the atmosphere. On a pixel-by-pixel basis, ThunderCast has high accuracy, recall, and specificity but is subject to false-positive predictions resulting in low precision. However, the number of false positives decreases when buffering the target values with a 15 km × 15 km centered window, indicating ThunderCast’s predictions are useful within a buffered area. To demonstrate the initial prediction capabilities of ThunderCast, three case studies are presented: a mesoscale convective vortex, sea-breeze convection, and monsoonal convection in the southwestern United States. The case studies illustrate that the ThunderCast model effectively nowcasts the location of newly initiated and ongoing active convection, within the next 60 min, under a variety of geographical and meteorological conditions.

Significance Statement

In this research, a machine learning model is developed for short-term (0–60 min) forecasting of thunderstorms in the continental United States using geostationary satellite imagery as inputs for predicting active convection based on radar thresholds. Pending additional testing, the model may be able to provide decision-support services for thunderstorm forecasting. The case studies presented here indicate the model is able to nowcast convective initiation with 5–35 min of lead time in areas without radar coverage and anticipate future locations of storms without additional environmental context.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Stephanie M. Ortland, ortland2@wisc.edu

1. Introduction

a. Convection, radar, and satellites

Field studies in the twentieth century documented the relationship between convection and precipitation radar echoes, forming the basis for radar-based definitions of convective storms (synonymously referred to as thunderstorms in this paper). An extensive field campaign called the National Hail Research Experiment aimed to increase the understanding of the microphysics and cloud dynamics of severe convective storms by analyzing thunderstorms located in northeastern Colorado, southeastern Wyoming, and/or southwestern Nebraska (Morgan and Squires 1982; Fankhauser and Wade 1982). As a part of this campaign, Dye et al. (1982) and Dye and Martner (1982) recorded the environmental conditions, reflectivity structure, and microphysical characteristics of a thunderstorm on 25 July 1976 with moderate to heavy precipitation and light hail. The microphysical observations during the organizing stage of the storm corresponded to reflectivity values between approximately 30 and 40 dBZ. Although these measurements were taken at a temperature of −5°C, Dye and Martner (1982) also assert −10°C or colder cloud temperatures are necessary for adequate concentrations of nuclei for significant precipitation.

With this case study for reference, radar-based thresholds for convective initiation of thunderstorms and reflectivity measurements used for thunderstorm tracking tend to range between 30 and 40 dBZ at various temperature levels including −10°C. During a field campaign designed to determine the importance of radar-observed boundary layer convergence lines in initiating convective storms over the Colorado plains, Wilson and Schreiber (1986) defined a storm as a reflectivity echo of ≥30 dBZ. This same reflectivity was used as a starting threshold for the Storm Cell Identification and Tracking (SCIT) algorithm (Johnson et al. 1998). Alternatively, Roberts and Rutledge (2003) used a radar reflectivity threshold of 35 dBZ to distinguish between weakly precipitating storms (<35 dBZ) and vigorous convective storms (>35 dBZ). The 35-dBZ threshold was also used in the verification process for the Auto-Nowcast System detailed in Mueller et al. (2003) and studies citing a “legacy-based” radar definition of convective storms or initiation of convective storms (Mecikalski and Bedka 2006; Mecikalski et al. 2010a,b, 2015; Walker et al. 2012). Although 40 dBZ, the highest end of the reflectivity range observed in Dye and Martner (1982), was used as a proxy for thunderstorm initiation in Sieglaff et al. (2011), it has not been used often in radar-based definitions of thunderstorms, because it correlates with lightning, a postconvective initiation event (Zipser and Lutz 1994; Gremillion and Orville 1999; Elsenheimer and Gravelle 2019).

Based on the above discussion, a threshold of either 30 or 35 dBZ would be appropriate to reference as a radar signature of a developing thunderstorm. To maximize the amount of lead time obtained with the Thunderstorm Nowcasting Tool (ThunderCast), the research presented herein defines convective initiation leading to thunderstorm formation as the first occurrence of a radar reflectivity echo of ≥30 dBZ at the −10°C isotherm level in the atmosphere. This definition is consistent with the 25 July 1976 case study in the National Hail Research Experiment and will be used in the methodology presented in section 2.

As evident from the radar-based definition of thunderstorm initiation, it is clear radar is a powerful tool for observing and diagnosing convection associated with thunderstorms. However, the network of radar instruments in the contiguous United States (CONUS), although extensive, does not provide full coverage of the land area. This is especially prevalent in less populated regions in the western United States, where beam blocking from mountainous terrain is commonplace, and the nearest radars may be out of range. Additionally, radar does not provide information on storm development prior to the first echoes detected from precipitation. Satellites can supplement this gap in observations and provide greater forecast lead time. The Geostationary Operational Environmental Satellite (GOES) program provides satellite imagery with the Advanced Baseline Imager (ABI), summarized in Schmit et al. (2017), with high resolution for the entire CONUS, including areas without radar coverage.

ABI spectral bands with 0.64-μm (channel 2), 1.6-μm (channel 5), 10.3-μm (channel 13), and 12.3-μm (channel 15) wavelengths and/or combinations of these bands are sensitive to pertinent features in thunderstorms, such as cloud type, overshooting tops, cloud particle size, cloud-top glaciation, and cloud-top height (Pavolonis et al. 2005; Elsenheimer and Gravelle 2019). These four ABI bands are commonly referred to as the red, snow/ice, clean longwave window, and dirty longwave window bands, respectively (Schmit et al. 2017). With ABI data, various geostationary satellite–based convection nowcasting tools have been developed. As an example, Roberts and Rutledge (2003), using GOES-8 (0.62- and 10.7-μm wavelengths) and Weather Surveillance Radar-1988 Doppler (WSR-88D) data, found the rates of cooling of cloud-top brightness temperatures were important for discriminating between weakly precipitating storms and vigorous convective storms. Their work increased the lead time and accuracy of convective storm forecasts produced by the Auto-Nowcast System originally developed in Mueller et al. (2003).

Both radar and satellite data contribute valuable information for detecting thunderstorms throughout their various life stages. Because of this, a radar-based definition of thunderstorm initiation was used in studies such as Mecikalski and Bedka (2006), Mecikalski et al. (2010a,b), and Walker et al. (2012) to understand satellite cloud-top signatures for developing convective storms. The underlying purpose of these studies was to enhance convective storm forecasting. Further improvements to convection nowcasting models came soon after with the adoption of machine learning techniques (Mecikalski et al. 2015; Lagerquist et al. 2021; Bradshaw 2021).

b. Nowcasting with deep learning

Artificial intelligence is a broad term encompassing computer programs designed to automate intellectual tasks usually performed by humans. Machine learning refers to particular cases of artificial intelligence where, when presented with many examples of inputs and desired outputs for a given task, a machine learns the rules to map input parameters to the desired output (Chollet 2018; Stevens et al. 2020). The appropriate structure of a machine learning model varies depending on the task. Models requiring many successive layers (where data transformations occur) for learning are called deep learning models. The number of layers is referred to as the depth of the model; hence, many layers categorize the model as deep (Chollet 2018).

Deep learning has been increasing in popularity in atmospheric sciences and has been used, with success, for tropical cyclone intensity estimates (Wimmers et al. 2019; Griffin et al. 2022), synoptic-scale front prediction (Lagerquist et al. 2019), short-term tornado detection (Lagerquist et al. 2020), satellite-driven convective intensity (Cintineo et al. 2020), nowcasting radar echoes (Cuomo and Chandrasekar 2021; Ravuri et al. 2021), and lightning predictions (Zhou et al. 2020, 2022; Cintineo et al. 2022). Recently, Lagerquist et al. (2021) applied deep learning to nowcast convection in Taiwan with Himawari-8 satellite data using U-Net convolutional neural network architectures, originally designed for the classification of biomedical imagery in Ronneberger et al. (2015). U-Nets have become popular for semantic segmentation tasks in atmospheric sciences, where pixel-by-pixel predictions are valued. However, in order to apply the results of Lagerquist et al. (2021) to the United States, adjustments are necessary to account for differing meteorological conditions and satellite coverage.

With this in mind, there are indications that deep learning models, such as the vanilla U-Net in Lagerquist et al. (2021), can be applied to the CONUS with adjustments. Lee et al. (2021) identified present-time convection using GOES-16 data (at wavelengths of 0.65 and 11.2 μm) and an echo-classification algorithm [detailed in Zhang et al. (2016)], indicating machine learning models can learn the physical properties of clouds from GOES-16 high-spatial- and high-temporal-resolution data. This was also demonstrated in Bradshaw (2021), where a U-Net was implemented for a small portion of the CONUS to predict daytime convection for the next hour. This present paper describes a U-Net designed to nowcast thunderstorms in the CONUS, using a 60-min prediction window, based on spatial and spectral patterns in GOES-16 ABI imagery. The model, referred to as ThunderCast, was trained using NEXRAD data as the target dataset. If successfully demonstrated in a testbed environment, ThunderCast could be used to support operational applications by providing lead time to thunderstorm initiation prior to radar thunderstorm signatures in all CONUS regions, including those without radar coverage. ThunderCast may also be used to gain insight into the relationship between radar and satellite products and physical processes.

2. Method

a. Model structure

To train a deep learning model, a set of inputs and a corresponding target (also referred to as the “truth” or desired values) are presented to a deep learning architecture through many iterations to optimize the resemblance of the model output to the given target. This process is depicted in Fig. 1 and described in detail in Chollet (2018). The model architecture includes successive layers to transform the data into meaningful representations, which are characterized by a set of weights to determine the relative contribution of each layer to the final output. Weight updating occurs after a subset of the data, called a batch, is processed. A cycle through all the batches (the entire training dataset) forms an epoch. The model trains for many epochs until the difference between the model output and inputs, called the loss as measured by the loss or cost function, is minimized (Chollet 2018; Stevens et al. 2020). Once trained, the model goes through validation to tune selected model hyperparameters, such as the batch size and the learning rate. After training and validation are complete, a set of inputs can be passed through the model to obtain a set of predictions without using the target. As shown in Fig. 1, cross entropy for a binary model [as defined in Cintineo et al. (2022)] and Adam (Kingma and Ba 2017) were chosen as this application’s loss function and optimizer, respectively. The final model hyperparameters are displayed in Table 1 and in the code repository provided in the data availability statement at the end of this paper. The hyperparameters were chosen based on computing restraints [e.g., number of graphical processing units (GPUs)], trial and error (e.g., early stopping patience), and the scientific understanding of the given task as discussed throughout this paper (e.g., number of inputs and class weights).

Fig. 1.
Fig. 1.

Diagram of model training for ThunderCast. The cycle is repeated until the model loss is minimized.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Table 1.

Select hyperparameters configured for the training process.

Table 1.

ThunderCast’s layers are arranged in a U-Net convolutional neural network, as shown in Fig. 2. The initial layers (Fig. 2 upper left) are the inputs to the model (see section 2b), with subsequent layers representing data filters, which encode or decode features of the input data (Chollet 2018). The number of layers, or depth of the U-Net, is consistent with the U-Nets described in Cintineo et al. (2022) and Ronneberger et al. (2015) and was not varied as a part of the hyperparameter tuning process. Three types of data transformations are used in the U-Net: two-dimensional convolutions, max pooling, and upsampling. Convolutions use the scalar product of the model weights (kernel) and the inputs to extract translationally invariant spatial hierarchies of features (Chollet 2018; Stevens et al. 2020). During most of the convolutions, the data are split into 3 × 3 overlapping windows with a stride of one and padding with zeros to ensure the output has the same dimension as the inputs. These are accompanied by a rectified linear unit (ReLU) activation function, allowing for nonlinearity (Maas et al. 2013). However, the last convolution uses a 1 × 1 window and a softmax activation function to obtain one gridded segmentation map of probability scores with values between zero and one at 1-km resolution (320 km × 320 km, matching the target dataset’s resolution) as output. The convolutions are important for isolating and learning local data patterns, and max pooling allows for learning at multiple spatial scales. Max pooling aggressively downsamples the data by taking the maximum of the series with 2 × 2 windows and a stride of two (downsamples by a factor of two). The size of the layers is halved during this procedure, so after max pooling, a window will view data from a larger area. Thus, initially, the U-Net learns small-scale data patterns, and it learns large-scale patterns after max-pooling transformations. At the bottom of the “U” in the U-Net, shown in Fig. 2, the image has a low resolution. Upsampling acts to return the image to a high resolution in order to obtain a pixel-by-pixel result. In particular, upsampling is performed with a combination of a three-dimensional transposed convolution and a concatenation (skip connection). Concatenations retain overall prediction details while helping to converge on a loss value during training (Ronneberger et al. 2015).

Fig. 2.
Fig. 2.

Depiction of ThunderCast’s U-Net convolutional neural network model architecture. The figure is adapted for this application from Ronneberger et al. (2015).

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

b. Data

GOES-16 (also currently designated GOES-East) ABI spectral bands with wavelengths of 0.64 μm (channel-2 reflectance at 0.5-km resolution), 1.6 μm (channel-5 reflectance at 1-km resolution), 10.3 μm (channel-13 brightness temperature at 2-km resolution), and 12.3 μm (channel-15 brightness temperature at 2-km resolution) compose the model input for ThunderCast. These ABI spectral bands were selected because they consist of multispectral imagery that is commonly utilized by forecasters for diagnosing trends and patterns in cumuliform clouds, including cloud-top glaciation, cloud-top temperature, and morphology (Elsenheimer and Gravelle 2019). All four channels are included in the predictors regardless of solar illumination. However, in the absence of sunlight, the visible and shortwave infrared channels are near zero and are not expected to contribute to predictions. As shown in Fig. 2, the channels with resolutions of 1 or 2 km are upsampled to 0.5-km resolution at the beginning of the U-Net architecture to preserve the fine resolution of the visible band. Additionally, each input is normalized by subtracting the mean and dividing by the standard deviation of each respective spectral band in the training dataset.

To convey prediction uncertainty, the desired output of the model is interpreted as a grid of 60-min thunderstorm probabilities. To train the model to produce an optimal prediction of this nature, the target represents whether a thunderstorm has occurred between the input time and an hour afterward and is obtained with a grid of maximum Multi-Radar Multi-Sensor [MRMS; described further in Zhang et al. (2016) and Smith et al. (2016)] radar reflectivity at the −10°C isotherm in the atmosphere. The grid is binarized such that any pixel with maximum reflectivity greater than 30 dBZ during the hour after the input time is considered positive for thunderstorm occurrence. To maximize the lead time obtained with the model, 30 dBZ is selected as the radar threshold because it corresponds to the earliest thunderstorm radar signature, as discussed in section 1a.

Preprocessing is necessary to address data, computing, and machine learning limitations. Because large data files require more computing memory than is typically available on current GPU servers for model training, it is not feasible to supply data from full-domain, high-resolution GOES-16 scans to the model. To account for this sort of limitation, Liu et al. (2018) implemented a patch-wise sampling technique, and a similar approach is used here. The GOES-16 scans are automatically split into small 640 × 640 pixel patches at 0.5-km resolution (simply referred to as patches or data patches in this paper). To avoid a loss of contextual information from splitting apart storms along patch borders, patches overlap adjacent patches by 32 pixels (0.5-km resolution) on each of their sides, as shown in Fig. 3. Additional errors in training or an unreasonable model output caused by data artifacts is avoided by rejecting patches containing at least one of the following: any “not a number” (NaN) values, MRMS data with more than 1% of pixels with no coverage (designated by −999 values within the MRMS radar reflectivity at the −10°C isotherm dataset), or any other fill values.

Fig. 3.
Fig. 3.

Example of a corner patch (black) with overlapping adjacent patches (blue and red outlines). The number of pixels is labeled at 0.5-km resolution (black), and the patches are not drawn to scale.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

At any one time in the CONUS, the number of negative pixels (without thunderstorm activity) far outnumbers the number of positive pixels (with thunderstorm activity). A sample is considered a positive patch when more than 1% of pixels are positive. Even positive patches generally have a class imbalance, as there are many negative pixels in a given scene. If this goes unchecked, a model can be highly accurate by never making a positive prediction. To mitigate this, a majority of negative patches are not used. Within a randomized list of negative patches, every 100th case is selected for use in the model to ensure the model has some exposure to a variety of midlatitude weather phenomena. The average percentage of positive pixels in the patches is used to set the class weights prior to training (shown in Table 1).

Developing a deep learning model can be broken into three stages, each with its own separate dataset: training, validation, and testing. The training dataset is used to learn the weights necessary for an optimized model (as discussed in section 2a); the validation dataset is used for evaluating a model’s skill during training for hyperparameter tuning; and the testing dataset is used to determine the model’s after-training statistics, such as accuracy, precision, and recall. To ensure the samples between datasets are separated temporally, the training, validation, and testing datasets were taken from the years 2019, 2020, and 2021, respectively. The total data patches in each dataset are provided in Table 2. Within the datasets, patches are collected from all months in a year, as shown in Fig. 4. Additionally, to ensure ThunderCast has exposure to many types of terrain and environments, patches can originate from all climate regions in the CONUS as well as from areas with radar coverage near the coasts, referred to as outside CONUS (OCONUS). The CONUS climate regions are shown in Fig. 5, where a patch is considered a part of the region where the center pixel is located. The numbers of patches in each climate region for training, validation, and testing are shown in Fig. 6. Variations in the number of patches per month and per climate region can be accounted for due to the temporal and spatial variability of thunderstorms in the United States. The monthly and regional distributions (Figs. 4 and 6) indicate that all three datasets (training, validation, and testing) well represent the thunderstorm climatology for 2019–21.

Table 2.

The number of 640 × 640 pixel (0.5-km resolution) data patches in the training, validation, and testing datasets. The years used for each dataset are indicated in parentheses.

Table 2.
Fig. 4.
Fig. 4.

Temporal distribution of data patches composing the training, validation, and testing datasets.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Fig. 5.
Fig. 5.

Map of the CONUS with state colors corresponding to the state’s respective climate region in accordance with Karl and Koss (1984).

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Fig. 6.
Fig. 6.

Spatial distribution of data patches composing the training, validation, and testing datasets sorted by climate regions in the CONUS and OCONUS. The number of patches is indicated by the height of the bars as well as the values above each bar (“e+0X” indicates “× 10X”).

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

During initial model validation, predictions were biased toward west-to-east storm motion, which is the predominant atmospheric steering pattern in the CONUS. For example, for a known eastward-moving storm, predictions were elongated eastward of existing radar reflectivity at −10°C, suggesting in the next hour there could be storm development toward the east of the existing reflectivity, or there could be eastward motion of the existing storm system. The eastward prediction elongation continued to occur when the storm was flipped over the vertical axis (becoming a westward-moving storm in a time series of patches). To address this bias, data augmentation is utilized. When batches of data patches are imported for training, each batch has a 25% chance the patches will be flipped over the vertical axis, followed by a 25% chance of being flipped over the patches’ horizontal axis. This results in some patches being flipped horizontally, vertically, or both horizontally and vertically during model training.

c. Evaluation techniques

Traditional machine learning and forecasting statistics are used to evaluate model performance in section 3a. Although ThunderCast’s outputs are probabilistic predictions for the next hour, the predictions are binarized for statistical evaluation. Predictions greater than or equal to a probability threshold (e.g., greater than or equal to 50% probability) can either be a “yes” thunderstorm (positive) or a “no” thunderstorm (negative) case. A prediction can then be placed in one of four categories: true positive (TP), true negative (TN), false positive (FP), or false negative (FN). True positives or negatives occur when both the prediction and observation are positive or negative, respectively, false positives are categorized when the prediction is positive while the observation is negative, and false negatives are categorized when the prediction is negative while the observation is positive. Since ThunderCast makes pixel-by-pixel predictions, the total pixels in each category are counted for the entire training, validation, or testing set [represented as sums in Eqs. (1)(6)] and used to compute accuracy, precision, recall, specificity, false alarm ratio, critical success index, and frequency bias [represented in Eqs. (1)(7)]. With potential values in the range of [0, 1], 1 is an ideal value for accuracy, precision, recall, specificity, and critical success index and 0 is ideal for the false alarm ratio. Frequency bias ranges over [0, ∞) and is ideally close to 1:
accuracy=TP+TNTP+FP+FN+TN,
precision=TPTP+FP,
recall=probabilityofdetection(POD)=TPTP+FN,
specificity=TNTN+FP,
falsealarmratio(FAR)=FPTP+FP,
criticalsuccessindex(CSI)=TPTP+FN+FP, and
frequencybias=POD1FAR.
For visual interpretation of the ThunderCast model’s performance, an attribute diagram [developed in Hsu and Murphy (1986)] and a performance diagram [developed in Roebber (2009)] are provided in section 3. Forecasts are considered reliable or calibrated when the relative frequency of occurrence of events is equal to the probability forecast for those events. Thus, plotting the conditional event frequency with respect to the forecast probability gives a reliability curve. The attributes diagram displays the reliability curve, with reference lines for skill levels (Hsu and Murphy 1986). The performance diagram combines the recall or probability of detection [Eq. (3)], the success ratio [the false alarm ratio in Eq. (5) subtracted from 1], and the critical success index [Eq. (6)] into one diagram. A perfect performance diagram case results when the area under a curve with data points representing many probability thresholds is 1.
Another statistical metric called the fraction skill score (FSS) is used to determine how ThunderCast’s forecast skill varies with spatial scale (Roberts and Lean 2008). For every grid point in a binarized grid of predictions (of size Nx by Ny in the Cartesian grid), the average number of positive grid points within a centered square (window) of length n is computed. The result forms a density field P(n)i,j, where i and j are indices for the xy Cartesian grid. A similar grid, T(n)i,j, is also calculated for the target. This process can be described as taking the average pooling of the binarized predictions and the target. The density fields P(n)i,j and T(n)i,j are then used to calculate the mean-square error (MSE) for the target and prediction densities [MSE(n); Eq. (8)] and for a reference case [MSE(n)ref; Eq. (9)]. These quantities form the basis for the fraction skill score shown in Eq. (10). The definitions are
MSE(n)=1NxNyi=1Nxj=1Ny[T(n)i,jP(n)i,j]2,
MSE(n)ref=1NxNy[i=1Nxj=1NyT(n)i,j2+i=1Nxj=1NyP(n)i,j2], and
fractionskillscore(FSS)=1MSE(n)MSE(n)ref.
In addition to statistical evaluations, ThunderCast is applied to three distinct thunderstorm growth environments in sections 3bd: a mesoscale convective vortex, sea breezes, and a southwestern monsoon. Each case study includes visual representations of the satellite spectral band (inputs), radar reflectivity at the −10°C isotherm (target), and ThunderCast predictions. Forecast lead times from ThunderCast’s predictions to the occurrence of 30 dBZ at −10°C are presented.

3. Results

a. Model performance

Evaluating the model on data the model has not seen before with the testing dataset provides context on how the model is performing. The pixel-to-pixel statistics shown in Table 3 indicate the model is performing well for accuracy, recall, and specificity but has a low value for precision. Based on the equations for these metrics [Eqs. (1)(4)], ThunderCast tends to have a high number of false positives, resulting in low precision. However, a pixel-to-pixel evaluation does not take into account any slight offsets between the predictions and the target. The fraction skill score [Eq. (10)] provides insight into whether predictions are skillful within a spatial range. Fraction skill scores for probability thresholds ranging from 10% to 90% for varied spatial window sizes are shown in Fig. 7. In Fig. 7, the fraction skill score slightly increases for all probability thresholds as the window size increases, although minimal improvements occur after the windows reach a length (n) between 13 and 17 pixels (equivalent to 13–17 km, since the spatial resolution is 1 km). This indicates the skill of the model improves when evaluated within a spatial range of the predictions.

Table 3.

Model performance for the full, daytime-only, and nighttime-only testing datasets, where probabilities greater than or equal to 20% are considered to be positive for thunderstorm activity. Additional columns include values computed using an alternate test set with the same data patches but the target values have been adjusted such that all pixels within a 15 km × 15 km window centered on a pixel containing a maximum reflectivity of 30 dBZ or greater at −10°C for the next hour are also considered positive for thunderstorm occurrence (called a buffered dataset here).

Table 3.
Fig. 7.
Fig. 7.

Fraction skill score diagram based on the method presented in Roberts and Lean (2008). The fraction skill scores [Eq. (10)] are calculated for probability thresholds ranging from 10% to 90% for various window lengths. The window lengths [n in Eq. (10)] are given in kilometers but can be referred to as pixels or grid spaces since the spatial resolution is 1 km. The probability thresholds (%) are colored according to the legend in the upper-right-hand corner of the diagram. The lower dashed gray line represents the fraction of observed points exceeding 30 dBZ at −10°C over the domain and is called the random fraction skill score. The upper dashed gray line is the uniform fraction skill score and marks halfway between the random and perfect skill scores.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

If the scope of the target values resulting in true positives is broadened, such that any positive target value within a 15 km × 15 km window centered on the location of a prediction would result in a true-positive value, then the precision [Eq. (3)] improves, as shown in Table 3. The broadened target in the testing dataset is referred to as the buffered target or buffered dataset in this paper. Although other values could be chosen for the buffer window, the fraction skill score improves minimally with windows greater than 13–17 km, and 15 km was found to represent the model well in a trial-and-error attributes diagram analysis. The increased precision achieved with the buffered dataset indicates ThunderCast’s predictions may be useful within a buffered area surrounding a prediction. However, there is a tradeoff in doing this. In Table 3, the improvements in precision are concurrent with decreases in recall. The recall equation [Eq. (3)] is the same as that for precision, except recall has the sum of false negatives in the denominator instead of false positives. Thus, a decrease in recall with the buffered dataset indicates there will be more false negatives than with the original nonbuffered testing dataset.

Using the buffered target in the testing dataset, an attributes diagram and a performance diagram are provided in Figs. 8 and 9, respectively. In Fig. 8, the 1:1 dashed gray line represents an ideal model. For example, in an ideal model, predictions of a 40% probability of thunderstorms in the next hour would result in a thunderstorm (true positive) 40% of the time. The attributes diagram in Fig. 8 indicates predictions between approximately 20% and 35% result in a thunderstorm more often than anticipated, while forecast probabilities greater than approximately 38% result in a thunderstorm less often than in the ideal model case. However, almost all forecast probabilities demonstrate skill because they are above the diagonal no-skill line shown. In Fig. 9, the critical success index [Eq. (6)], a measure often used for forecast performance evaluation, falls between 0.4 and 0.5 for all thunderstorm probabilities, with the highest critical success indices occurring between 20% and 40%. Probabilities less than 50% are slightly overforecast (frequency bias is greater than 1), and probabilities greater than 50% are underforecast (frequency bias is less than 1). The 20% probability value, although overforecast, has one of the highest values for the critical success index and is skillful according to the attributes diagram, so probabilities greater than or equal to 20% are considered positive for thunderstorm activity in the statistical calculations in Eqs. (1)(4), as presented in Table 3 and Fig. 10.

Fig. 8.
Fig. 8.

Attributes diagram for the ThunderCast model constructed with the method presented in Hsu and Murphy (1986). The blue line represents the conditional event frequency (true positives per total positive predictions) for given forecast probabilities for the testing dataset. The dashed gray lines are reference lines for determining model resolution. The vertical and horizontal dashed gray lines are the no-resolution lines equal to the testing dataset’s overall relative frequency of thunderstorm occurrence. The 1:1 dashed gray line (upper diagonal line) represents a perfect reliability or forecast calibration. The lower dashed diagonal gray line is the no-skill line, where anything below this line is considered to have no skill (Brier skill score of 0).

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Fig. 9.
Fig. 9.

Performance diagram based on Roebber (2009) for the ThunderCast model on the testing dataset. The y axis displays the probability of detection [Eq. (3)], and the success ratio [i.e., 1 − FAR, or 1 − Eq. (5)] is on the x axis. The background colors denote the critical success index [Eq. (6)], and the dashed contours represent the frequency bias. Each black data point is labeled with the corresponding probability threshold (%) used for calculating the indicated quantities.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Fig. 10.
Fig. 10.

Accuracy, precision, recall, specificity, and critical success index [Eqs. (1)(4) and (6)] for ThunderCast’s predictions on the testing dataset, with target values buffered by a 15 km × 15 km centered window. Probabilities greater than or equal to 20% are considered to be positive for thunderstorm activity in the calculations for each statistic. The testing dataset is sorted by region, and each data point represents the corresponding statistic value for only the testing data patches in the month given. For spatial reference, the climate regions are distinguishable by their colors, matching Fig. 5.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

For further statistical evaluation, the testing dataset is broken into subsets to analyze performance across the diurnal (day and night) and monthly cycles and spatially across the CONUS and nearby OCONUS regions. In Table 3, the pixel-by-pixel accuracy, precision, recall, and specificity for the day and night subsets are consistent with the full testing dataset results. With the target buffer, the differences in model precision between the day and night subsets are more pronounced, indicating the model may be more precise within a 15 km × 15 km centered window during the day than during the night. Figure 10 demonstrates the accuracy, precision, recall, specificity, and critical success index’s spatial and monthly variability. Some regions (e.g., the northwest) do not have data points for each month in the testing dataset (2021), because data patches were not present in those regions during the respective months. Across CONUS and OCONUS regions, the model performs similarly during the main convective months (midlatitude summer), with higher variability during the winter months. Although statistical insight is valuable, it is also important to determine in which cases ThunderCast is performing well and not as well. A detailed evaluation of this is left for future work, but a sampling of case studies is presented in the remainder of this section to demonstrate ThunderCast’s initial prediction capabilities.

b. Case study 1: Mesoscale convective vortex

Mesoscale convective systems are organized cloud structures containing convective (cumulonimbus) and stratiform (nimbostratus) clouds, with a mesoscale cirriform cloud shield in the topmost cloud layers. In the stratiform region of a mesoscale convective system, a cyclonic vortex can form as a result of pre-existing cyclonic absolute vorticity and heating gradients. This vortex, often referred to as a mesoscale convective vortex, can trigger new convection within a mesoscale convective system (Houze 2014). On 25 August 2021, a mesoscale convective vortex over eastern Iowa resulted in thunderstorm development in western Illinois. Figure 11 shows a daytime cloud-phase distinction false-color red–green–blue (RGB) image [following the methods presented in Elsenheimer and Gravelle (2019)] of one such storm. The image depicts the beginning of the mature stage of the thunderstorm. The cold, bubbling, overshooting top is visible in the bright orange/red part of the cumulonimbus cloud, as is the beginning of the thunderstorm’s anvil. The underlying green clouds are glaciated cumulus. An animation provided in the online supplemental material for this paper shows the initiation and development of the thunderstorm through the mature stage. A separate text file in the online supplemental material contains the captions for this animation and the animations for the other two cases discussed below.

Fig. 11.
Fig. 11.

Daytime cloud-phase distinction false-color RGB image at 2041 UTC 25 Aug 2021 centered at 41.44° latitude and −90.56° longitude. Following Elsenheimer and Gravelle (2019), red is ABI 10.3-μm (channel 13) brightness temperatures, green is ABI 0.64-μm (channel 2) reflectance, and blue is ABI 1.6-μm (channel 5) reflectance. The image is 160 km × 160 km, and the light-tan line shows state borders.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Figure 12 shows ThunderCast’s predictions for the storm’s initiation along with ABI 0.64-μm reflectance and the associated radar at −10°C. ThunderCast’s first predictions of 40% or greater occur at 1831 UTC, when low-level cumulus clouds are present. The first signature of convective initiation from radar occurs at 1856 UTC, when the clouds have become glaciated, 25 min after ThunderCast’s initial predictions. To provide further context for this event, Fig. 13 displays the storm’s later development between 1931 and 2046 UTC with processed data from the GOES-16 Geostationary Lightning Mapper (GLM) (Rudlosky et al. 2019; Goodman et al. 2013; Bruning et al. 2019). According to Fig. 13, the storm becomes electrically active at 1946 UTC (50 min after 30 dBZ at −10°C) prior to anvil formation and has moved northeast of its convective initiation location in Fig. 12. ThunderCast’s predictions in Fig. 12 at 1856 UTC are elongated toward the northeast of the active convection (30 dBZ at −10°C) into nonclouded areas. Thus, ThunderCast’s next-hour thunderstorm predictions demonstrate ThunderCast may be anticipating something about storm motion through learned data patterns.

Fig. 12.
Fig. 12.

Paneled image time series between 1831 and 1856 UTC 25 Aug 2021 centered at 41.41° latitude and −90.85° longitude. Each panel contains a background of ABI 0.64-μm reflectance layered with radar reflectivity at −10°C and ThunderCast’s thunderstorm probabilities displayed as contours. The thunderstorm probabilities are valid for up to 1 h from the time listed above each panel. The radar reflectivity color bar is adapted from Helmus and Collis (2016), and state borders are indicated by the plum-colored line for spatial context. Each image is 112 km × 112 km.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Fig. 13.
Fig. 13.

Paneled image time series between 1931 and 2046 UTC 25 Aug 2021 centered at 41.41° latitude and −90.85° longitude. Each panel contains ABI 0.64-μm reflectance, GOES GLM flash extent density in flashes per 5 min (Bruning et al. 2019), and plum-colored state borders. Each image is 112 km × 112 km.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

c. Case study 2: Sea-breeze convection

Differential heating between the land and the sea is a forcing mechanism for new convection. High-pressure (cooler) sea air is driven toward the low-pressure (warmer) air over the land, forcing the warmer air to rise. In satellite imagery, this sort of convection can be identified along coasts with landward-moving surface winds, cumulus development along the land–sea boundary, and activity in the afternoon when the temperature difference between the land and the sea is greatest (Scofield and Purdom 1986). Oftentimes, sea-breeze convection can develop rapidly, making it difficult for forecasters to issue timely advisories and warnings. For this case study, the 45th Weather Squadron, U.S. Space Force, provided 159 cases between July and August 2022 for which they were unable to achieve the desired 30 min of lead time prior to observed thunderstorm hazards for regions of interest in and around Cape Canaveral, Florida. Of these cases, 88 (55%) were associated with sea-breeze activity, one of which is shown in Fig. 14.

Fig. 14.
Fig. 14.

Daytime cloud-phase distinction false-color RGB image at 1916 UTC 31 Aug 2022 centered at 28.52° latitude and −80.65° longitude. Following Elsenheimer and Gravelle (2019), red is ABI 10.3-μm (channel 13) brightness temperatures, green is ABI 0.64-μm (channel 2) reflectance, and blue is ABI 1.6-μm (channel 5) reflectance. The image is 80 km × 80 km, and the light-tan line shows state borders.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

In the sea-breeze example shown in Fig. 14, scattered low-level cumulus clouds line the coastline, with an area of towering cumulus-containing ice particles, signified by the green and orange colors present in the RGB image, near the center of the figure. An RGB animation of the life cycle of this short-lived storm is provided in the online supplemental material. As shown in Fig. 15, ThunderCast is able to identify which cumulus, among the cumulus field, are likely to initiate a thunderstorm. ThunderCast’s predictions (20% probabilities or greater) first appear at 1906:17 UTC, 5 min prior to cloud-top glaciation and MRMS radar reflectivities greater than or equal to 30 dBZ at −10°C. The 45th Weather Squadron, U.S. Space Force, issued a lightning hazard watch at 1923 UTC and recorded observed lightning at 1926 UTC. Thus, ThunderCast’s 30-dBZ predictions occur 20 min before the storm becomes electrically active in this case.

Fig. 15.
Fig. 15.

Paneled image time series between 1901 and 1926 UTC 31 Aug 2022 centered at 29.52° latitude and −80.65° longitude. Each panel contains a background of ABI 0.64-μm reflectance layered with radar reflectivity at −10°C and ThunderCast’s thunderstorm probabilities displayed as contours. The thunderstorm probabilities are valid for up to 1 h from the time listed above each panel. The radar reflectivity color bar is adapted from Helmus and Collis (2016), and coastal borders are indicated by the plum-colored line for spatial context. Each image is 80 km × 80 km.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

To assess lead time for sea-breeze cases, one randomly selected sample during each of the 22 sea-breeze days highlighted by the 45th Weather Squadron is collected for statistical analysis. Only one sample was selected each day to avoid overrepresentation of a single storm due to the close proximity of the sampled regions near Cape Canaveral (i.e., one storm may span multiple regions and have multiple data points). On average, ThunderCast’s first instance of a 20% probability prediction occurred 18.4 ± 5.8 min (95% confidence interval) prior to the first instance of 30 dBZ at −10°C within a 15 km × 15 km window centered on the prediction; 30 dBZ at −10°C occurred, on average, 46.6 ± 16.3 min (95% confidence interval) before observed lightning. Although more work is needed to test the model in varying thunderstorm environments, this demonstrates the potential ThunderCast may have for providing forecasters advance notice of clouds with the potential to become a thunderstorm in the next hour, perhaps aiding in decision support.

d. Case study 3: Southwestern monsoonal convection

On 27 August 2022, scattered nonsevere thunderstorms were present in the western United States across Utah, Arizona, New Mexico, and Colorado. Figure 16 shows some of these storms in northeastern Arizona and northwestern New Mexico, the area of interest for this case study. Many cumulus at various development stages are visible in Fig. 16, including some thick, high-level cumulus with ice particles (yellow/orange colors) exhibiting overshooting tops with anvils as well as some lower-level water-based cumulus and towering glaciated cumulus. An animation following the development of these clouds is provided in the online supplemental material. Figure 17 shows ThunderCast’s predictions as these cumulus cloud clusters develop.

Fig. 16.
Fig. 16.

Daytime cloud-phase distinction false-color RGB image at 1926 UTC 27 Aug 2022 centered at 36.13° latitude and −109.56° longitude. Following Elsenheimer and Gravelle (2019), red is ABI 10.3-μm (channel 13) brightness temperatures, green is ABI 0.64-μm (channel 2) reflectance, and blue is ABI 1.6-μm (channel 5) reflectance. The image is 160 km × 160 km, and the light-tan line shows state borders.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

Fig. 17.
Fig. 17.

Sequence of images depicting the evolution of ThunderCast probabilities for weak thunderstorms in Arizona and New Mexico between 1726 and 1926 UTC 27 Aug 2022. Centered at 36.13° latitude and −109.56° longitude, all images contain a black-and-white background of ABI 0.64-μm reflectance, the available radar reflectivity at −10°C, and contours of ThunderCast probabilities. The probabilities are valid for up to 1 h from the time listed above each panel, and the reflectances are not corrected for parallax. The radar reflectivity color bar is adapted from Helmus and Collis (2016). Each image is 160 km × 160 km, and state borders are indicated by the plum-colored lines.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0044.1

ThunderCast’s prediction lead time varies depending on the storm cluster of interest. However, the initial predictions tend to appear when the clouds are low-level cumulus, and 30 dBZ at −10°C coincides with glaciation and colder cloud tops. In the 1726 UTC panel in Fig. 17, 20% and 40% probability contours are present in the bottom left. This area reaches 30 dBZ at −10°C at 1801 UTC, achieving 35 min of lead time. Lightning is observed from the GLM at 1831 UTC, 30 min after 30 dBZ at −10°C. In the 1811 UTC panel of Fig. 17, an area containing a 60% probability contour is present in the upper left. This area’s first predictions of 20% or greater appear at 1801 UTC, 30 dBZ at −10°C is recorded at 1926 UTC, and lightning is observed from the GLM at 2001 UTC. Thus, ThunderCast achieves 25 min of lead time to 30 dBZ at −10°C, which occurs 35 min before observed lightning. A lead time is not available for the storm in the upper right, spanning the Arizona and New Mexico border, because this area does not have MRMS radar coverage. However, ThunderCast’s ability to make predictions in a no-coverage radar zone could be useful for forecasting thunderstorms in similar regions across the CONUS.

4. Summary and future work

The Thunderstorm Nowcasting Tool, or ThunderCast, was developed for predicting thunderstorm occurrence for both developing and existing storms in the next 0–60 min in the conterminous United States. ThunderCast is a deep learning model built with a U-Net convolutional neural network for semantic segmentation. The model was trained with four predictors: 0.64-μm reflectance (channel 2; red band), 1.6-μm reflectance (channel 5; snow/ice band), 10.3-μm brightness temperature (channel 13; clean longwave window band), and 12.3-μm brightness temperature (channel 15; dirty longwave window band) from the GOES-16 Advanced Baseline Imager. All four channels are included in the predictors regardless of solar illumination; however, in the absence of sunlight, the visible and shortwave infrared channels are not expected to contribute to predictions due to near-zero values. The target dataset was the maximum Multi-Radar Multi-Sensor radar reflectivity at −10°C in the next hour, where anything greater than or equal to 30 dBZ was considered positive for thunderstorm occurrence. To address computational limitations and potential biases in the datasets, the model was trained on 320 km × 320 km data patches, where patches could be flipped vertically or horizontally during data augmentation, and the model weights were heavily in favor of the positive class due to the negative pixels being far more common than the positive.

On a pixel-by-pixel basis, ThunderCast performs well statistically in terms of accuracy, recall, and specificity, but it has a low value for precision resulting from a high amount of false-positive predictions. The precision improves when evaluating the model within a buffered target region, where any occurrence of 30 dBZ at −10°C within a 15 km × 15 km window centered on the prediction is considered positive for thunderstorm activity. Considering the buffered target, ThunderCast was found to make skillful predictions with critical success indices ranging between 0.4 and 0.5 for all thunderstorm probabilities, with the highest values occurring between 20% and 40%. These statistics are consistent across CONUS climate regions, with some increases in variability during the Northern Hemisphere’s winter months. Similarly, ThunderCast performs fairly consistently during the day and the night but tends to have higher precision during the day.

ThunderCast was applied to three case studies to demonstrate the model’s initial prediction capabilities. Each case study represented convection from varying convective environments across the CONUS, including convection associated with a mesoscale convective vortex, sea breezes, and a southwestern monsoon. At least 5 min and up to 35 min of prediction lead time to 30 dBZ at −10°C was achieved, and the 30 dBZ at −10°C radar threshold occurred 30–50 min prior to observed lightning. In addition to lead time, ThunderCast demonstrated it is able to make predictions in no-coverage radar zones and anticipate storm motion from a single timestamp of four GOES-16 ABI satellite bands without additional ambient environmental context. Although more work is needed to test the model in varying thunderstorm environments, these initial case studies demonstrated ThunderCast is capable of providing forecasters advance notice on the location of thunderstorms in the next hour in the CONUS.

To further determine the extent to which ThunderCast is able to provide decision-support services to the National Weather Service for thunderstorm forecasting, ThunderCast will need to be assessed in real time, perhaps in a hazardous weather testbed. In doing so, areas for improvement within the model can be identified. Additionally, further analysis is needed to determine if ThunderCast’s identification of areas with the potential for convective initiation can be used as a proxy for thunderstorm hazards like lightning. In theory, all thunderstorms will have lightning, because it is needed to produce the characteristic thunder for which the storms are named. Within the case studies presented, 30 dBZ at −10°C occurred prior to lightning observations, indicating it may be able to provide advanced lead time to thunderstorm hazard products. However, ThunderCast will need to be evaluated to determine if there are instances in which 30-dBZ radar reflectivity at −10°C is observed but lightning is not produced or if 30-dBZ radar reflectivity at −10°C is not observed when lightning is produced. Following future work, ThunderCast could be a valuable tool for nowcasting thunderstorms in the continental United States and surrounding areas within the range of the GOES platforms.

Acknowledgments.

Funding for this work was provided by NOAA Grant NA20NES4320003. The scientific results and conclusions, as well as any views or opinions expressed herein, are those of the author(s) and do not necessarily reflect those of NOAA or the Department of Commerce. The authors thank Brian Cizek, Jenny Stewart, and the 45th Weather Squadron, U.S. Space Force, for providing data for thunderstorm events in and near Cape Canaveral, Florida, as well as the Satellite Data Services group at the University of Wisconsin–Madison for maintaining the GOES-R ABI and GLM data. A thank you is also extended to David Hoese, whose suggestions for code restructuring helped to speed up data collection and formatting. Furthermore, this research was developed with resources and computing assistance provided by the Space Science and Engineering Center (SSEC) at the University of Wisconsin–Madison, and the GPU resources were supported by the SSEC 2022 Internal Grant program. Last, the authors thank three anonymous reviewers for their constructive feedback on this paper prior to publication.

Data availability statement.

The GOES-16 data used in this research can be freely obtained from NOAA’s Comprehensive Large Array Data Stewardship System (CLASS; https://www.class.noaa.gov/). Recent MRMS data are available online through the NSSL (https://mrms.nssl.noaa.gov/) or upon request to the authors for archived data. Python code developed by the authors for data formatting, model training, validation, and testing, as well as creation of the graphs for this paper, is provided online (https://github.com/sortland33/ThunderCast). The trained ThunderCast model can be obtained by contacting the authors.

REFERENCES

  • Bradshaw, S. M., 2021: A deep learning model for nowcasting midlatitude convective storms. M.S. thesis, Dept. of Atmospheric and Oceanic Sciences, University of Wisconsin–Madison, 59 pp.

  • Bruning, E. C., and Coauthors, 2019: Meteorological imagery for the Geostationary Lightning Mapper. J. Geophys. Res. Atmos., 124, 14 28514 309, https://doi.org/10.1029/2019JD030874.

    • Search Google Scholar
    • Export Citation
  • Chollet, F., 2018: Deep Learning with Python. 1st ed. Manning Publications Co., 373 pp.

  • Cintineo, J. L., M. J. Pavolonis, J. M. Sieglaff, A. Wimmers, J. Brunner, and W. Bellon, 2020: A deep-learning model for automated detection of intense midlatitude convection using geostationary satellite images. Wea. Forecasting, 35, 25672588, https://doi.org/10.1175/WAF-D-20-0028.1.

    • Search Google Scholar
    • Export Citation
  • Cintineo, J. L., M. J. Pavolonis, and J. M. Sieglaff, 2022: ProbSevere lightningCast: A deep-learning model for satellite-based lightning nowcasting. Wea. Forecasting, 37, 12391257, https://doi.org/10.1175/WAF-D-22-0019.1.

    • Search Google Scholar
    • Export Citation
  • Cuomo, J., and V. Chandrasekar, 2021: Use of deep learning for weather radar nowcasting. J. Atmos. Oceanic Technol., 38, 16411656, https://doi.org/10.1175/JTECH-D-21-0012.1.

    • Search Google Scholar
    • Export Citation
  • Dye, J. E., and B. E. Martner, 1982: The 25 July 1976 case study: Microphyscial observations. Hailstorms of the Central High Plains, C. A. Knight and P. Squires, Eds., Vol. 2, The National Hail Research Experiment, Colorado Associated University Press, 211–228.

  • Dye, J. E., L. J. Miller, B. E. Martner, and Z. Levin, 1982: The 25 July 1976 case study: Environmental conditions, reflectivity structure, and evolution. Hailstorms of the Central High Plains, C. A. Knight and P. Squires, Eds., Vol. 2, The National Hail Research Experiment, Colorado Associated University Press, 197–209.

  • Elsenheimer, C. B., and C. M. Gravelle, 2019: Introducing lightning threat messaging using the GOES-16 day cloud phase distinction RGB composite. Wea. Forecasting, 34, 15871600, https://doi.org/10.1175/WAF-D-19-0049.1.

    • Search Google Scholar
    • Export Citation
  • Fankhauser, J. C., and C. Wade, 1982: The environment of the storms. Hailstorms of the Central High Plains, C. A. Knight and P. Squires, Eds., Vol. 1, The National Hail Research Experiment, Colorado Associated University Press, 5–33.

  • Goodman, S. J., and Coauthors, 2013: The GOES-R Geostationary Lightning Mapper (GLM). Atmos. Res., 125–126, 3449, https://doi.org/10.1016/j.atmosres.2013.01.006.

    • Search Google Scholar
    • Export Citation
  • Gremillion, M. S., and R. E. Orville, 1999: Thunderstorm characteristics of cloud-to-ground lightning at the Kennedy Space Center, Florida: A study of lightning initiation signatures as indicated by the WSR-88D. Wea. Forecasting, 14, 640649, https://doi.org/10.1175/1520-0434(1999)014<0640:TCOCTG>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Griffin, S. M., A. Wimmers, and C. S. Velden, 2022: Predicting rapid intensification in North Atlantic and eastern North Pacific tropical cyclones using a convolutional neural network. Wea. Forecasting, 37, 13331355, https://doi.org/10.1175/WAF-D-21-0194.1.

    • Search Google Scholar
    • Export Citation
  • Helmus, J. J., and S. M. Collis, 2016: The Python ARM Radar Toolkit (Py-ART), a library for working with weather radar data in the python programming language. J. Open Res. Software, 4, e25, https://doi.org/10.5334/jors.119.

    • Search Google Scholar
    • Export Citation
  • Houze, R. A., Jr., 2014: Cloud Dynamics. 2nd ed. Academic Press, 496 pp.

  • Hsu, W.-R., and A. H. Murphy, 1986: The attributes diagram a geometrical framework for assessing the quality of probability forecasts. Int. J. Forecasting, 2, 285293, https://doi.org/10.1016/0169-2070(86)90048-8.

    • Search Google Scholar
    • Export Citation
  • Johnson, J. T., P. L. MacKeen, A. Witt, E. D. W. Mitchell, G. J. Stumpf, M. D. Eilts, and K. W. Thomas, 1998: The storm cell identification and tracking algorithm: An enhanced WSR-88D algorithm. Wea. Forecasting, 13, 263276, https://doi.org/10.1175/1520-0434(1998)013<0263:TSCIAT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Karl, T. R., and W. J. Koss, 1984: Regional and national monthly, seasonal, and annual temperature weighted by area, 1895–1983. National Climatic Data Center Tech. Rep. Historical Climatology Series 4-3, 38 pp.

  • Kingma, D. P., and J. Ba, 2017: Adam: A method for stochastic optimization. arXiv, 1412.6980v9, https://doi.org/10.48550/ARXIV.1412.6980.

  • Lagerquist, R., A. McGovern, and D. J. Gagne II, 2019: Deep learning for spatially explicit prediction of synoptic-scale fronts. Wea. Forecasting, 34, 11371160, https://doi.org/10.1175/WAF-D-18-0183.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., A. McGovern, C. R. Homeyer, D. J. Gagne II, and T. Smith, 2020: Deep learning on three-dimensional multiscale data for next-hour tornado prediction. Mon. Wea. Rev., 148, 28372861, https://doi.org/10.1175/MWR-D-19-0372.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., J. Q. Stewart, I. Ebert-Uphoff, and C. Kumler, 2021: Using deep learning to nowcast the spatial coverage of convection from Himawari-8 satellite data. Mon. Wea. Rev., 149, 38973921, https://doi.org/10.1175/MWR-D-21-0096.1.

    • Search Google Scholar
    • Export Citation
  • Lee, Y., C. D. Kummerow, and I. Ebert-Uphoff, 2021: Applying machine learning methods to detect convection using Geostationary Operational Environmental Satellite-16 (GOES-16) Advanced Baseline Imager (ABI) data. Atmos. Meas. Tech., 14, 26992716, https://doi.org/10.5194/amt-14-2699-2021.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., Q. Ren, J. Geng, M. Ding, and J. Li, 2018: Efficient patch-wise semantic segmentation for large-scale remote sensing images. Sensors, 18, 3232, https://doi.org/10.3390/s18103232.

    • Search Google Scholar
    • Export Citation
  • Maas, A. L., A. Y. Hannun, and A. Y. Ng, 2013: Rectifier nonlinearities improve neural network acoustic models. Proceedings of the 30th International Conference on Machine Learning, Vol. 28, S. Dasgupta and D. McAllester, Eds., PMLR, 3, https://ai.stanford.edu/∼amaas/papers/relu_hybrid_icml2013_final.pdf.

  • Mecikalski, J. R., and K. M. Bedka, 2006: Forecasting convective initiation by monitoring the evolution of moving cumulus in daytime GOES imagery. Mon. Wea. Rev., 134, 4978, https://doi.org/10.1175/MWR3062.1.

    • Search Google Scholar
    • Export Citation
  • Mecikalski, J. R., W. M. MacKenzie Jr., M. Koenig, and S. Muller, 2010a: Cloud-top properties of growing cumulus prior to convective initiation as measured by Meteosat second generation. Part I: Infrared fields. J. Appl. Meteor. Climatol., 49, 521534, https://doi.org/10.1175/2009JAMC2344.1.

    • Search Google Scholar
    • Export Citation
  • Mecikalski, J. R., W. M. MacKenzie Jr., M. König, and S. Muller, 2010b: Cloud-top properties of growing cumulus prior to convective initiation as measured by Meteosat second generation. Part II: Use of visible reflectance. J. Appl. Meteor. Climatol., 49, 25442558, https://doi.org/10.1175/2010JAMC2480.1.

    • Search Google Scholar
    • Export Citation
  • Mecikalski, J. R., J. K. Williams, C. P. Jewett, D. Ahijevych, A. LeRoy, and J. R. Walker, 2015: Probabilistic 0–1-h convective initiation nowcasts that combine geostationary satellite observations and numerical weather prediction model data. J. Appl. Meteor. Climatol., 54, 10391059, https://doi.org/10.1175/JAMC-D-14-0129.1.

    • Search Google Scholar
    • Export Citation
  • Morgan, G. M., and P. Squires, 1982: Introduction. Hailstorms of the Central High Plains, C. A. Knight, and P. Squires, Eds., Vol. 1, The National Hail Research Experiment, Colorado Associated University Press, 1–4.

  • Mueller, C., T. Saxen, R. Roberts, J. Wilson, T. Betancourt, S. Dettling, N. Oien, and J. Yee, 2003: NCAR Auto-Nowcast System. Wea. Forecasting, 18, 545561, https://doi.org/10.1175/1520-0434(2003)018<0545:NAS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Pavolonis, M. J., A. K. Heidinger, and T. Uttal, 2005: Daytime global cloud typing from AVHRR and VIIRS: Algorithm description, validation, and comparisons. J. Appl. Meteor., 44, 804826, https://doi.org/10.1175/JAM2236.1.

    • Search Google Scholar
    • Export Citation
  • Ravuri, S., and Coauthors, 2021: Skillful precipitation nowcasting using deep generative models of radar. Nature, 597, 672677, https://doi.org/10.1038/s41586-021-03854-z.

    • Search Google Scholar
    • Export Citation
  • Roberts, N. M., and H. W. Lean, 2008: Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Wea. Rev., 136, 7897, https://doi.org/10.1175/2007MWR2123.1.

    • Search Google Scholar
    • Export Citation
  • Roberts, R. D., and S. Rutledge, 2003: Nowcasting storm initiation and growth using GOES-8 and WSR-88D data. Wea. Forecasting, 18, 562584, https://doi.org/10.1175/1520-0434(2003)018<0562:NSIAGU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Roebber, P. J., 2009: Visualizing multiple measures of forecast quality. Wea. Forecasting, 24, 601608, https://doi.org/10.1175/2008WAF2222159.1.

    • Search Google Scholar
    • Export Citation
  • Ronneberger, O., P. Fischer, and T. Brox, 2015: U-Net: Convolutional networks for biomedical image segmentation. arXiv, 1505.04597v1, https://doi.org/10.48550/arXiv.1505.04597.

  • Rudlosky, S. D., S. J. Goodman, K. S. Virts, and E. C. Bruning, 2019: Initial Geostationary Lightning Mapper observations. Geophys. Res. Lett., 46, 10971104, https://doi.org/10.1029/2018GL081052.

    • Search Google Scholar
    • Export Citation
  • Schmit, T. J., P. Griffith, M. M. Gunshor, J. M. Daniels, S. J. Goodman, and W. J. Lebair, 2017: A closer look at the ABI on the GOES-R series. Bull. Amer. Meteor. Soc., 98, 681698, https://doi.org/10.1175/BAMS-D-15-00230.1.

    • Search Google Scholar
    • Export Citation
  • Scofield, R. A., and J. F. W. Purdom, 1986: The use of satellite data for mesoscale analyses and forecasting applications. Mesoscale Meteorology and Forecasting, P. S. Ray, Ed., Amer. Meteor. Soc., 118–150.

  • Sieglaff, J. M., L. M. Cronce, W. F. Feltz, K. M. Bedka, M. J. Pavolonis, and A. K. Heidinger, 2011: Nowcasting convective storm initiation using satellite-based box-averaged cloud-top cooling and cloud-type trends. J. Appl. Meteor. Climatol., 50, 110126, https://doi.org/10.1175/2010JAMC2496.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Stevens, E., L. Antiga, and T. Viehmann, 2020: Deep Learning with PyTorch. Manning Publications Co., 520 pp.

  • Walker, J. R., W. M. MacKenzie Jr., J. R. Mecikalski, and C. P. Jewett, 2012: An enhanced geostationary satellite–based convective initiation algorithm for 0–2-h nowcasting with object tracking. J. Appl. Meteor. Climatol., 51, 19311949, https://doi.org/10.1175/JAMC-D-11-0246.1.

    • Search Google Scholar
    • Export Citation
  • Wilson, J. W., and W. E. Schreiber, 1986: Initiation of convective storms at radar-observed boundary-layer convergence lines. Mon. Wea. Rev., 114, 25162536, https://doi.org/10.1175/1520-0493(1986)114<2516:IOCSAR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Wimmers, A., C. Velden, and J. H. Cossuth, 2019: Using deep learning to estimate tropical cyclone intensity from satellite passive microwave imagery. Mon. Wea. Rev., 147, 22612282, https://doi.org/10.1175/MWR-D-18-0391.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Search Google Scholar
    • Export Citation
  • Zhou, K., Y. Zheng, W. Dong, and T. Wang, 2020: A deep learning network for cloud-to-ground lightning nowcasting with multisource data. J. Atmos. Oceanic Technol., 37, 927942, https://doi.org/10.1175/JTECH-D-19-0146.1.

    • Search Google Scholar
    • Export Citation
  • Zhou, X., Y.-a. Geng, H. Yu, Q. Li, L. Xu, W. Yao, D. Zheng, and Y. Zhang, 2022: LightNet+: A dual-source lightning forecasting network with bi-direction spatiotemporal transformation. Appl. Intell., 52, 11 14711 159, https://doi.org/10.1007/s10489-021-03089-5.

    • Search Google Scholar
    • Export Citation
  • Zipser, E. J., and K. R. Lutz, 1994: The vertical profile of radar reflectivity of convective cells: A strong indicator of storm intensity and lightning probability? Mon. Wea. Rev., 122, 17511759, https://doi.org/10.1175/1520-0493(1994)122<1751:TVPORR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save
  • Bradshaw, S. M., 2021: A deep learning model for nowcasting midlatitude convective storms. M.S. thesis, Dept. of Atmospheric and Oceanic Sciences, University of Wisconsin–Madison, 59 pp.

  • Bruning, E. C., and Coauthors, 2019: Meteorological imagery for the Geostationary Lightning Mapper. J. Geophys. Res. Atmos., 124, 14 28514 309, https://doi.org/10.1029/2019JD030874.

    • Search Google Scholar
    • Export Citation
  • Chollet, F., 2018: Deep Learning with Python. 1st ed. Manning Publications Co., 373 pp.

  • Cintineo, J. L., M. J. Pavolonis, J. M. Sieglaff, A. Wimmers, J. Brunner, and W. Bellon, 2020: A deep-learning model for automated detection of intense midlatitude convection using geostationary satellite images. Wea. Forecasting, 35, 25672588, https://doi.org/10.1175/WAF-D-20-0028.1.

    • Search Google Scholar
    • Export Citation
  • Cintineo, J. L., M. J. Pavolonis, and J. M. Sieglaff, 2022: ProbSevere lightningCast: A deep-learning model for satellite-based lightning nowcasting. Wea. Forecasting, 37, 12391257, https://doi.org/10.1175/WAF-D-22-0019.1.

    • Search Google Scholar
    • Export Citation
  • Cuomo, J., and V. Chandrasekar, 2021: Use of deep learning for weather radar nowcasting. J. Atmos. Oceanic Technol., 38, 16411656, https://doi.org/10.1175/JTECH-D-21-0012.1.

    • Search Google Scholar
    • Export Citation
  • Dye, J. E., and B. E. Martner, 1982: The 25 July 1976 case study: Microphyscial observations. Hailstorms of the Central High Plains, C. A. Knight and P. Squires, Eds., Vol. 2, The National Hail Research Experiment, Colorado Associated University Press, 211–228.

  • Dye, J. E., L. J. Miller, B. E. Martner, and Z. Levin, 1982: The 25 July 1976 case study: Environmental conditions, reflectivity structure, and evolution. Hailstorms of the Central High Plains, C. A. Knight and P. Squires, Eds., Vol. 2, The National Hail Research Experiment, Colorado Associated University Press, 197–209.

  • Elsenheimer, C. B., and C. M. Gravelle, 2019: Introducing lightning threat messaging using the GOES-16 day cloud phase distinction RGB composite. Wea. Forecasting, 34, 15871600, https://doi.org/10.1175/WAF-D-19-0049.1.

    • Search Google Scholar
    • Export Citation
  • Fankhauser, J. C., and C. Wade, 1982: The environment of the storms. Hailstorms of the Central High Plains, C. A. Knight and P. Squires, Eds., Vol. 1, The National Hail Research Experiment, Colorado Associated University Press, 5–33.

  • Goodman, S. J., and Coauthors, 2013: The GOES-R Geostationary Lightning Mapper (GLM). Atmos. Res., 125–126, 3449, https://doi.org/10.1016/j.atmosres.2013.01.006.

    • Search Google Scholar
    • Export Citation
  • Gremillion, M. S., and R. E. Orville, 1999: Thunderstorm characteristics of cloud-to-ground lightning at the Kennedy Space Center, Florida: A study of lightning initiation signatures as indicated by the WSR-88D. Wea. Forecasting, 14, 640649, https://doi.org/10.1175/1520-0434(1999)014<0640:TCOCTG>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Griffin, S. M., A. Wimmers, and C. S. Velden, 2022: Predicting rapid intensification in North Atlantic and eastern North Pacific tropical cyclones using a convolutional neural network. Wea. Forecasting, 37, 13331355, https://doi.org/10.1175/WAF-D-21-0194.1.

    • Search Google Scholar
    • Export Citation
  • Helmus, J. J., and S. M. Collis, 2016: The Python ARM Radar Toolkit (Py-ART), a library for working with weather radar data in the python programming language. J. Open Res. Software, 4, e25, https://doi.org/10.5334/jors.119.

    • Search Google Scholar
    • Export Citation
  • Houze, R. A., Jr., 2014: Cloud Dynamics. 2nd ed. Academic Press, 496 pp.

  • Hsu, W.-R., and A. H. Murphy, 1986: The attributes diagram a geometrical framework for assessing the quality of probability forecasts. Int. J. Forecasting, 2, 285293, https://doi.org/10.1016/0169-2070(86)90048-8.

    • Search Google Scholar
    • Export Citation
  • Johnson, J. T., P. L. MacKeen, A. Witt, E. D. W. Mitchell, G. J. Stumpf, M. D. Eilts, and K. W. Thomas, 1998: The storm cell identification and tracking algorithm: An enhanced WSR-88D algorithm. Wea. Forecasting, 13, 263276, https://doi.org/10.1175/1520-0434(1998)013<0263:TSCIAT>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Karl, T. R., and W. J. Koss, 1984: Regional and national monthly, seasonal, and annual temperature weighted by area, 1895–1983. National Climatic Data Center Tech. Rep. Historical Climatology Series 4-3, 38 pp.

  • Kingma, D. P., and J. Ba, 2017: Adam: A method for stochastic optimization. arXiv, 1412.6980v9, https://doi.org/10.48550/ARXIV.1412.6980.

  • Lagerquist, R., A. McGovern, and D. J. Gagne II, 2019: Deep learning for spatially explicit prediction of synoptic-scale fronts. Wea. Forecasting, 34, 11371160, https://doi.org/10.1175/WAF-D-18-0183.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., A. McGovern, C. R. Homeyer, D. J. Gagne II, and T. Smith, 2020: Deep learning on three-dimensional multiscale data for next-hour tornado prediction. Mon. Wea. Rev., 148, 28372861, https://doi.org/10.1175/MWR-D-19-0372.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., J. Q. Stewart, I. Ebert-Uphoff, and C. Kumler, 2021: Using deep learning to nowcast the spatial coverage of convection from Himawari-8 satellite data. Mon. Wea. Rev., 149, 38973921, https://doi.org/10.1175/MWR-D-21-0096.1.

    • Search Google Scholar
    • Export Citation
  • Lee, Y., C. D. Kummerow, and I. Ebert-Uphoff, 2021: Applying machine learning methods to detect convection using Geostationary Operational Environmental Satellite-16 (GOES-16) Advanced Baseline Imager (ABI) data. Atmos. Meas. Tech., 14, 26992716, https://doi.org/10.5194/amt-14-2699-2021.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., Q. Ren, J. Geng, M. Ding, and J. Li, 2018: Efficient patch-wise semantic segmentation for large-scale remote sensing images. Sensors, 18, 3232, https://doi.org/10.3390/s18103232.

    • Search Google Scholar
    • Export Citation
  • Maas, A. L., A. Y. Hannun, and A. Y. Ng, 2013: Rectifier nonlinearities improve neural network acoustic models. Proceedings of the 30th International Conference on Machine Learning, Vol. 28, S. Dasgupta and D. McAllester, Eds., PMLR, 3, https://ai.stanford.edu/∼amaas/papers/relu_hybrid_icml2013_final.pdf.

  • Mecikalski, J. R., and K. M. Bedka, 2006: Forecasting convective initiation by monitoring the evolution of moving cumulus in daytime GOES imagery. Mon. Wea. Rev., 134, 4978, https://doi.org/10.1175/MWR3062.1.

    • Search Google Scholar
    • Export Citation
  • Mecikalski, J. R., W. M. MacKenzie Jr., M. Koenig, and S. Muller, 2010a: Cloud-top properties of growing cumulus prior to convective initiation as measured by Meteosat second generation. Part I: Infrared fields. J. Appl. Meteor. Climatol., 49, 521534, https://doi.org/10.1175/2009JAMC2344.1.

    • Search Google Scholar
    • Export Citation
  • Mecikalski, J. R., W. M. MacKenzie Jr., M. König, and S. Muller, 2010b: Cloud-top properties of growing cumulus prior to convective initiation as measured by Meteosat second generation. Part II: Use of visible reflectance. J. Appl. Meteor. Climatol., 49, 25442558, https://doi.org/10.1175/2010JAMC2480.1.

    • Search Google Scholar
    • Export Citation
  • Mecikalski, J. R., J. K. Williams, C. P. Jewett, D. Ahijevych, A. LeRoy, and J. R. Walker, 2015: Probabilistic 0–1-h convective initiation nowcasts that combine geostationary satellite observations and numerical weather prediction model data. J. Appl. Meteor. Climatol., 54, 10391059, https://doi.org/10.1175/JAMC-D-14-0129.1.

    • Search Google Scholar
    • Export Citation
  • Morgan, G. M., and P. Squires, 1982: Introduction. Hailstorms of the Central High Plains, C. A. Knight, and P. Squires, Eds., Vol. 1, The National Hail Research Experiment, Colorado Associated University Press, 1–4.

  • Mueller, C., T. Saxen, R. Roberts, J. Wilson, T. Betancourt, S. Dettling, N. Oien, and J. Yee, 2003: NCAR Auto-Nowcast System. Wea. Forecasting, 18, 545561, https://doi.org/10.1175/1520-0434(2003)018<0545:NAS>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Pavolonis, M. J., A. K. Heidinger, and T. Uttal, 2005: Daytime global cloud typing from AVHRR and VIIRS: Algorithm description, validation, and comparisons. J. Appl. Meteor., 44, 804826, https://doi.org/10.1175/JAM2236.1.

    • Search Google Scholar
    • Export Citation
  • Ravuri, S., and Coauthors, 2021: Skillful precipitation nowcasting using deep generative models of radar. Nature, 597, 672677, https://doi.org/10.1038/s41586-021-03854-z.

    • Search Google Scholar
    • Export Citation
  • Roberts, N. M., and H. W. Lean, 2008: Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Wea. Rev., 136, 7897, https://doi.org/10.1175/2007MWR2123.1.

    • Search Google Scholar
    • Export Citation
  • Roberts, R. D., and S. Rutledge, 2003: Nowcasting storm initiation and growth using GOES-8 and WSR-88D data. Wea. Forecasting, 18, 562584, https://doi.org/10.1175/1520-0434(2003)018<0562:NSIAGU>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Roebber, P. J., 2009: Visualizing multiple measures of forecast quality. Wea. Forecasting, 24, 601608, https://doi.org/10.1175/2008WAF2222159.1.

    • Search Google Scholar
    • Export Citation
  • Ronneberger, O., P. Fischer, and T. Brox, 2015: U-Net: Convolutional networks for biomedical image segmentation. arXiv, 1505.04597v1, https://doi.org/10.48550/arXiv.1505.04597.

  • Rudlosky, S. D., S. J. Goodman, K. S. Virts, and E. C. Bruning, 2019: Initial Geostationary Lightning Mapper observations. Geophys. Res. Lett., 46, 10971104, https://doi.org/10.1029/2018GL081052.

    • Search Google Scholar
    • Export Citation
  • Schmit, T. J., P. Griffith, M. M. Gunshor, J. M. Daniels, S. J. Goodman, and W. J. Lebair, 2017: A closer look at the ABI on the GOES-R series. Bull. Amer. Meteor. Soc., 98, 681698, https://doi.org/10.1175/BAMS-D-15-00230.1.

    • Search Google Scholar
    • Export Citation
  • Scofield, R. A., and J. F. W. Purdom, 1986: The use of satellite data for mesoscale analyses and forecasting applications. Mesoscale Meteorology and Forecasting, P. S. Ray, Ed., Amer. Meteor. Soc., 118–150.

  • Sieglaff, J. M., L. M. Cronce, W. F. Feltz, K. M. Bedka, M. J. Pavolonis, and A. K. Heidinger, 2011: Nowcasting convective storm initiation using satellite-based box-averaged cloud-top cooling and cloud-type trends. J. Appl. Meteor. Climatol., 50, 110126, https://doi.org/10.1175/2010JAMC2496.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Stevens, E., L. Antiga, and T. Viehmann, 2020: Deep Learning with PyTorch. Manning Publications Co., 520 pp.

  • Walker, J. R., W. M. MacKenzie Jr., J. R. Mecikalski, and C. P. Jewett, 2012: An enhanced geostationary satellite–based convective initiation algorithm for 0–2-h nowcasting with object tracking. J. Appl. Meteor. Climatol., 51, 19311949, https://doi.org/10.1175/JAMC-D-11-0246.1.

    • Search Google Scholar
    • Export Citation
  • Wilson, J. W., and W. E. Schreiber, 1986: Initiation of convective storms at radar-observed boundary-layer convergence lines. Mon. Wea. Rev., 114, 25162536, https://doi.org/10.1175/1520-0493(1986)114<2516:IOCSAR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Wimmers, A., C. Velden, and J. H. Cossuth, 2019: Using deep learning to estimate tropical cyclone intensity from satellite passive microwave imagery. Mon. Wea. Rev., 147, 22612282, https://doi.org/10.1175/MWR-D-18-0391.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Search Google Scholar
    • Export Citation
  • Zhou, K., Y. Zheng, W. Dong, and T. Wang, 2020: A deep learning network for cloud-to-ground lightning nowcasting with multisource data. J. Atmos. Oceanic Technol., 37, 927942, https://doi.org/10.1175/JTECH-D-19-0146.1.

    • Search Google Scholar
    • Export Citation
  • Zhou, X., Y.-a. Geng, H. Yu, Q. Li, L. Xu, W. Yao, D. Zheng, and Y. Zhang, 2022: LightNet+: A dual-source lightning forecasting network with bi-direction spatiotemporal transformation. Appl. Intell., 52, 11 14711 159, https://doi.org/10.1007/s10489-021-03089-5.

    • Search Google Scholar
    • Export Citation
  • Zipser, E. J., and K. R. Lutz, 1994: The vertical profile of radar reflectivity of convective cells: A strong indicator of storm intensity and lightning probability? Mon. Wea. Rev., 122, 17511759, https://doi.org/10.1175/1520-0493(1994)122<1751:TVPORR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Diagram of model training for ThunderCast. The cycle is repeated until the model loss is minimized.

  • Fig. 2.

    Depiction of ThunderCast’s U-Net convolutional neural network model architecture. The figure is adapted for this application from Ronneberger et al. (2015).

  • Fig. 3.

    Example of a corner patch (black) with overlapping adjacent patches (blue and red outlines). The number of pixels is labeled at 0.5-km resolution (black), and the patches are not drawn to scale.

  • Fig. 4.

    Temporal distribution of data patches composing the training, validation, and testing datasets.

  • Fig. 5.

    Map of the CONUS with state colors corresponding to the state’s respective climate region in accordance with Karl and Koss (1984).

  • Fig. 6.

    Spatial distribution of data patches composing the training, validation, and testing datasets sorted by climate regions in the CONUS and OCONUS. The number of patches is indicated by the height of the bars as well as the values above each bar (“e+0X” indicates “× 10X”).

  • Fig. 7.

    Fraction skill score diagram based on the method presented in Roberts and Lean (2008). The fraction skill scores [Eq. (10)] are calculated for probability thresholds ranging from 10% to 90% for various window lengths. The window lengths [n in Eq. (10)] are given in kilometers but can be referred to as pixels or grid spaces since the spatial resolution is 1 km. The probability thresholds (%) are colored according to the legend in the upper-right-hand corner of the diagram. The lower dashed gray line represents the fraction of observed points exceeding 30 dBZ at −10°C over the domain and is called the random fraction skill score. The upper dashed gray line is the uniform fraction skill score and marks halfway between the random and perfect skill scores.

  • Fig. 8.

    Attributes diagram for the ThunderCast model constructed with the method presented in Hsu and Murphy (1986). The blue line represents the conditional event frequency (true positives per total positive predictions) for given forecast probabilities for the testing dataset. The dashed gray lines are reference lines for determining model resolution. The vertical and horizontal dashed gray lines are the no-resolution lines equal to the testing dataset’s overall relative frequency of thunderstorm occurrence. The 1:1 dashed gray line (upper diagonal line) represents a perfect reliability or forecast calibration. The lower dashed diagonal gray line is the no-skill line, where anything below this line is considered to have no skill (Brier skill score of 0).

  • Fig. 9.

    Performance diagram based on Roebber (2009) for the ThunderCast model on the testing dataset. The y axis displays the probability of detection [Eq. (3)], and the success ratio [i.e., 1 − FAR, or 1 − Eq. (5)] is on the x axis. The background colors denote the critical success index [Eq. (6)], and the dashed contours represent the frequency bias. Each black data point is labeled with the corresponding probability threshold (%) used for calculating the indicated quantities.

  • Fig. 10.

    Accuracy, precision, recall, specificity, and critical success index [Eqs. (1)(4) and (6)] for ThunderCast’s predictions on the testing dataset, with target values buffered by a 15 km × 15 km centered window. Probabilities greater than or equal to 20% are considered to be positive for thunderstorm activity in the calculations for each statistic. The testing dataset is sorted by region, and each data point represents the corresponding statistic value for only the testing data patches in the month given. For spatial reference, the climate regions are distinguishable by their colors, matching Fig. 5.

  • Fig. 11.

    Daytime cloud-phase distinction false-color RGB image at 2041 UTC 25 Aug 2021 centered at 41.44° latitude and −90.56° longitude. Following Elsenheimer and Gravelle (2019), red is ABI 10.3-μm (channel 13) brightness temperatures, green is ABI 0.64-μm (channel 2) reflectance, and blue is ABI 1.6-μm (channel 5) reflectance. The image is 160 km × 160 km, and the light-tan line shows state borders.

  • Fig. 12.

    Paneled image time series between 1831 and 1856 UTC 25 Aug 2021 centered at 41.41° latitude and −90.85° longitude. Each panel contains a background of ABI 0.64-μm reflectance layered with radar reflectivity at −10°C and ThunderCast’s thunderstorm probabilities displayed as contours. The thunderstorm probabilities are valid for up to 1 h from the time listed above each panel. The radar reflectivity color bar is adapted from Helmus and Collis (2016), and state borders are indicated by the plum-colored line for spatial context. Each image is 112 km × 112 km.

  • Fig. 13.

    Paneled image time series between 1931 and 2046 UTC 25 Aug 2021 centered at 41.41° latitude and −90.85° longitude. Each panel contains ABI 0.64-μm reflectance, GOES GLM flash extent density in flashes per 5 min (Bruning et al. 2019), and plum-colored state borders. Each image is 112 km × 112 km.

  • Fig. 14.

    Daytime cloud-phase distinction false-color RGB image at 1916 UTC 31 Aug 2022 centered at 28.52° latitude and −80.65° longitude. Following Elsenheimer and Gravelle (2019), red is ABI 10.3-μm (channel 13) brightness temperatures, green is ABI 0.64-μm (channel 2) reflectance, and blue is ABI 1.6-μm (channel 5) reflectance. The image is 80 km × 80 km, and the light-tan line shows state borders.

  • Fig. 15.

    Paneled image time series between 1901 and 1926 UTC 31 Aug 2022 centered at 29.52° latitude and −80.65° longitude. Each panel contains a background of ABI 0.64-μm reflectance layered with radar reflectivity at −10°C and ThunderCast’s thunderstorm probabilities displayed as contours. The thunderstorm probabilities are valid for up to 1 h from the time listed above each panel. The radar reflectivity color bar is adapted from Helmus and Collis (2016), and coastal borders are indicated by the plum-colored line for spatial context. Each image is 80 km × 80 km.

  • Fig. 16.

    Daytime cloud-phase distinction false-color RGB image at 1926 UTC 27 Aug 2022 centered at 36.13° latitude and −109.56° longitude. Following Elsenheimer and Gravelle (2019), red is ABI 10.3-μm (channel 13) brightness temperatures, green is ABI 0.64-μm (channel 2) reflectance, and blue is ABI 1.6-μm (channel 5) reflectance. The image is 160 km × 160 km, and the light-tan line shows state borders.

  • Fig. 17.

    Sequence of images depicting the evolution of ThunderCast probabilities for weak thunderstorms in Arizona and New Mexico between 1726 and 1926 UTC 27 Aug 2022. Centered at 36.13° latitude and −109.56° longitude, all images contain a black-and-white background of ABI 0.64-μm reflectance, the available radar reflectivity at −10°C, and contours of ThunderCast probabilities. The probabilities are valid for up to 1 h from the time listed above each panel, and the reflectances are not corrected for parallax. The radar reflectivity color bar is adapted from Helmus and Collis (2016). Each image is 160 km × 160 km, and state borders are indicated by the plum-colored lines.