Validating GOES Radar Estimation via Machine Learning to Inform NWP (GREMLIN) Product over CONUS

Yoonjin Lee aCooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, Colorado

Search for other papers by Yoonjin Lee in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-2092-3078
and
Kyle Hilburn aCooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, Colorado

Search for other papers by Kyle Hilburn in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

Geostationary Operational Environmental Satellites (GOES) Radar Estimation via Machine Learning to Inform NWP (GREMLIN) is a machine learning model that outputs composite reflectivity using GOES-R Series Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) input data. GREMLIN is useful for observing severe weather and initializing convection for short-term forecasts, especially over regions without ground-based radars. This study expands the evaluation of GREMLIN’s accuracy against the Multi-Radar Multi-Sensor (MRMS) System to the entire contiguous United States (CONUS) for the entire annual cycle. Regional and temporal variation of validation metrics are examined over CONUS by season, day of year, and time of day. Since GREMLIN was trained with data in spring and summer, root-mean-square difference (RMSD) and bias are lowest in the order of summer, spring, fall, and winter. In summer, diurnal patterns of RMSD follow those of precipitation occurrence. Winter has the highest RMSD because of cold surfaces mistaken as precipitating clouds, but some of these errors can be removed by applying the ABI clear-sky mask product and correcting biases using a lookup table. In GREMLIN, strong echoes are closely related to the existence of lightning and corresponding low brightness temperatures, which result in different error distributions over different regions of CONUS. This leads to negative biases in cold seasons over Washington State, lower 30-dBZ critical success index caused by high misses over the Northeast, and higher false alarms over Florida that are due to higher frequency of lightning.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Yoonjin Lee, yoonjin.lee@colostate.edu

Abstract

Geostationary Operational Environmental Satellites (GOES) Radar Estimation via Machine Learning to Inform NWP (GREMLIN) is a machine learning model that outputs composite reflectivity using GOES-R Series Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) input data. GREMLIN is useful for observing severe weather and initializing convection for short-term forecasts, especially over regions without ground-based radars. This study expands the evaluation of GREMLIN’s accuracy against the Multi-Radar Multi-Sensor (MRMS) System to the entire contiguous United States (CONUS) for the entire annual cycle. Regional and temporal variation of validation metrics are examined over CONUS by season, day of year, and time of day. Since GREMLIN was trained with data in spring and summer, root-mean-square difference (RMSD) and bias are lowest in the order of summer, spring, fall, and winter. In summer, diurnal patterns of RMSD follow those of precipitation occurrence. Winter has the highest RMSD because of cold surfaces mistaken as precipitating clouds, but some of these errors can be removed by applying the ABI clear-sky mask product and correcting biases using a lookup table. In GREMLIN, strong echoes are closely related to the existence of lightning and corresponding low brightness temperatures, which result in different error distributions over different regions of CONUS. This leads to negative biases in cold seasons over Washington State, lower 30-dBZ critical success index caused by high misses over the Northeast, and higher false alarms over Florida that are due to higher frequency of lightning.

© 2024 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Yoonjin Lee, yoonjin.lee@colostate.edu

1. Introduction

Artificial intelligence (AI) techniques have recently gained much attention in the atmospheric sciences, and use of AI in the field has increased significantly. Meteorological satellite data are well suited for applying AI techniques as the amount of data is abundant enough to train AI models, and weather systems are intrinsically nonlinear. As Abbe (1901) and Bjerknes (1904) started an era of numerical weather prediction using the laws of physics (Bauer et al. 2015), AI has opened up a new data-driven era in atmospheric science with the help of increased computing power. An increasing number of observations, particularly from the latest generation of geostationary satellites, also have played an important role in the increase of AI applications. AI has enabled us to resolve nonlinear relationships within meteorological data, extract spatiotemporal patterns of weather phenomena from meteorological images, and improve prediction accuracy in ways that were not possible before. Various machine learning applications in the atmospheric science field include expediting model parameterizations (Lagerquist et al. 2021; Gettelman et al. 2021), downscaling of a model or an observation (Im et al. 2016; Li et al. 2019), improving satellite retrieval accuracy (Pfreundschuh et al. 2022; Liu et al. 2022), detecting weather events from meteorological images (Chapman et al. 2019; Lagerquist et al. 2020; Lee et al. 2021), and creating synthetic satellite datasets (Park et al. 2023; Hilburn et al. 2021).

For AI applications to be trusted by users for decision-making, it is also critical that AI model results are thoroughly validated and the reasoning behind AI model predictions are explainable. The most common way to make AI models more transparent is to use an explainable AI (XAI) technique. XAI is a tool that highlights important input features that contributed to the predictions of an AI model. Increased use of AI has been accompanied by increased use of XAI methods for this reason (Arrieta et al. 2020), especially in geoscience fields where physical meaning matters. Ebert-Uphoff and Hilburn (2020) provide guidelines to obtain more insights in AI model results for meteorological applications. It suggests that model accuracy needs to be thoroughly analyzed by dividing input samples into several physically meaningful groups. Subsetting strategies include grouping by biggest successes and failures, sorting by the true value, grouping by time or space parameters, or grouping by meteorological properties. Such interpretation results can help explore model’s prediction strategies and how the model behaves in different climate regimes.

Geostationary Operational Environmental Satellites (GOES) Radar Estimation via Machine Learning to Inform NWP (GREMLIN) is a convolutional neural network (CNN)-based synthetic radar reflectivity emulator developed by Hilburn et al. (2021, hereinafter H21). It uses three brightness temperature predictors from GOES-16 Advanced Baseline Imager (ABI) and one lightning predictor from GOES Geostationary Lightning Mapper (GLM) and produces composite radar reflectivity as output. It was trained on severe weather samples for spring and summer of 2019, and statistical evaluation conducted in H21 showed good accuracy of GREMLIN. It not only provides valuable synthetic radar information everywhere over the contiguous United States (CONUS), but also it has been shown to be effective in initializing convection in a short-term forecast model. Ground-based radar reflectivity profiles are used to initialize convection in the High-Resolution Rapid Refresh (HRRR) model, which is the National Oceanic and Atmospheric Administration’s (NOAA) operational regional model (Weygandt et al. 2022). Composite reflectivity information from GREMLIN can be combined with a lookup table developed by Lee et al. (2022) to generate vertical profiles of radar reflectivity. Such vertical profiles obtained from GREMLIN have been shown to be as effective as ground-based radar profiles for convective initialization (Back et al. 2021). Moreover, GREMLIN has further potential to be used globally, not just over CONUS, because geostationary data similar to GOES-16 are available over the globe. However, extension to a global application requires that its behavior in different climate regions needs to be further explored.

To understand how GREMLIN makes predictions, Ebert-Uphoff and Hilburn (2020) used the layer-wise relevance propagation (LRP) XAI method. Three strategies were revealed using the LRP method: 1) lightning is associated with strong radar echoes, 2) stronger echoes tend to be associated with colder brightness temperatures, and 3) strong radar echoes tend to be located near strong brightness temperature gradients. From these results, another version of GREMLIN, interpretable GREMLIN, was developed by Hilburn (2023d) to make GREMLIN more transparent. Interpretable GREMLIN uses the same inputs but applies prescribed kernels (identity, gradient, and Laplacian) and constructs an image pyramid to convert the convolutional neural network into a linear regression (with nonlinear terms). Despite having 24 times fewer parameters, interpretable GREMLIN has comparable accuracy to original GREMLIN. By exposing the effective input space of the convolutional neural network, interpretable GREMLIN found five additional strategies: 1) stronger echoes are associated with certain gradient orientations, 2) stronger echoes tend to occur when the larger-scale gradients are weaker, 3) shortwave brightness temperatures help distinguish cloud edges under thin cirrus, 4) small differences between the water vapor and longwave infrared channels are associated with strong echoes in deep convective clouds, and 5) there is an enhanced likelihood of strong echoes when cold brightness temperatures are associated with higher lightning flash rates.

Since GREMLIN’s prediction strategies have been extensively studied, this study focuses on analyzing GREMLIN’s spatial, seasonal, and diurnal variability and validating its accuracy in different precipitation regimes. As many previous studies show, there are regional and seasonal variations in errors of various satellite-based precipitation products due to different precipitation processes (warm vs cold) and different frequency of precipitation types (convective vs stratiform) (AghaKouchak et al. 2012; Tian et al. 2009). Many studies use geostationary satellite data to extract radar reflectivity information like GREMLIN [Sun et al. 2021; Yang et al. 2023 using Fengyun-4A (FY-4A)] or similar radar-derived quantities such as vertically integrated liquid (VIL; Veillette et al. 2018) or precipitation rate [Hayatbini et al. 2019; Kuligowski et al. 2016; self-calibrating multivariate precipitation retrieval (SCaMPR); Kim and Hong 2023], but the quantitative evaluation of the model performance is often conducted over the same training domain. Such studies tend to limit the spatial and temporal domain of the training dataset either due to lack of consistent observation data over a large domain or due to making the model focus on learning precipitation patterns in a certain season or region, and the evaluation is done using similar data for the same reasons. This leaves questions about how much a particular model can generalize and whether we can use the model in a region that was not included in the training, especially when it has a totally different climate.

In this study, quantitative evaluation is conducted for GREMLIN over the entire CONUS for all seasons to find clues to several questions: “If the weather in a certain region is similar throughout the year, can the same model be used throughout the year?”; “How are the input data for GREMLIN distributed over CONUS, and how does different input distribution affect the model performance?”; and finally, “What additional input variables are needed to develop a model that can also be generalized to cold-season data?” Evaluating GREMLIN’s performance in various climate regimes can help design the next generation of GREMLIN by providing insights into input variable selection, data selection, as well as model construction (model selection, hyperparameter tuning, and loss function customization). Through the error analysis, additional satellite information or other variables that need to be included to capture regime dependencies can be found. By identifying the cases that are most difficult for the current model, the model can be modified to overcome such difficult cases and eventually improve overall accuracy. Therefore, as a step toward developing a generalized model over CONUS and further to the globe, this study seeks to find out why GREMLIN performs better or worse in certain regions or certain seasons, identify issues that we encounter when we apply the model to different regimes, and describe the factors in the input space that affects GREMLIN’s performance.

Even though CONUS is a small part of the entire globe, it is still large enough to include multiple precipitation regimes, and it has a dense network of well-calibrated ground-based radars to conduct consistent evaluations. Since GREMLIN’s development, three years of data from 2020 to 2022 have been accumulated, which are long enough to examine GREMLIN’s average accuracy for different seasons and times of day. Root-mean-square difference (RMSD), mean bias error (MBE), and categorical verification metrics are calculated for each season and visualized with maps. Annual and diurnal variations of errors are also evaluated. For summertime cases, input samples in regions with the biggest successes and failures are compared to analyze whether there is a significant difference in input data that leads to success or failure, and uncertainties are evaluated according to different input data categories to identify characteristics of satellite data that produce high errors. Several strategies to overcome uncertainties in wintertime cases are presented in this study.

2. Data

GREMLIN CONUS3 datasets for 2020, 2021, and 2022 (Hilburn 2023a,b,c) contain inputs and outputs of GREMLIN as well as Multi-Radar Multi-Sensor System (MRMS) composite reflectivity. These datasets are created to compare error distribution of the GREMLIN product. Each dataset includes the GOES inputs (ABI brightness temperatures from channels 7, 9, and 13 (3.9, 6.9, and 10.3 μm, respectively), and GLM lightning group extent density), the corresponding GREMLIN predictions, and MRMS composite reflectivity. Time difference between GREMLIN collected at 1, 16, 31, and 46 min and MRMS data collected at 0, 16, 30, and 46 min is assumed to be negligible. All datasets have been matched in time and resampled to the 3-km CONUS HRRR grid, which consists of 1799 pixels in longitude and 1059 pixels in latitude.

a. MRMS

MRMS was developed at the National Severe Storms Laboratory and the University of Oklahoma to provide observation data for severe weather and aviation by integrating multiple observations (Smith et al. 2016). MRMS dataset integrates ground-based radar data over the conterminous United States and southern Canada with numerical weather prediction model data, satellite data, and lightning and rain gauge observations. It provides useful precipitation-related information to the community (Zhang et al. 2016). The data are provided every 2 min at a spatial resolution of 1 km with 33 vertical levels. This study uses composite radar reflectivity from the MRMS dataset, which represents the maximum reflectivity value of each vertical column.

Although the data are quality controlled and are provided in every grid point over the domain, there are locations where radar data are missing due to beam blockage, especially over the West. To mitigate this, MRMS provides a “radar quality index (RQI),” which quantifies the quality of the radar data. The quality of radar changes throughout the year, and thus, average RQI over the 3-yr period is used to create a mask that eliminates locations with bad quality of radar data. Average RQI, shown in Fig. 1, is calculated using data every first, 15th, and last day of the month every 3 h, and only the data with average RQI greater than 0 (colored regions in Fig. 1) are used throughout the study. RQI threshold of 0 is used rather than higher numbers because otherwise large parts of western CONUS will be excluded from the analysis due to low RQI. The analysis will be interpreted bearing in mind that the errors in some parts of western CONUS could be attributed to low RQI. Data over Canada are removed because average RQI is low, and 61 cases of spurious echoes (listed in the readme file of the CONUS3 datasets) are also removed.

Fig. 1.
Fig. 1.

Average RQI over CONUS.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

b. GREMLIN

GREMLIN is a neural network–based model developed by H21 that uses GOES ABI data to estimate composite reflectivity. It uses U-Net-based architecture without skip connections and uses a weighted loss function to resolve a well-known class imbalance issue when dealing with radar reflectivity (Shi et al. 2017). Its inputs are ABI channels 7, 9, and 13 (3.9, 6.9, and 10.3 μm, respectively) brightness temperatures and GLM group extent density accumulated over 15-min intervals, and it is trained against quality-controlled composite reflectivity from the MRMS dataset. The input variables were selected based on scientific knowledge that channel 7 is useful for low cloud detection; channel 13 is closely related to cloud-top height and thereby useful for inferring updraft intensity of convective clouds; and channel 9, one of the water vapor channels, is a good indicator of relative height to the tropopause given channel-13 brightness temperature. Lightning is a good indicator for strong convective activity. It was trained with spring and summer data (April–June 2019) to focus on warm-season convection, and the training dataset includes only locations east of 105°W that are generally within good radar coverage.

3. Results and discussion

Validation statistics for the version-1 GREMLIN in H21 showed that MRMS and GREMLIN have a good correlation (R2 = 0.74) and low error (RMSD = 5.53 dBZ) for its testing dataset. These statistics in H21 were derived using data over the same regions as in the training dataset, but over a different time period (July 2019). This study further extends the analysis to the entire CONUS region for three years from 2020 to 2022, including regions and seasons that GREMLIN has not seen during training, and it evaluates spatial and temporal (seasonal and diurnal) variations in GREMLIN’s performance. Composite reflectivity data from GREMLIN and MRMS are collected every 15 min for 3 yr to yield 100 178 samples. Error distribution over CONUS for the three years are examined spatially and temporally using RMSD, MBE, and categorical verification metrics in the following sections.

a. RMSD and MBE over CONUS

1) Spatial analysis by season

RMSD and MBE are respectively calculated as
RMSD=i=1N(Rtrue,iRpred,i)2Nand
MBE=i=1NRtrue,iRpred,iN,
where N is the total number of pixels, Rtrue is a true composite reflectivity from MRMS, and Rpred is a predicted composite reflectivity from GREMLIN. Figure 2 shows maps of RMSD over the 3-yr period for each season, along with coefficient of determination R2 in the lower-right corner of each plot. It provides a big-picture view of GREMLIN’s accuracy by season. Each season is defined as a 3-month period: spring (March–May), summer (June–August), fall (September–November), and winter (December–February). Summer has the highest R2 and the lowest RMSD as expected because it is the season when GREMLIN is trained and is mostly warm. In summer, relatively high RMSD values are observed in two regions indicated by blue-outlined (Florida) and red-outlined (Washington State) boxes. High RMSD in the blue-outlined-box region is probably due to more frequent convective activities that produce high reflectivity values, leading to high errors over the region. It is not shown in this study, but RMSD distribution follows distribution of 5-dBZ reflectivity occurrence. On the other hand, a line of high RMSD in the red-outlined box of Fig. 2b does not seem to be realistic. A line of reflectivity echo seems to be present constantly, especially on a clear day.
Fig. 2.
Fig. 2.

RMSD maps over CONUS for (a) spring, (b) summer, (c) fall, and (d) winter. “RMSD total” is calculated over CONUS, and “RMSD only east” is calculated only using data in regions east of 105°W longitude. The winter RMSD map (d) uses a different scale in the color bar because the magnitude of RMSD in winter is much larger than that in other seasons.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

In spring and fall, RMSD increases with latitude. In the lower-latitude regions (<40°N), most RMSD values are lower than 5 dBZ, and total RMSDs of 4.65 dBZ (spring) and 4.42 dBZ (fall) are similar to the one in H21 (5.53 dBZ). One notable feature observed in the spring and fall maps is that high RMSD in the white-outlined box in Fig. 2 follows the pattern of RQI (Fig. 1). The white-outlined-box region contains the Rocky Mountains, and thus, snow can still be present in the high terrain during spring and fall, which can lead to high RMSD as in cold seasons. In addition, echoes in cold seasons tend to be shallow and can be missed by the lowest tilt of the radar beam. Therefore, both RMSD and R2 values become higher when the West (longitude less than 105°W), which includes the white-outlined-box region, is excluded. There is another interesting spatial feature observed from the spring map (Fig. 2a). Over Texas, most regions have RMSD lower than 4 dBZ (light-green colors), but there are few spots that have RMSD greater than 5 dBZ (orange color). This is due to spurious echoes that constantly happen mostly on clear-sky days. Summer has the same issue, although it is not very clear from Fig. 2b. This issue will be addressed again in a later section with a case study.

The winter map has a different scale on the color bar because the errors are much larger than in the other seasons. High RMSD and low R2 in winter are mainly because cold surfaces can be mistaken as precipitating clouds, and GREMLIN has not been trained on synoptic-scale winter precipitation systems, which have different spatial features than the warm-season severe storms on which GREMLIN was trained. During winter when it is the coldest and most of the precipitation is snow, errors are the highest, some of which exceed an RMSD of 10 dBZ. GREMLIN tends to overestimate weak echoes according to two-dimensional histogram between GREMLIN and MRMS composite reflectivity (not shown), and it is due to cold brightness temperatures caused by cold surfaces, especially with snow on ground, leading to false echoes. Since winter has different precipitation forcing mechanisms and different brightness temperature distributions from what GREMLIN has been trained, it makes most sense to retrain the model with wintertime data, but that is beyond the scope of this investigation. However, some of these errors can be removed rather simply using one of ABI level-2 products. Since most errors come from cold surfaces, ABI level-2 clear-sky mask product is used to suppress false echoes wherever the mask indicates clear sky. Figure 3 shows RMSD difference maps after applying the clear-sky mask. Summertime is nearly unchanged as shown in minimum and maximum differences indicated in the lower left corner of Fig. 3, but other seasons show a decrease in RMSD (green colors). Note that orange colors in Fig. 3 indicate an increase in RMSD, but the magnitude is so small that it is almost negligible. The northern part of the United States where cold brightness temperatures are most frequently observed shows significant decreases in RMSD values in Fig. 3d, especially during winter. Remaining errors that have not been removed after using the clear-sky mask come from different brightness temperature distributions in winter or when clouds are over cold surfaces. Ancillary information from numerical weather prediction models data such as surface temperature might be useful additional variables when expanding GREMLIN for global use in the future. In this study, since the cold surface clearly is an important error source, part of which can be fixed easily, the data that are masked with the clear-sky mask product are used for analysis hereinafter.

Fig. 3.
Fig. 3.

Difference in RMSD between before and after modifying reflectivity based on the ABI level-2 clear-sky mask product for (a) spring, (b) summer, (c) fall, and (d) winter. Minimum and maximum differences are shown in the lower-left corner of each plot.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

MBE maps in Fig. 4, in which positive bias (in red) and negative bias (in blue) indicate overestimation and underestimation by GREMLIN, respectively, also show cold surface issue in winter. It is again noted that the winter map uses different color bars due to high positive biases caused by the cold surface. Summer has the lowest bias as expected, and spring and fall have very similar biases. As regions with low RQI tend to show high RMSD in spring in fall, such regions show high bias as well. In general, it shows positive biases as seen in Fig. 2, but negative biases are observed over Texas, which seem to be caused by the very frequent false echoes over those regions that are mentioned earlier. Negative biases observed over Washington State in spring, fall, and winter seem to be attributable to maritime precipitation systems in cold seasons. Over Washington State in cold seasons, tropopause height is lower due to cold air, and accordingly brightness temperature values are overall higher, especially for channel 9. In addition, less lightning makes it even harder to predict high echoes, and the synoptic-scale system over the regions often produces severe winter precipitation, which is not what GREMLIN was trained for. Such negative biases over the Pacific Northwest region in winter are consistent with the results from Tian et al. (2009) using other satellite-based precipitation products. Negative biases also appear around the locations of individual ground-based radars, mostly likely due to ground clutter, and it is most evident in summer (Fig. 6b).

Fig. 4.
Fig. 4.

MBE maps (dBZ) for each season. Red color represents positive bias, which means that GREMLIN overestimates, and blue color represents negative bias, which means that GREMLIN underestimates. Note that the color bar for winter in (d) is different from the others.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Although winter is the season with the highest error, it is still capable of providing useful information in regions without ground-based radar. One case study at 1900 UTC 1 December 2022 is shown in Fig. 5. Even though GREMLIN tends to overestimate reflectivity overall, reflectivity echoes only observed in GREMLIN over Idaho (the red-outlined-box region of Fig. 5b) were not wrong. Precipitation in these regions was likely missed by MRMS because of low RQI (as shown in Fig. 1), but snowfall in the region [highest accumulated snowfall depth up to 12.5 in. (31.8 cm)] was confirmed by data from the Community Collaborative Rain, Hail and Snow Network (CoCoRaHS) in which volunteers report weather conditions. This shows that GREMLIN might be useful for filling observation gaps in radar data.

Fig. 5.
Fig. 5.

(a) MRMS composite reflectivity and (b) GREMLIN composite reflectivity at 1900 UTC 1 Dec 2022.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Overall, summer is the season that shows the lowest RMSD and low bias, but abnormally high negative biases over Texas or Washington even observed during summer seem unreasonable. This suggests that although ground-based radars are the best available observation data with high spatial and temporal resolutions that can be treated as the truth during training, it still is not perfect. The data contains inevitable errors, and thus, it should be treated with caution when using it during training or validation. Spring and fall show similar distribution of RMSD and bias, and their values are low enough relative to the one shown in H21 to be used reliably during these seasons. Winter shows high RMSD and high positive bias in the North, but the South where it is less affected by the cold surface shows decent accuracy as well.

2) Temporal analysis of RMSD by season and by time of day

Figure 6 shows a time series of RMSD over the entire CONUS throughout the year. The daily mean RMSD in Fig. 6 is calculated by taking an average across all of the 15-min data within a day. As shown in Fig. 2, winter, especially December, has the highest RMSD, whereas during April–September RMSD values are similar to the one derived in H21 (5.53 dBZ).

Fig. 6.
Fig. 6.

RMSD against day of year using daily bins.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Further temporal analysis is conducted using time of day to examine diurnal variations in RMSD. Figure 7a shows RMSD for different times of day for each season. Spring, fall, and winter have a similar diurnal cycle, RMSD peaking around 1300 UTC, and summer has a maximum RMSD around 2100 UTC. Although summer seems to have the least diurnal variation in RMSD, different diurnal patterns start to appear when RMSD is calculated separately in different longitude regions. Figure 7b shows summertime diurnal cycle of RMSD in three different longitude regions: WEST (longitude greater than 103°W; includes the Rockies), MID (longitude less than or equal to 103°W and greater than 90°W; east of the Rockies and west of the Appalachian Mountains), and EAST (longitude less than or equal to 90°W; includes the Appalachian Mountains). RMSD is only calculated using data north of 35°N since high RMSD in the South tends to dominate over and does not highlight the longitudinal pattern. Three regions show maximum RMSD at different times of day: WEST around 2300 UTC, MID around 0900 UTC, and EAST around 2000 UTC. These differences can be explained by the diurnal pattern of summertime precipitation over CONUS presented in Carbone and Tuttle (2008), which is attributed to continental-scale diurnal cycle that propagates eastward as well as regional diurnal variation.

Fig. 7.
Fig. 7.

Diurnal time series of (a) RMSD for each season and (b) RMSD in summer, dividing regions by longitude: WEST (longitude > 103°W), MID (90°W < longitude ≤ 103°W), and EAST (longitude ≤ 90°W).

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Figure 8 shows a more detailed description of RMSD distribution over CONUS at different times of day. These RMSD maps are plotted at 0200, 0700, 1300, and 2100 UTC for a direct comparison with maps of diurnal radar echo frequency in Fig. 2 of Carbone and Tuttle (2008), which showed climatological diurnal variations of radar echoes over CONUS. At 2100 UTC, convection is most active over the Rockies and Southeast, and convection maximum moves from east of the Rockies at 0200 UTC to the central United States at 0700 UTC. At 1300 UTC, convection is most active offshore due to land breeze convection and some continued convective activities are observed in the central United States. RMSD distribution in Fig. 8 follows this pattern as most of the rainfall is associated with convective precipitation that produces significant amounts of rainfall, which consequently leads to high errors simply due to the magnitude. These results agree well with the climatological analysis presented in Carbone and Tuttle (2008).

Fig. 8.
Fig. 8.

RMSD in summer over CONUS at (a) 0200 UTC, (b) 0700 UTC, (c) 1300 UTC, and (d) 2100 UTC.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Another thing to note from Fig. 8 is that a line of high RMSD observed during summer (white-outlined box in Fig. 8b) seems to appear throughout the day. On the other hand, high RMSD in the red-outlined box of Fig. 8b is only observed at 0700 UTC. This region corresponds to the aforementioned high negative bias regions over Texas (in Fig. 4b). It is not shown here, but such high RMSD over Texas is mainly observed from 0300 to 1000 UTC. Figure 9 provides a case example that clearly shows that they are actually false echoes. A map of MRMS reflectivity with false echoes is presented along with a map of brightness temperature at channel 13 in Fig. 9 to show that it was obviously clear sky in the red-outlined box. Reports from CoCoRaHS also suggest that it did not rain over the region (not shown).

Fig. 9.
Fig. 9.

Maps of (a) MRMS composite reflectivity and (b) GOES channel-13 brightness temperature at 0300 UTC 26 Jul 2021.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

b. Categorical verification metrics

Categorical verification metrics are useful to evaluate a model’s accuracy at different thresholds. Table 3 of H21 provides categorical performance statistics for 10 reflectivity thresholds (5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 dBZ) in terms of probability of detection (POD), false alarm ratio (FAR), critical success index (CSI), and frequency bias index (FBI), which are respectively defined as
POD=TPTP+FN,
FAR=FPTP+FP,
CSI=TPTP+FP+FN, and
FBI=TP+FPTP+FN,
where TP, TN, FP, and FN refer to true positive, true negative, false positive, and false negative, respectively.

The same verification metrics are calculated using data over the entire CONUS and plotted in Fig. 10 by season. The values from H21 are also plotted in red for comparisons. Since values from H21 are calculated using summertime data, values in summer (orange) are the closest, but the statistics over the entire CONUS are worse in general because it includes regions with different brightness temperature distribution that were not seen during training or regions with low RQI, thereby reducing the total accuracy. Spring and fall have similar CSI with lower thresholds, whereas spring has similar CSI as for summer with higher thresholds. Spring, fall, and winter show high FBI with lower thresholds, which means more false alarms than misses due to cold surface, and low FBI with higher thresholds meaning that GREMLIN underestimates high reflectivity, probably because of relatively high brightness temperature producing low echoes in cold seasons.

Fig. 10.
Fig. 10.

(a) CSI, (b) POD, (c) FAR, and (d) FBI for 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 thresholds. Values for spring, summer, fall, and winter are plotted in green, orange, blue, and purple, respectively, and values from H21 are also plotted in red for comparisons.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

To investigate regional variations of these statistics, maps of CSI over CONUS are plotted for each season. Figure 11 shows maps of CSI using the 5-dBZ threshold for each season. CSI using 5 dBZ is shown here because 5 dBZ is a good indicator of meteorologically significant echoes (Illingworth 1988). There are some features that agree with previous analyses using RMSD and MBE.

  • Winter has the lowest CSI for 5 dBZ, especially over the North where the cold surface is observed the most.

  • Regions with low RQI (the white-outlined-box region in Fig. 2) show low CSI, which leads to low total CSI in spring and fall (0.43 and 0.40, respectively; Fig. 10).

  • Low CSI (∼0.3) is observed in the regions over Texas where high RMSD (Fig. 2a) and high negative bias (Fig. 5b) are observed during spring and summer because of false echoes as in the case study (Fig. 9).

Fig. 11.
Fig. 11.

CSI for the 5-dBZ threshold in (a) spring, (b) summer, (c) fall, and (d) winter.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

There are also new features that appear on the CSI map. In summer, Florida and the Great Plain regions have high CSI for 5 dBZ, even though data over Florida were not included in training. This is because in summer, Florida has similar brightness temperature distributions as the ones that are used to train the model. High CSI in Florida confirms that high RMSD in summer shown in Fig. 2b or Fig. 8d is mostly due to high echoes. On the other hand, the West Coast, which showed low RMSD in summer (Fig. 2b), shows the lowest CSI in summer, probably due to the low occurrence of precipitating cases. During spring, fall, and winter, the Southeast still achieves good CSI for 5 dBZ because of more occurrence of warm precipitation over the region.

Maps of CSI using the 30-dBZ threshold are presented in Fig. 12. The 30-dBZ threshold can be an indicator of convection as the HRRR model uses 28-dBZ threshold for initiating convection. Since 30 dBZ is not very common in most of the seasons, and low occurrence can impact the statistical results, only the grid points that had more than 100 occurrences are colored in Fig. 12. CSI is really low in winter even in the South, part of which is because the occurrence of reflectivity over 30 dBZ itself is low. Unlike in other metrics for which summer always has much better statistics, CSI for 30 dBZ is very similar in spring and summer. While CSI is high in most of the Midwest and East in summer, in spring, high CSI (between 0.4 and 0.5) appears in wider regions of the South. This is probably because a lot of supercells occur in the South in spring, and they are mostly discrete or isolated convection (Smith et al. 2012), which makes it easier to estimate reflectivity from ABI.

Fig. 12.
Fig. 12.

CSI for the 30-dBZ threshold in (a) spring, (b) summer, (c) fall, and (d) winter. Only the grid points that had more than 100 occurrences are plotted. Nebraska in the green-outlined box shows the highest CSI, and the Northeast (Vermont, Maine, and New Hampshire) in the blue-outlined box shows the lowest CSI.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Most of the East has good CSI in summer except for the Northeast (Vermont, Maine, and New Hampshire; blue-outlined-box region) in Fig. 12b. Low CSI in the region is more due to low POD in the region (Fig. 13a) rather than high FAR (Fig. 13b), as low FBI in the region suggests (not shown). To explore what contributed to low POD in the region, distributions of brightness temperature and frequency of lightning are compared in two regions: the Northeast (the blue-outlined-box region) with low CSI and Nebraska (the green-outlined-box region) with high CSI. Figure 14 shows distributions in the two regions for hit and miss cases. Note that for GLM (Fig. 14d), the presence of GLM is plotted rather than the actual value, because the existence of lightning is more important than the magnitude of lightning (Hilburn 2023d) for inferring reflectivity. The Northeast has a lot less lightning frequency both in hit and miss cases, which is confirmed by previous studies (Zajac and Rutledge 2001; Marchand et al. 2019). For hit cases, the channel-7 brightness temperature distribution differs in the two regions (peak around 250 K in the Northeast while peak around 210 K in Nebraska) due to preference of nocturnal precipitation in Nebraska. Meanwhile, brightness temperatures at channels 9 and 13 are generally higher in the Northeast. These patterns continue in miss cases, but the difference between hit and miss cases in the Northeast is that most brightness temperatures for channels 9 and 13 are much higher in miss cases (Figs. 14b,c), especially at channel 9. This is probably due to low tropopause height in the region, leading to warmer brightness temperature in the water vapor channel. Aside from different distributions of input variables, different precipitation systems over the region can be another factor contributing to low CSI. According to Agel et al. (2018), most of the extreme precipitation in this region comes from synoptic-scale precipitation systems or tropical cyclones that GREMLIN was not trained on. As shown in Tian et al. (2009), other satellite-based precipitation products that use passive microwave sensor data also struggle to capture precipitation over this region and underestimate precipitation. Therefore, different precipitation systems as well as relatively high brightness temperatures, especially at channel 9, along with low frequency of lightning might have contributed to low POD, and thereby low CSI.

Fig. 13.
Fig. 13.

(a) POD and (b) FAR for the 30-dBZ threshold in summer. Only the grid points that had more than 100 occurrences are plotted.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Fig. 14.
Fig. 14.

Distributions of brightness temperature for channels (a) 7, (b) 9, and (c) 13 and (d) presence of lightning in the Northeast (blue) and Nebraska (green), as defined by the outlined boxes in Fig. 12b. Solid and dashed lines are for hit and miss cases, respectively.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

On the other hand, Florida is a region with high FAR in summer (Fig. 13b). Although CSI for 5 dBZ is high over Florida as shown in Fig. 11b, CSI for 30 dBZ is relatively low in comparison with the Great Plains where CSI for 5 dBZ is similarly high as in Florida. Brightness temperature distributions are very similar between Nebraska of Fig. 12b and Florida except for channel 7 because of different times of day that convection occurs (Fig. 8). However, there is more lightning activity over Florida, which is an indicator of strong echoes; 4% of cases with reflectivity lower than 30 dBZ accompanies lightning in Florida, while only 0.9% of cases with reflectivity lower than 30 dBZ have lightning in Nebraska. Among the 4% that had lightning, 20% were false alarms, which account for 77% of the total false alarms. Presence of lightning seems to be the most important factor for CSI for the 30-dBZ threshold.

The Northeast and Florida where CSIs for the 30-dBZ threshold appear relatively lower than other regions happen to be two regions that were not included in training. Relatively low CSI over the two regions may be due to epistemic uncertainty in the form of regime dependence where those locations have different relationships between the inputs (satellite data) and the outputs (composite reflectivity). Florida tends to have more lightning, and this leads to more false alarms in regions adjacent to the convective cores, while the Northeast tends to have fewer lightning and flat cloud tops that are hard to distinguish from anvil clouds even when radar reflectivity is high. Epistemic uncertainty coming from such regime dependence can be removed if environmental variables that can infer lightning frequency such as moisture contents or geographical variables to locate the regime are provided. Therefore, such variables should be considered as additional inputs in the future study.

c. Uncertainty relationships to inputs for summertime data

There are two types of uncertainties: aleatoric and epistemic. Aleatoric uncertainty arises from inherent randomness within the dataset, which is irreducible, while epistemic uncertainty can be reduced if more knowledge is provided (Matthies 2007). As seen from the previous sections, the main uncertainty during the cold season is considered to be epistemic, because it can be reduced to a certain degree if additional information such as surface temperature or is given. On the other hand, in summer when the cold surface issue is not present, uncertainty can also be aleatoric. To examine uncertainties in summer according to satellite data characteristics, RMSDs are divided into different input data categories and shown in Table 1. These input data categories are following the categories defined in Goldenstern and Kummerow (2023), which is used to evaluate regional biases of the GOES-16 Precipitation Estimator using Convolutional Neural Networks (GPE-CNN) model that uses the same architecture as GREMLIN. They divided satellite data into 12 categories according to the presence of lightning, mean and standard deviation of channel-13 brightness temperature within a 96 km × 96 km box domain, and each of the categories contains different information content. While it is rather easy to interpret that the presence of lightning and low mean brightness temperature is related to convective precipitation, leading to high radar reflectivity, interpretation of standard deviation requires some thought. Standard deviation can be regarded as a measure of texture of a satellite image, and its values greater than 10 are considered to indicate the presence of high gradients along the cloud edge. The dominant type of uncertainty for each class can be inferred by how much information content the class has. The uncertainty type of classes with high information content is most likely aleatoric, while most of the uncertainty from classes with low information content is probably epistemic, which can be reduced with additional information.

Table 1.

Summertime RMSD calculated for 12 different input data categories defined according to Table 3 in Goldenstern and Kummerow (2023). “STD” refers to standard deviation; “Tb” refers to brightness temperature; “Precip type” refers to precipitation type; “Conv” and “Stra” are convective and stratiform, respectively; and “Mean ref” refers to mean composite reflectivity.

Table 1.

The mean reflectivity of class 1 is smaller than that of class 4 because class 4 includes a clear-sky region next to the cloud edge. However, maximum reflectivity within the domain tends to be higher for class 1 than class 4 and for class 2 than class 5 (not shown) because the convective core is often near the cloud edge with high gradient in the satellite image. Since RMSD tends to be higher when the actual value is higher, RMSD is divided by mean reflectivity so that it is easier to compare relative errors. Overall, the existence of lightning leads to small relative RMSD (classes 1–6). It is interesting to note that having more texture leads to low RMSD (classes 2 and 8 as compared with classes 5 and 11), and this could be either because high standard deviation indicates the presence of cloud edge, thereby including more clear-sky region within the domain reducing the error, or because high gradient near the edge is one of the main characteristics that GREMLIN focuses on to predict reflectivity (H21). Except classes 6, 9, and 12 that have the highest relative RMSDs due to mean reflectivity being close to 0, classes 5 and 11 have the highest relative RMSDs, which are greater than 2. They are the hardest ones to predict because they are the groups with the least information content: ambiguous brightness temperature meaning not too cold to predict high reflectivity or too warm to predict low reflectivity with high confidence and less texture (small standard deviation). Class 11 has higher RMSD than class 5 because it does not even have lightning to infer high reflectivity, and thus class 11 can be regarded as a class that possesses the highest amount of epistemic uncertainty. However, the frequencies of these two classes are very low relative to other classes, and thus they are not the main sources of uncertainties in summer. Among classes 3, 7, and 8 that have high RMSD (>5) as well as high relative error (>1), classes 3 and 8 have higher frequency, and they are the two classes that are dominant over Florida and the Northeast regions, respectively. The fact that the two classes are dominant over Florida and the Northeast, which are not included in the training, also suggests that their uncertainties are likely epistemic. To reduce such uncertainties, additional information such as locations or environmental variables to infer lightning frequency can be considered in future training.

d. Bias correction of wintertime data

Although it is not shown, wintertime data mostly belong to classes 6–12 because of less lightning. The percentage of class 11, which has the lowest information content and thus highest epistemic uncertainty, is two magnitudes bigger than that of summer over the northern part, and it is due to cold surfaces still left after applying clear-sky mask or big synoptic systems that occupy wide regions with less textured cloud top. Therefore, whether use of additional information can reduce such uncertainties is explored in this section so that it can help design the retraining of GREMLIN in the future.

To reduce false alarms and make the current version of GREMLIN more accurate in winter, two methods are conducted using surface temperature from the HRRR model output: 1) dynamical scaling of channel-13 brightness temperature using surface temperature, and 2) bias correction using a lookup table of mean biases that depends on the surface temperature and channel-13 brightness temperature. Dynamical scaling is done by setting the maximum brightness temperature as the surface temperature and scaling accordingly. Cold brightness temperature becomes like a warm temperature after normalization, and thus, a cold surface in winter no longer looks like a precipitating cloud. The lookup table for the second method consists of mean biases for each surface temperature and channel-13 brightness temperature bin. The lookup table in Fig. 15a shows that mean biases are high when surface temperature is similar to channel-13 brightness temperature (when it is close to one-to-one line in red), which is the exact case where cold surface issue occurs. Note that the mean biases shown in Fig. 15a are small for surface temperatures warmer than 280 K, while they rapidly increase as the surface temperature falls below the freezing temperature (∼273 K).

Fig. 15.
Fig. 15.

(a) A lookup table of mean biases according to channel-13 brightness temperature and surface temperature. The one-to-one line is plotted in red for an easier comparison. (b) Two-dimensional histogram plots of MRMS and GREMLIN composite radar reflectivity in winter after correcting with the lookup table in (a).

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

To compare the accuracy after the corrections, the same case-study results in Fig. 5 using the two methods are presented in Fig. 16. Using dynamical scaling (Fig. 16a) eliminates too much echo over the red-outlined-box region, while using the lookup table (Fig. 16b) only reduces some of the echoes. However, high echoes (>35 dBZ in yellow color) over the Gulf of Mexico are also reduced after using the lookup table. Note that since GREMLIN was trained using fixed scaling, and retraining the GREMLIN model using the dynamically scaled inputs might improve results.

Fig. 16.
Fig. 16.

GREMLIN results (a) after dynamical scaling and (b) after correcting with the lookup table in Fig. 15a for the same case study as in Fig. 5.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

Since the second method is more effective in reducing false alarms, error distributions are calculated using the second method and compared with the current version of GREMLIN. Correlation is much improved as shown in Fig. 15b, when compared with R2 in Fig. 2d. Figure 17 shows the map of RMSD and MBE after the correction using the lookup table. After the correction, overall RMSD and MBE values are significantly diminished, and the values lie within similar ranges from other seasons. However, it reduces echoes in general, and thus, negative biases appear in regions where precipitation occurs frequently in winter such as the West Coast or the South. These results suggest that using surface temperature helps reduce false echoes in winter, but we need to be careful in how to use it appropriately in the future when training with wintertime data.

Fig. 17.
Fig. 17.

(a) RMSD and (b) MBE in winter after correcting with the lookup table shown in Fig. 15a.

Citation: Journal of Applied Meteorology and Climatology 63, 3; 10.1175/JAMC-D-23-0103.1

4. Summary and conclusions

GREMLIN was developed to provide synthetic radar reflectivity information for observing severe weather and convective initialization. GREMLIN has the potential to be applied to additional currently operational geostationary satellites that carry similar sensors. Although the current version of GREMLIN uses CONUS sector data from GOES-16 and produces maps over CONUS, it can be expanded to the whole satellite field of view using full-disk sector data from GOES-16. Furthermore, once it is trained with the full-disk sector, it can be applied to full-disk sector data of other geostationary satellites to provide quasi-global maps of radar reflectivity. To support development of its extended application, this study validates the current GREMLIN product against MRMS product (Smith et al. 2016), which is the most reliable data over CONUS.

GREMLIN is applied to 3 yr of data from 2020 to 2022, and its spatial and temporal error distributions are evaluated by comparing maps of RMSD, bias, CSI, POD, FAR, and FBI over CONUS. During winter, there is a positive bias due to the cold surface frequently mistaken as precipitating clouds. After modifying reflectivity values based on the ABI level-2 clear-sky mask product, RMSD is significantly reduced. Accuracy in spring and fall is still affected by the cold surfaces across northern CONUS, but the overall RMSD values are similar to the one from H21. RMSD plotted against day of year shows that RMSD is the lowest April through September, and plotted against time of day, it shows that distribution of RMSD in summer has a diurnal pattern related to rainfall occurrence in different longitude regions. In addition, RMSD tends to increase with rainfall occurrence in summer. Bias maps show that there is negative bias over the West Coast especially in cold seasons. In addition, false echoes in MRMS are observed over Texas and Washington in summer, which are also confirmed by the time series of RMSD that false echoes over Texas happen at certain time of day, while false echoes over Washington can happen regardless of time of day.

While RMSD is a useful metric to evaluate the overall model accuracy, categorical verification metrics can provide different views depending on the thresholds used. Five dBZ can be used as a threshold to determine meteorologically significant echoes. As in the RMSD results, summer shows the best CSI. Meanwhile, CSI using the threshold of 30 dBZ, which is a good indicator of convection, shows slightly different results. Summer still has the highest CSI, and it is very similar to that of spring. In addition, springtime CSI in the Southeast is the highest. In summer, CSI is overall good in the East, but the Northeast including Vermont, New Hampshire, and Maine shows low CSI. It seems to be due to low frequency of lightning and relatively warmer brightness temperatures that leads to underprediction of strong echoes. On the other hand, relatively low CSI in Florida seems to be due to high frequency of lightning, which leads to high FAR.

Based on spatial and temporal analysis of GREMLIN products, it is shown that GREMLIN has overall good accuracy in spring, summer, and fall. In winter, it can still be useful in areas without ground-based radars, but it needs some improvements. Two simple methods for reducing false alarms using surface temperature were examined. After subtracting mean biases based on channel-13 brightness temperature and surface temperature, RMSD and biases are significantly reduced, but it tends to underestimate high echoes. These results suggest that for the current version of GREMLIN, most of the uncertainties in winter tend to be epistemic, which can be reduced if more information such as surface temperature is given, and the uncertainties in summer can be both epistemic and aleatoric, depending on the regime and relationships between inputs and outputs. Therefore, for a global application, proper input variables need to be chosen so that the model can learn to distinguish different seasons or regimes, and it is also important to evaluate the result for each season and regime to validate that the model is well generalized.

Acknowledgments.

This work is supported by the NOAA GOES-R Program under Grant NA19OAR4320073.

Data availability statement.

NOAA GOES-R Series Advanced Baseline Imager (ABI) level-2 clear-sky mask data were obtained from NOAA National Centers for Environmental Information (https://doi.org/10.7289/V5SF2TGP; accessed 16 March 2023). Composite radar reflectivity from MRMS and GREMLIN used in this study as well as input data for GREMLIN can be retrieved from Hilburn (2023a,b,c). MRMS RQI data were obtained from Iowa Environmental Mesonet of Iowa State University (MRMS Archiving; https://mtarchive.geol.iastate.edu/). the NOAA High-Resolution Rapid Refresh (HRRR) Model was accessed on 31 December 2022 (ftp.ncep.noaa.gov).

REFERENCES

  • Abbe, C., 1901: The physical basis of long-range weather forecasts. Mon. Wea. Rev., 29, 551561, https://doi.org/10.1175/1520-0493(1901)29[551c:TPBOLW]2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Agel, L., M. Barlow, S. B. Feldstein, and W. J. Gutowski Jr., 2018: Identification of large-scale meteorological patterns associated with extreme precipitation in the US northeast. Climate Dyn., 50, 18191839, https://doi.org/10.1007/s00382-017-3724-8.

    • Search Google Scholar
    • Export Citation
  • AghaKouchak, A., A. Mehran, H. Norouzi, and A. Behrangi, 2012: Systematic and random error components in satellite precipitation data sets. Geophys. Res. Lett., 39, L09406, https://doi.org/10.1029/2012GL051592.

    • Search Google Scholar
    • Export Citation
  • Arrieta, A. B., and Coauthors, 2020: Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion, 58, 82115, https://doi.org/10.1016/j.inffus.2019.12.012.

    • Search Google Scholar
    • Export Citation
  • Back, A., and Coauthors, 2021: Convection-indicating GOES-R products assimilated in the experimental UFS rapid refresh system. 2021 Fall Meeting, New Orleans, LA, Amer. Geophys. Union, Abstracts A22B-02, https://ui.adsabs.harvard.edu/abs/2021AGUFM.A22B..02B/abstract.

  • Bauer, P., A. Thorpe, and G. Brunet, 2015: The quiet revolution of numerical weather prediction. Nature, 525, 4755, https://doi.org/10.1038/nature14956.

    • Search Google Scholar
    • Export Citation
  • Bjerknes, V., 1904: Das problem der wettervorhersage, betrachtet vom standpunkte der mechanik und der physik. Meteor. Z., 21, 17.

  • Carbone, R. E., and J. D. Tuttle, 2008: Rainfall occurrence in the U.S. warm season: The diurnal cycle. J. Climate, 21, 41324146, https://doi.org/10.1175/2008JCLI2275.1.

    • Search Google Scholar
    • Export Citation
  • Chapman, W. E., A. C. Subramanian, L. Delle Monache, S. P. Xie, and F. M. Ralph, 2019: Improving atmospheric river forecasts with machine learning. Geophys. Res. Lett., 46, 10 62710 635, https://doi.org/10.1029/2019GL083662.

    • Search Google Scholar
    • Export Citation
  • Ebert-Uphoff, I., and K. Hilburn, 2020: Evaluation, tuning, and interpretation of neural networks for working with images in meteorological applications. Bull. Amer. Meteor. Soc., 101, E2149E2170, https://doi.org/10.1175/BAMS-D-20-0097.1.

    • Search Google Scholar
    • Export Citation
  • Gettelman, A., D. J. Gagne, C.-C. Chen, M. W. Christensen, Z. J. Lebo, H. Morrison, and G. Gantos, 2021: Machine learning the warm rain process. J. Adv. Model. Earth Syst., 13, e2020MS002268, https://doi.org/10.1029/2020MS002268.

    • Search Google Scholar
    • Export Citation
  • Goldenstern, E., and C. Kummerow, 2023: Predicting region-dependent biases in a GOES-16 machine learning precipitation retrieval. J. Appl. Meteor. Climatol., 62, 873885, https://doi.org/10.1175/JAMC-D-22-0089.1.

    • Search Google Scholar
    • Export Citation
  • Hayatbini, N., and Coauthors, 2019: Conditional Generative Adversarial Networks (CGANS) for near real-time precipitation estimation from multispectral GOES-16 satellite imageries—PERSIANN-cGAN. Remote Sens., 11, 2193, https://doi.org/10.3390/rs11192193.

    • Search Google Scholar
    • Export Citation
  • Hilburn, K., 2023a: GREMLIN CONUS3 dataset for 2020. Dryad, accessed 15 May 2023, https://doi.org/10.5061/dryad.h9w0vt4nq.

  • Hilburn, K., 2023b: GREMLIN CONUS3 dataset for 2021. Dryad, accessed 15 May 2023, https://doi.org/10.5061/dryad.zs7h44jf2.

  • Hilburn, K., 2023c: GREMLIN CONUS3 dataset for 2022. Dryad, accessed 15 May 2023, https://doi.org/10.5061/dryad.2jm63xstt.

  • Hilburn, K., 2023d: Understanding spatial context in convolutional neural networks using explainable methods: Application to interpretable GREMLIN. Artif. Intell. Earth Syst., 2, 220093, https://doi.org/10.1175/AIES-D-22-0093.1.

    • Search Google Scholar
    • Export Citation
  • Hilburn, K., I. Ebert-Uphoff, and S. D. Miller, 2021: Development and interpretation of a neural-network-based synthetic radar reflectivity estimator using GOES-R satellite observations. J. Appl. Meteor. Climatol., 60, 321, https://doi.org/10.1175/JAMC-D-20-0084.1.

    • Search Google Scholar
    • Export Citation
  • Illingworth, A. J., 1988: The formation of rain in convective clouds. Nature, 336, 754756, https://doi.org/10.1038/336754a0.

  • Im, J., S. Park, J. Rhee, J. Baik, and M. Choi, 2016: Downscaling of AMSR-E soil moisture with MODIS products using machine learning approaches. Environ. Earth Sci., 75, 1120, https://doi.org/10.1007/s12665-016-5917-6.

    • Search Google Scholar
    • Export Citation
  • Kim, Y., and S. Hong, 2023: Hypothetical ground radar-like rain rate generation of geostationary weather satellite using data-to-data translation. IEEE Trans. Geosci. Remote Sens., 61, 4103414, https://doi.org/10.1109/TGRS.2023.3267840.

    • Search Google Scholar
    • Export Citation
  • Kuligowski, R. J., Y. Li, Y. Hao, and Y. Zhang, 2016: Improvements to the GOES-R rainfall rate algorithm. J. Hydrometeor., 17, 16931704, https://doi.org/10.1175/JHM-D-15-0186.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., A. McGovern, C. R. Homeyer, D. J. Gagne II, and T. Smith, 2020: Deep learning on three-dimensional multiscale data for next-hour tornado prediction. Mon. Wea. Rev., 148, 28372861, https://doi.org/10.1175/MWR-D-19-0372.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., D. Turner, I. Ebert-Uphoff, J. Stewart, and V. Hagerty, 2021: Using deep learning to emulate and accelerate a radiative transfer model. J. Atmos. Oceanic Technol., 38, 16731696, https://doi.org/10.1175/JTECH-D-21-0007.1.

    • Search Google Scholar
    • Export Citation
  • Lee, Y., C. D. Kummerow, and I. Ebert-Uphoff, 2021: Applying machine learning methods to detect convection using Geostationary Operational Environmental Satellite-16 (GOES-16) Advanced Baseline Imager (ABI) data. Atmos. Meas. Tech., 14, 26992716, https://doi.org/10.5194/amt-14-2699-2021.

    • Search Google Scholar
    • Export Citation
  • Lee, Y., C. D. Kummerow, and M. Zupanski, 2022: Latent heating profiles from GOES-16 and its impacts on precipitation forecasts. Atmos. Meas. Tech., 15, 71197136, https://doi.org/10.5194/amt-15-7119-2022.

    • Search Google Scholar
    • Export Citation
  • Li, W., L. Ni, Z.-L. Li, S.-B. Duan, and H. Wu, 2019: Evaluation of machine learning algorithms in spatial downscaling of MODIS land surface temperature. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 12, 22992307, https://doi.org/10.1109/JSTARS.2019.2896923.

    • Search Google Scholar
    • Export Citation
  • Liu, S., C. Grassotti, Q. Liu, Y. Zhou, and Y.-K. Lee, 2022: Improvement of MiRS sea surface temperature retrievals using a machine learning approach. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 15, 18571868, https://doi.org/10.1109/JSTARS.2022.3151002.

    • Search Google Scholar
    • Export Citation
  • Marchand, M., K. Hilburn, and S. D. Miller, 2019: Geostationary lightning mapper and earth networks lightning detection over the contiguous United States and dependence on flash characteristics. J. Geophys. Res. Atmos., 124, 11 55211 567, https://doi.org/10.1029/2019JD031039.

    • Search Google Scholar
    • Export Citation
  • Matthies, H. G., 2007: Quantifying uncertainty: Modern computational representation of probability and applications. Extreme Man-Made and Natural Hazards in Dynamics of Structures, Springer, 105–135.

  • Park, J.-E., Y.-J. Choi, J. Jeong, and S. Hong, 2023: Hypothetical cirrus band generation for advanced Himawari imager sensor using data-to-data translation with advanced meteorological imager observations. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16, 356368, https://doi.org/10.1109/JSTARS.2022.3224911.

    • Search Google Scholar
    • Export Citation
  • Pfreundschuh, S., P. J. Brown, C. D. Kummerow, P. Eriksson, and T. Norrestad, 2022: GPROF-NN: A neural-network-based implementation of the Goddard profiling algorithm. Atmos. Meas. Tech., 15, 50335060, https://doi.org/10.5194/amt-15-5033-2022.

    • Search Google Scholar
    • Export Citation
  • Shi, X., Z. Gao, L. Lausen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c. Woo, 2017: Deep learning for precipitation nowcasting: A benchmark and a new model. NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates Inc., 5622–5632, https://dl.acm.org/doi/10.5555/3295222.3295313.

  • Smith, B. T., R. L. Thompson, J. S. Grams, C. Broyles, and H. E. Brooks, 2012: Convective modes for significant severe thunderstorms in the contiguous United States. Part I: Storm classification and climatology. Wea. Forecasting, 27, 11141135, https://doi.org/10.1175/WAF-D-11-00115.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Sun, F., B. Li, M. Min, and D. Qin, 2021: Deep learning-based radar composite reflectivity factor estimations from Fengyun-4A geostationary satellite observations. Remote Sens., 13, 2229, https://doi.org/10.3390/rs13112229.

    • Search Google Scholar
    • Export Citation
  • Tian, Y., and Coauthors, 2009: Component analysis of errors in satellite-based precipitation estimates. J. Geophys. Res., 114, D24101, https://doi.org/10.1029/2009JD011949.

    • Search Google Scholar
    • Export Citation
  • Veillette, M. S., E. P. Hassey, C. J. Mattioli, H. Iskenderian, and P. M. Lamey, 2018: Creating synthetic radar imagery using convolutional neural networks. J. Atmos. Oceanic Technol., 35, 23232338, https://doi.org/10.1175/JTECH-D-18-0010.1.

    • Search Google Scholar
    • Export Citation
  • Weygandt, S. S., S. G. Benjamin, M. Hu, C. R. Alexander, T. G. Smirnova, and E. P. James, 2022: Radar reflectivity–based model initialization using specified latent heating (Radar-LHI) within a diabatic digital filter or pre-forecast integration. Wea. Forecasting, 37, 14191434, https://doi.org/10.1175/WAF-D-21-0142.1.

    • Search Google Scholar
    • Export Citation
  • Yang, L., Q. Zhao, Y. Xue, F. Sun, J. Li, X. Zhen, and T. Lu, 2023: Radar composite reflectivity reconstruction based on FY-4A using deep learning. Sensors, 23, 81, https://doi.org/10.3390/s23010081.

    • Search Google Scholar
    • Export Citation
  • Zajac, B. A., and S. A. Rutledge, 2001: Cloud-to-ground lightning activity in the contiguous United States from 1995 to 1999. Mon. Wea. Rev., 129, 9991019, https://doi.org/10.1175/1520-0493(2001)129<0999:CTGLAI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Search Google Scholar
    • Export Citation
Save
  • Abbe, C., 1901: The physical basis of long-range weather forecasts. Mon. Wea. Rev., 29, 551561, https://doi.org/10.1175/1520-0493(1901)29[551c:TPBOLW]2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Agel, L., M. Barlow, S. B. Feldstein, and W. J. Gutowski Jr., 2018: Identification of large-scale meteorological patterns associated with extreme precipitation in the US northeast. Climate Dyn., 50, 18191839, https://doi.org/10.1007/s00382-017-3724-8.

    • Search Google Scholar
    • Export Citation
  • AghaKouchak, A., A. Mehran, H. Norouzi, and A. Behrangi, 2012: Systematic and random error components in satellite precipitation data sets. Geophys. Res. Lett., 39, L09406, https://doi.org/10.1029/2012GL051592.

    • Search Google Scholar
    • Export Citation
  • Arrieta, A. B., and Coauthors, 2020: Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion, 58, 82115, https://doi.org/10.1016/j.inffus.2019.12.012.

    • Search Google Scholar
    • Export Citation
  • Back, A., and Coauthors, 2021: Convection-indicating GOES-R products assimilated in the experimental UFS rapid refresh system. 2021 Fall Meeting, New Orleans, LA, Amer. Geophys. Union, Abstracts A22B-02, https://ui.adsabs.harvard.edu/abs/2021AGUFM.A22B..02B/abstract.

  • Bauer, P., A. Thorpe, and G. Brunet, 2015: The quiet revolution of numerical weather prediction. Nature, 525, 4755, https://doi.org/10.1038/nature14956.

    • Search Google Scholar
    • Export Citation
  • Bjerknes, V., 1904: Das problem der wettervorhersage, betrachtet vom standpunkte der mechanik und der physik. Meteor. Z., 21, 17.

  • Carbone, R. E., and J. D. Tuttle, 2008: Rainfall occurrence in the U.S. warm season: The diurnal cycle. J. Climate, 21, 41324146, https://doi.org/10.1175/2008JCLI2275.1.

    • Search Google Scholar
    • Export Citation
  • Chapman, W. E., A. C. Subramanian, L. Delle Monache, S. P. Xie, and F. M. Ralph, 2019: Improving atmospheric river forecasts with machine learning. Geophys. Res. Lett., 46, 10 62710 635, https://doi.org/10.1029/2019GL083662.

    • Search Google Scholar
    • Export Citation
  • Ebert-Uphoff, I., and K. Hilburn, 2020: Evaluation, tuning, and interpretation of neural networks for working with images in meteorological applications. Bull. Amer. Meteor. Soc., 101, E2149E2170, https://doi.org/10.1175/BAMS-D-20-0097.1.

    • Search Google Scholar
    • Export Citation
  • Gettelman, A., D. J. Gagne, C.-C. Chen, M. W. Christensen, Z. J. Lebo, H. Morrison, and G. Gantos, 2021: Machine learning the warm rain process. J. Adv. Model. Earth Syst., 13, e2020MS002268, https://doi.org/10.1029/2020MS002268.

    • Search Google Scholar
    • Export Citation
  • Goldenstern, E., and C. Kummerow, 2023: Predicting region-dependent biases in a GOES-16 machine learning precipitation retrieval. J. Appl. Meteor. Climatol., 62, 873885, https://doi.org/10.1175/JAMC-D-22-0089.1.

    • Search Google Scholar
    • Export Citation
  • Hayatbini, N., and Coauthors, 2019: Conditional Generative Adversarial Networks (CGANS) for near real-time precipitation estimation from multispectral GOES-16 satellite imageries—PERSIANN-cGAN. Remote Sens., 11, 2193, https://doi.org/10.3390/rs11192193.

    • Search Google Scholar
    • Export Citation
  • Hilburn, K., 2023a: GREMLIN CONUS3 dataset for 2020. Dryad, accessed 15 May 2023, https://doi.org/10.5061/dryad.h9w0vt4nq.

  • Hilburn, K., 2023b: GREMLIN CONUS3 dataset for 2021. Dryad, accessed 15 May 2023, https://doi.org/10.5061/dryad.zs7h44jf2.

  • Hilburn, K., 2023c: GREMLIN CONUS3 dataset for 2022. Dryad, accessed 15 May 2023, https://doi.org/10.5061/dryad.2jm63xstt.

  • Hilburn, K., 2023d: Understanding spatial context in convolutional neural networks using explainable methods: Application to interpretable GREMLIN. Artif. Intell. Earth Syst., 2, 220093, https://doi.org/10.1175/AIES-D-22-0093.1.

    • Search Google Scholar
    • Export Citation
  • Hilburn, K., I. Ebert-Uphoff, and S. D. Miller, 2021: Development and interpretation of a neural-network-based synthetic radar reflectivity estimator using GOES-R satellite observations. J. Appl. Meteor. Climatol., 60, 321, https://doi.org/10.1175/JAMC-D-20-0084.1.

    • Search Google Scholar
    • Export Citation
  • Illingworth, A. J., 1988: The formation of rain in convective clouds. Nature, 336, 754756, https://doi.org/10.1038/336754a0.

  • Im, J., S. Park, J. Rhee, J. Baik, and M. Choi, 2016: Downscaling of AMSR-E soil moisture with MODIS products using machine learning approaches. Environ. Earth Sci., 75, 1120, https://doi.org/10.1007/s12665-016-5917-6.

    • Search Google Scholar
    • Export Citation
  • Kim, Y., and S. Hong, 2023: Hypothetical ground radar-like rain rate generation of geostationary weather satellite using data-to-data translation. IEEE Trans. Geosci. Remote Sens., 61, 4103414, https://doi.org/10.1109/TGRS.2023.3267840.

    • Search Google Scholar
    • Export Citation
  • Kuligowski, R. J., Y. Li, Y. Hao, and Y. Zhang, 2016: Improvements to the GOES-R rainfall rate algorithm. J. Hydrometeor., 17, 16931704, https://doi.org/10.1175/JHM-D-15-0186.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., A. McGovern, C. R. Homeyer, D. J. Gagne II, and T. Smith, 2020: Deep learning on three-dimensional multiscale data for next-hour tornado prediction. Mon. Wea. Rev., 148, 28372861, https://doi.org/10.1175/MWR-D-19-0372.1.

    • Search Google Scholar
    • Export Citation
  • Lagerquist, R., D. Turner, I. Ebert-Uphoff, J. Stewart, and V. Hagerty, 2021: Using deep learning to emulate and accelerate a radiative transfer model. J. Atmos. Oceanic Technol., 38, 16731696, https://doi.org/10.1175/JTECH-D-21-0007.1.

    • Search Google Scholar
    • Export Citation
  • Lee, Y., C. D. Kummerow, and I. Ebert-Uphoff, 2021: Applying machine learning methods to detect convection using Geostationary Operational Environmental Satellite-16 (GOES-16) Advanced Baseline Imager (ABI) data. Atmos. Meas. Tech., 14, 26992716, https://doi.org/10.5194/amt-14-2699-2021.

    • Search Google Scholar
    • Export Citation
  • Lee, Y., C. D. Kummerow, and M. Zupanski, 2022: Latent heating profiles from GOES-16 and its impacts on precipitation forecasts. Atmos. Meas. Tech., 15, 71197136, https://doi.org/10.5194/amt-15-7119-2022.

    • Search Google Scholar
    • Export Citation
  • Li, W., L. Ni, Z.-L. Li, S.-B. Duan, and H. Wu, 2019: Evaluation of machine learning algorithms in spatial downscaling of MODIS land surface temperature. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 12, 22992307, https://doi.org/10.1109/JSTARS.2019.2896923.

    • Search Google Scholar
    • Export Citation
  • Liu, S., C. Grassotti, Q. Liu, Y. Zhou, and Y.-K. Lee, 2022: Improvement of MiRS sea surface temperature retrievals using a machine learning approach. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 15, 18571868, https://doi.org/10.1109/JSTARS.2022.3151002.

    • Search Google Scholar
    • Export Citation
  • Marchand, M., K. Hilburn, and S. D. Miller, 2019: Geostationary lightning mapper and earth networks lightning detection over the contiguous United States and dependence on flash characteristics. J. Geophys. Res. Atmos., 124, 11 55211 567, https://doi.org/10.1029/2019JD031039.

    • Search Google Scholar
    • Export Citation
  • Matthies, H. G., 2007: Quantifying uncertainty: Modern computational representation of probability and applications. Extreme Man-Made and Natural Hazards in Dynamics of Structures, Springer, 105–135.

  • Park, J.-E., Y.-J. Choi, J. Jeong, and S. Hong, 2023: Hypothetical cirrus band generation for advanced Himawari imager sensor using data-to-data translation with advanced meteorological imager observations. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16, 356368, https://doi.org/10.1109/JSTARS.2022.3224911.

    • Search Google Scholar
    • Export Citation
  • Pfreundschuh, S., P. J. Brown, C. D. Kummerow, P. Eriksson, and T. Norrestad, 2022: GPROF-NN: A neural-network-based implementation of the Goddard profiling algorithm. Atmos. Meas. Tech., 15, 50335060, https://doi.org/10.5194/amt-15-5033-2022.

    • Search Google Scholar
    • Export Citation
  • Shi, X., Z. Gao, L. Lausen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c. Woo, 2017: Deep learning for precipitation nowcasting: A benchmark and a new model. NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates Inc., 5622–5632, https://dl.acm.org/doi/10.5555/3295222.3295313.

  • Smith, B. T., R. L. Thompson, J. S. Grams, C. Broyles, and H. E. Brooks, 2012: Convective modes for significant severe thunderstorms in the contiguous United States. Part I: Storm classification and climatology. Wea. Forecasting, 27, 11141135, https://doi.org/10.1175/WAF-D-11-00115.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Sun, F., B. Li, M. Min, and D. Qin, 2021: Deep learning-based radar composite reflectivity factor estimations from Fengyun-4A geostationary satellite observations. Remote Sens., 13, 2229, https://doi.org/10.3390/rs13112229.

    • Search Google Scholar
    • Export Citation
  • Tian, Y., and Coauthors, 2009: Component analysis of errors in satellite-based precipitation estimates. J. Geophys. Res., 114, D24101, https://doi.org/10.1029/2009JD011949.

    • Search Google Scholar
    • Export Citation
  • Veillette, M. S., E. P. Hassey, C. J. Mattioli, H. Iskenderian, and P. M. Lamey, 2018: Creating synthetic radar imagery using convolutional neural networks. J. Atmos. Oceanic Technol., 35, 23232338, https://doi.org/10.1175/JTECH-D-18-0010.1.

    • Search Google Scholar
    • Export Citation
  • Weygandt, S. S., S. G. Benjamin, M. Hu, C. R. Alexander, T. G. Smirnova, and E. P. James, 2022: Radar reflectivity–based model initialization using specified latent heating (Radar-LHI) within a diabatic digital filter or pre-forecast integration. Wea. Forecasting, 37, 14191434, https://doi.org/10.1175/WAF-D-21-0142.1.

    • Search Google Scholar
    • Export Citation
  • Yang, L., Q. Zhao, Y. Xue, F. Sun, J. Li, X. Zhen, and T. Lu, 2023: Radar composite reflectivity reconstruction based on FY-4A using deep learning. Sensors, 23, 81, https://doi.org/10.3390/s23010081.

    • Search Google Scholar
    • Export Citation
  • Zajac, B. A., and S. A. Rutledge, 2001: Cloud-to-ground lightning activity in the contiguous United States from 1995 to 1999. Mon. Wea. Rev., 129, 9991019, https://doi.org/10.1175/1520-0493(2001)129<0999:CTGLAI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Average RQI over CONUS.

  • Fig. 2.

    RMSD maps over CONUS for (a) spring, (b) summer, (c) fall, and (d) winter. “RMSD total” is calculated over CONUS, and “RMSD only east” is calculated only using data in regions east of 105°W longitude. The winter RMSD map (d) uses a different scale in the color bar because the magnitude of RMSD in winter is much larger than that in other seasons.

  • Fig. 3.

    Difference in RMSD between before and after modifying reflectivity based on the ABI level-2 clear-sky mask product for (a) spring, (b) summer, (c) fall, and (d) winter. Minimum and maximum differences are shown in the lower-left corner of each plot.

  • Fig. 4.

    MBE maps (dBZ) for each season. Red color represents positive bias, which means that GREMLIN overestimates, and blue color represents negative bias, which means that GREMLIN underestimates. Note that the color bar for winter in (d) is different from the others.

  • Fig. 5.

    (a) MRMS composite reflectivity and (b) GREMLIN composite reflectivity at 1900 UTC 1 Dec 2022.

  • Fig. 6.

    RMSD against day of year using daily bins.

  • Fig. 7.

    Diurnal time series of (a) RMSD for each season and (b) RMSD in summer, dividing regions by longitude: WEST (longitude > 103°W), MID (90°W < longitude ≤ 103°W), and EAST (longitude ≤ 90°W).

  • Fig. 8.

    RMSD in summer over CONUS at (a) 0200 UTC, (b) 0700 UTC, (c) 1300 UTC, and (d) 2100 UTC.

  • Fig. 9.

    Maps of (a) MRMS composite reflectivity and (b) GOES channel-13 brightness temperature at 0300 UTC 26 Jul 2021.

  • Fig. 10.

    (a) CSI, (b) POD, (c) FAR, and (d) FBI for 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 thresholds. Values for spring, summer, fall, and winter are plotted in green, orange, blue, and purple, respectively, and values from H21 are also plotted in red for comparisons.

  • Fig. 11.

    CSI for the 5-dBZ threshold in (a) spring, (b) summer, (c) fall, and (d) winter.

  • Fig. 12.

    CSI for the 30-dBZ threshold in (a) spring, (b) summer, (c) fall, and (d) winter. Only the grid points that had more than 100 occurrences are plotted. Nebraska in the green-outlined box shows the highest CSI, and the Northeast (Vermont, Maine, and New Hampshire) in the blue-outlined box shows the lowest CSI.

  • Fig. 13.

    (a) POD and (b) FAR for the 30-dBZ threshold in summer. Only the grid points that had more than 100 occurrences are plotted.

  • Fig. 14.

    Distributions of brightness temperature for channels (a) 7, (b) 9, and (c) 13 and (d) presence of lightning in the Northeast (blue) and Nebraska (green), as defined by the outlined boxes in Fig. 12b. Solid and dashed lines are for hit and miss cases, respectively.

  • Fig. 15.

    (a) A lookup table of mean biases according to channel-13 brightness temperature and surface temperature. The one-to-one line is plotted in red for an easier comparison. (b) Two-dimensional histogram plots of MRMS and GREMLIN composite radar reflectivity in winter after correcting with the lookup table in (a).

  • Fig. 16.

    GREMLIN results (a) after dynamical scaling and (b) after correcting with the lookup table in Fig. 15a for the same case study as in Fig. 5.

  • Fig. 17.

    (a) RMSD and (b) MBE in winter after correcting with the lookup table shown in Fig. 15a.

All Time Past Year Past 30 Days
Abstract Views 89 89 0
Full Text Views 303 303 48
PDF Downloads 246 246 48