Browse
Abstract
The diversity in the lightning parameterizations for numerical weather and climate models causes considerable uncertainty in lightning prediction. In this study, we take a data-driven approach to address the lightning parameterization problem, by combining machine learning (ML) techniques with the rich lightning observations from the World Wide Lightning Location Network. Three ML algorithms are trained over the contiguous United States (CONUS) to predict lightning stroke density in a 1° box based on the information about the atmospheric variables in the same grid (local) or over the entire CONUS (nonlocal). The performance of the ML-based lightning schemes is examined and compared with that of a simple, conventional lightning parameterization scheme of Romps et al. We find that all ML-based lightning schemes exhibit a performance that is superior to that of the conventional scheme in the regions and in the seasons with climatologically higher lightning stroke density. To the west of the Rocky Mountains, the nonlocal ML lightning scheme achieves the best overall performance, with lightning stroke density predictions being 70% more accurate than the conventional scheme. Our results suggest that the ML-based approaches have the potential to improve the representation of lightning and other types of extreme weather events in the weather and climate models.
Abstract
The diversity in the lightning parameterizations for numerical weather and climate models causes considerable uncertainty in lightning prediction. In this study, we take a data-driven approach to address the lightning parameterization problem, by combining machine learning (ML) techniques with the rich lightning observations from the World Wide Lightning Location Network. Three ML algorithms are trained over the contiguous United States (CONUS) to predict lightning stroke density in a 1° box based on the information about the atmospheric variables in the same grid (local) or over the entire CONUS (nonlocal). The performance of the ML-based lightning schemes is examined and compared with that of a simple, conventional lightning parameterization scheme of Romps et al. We find that all ML-based lightning schemes exhibit a performance that is superior to that of the conventional scheme in the regions and in the seasons with climatologically higher lightning stroke density. To the west of the Rocky Mountains, the nonlocal ML lightning scheme achieves the best overall performance, with lightning stroke density predictions being 70% more accurate than the conventional scheme. Our results suggest that the ML-based approaches have the potential to improve the representation of lightning and other types of extreme weather events in the weather and climate models.
Abstract
Forecast evaluation metrics have been discovered and rediscovered in a variety of contexts, leading to confusion. We look at measures from the 2 × 2 contingency table and the history of their development and illustrate how different fields working on similar problems has led to different approaches and perspectives of the same mathematical concepts. For example, probability of detection (POD) is a quantity in meteorology that was also called prefigurance in the field, while the same thing is named recall in information science and machine learning, and sensitivity and true positive rate in the medical literature. Many of the scores that combine three elements of the 2 × 2 table can be seen as either coming from a perspective of Venn diagrams or from the Pythagorean means, possibly weighted, of two ratios of performance measures. Although there are algebraic relationships between the two perspectives, the approaches taken by authors led them in different directions, making it unlikely that they would discover scores that naturally arose from the other approach. We close by discussing the importance of understanding the implicit or explicit values expressed by the choice of scores. In addition, we make some simple recommendations about the appropriate nomenclature to use when publishing interdisciplinary work.
Abstract
Forecast evaluation metrics have been discovered and rediscovered in a variety of contexts, leading to confusion. We look at measures from the 2 × 2 contingency table and the history of their development and illustrate how different fields working on similar problems has led to different approaches and perspectives of the same mathematical concepts. For example, probability of detection (POD) is a quantity in meteorology that was also called prefigurance in the field, while the same thing is named recall in information science and machine learning, and sensitivity and true positive rate in the medical literature. Many of the scores that combine three elements of the 2 × 2 table can be seen as either coming from a perspective of Venn diagrams or from the Pythagorean means, possibly weighted, of two ratios of performance measures. Although there are algebraic relationships between the two perspectives, the approaches taken by authors led them in different directions, making it unlikely that they would discover scores that naturally arose from the other approach. We close by discussing the importance of understanding the implicit or explicit values expressed by the choice of scores. In addition, we make some simple recommendations about the appropriate nomenclature to use when publishing interdisciplinary work.
Abstract
In this study, we developed an emulator of the Community Multiscale Air Quality (CMAQ) model by employing a one-dimensional (1D) convolutional neural network (CNN) algorithm to predict hourly surface nitrogen dioxide (NO2) concentrations over the most densely populated urban regions in Texas. The inputs for the emulator were the same as those for the CMAQ model, which includes emission, meteorological, and land-use/land-cover data. We trained the model over June, July, and August (JJA) of 2011 and 2014 and then tested it on JJA of 2017, achieving an index of agreement (IOA) of 0.95 and a correlation of 0.90. We also employed temporal threefold cross validation to evaluate the model’s performance, ensuring the robustness and generalizability of the results. To gain deeper insights and understand the factors influencing the model’s surface NO2 predictions, we conducted a Shapley additive explanations analysis. The results revealed solar radiation reaching the surface, planetary boundary layer height, and NO x (NO + NO2) emissions are key variables driving the model’s predictions. These findings highlight the emulator’s ability to capture the individual impact of each variable on the model’s NO2 predictions. Furthermore, our emulator outperformed the CMAQ model in terms of computational efficiency, being more than 900 times as fast in predicting NO2 concentrations, enabling the rapid assessment of various pollution management scenarios. This work offers a valuable resource for air pollution mitigation efforts, not just in Texas, but with appropriate regional data training, its utility could be extended to other regions and pollutants as well.
Significance Statement
This work develops an emulator of the Community Multiscale Air Quality model, using a one-dimensional convolutional neural network to predict hourly surface NO2 concentrations across densely populated regions in Texas. Our emulator is capable of providing rapid and highly accurate NO2 estimates, enabling it to model diverse scenarios and facilitating informed decision-making to improve public health outcomes. Notably, this model outperforms traditional methods in computational efficiency, making it a robust, time-efficient tool for air pollution mitigation efforts. The findings suggest that key variables like solar radiation, planetary boundary layer height, and NO x (NO + NO2) emissions significantly influence the model’s NO2 predictions. By adding appropriate training data, this work can be extended to other regions and other pollutants such as O3, PM2.5, and PM10, offering a powerful tool for pollution mitigation and public health improvement efforts worldwide.
Abstract
In this study, we developed an emulator of the Community Multiscale Air Quality (CMAQ) model by employing a one-dimensional (1D) convolutional neural network (CNN) algorithm to predict hourly surface nitrogen dioxide (NO2) concentrations over the most densely populated urban regions in Texas. The inputs for the emulator were the same as those for the CMAQ model, which includes emission, meteorological, and land-use/land-cover data. We trained the model over June, July, and August (JJA) of 2011 and 2014 and then tested it on JJA of 2017, achieving an index of agreement (IOA) of 0.95 and a correlation of 0.90. We also employed temporal threefold cross validation to evaluate the model’s performance, ensuring the robustness and generalizability of the results. To gain deeper insights and understand the factors influencing the model’s surface NO2 predictions, we conducted a Shapley additive explanations analysis. The results revealed solar radiation reaching the surface, planetary boundary layer height, and NO x (NO + NO2) emissions are key variables driving the model’s predictions. These findings highlight the emulator’s ability to capture the individual impact of each variable on the model’s NO2 predictions. Furthermore, our emulator outperformed the CMAQ model in terms of computational efficiency, being more than 900 times as fast in predicting NO2 concentrations, enabling the rapid assessment of various pollution management scenarios. This work offers a valuable resource for air pollution mitigation efforts, not just in Texas, but with appropriate regional data training, its utility could be extended to other regions and pollutants as well.
Significance Statement
This work develops an emulator of the Community Multiscale Air Quality model, using a one-dimensional convolutional neural network to predict hourly surface NO2 concentrations across densely populated regions in Texas. Our emulator is capable of providing rapid and highly accurate NO2 estimates, enabling it to model diverse scenarios and facilitating informed decision-making to improve public health outcomes. Notably, this model outperforms traditional methods in computational efficiency, making it a robust, time-efficient tool for air pollution mitigation efforts. The findings suggest that key variables like solar radiation, planetary boundary layer height, and NO x (NO + NO2) emissions significantly influence the model’s NO2 predictions. By adding appropriate training data, this work can be extended to other regions and other pollutants such as O3, PM2.5, and PM10, offering a powerful tool for pollution mitigation and public health improvement efforts worldwide.
Abstract
Forecasting the intensity of a tropical cyclone (TC) remains challenging, particularly when it undergoes rapid changes in intensity. This study aims to develop a convolutional neural network (CNN) for 24-h forecasts of the TC intensity changes and their rapid intensifications over the western Pacific. The CNN model, the DeepTC, is trained using a unique loss function, an amplitude focal loss, to better capture large intensity changes, such as those during rapid intensification (RI) events. We showed that the DeepTC outperforms operational forecasts, with a lower mean absolute error (8.9%–10.2%) and a higher coefficient of determination (31.7%–35%). In addition, the DeepTC exhibits a substantially better skill at capturing RI events than operational forecasts. To understand the superior performance of the DeepTC in RI forecasts, we conduct an occlusion sensitivity analysis to quantify the relative importance of each predictor. Results revealed that scalar quantities such as latitude, previous intensity change, initial intensity, and vertical wind shear play critical roles in successful RI prediction. Additionally, the DeepTC utilizes the three-dimensional distribution of relative humidity to distinguish RI cases from non-RI cases, with higher dry–moist moisture gradients in the mid-to-low troposphere and steeper radial moisture gradients in the upper troposphere showed during RI events. These relationships between the identified key variables and intensity change were successfully simulated by the DeepTC, implying that the relationship is physically reasonable. Our study demonstrates that the DeepTC can be a powerful tool for improving RI understanding and enhancing the reliability of TC intensity forecasts.
Abstract
Forecasting the intensity of a tropical cyclone (TC) remains challenging, particularly when it undergoes rapid changes in intensity. This study aims to develop a convolutional neural network (CNN) for 24-h forecasts of the TC intensity changes and their rapid intensifications over the western Pacific. The CNN model, the DeepTC, is trained using a unique loss function, an amplitude focal loss, to better capture large intensity changes, such as those during rapid intensification (RI) events. We showed that the DeepTC outperforms operational forecasts, with a lower mean absolute error (8.9%–10.2%) and a higher coefficient of determination (31.7%–35%). In addition, the DeepTC exhibits a substantially better skill at capturing RI events than operational forecasts. To understand the superior performance of the DeepTC in RI forecasts, we conduct an occlusion sensitivity analysis to quantify the relative importance of each predictor. Results revealed that scalar quantities such as latitude, previous intensity change, initial intensity, and vertical wind shear play critical roles in successful RI prediction. Additionally, the DeepTC utilizes the three-dimensional distribution of relative humidity to distinguish RI cases from non-RI cases, with higher dry–moist moisture gradients in the mid-to-low troposphere and steeper radial moisture gradients in the upper troposphere showed during RI events. These relationships between the identified key variables and intensity change were successfully simulated by the DeepTC, implying that the relationship is physically reasonable. Our study demonstrates that the DeepTC can be a powerful tool for improving RI understanding and enhancing the reliability of TC intensity forecasts.
Abstract
Tropical cyclones (TCs) are important phenomena, and understanding their behavior requires being able to detect their presence in simulations. Detection algorithms vary; here we compare a novel deep learning–based detection algorithm (TCDetect) with a state-of-the-art tracking system (TRACK) and an observational dataset (IBTrACS) to provide context for potential use in climate simulations. Previous work has shown that TCDetect has good recall, particularly for hurricane-strength events. The primary question addressed here is to what extent the structure of the systems plays a part in detection. To compare with observations of TCs, it is necessary to apply detection techniques to reanalysis. For this purpose, we use ERA-Interim, and a key part of the comparison is the recognition that ERA-Interim itself does not fully reflect the observations. Despite that limitation, both TCDetect and TRACK applied to ERA-Interim mostly agree with each other. Also, when considering only hurricane-strength TCs, TCDetect and TRACK correspond well to the TC observations from IBTrACS. Like TRACK, TCDetect has good recall for strong systems; however, it finds a significant number of false positives associated with weaker TCs (i.e., events detected as having hurricane strength but are weaker in reality) and extratropical storms. Because TCDetect was not trained to locate TCs, a post hoc method to perform comparisons was used. Although this method was not always successful, some success in matching tracks and events in physical space was also achieved. The analysis of matches suggested that the best results were found in the Northern Hemisphere and that in most regions the detections followed the same patterns in time no matter which detection method was used.
Abstract
Tropical cyclones (TCs) are important phenomena, and understanding their behavior requires being able to detect their presence in simulations. Detection algorithms vary; here we compare a novel deep learning–based detection algorithm (TCDetect) with a state-of-the-art tracking system (TRACK) and an observational dataset (IBTrACS) to provide context for potential use in climate simulations. Previous work has shown that TCDetect has good recall, particularly for hurricane-strength events. The primary question addressed here is to what extent the structure of the systems plays a part in detection. To compare with observations of TCs, it is necessary to apply detection techniques to reanalysis. For this purpose, we use ERA-Interim, and a key part of the comparison is the recognition that ERA-Interim itself does not fully reflect the observations. Despite that limitation, both TCDetect and TRACK applied to ERA-Interim mostly agree with each other. Also, when considering only hurricane-strength TCs, TCDetect and TRACK correspond well to the TC observations from IBTrACS. Like TRACK, TCDetect has good recall for strong systems; however, it finds a significant number of false positives associated with weaker TCs (i.e., events detected as having hurricane strength but are weaker in reality) and extratropical storms. Because TCDetect was not trained to locate TCs, a post hoc method to perform comparisons was used. Although this method was not always successful, some success in matching tracks and events in physical space was also achieved. The analysis of matches suggested that the best results were found in the Northern Hemisphere and that in most regions the detections followed the same patterns in time no matter which detection method was used.
Abstract
Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.
Significance Statement
Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.
Abstract
Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.
Significance Statement
Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.
Abstract
Despite the sophistication of global climate models (GCMs), their coarse spatial resolution limits their ability to resolve important aspects of climate variability and change at the local scale. Both dynamical and empirical methods are used for enhancing the resolution of climate projections through downscaling, each with distinct advantages and challenges. Dynamical downscaling is physics based but comes with a large computational cost, posing a barrier for downscaling an ensemble of GCMs large enough for reliable uncertainty quantification of climate risks. In contrast, empirical downscaling, which encompasses statistical and machine learning techniques, provides a computationally efficient alternative to downscaling GCMs. Empirical downscaling algorithms can be developed to emulate the behavior of dynamical models directly, or through frameworks such as perfect prognosis in which relationships are established between large-scale atmospheric conditions and local weather variables using observational data. However, the ability of empirical downscaling algorithms to apply their learned relationships out of distribution into future climates remains uncertain, as is their ability to represent certain types of extreme events. This review covers the growing potential of machine learning methods to address these challenges, offering a thorough exploration of the current applications and training strategies that can circumvent certain issues. Additionally, we propose an evaluation framework for machine learning algorithms specific to the problem of climate downscaling as needed to improve transparency and foster trust in climate projections.
Significance Statement
This review offers a significant contribution to our understanding of how machine learning can offer a transformative change in climate downscaling. It serves as a guide to navigate recent advances in machine learning and how these advances can be better aligned toward inherent challenges in climate downscaling. In this review, we provide an overview of these recent advances with a critical discussion of their advantages and limitations. We also discuss opportunities to refine existing machine learning methods alongside new approaches for the generation of large ensembles of high-resolution climate projections.
Abstract
Despite the sophistication of global climate models (GCMs), their coarse spatial resolution limits their ability to resolve important aspects of climate variability and change at the local scale. Both dynamical and empirical methods are used for enhancing the resolution of climate projections through downscaling, each with distinct advantages and challenges. Dynamical downscaling is physics based but comes with a large computational cost, posing a barrier for downscaling an ensemble of GCMs large enough for reliable uncertainty quantification of climate risks. In contrast, empirical downscaling, which encompasses statistical and machine learning techniques, provides a computationally efficient alternative to downscaling GCMs. Empirical downscaling algorithms can be developed to emulate the behavior of dynamical models directly, or through frameworks such as perfect prognosis in which relationships are established between large-scale atmospheric conditions and local weather variables using observational data. However, the ability of empirical downscaling algorithms to apply their learned relationships out of distribution into future climates remains uncertain, as is their ability to represent certain types of extreme events. This review covers the growing potential of machine learning methods to address these challenges, offering a thorough exploration of the current applications and training strategies that can circumvent certain issues. Additionally, we propose an evaluation framework for machine learning algorithms specific to the problem of climate downscaling as needed to improve transparency and foster trust in climate projections.
Significance Statement
This review offers a significant contribution to our understanding of how machine learning can offer a transformative change in climate downscaling. It serves as a guide to navigate recent advances in machine learning and how these advances can be better aligned toward inherent challenges in climate downscaling. In this review, we provide an overview of these recent advances with a critical discussion of their advantages and limitations. We also discuss opportunities to refine existing machine learning methods alongside new approaches for the generation of large ensembles of high-resolution climate projections.
Abstract
The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5 Precursor (Sentinel-5P) satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not provide prediction uncertainties and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.
Significance Statement
Gaps in satellite-based measurements of atmospheric variables can make the application of complex analysis methods such as deep learning approaches difficult. The purpose of this study is to present and evaluate a purely data-driven method to fill incomplete satellite image time series. The application on atmospheric carbon monoxide data suggests that the method can achieve prediction errors comparable to other approaches with much lower computation times. Results highlight that the method is promising for larger datasets but also that care must be taken to avoid extrapolation. Future studies may integrate the approach into more complex deep learning models for understanding spatiotemporal dynamics from incomplete data.
Abstract
The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5 Precursor (Sentinel-5P) satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not provide prediction uncertainties and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.
Significance Statement
Gaps in satellite-based measurements of atmospheric variables can make the application of complex analysis methods such as deep learning approaches difficult. The purpose of this study is to present and evaluate a purely data-driven method to fill incomplete satellite image time series. The application on atmospheric carbon monoxide data suggests that the method can achieve prediction errors comparable to other approaches with much lower computation times. Results highlight that the method is promising for larger datasets but also that care must be taken to avoid extrapolation. Future studies may integrate the approach into more complex deep learning models for understanding spatiotemporal dynamics from incomplete data.
Abstract
The quantification of storm updrafts remains unavailable for operational forecasting despite their inherent importance to convection and its associated severe weather hazards. Updraft proxies, like overshooting top area from satellite images, have been linked to severe weather hazards but only relate to a limited portion of the total storm updraft. This study investigates if a machine learning model, namely, U-Nets, can skillfully retrieve maximum vertical velocity and its areal extent from three-dimensional gridded radar reflectivity alone. The machine learning model is trained using simulated radar reflectivity and vertical velocity from the National Severe Storm Laboratory’s convection permitting Warn-on-Forecast System (WoFS). A parametric regression technique using the sinh–arcsinh–normal distribution is adapted to run with U-Nets, allowing for both deterministic and probabilistic predictions of maximum vertical velocity. The best models after hyperparameter search provided less than 50% root mean squared error, a coefficient of determination greater than 0.65, and an intersection over union (IoU) of more than 0.45 on the independent test set composed of WoFS data. Beyond the WoFS analysis, a case study was conducted using real radar data and corresponding dual-Doppler analyses of vertical velocity within a supercell. The U-Net consistently underestimates the dual-Doppler updraft speed estimates by 50%. Meanwhile, the area of the 5 and 10 m s−1 updraft cores shows an IoU of 0.25. While the above statistics are not exceptional, the machine learning model enables quick distillation of 3D radar data that is related to the maximum vertical velocity, which could be useful in assessing a storm’s severe potential.
Significance Statement
All convective storm hazards (tornadoes, hail, heavy rain, straight line winds) can be related to a storm’s updraft. Yet, there is no direct measurement of updraft speed or area available for forecasters to make their warning decisions from. This paper addresses the lack of observational data by providing a machine learning solution that skillfully estimates the maximum updraft speed within storms from only the radar reflectivity 3D structure. After further vetting the machine learning solutions on additional real-world examples, the estimated storm updrafts will hopefully provide forecasters with an added tool to help diagnose a storm’s hazard potential more accurately.
Abstract
The quantification of storm updrafts remains unavailable for operational forecasting despite their inherent importance to convection and its associated severe weather hazards. Updraft proxies, like overshooting top area from satellite images, have been linked to severe weather hazards but only relate to a limited portion of the total storm updraft. This study investigates if a machine learning model, namely, U-Nets, can skillfully retrieve maximum vertical velocity and its areal extent from three-dimensional gridded radar reflectivity alone. The machine learning model is trained using simulated radar reflectivity and vertical velocity from the National Severe Storm Laboratory’s convection permitting Warn-on-Forecast System (WoFS). A parametric regression technique using the sinh–arcsinh–normal distribution is adapted to run with U-Nets, allowing for both deterministic and probabilistic predictions of maximum vertical velocity. The best models after hyperparameter search provided less than 50% root mean squared error, a coefficient of determination greater than 0.65, and an intersection over union (IoU) of more than 0.45 on the independent test set composed of WoFS data. Beyond the WoFS analysis, a case study was conducted using real radar data and corresponding dual-Doppler analyses of vertical velocity within a supercell. The U-Net consistently underestimates the dual-Doppler updraft speed estimates by 50%. Meanwhile, the area of the 5 and 10 m s−1 updraft cores shows an IoU of 0.25. While the above statistics are not exceptional, the machine learning model enables quick distillation of 3D radar data that is related to the maximum vertical velocity, which could be useful in assessing a storm’s severe potential.
Significance Statement
All convective storm hazards (tornadoes, hail, heavy rain, straight line winds) can be related to a storm’s updraft. Yet, there is no direct measurement of updraft speed or area available for forecasters to make their warning decisions from. This paper addresses the lack of observational data by providing a machine learning solution that skillfully estimates the maximum updraft speed within storms from only the radar reflectivity 3D structure. After further vetting the machine learning solutions on additional real-world examples, the estimated storm updrafts will hopefully provide forecasters with an added tool to help diagnose a storm’s hazard potential more accurately.
Abstract
We developed an advanced postprocessing model for precipitation forecasting using a microgenetic algorithm (MGA). The algorithm determines the optimal combination of three general circulation models: the Korean Integrated Model, the Unified Model, and the Integrated Forecast System model. To measure model accuracy, including the critical success index (CSI), probability of detection (POD), and frequency bias index, the MGA calculates optimal weights for individual models based on a fitness function that considers various indices. Our optimized multimodel yielded up to 13% and 10% improvement in CSI and POD performance compared to each individual model, respectively. Notably, when applied to an operational definition that considers precipitation thresholds from three models and averages the precipitation amount from the satisfactory models, our optimized multimodel outperformed the current operational model used by the Korea Meteorological Administration by up to 1.0% and 6.8% in terms of CSI and false alarm ratio performance, respectively. This study highlights the effectiveness of a weighted combination of global models to enhance the forecasting accuracy for regional precipitation. By utilizing the MGA for the fine-tuning of model weights, we achieved superior precipitation prediction compared to that of individual models and existing standard postprocessing operations. This approach can significantly improve the accuracy of precipitation forecasts.
Significance Statement
We developed an optimized multimodel for predicting precipitation occurrence using advanced techniques. By integrating various weather models with their optimized weights, our approach outperforms the method of using an arithmetic average of all models. This study underscores the potential to enhance regional precipitation forecasts, thereby facilitating more precise weather predictions for the public.
Abstract
We developed an advanced postprocessing model for precipitation forecasting using a microgenetic algorithm (MGA). The algorithm determines the optimal combination of three general circulation models: the Korean Integrated Model, the Unified Model, and the Integrated Forecast System model. To measure model accuracy, including the critical success index (CSI), probability of detection (POD), and frequency bias index, the MGA calculates optimal weights for individual models based on a fitness function that considers various indices. Our optimized multimodel yielded up to 13% and 10% improvement in CSI and POD performance compared to each individual model, respectively. Notably, when applied to an operational definition that considers precipitation thresholds from three models and averages the precipitation amount from the satisfactory models, our optimized multimodel outperformed the current operational model used by the Korea Meteorological Administration by up to 1.0% and 6.8% in terms of CSI and false alarm ratio performance, respectively. This study highlights the effectiveness of a weighted combination of global models to enhance the forecasting accuracy for regional precipitation. By utilizing the MGA for the fine-tuning of model weights, we achieved superior precipitation prediction compared to that of individual models and existing standard postprocessing operations. This approach can significantly improve the accuracy of precipitation forecasts.
Significance Statement
We developed an optimized multimodel for predicting precipitation occurrence using advanced techniques. By integrating various weather models with their optimized weights, our approach outperforms the method of using an arithmetic average of all models. This study underscores the potential to enhance regional precipitation forecasts, thereby facilitating more precise weather predictions for the public.