Browse
Abstract
Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.
Significance Statement
Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.
Abstract
Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.
Significance Statement
Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.
Abstract
Despite the sophistication of global climate models (GCMs), their coarse spatial resolution limits their ability to resolve important aspects of climate variability and change at the local scale. Both dynamical and empirical methods are used for enhancing the resolution of climate projections through downscaling, each with distinct advantages and challenges. Dynamical downscaling is physics based but comes with a large computational cost, posing a barrier for downscaling an ensemble of GCMs large enough for reliable uncertainty quantification of climate risks. In contrast, empirical downscaling, which encompasses statistical and machine learning techniques, provides a computationally efficient alternative to downscaling GCMs. Empirical downscaling algorithms can be developed to emulate the behavior of dynamical models directly, or through frameworks such as perfect prognosis in which relationships are established between large-scale atmospheric conditions and local weather variables using observational data. However, the ability of empirical downscaling algorithms to apply their learned relationships out of distribution into future climates remains uncertain, as is their ability to represent certain types of extreme events. This review covers the growing potential of machine learning methods to address these challenges, offering a thorough exploration of the current applications and training strategies that can circumvent certain issues. Additionally, we propose an evaluation framework for machine learning algorithms specific to the problem of climate downscaling as needed to improve transparency and foster trust in climate projections.
Significance Statement
This review offers a significant contribution to our understanding of how machine learning can offer a transformative change in climate downscaling. It serves as a guide to navigate recent advances in machine learning and how these advances can be better aligned toward inherent challenges in climate downscaling. In this review, we provide an overview of these recent advances with a critical discussion of their advantages and limitations. We also discuss opportunities to refine existing machine learning methods alongside new approaches for the generation of large ensembles of high-resolution climate projections.
Abstract
Despite the sophistication of global climate models (GCMs), their coarse spatial resolution limits their ability to resolve important aspects of climate variability and change at the local scale. Both dynamical and empirical methods are used for enhancing the resolution of climate projections through downscaling, each with distinct advantages and challenges. Dynamical downscaling is physics based but comes with a large computational cost, posing a barrier for downscaling an ensemble of GCMs large enough for reliable uncertainty quantification of climate risks. In contrast, empirical downscaling, which encompasses statistical and machine learning techniques, provides a computationally efficient alternative to downscaling GCMs. Empirical downscaling algorithms can be developed to emulate the behavior of dynamical models directly, or through frameworks such as perfect prognosis in which relationships are established between large-scale atmospheric conditions and local weather variables using observational data. However, the ability of empirical downscaling algorithms to apply their learned relationships out of distribution into future climates remains uncertain, as is their ability to represent certain types of extreme events. This review covers the growing potential of machine learning methods to address these challenges, offering a thorough exploration of the current applications and training strategies that can circumvent certain issues. Additionally, we propose an evaluation framework for machine learning algorithms specific to the problem of climate downscaling as needed to improve transparency and foster trust in climate projections.
Significance Statement
This review offers a significant contribution to our understanding of how machine learning can offer a transformative change in climate downscaling. It serves as a guide to navigate recent advances in machine learning and how these advances can be better aligned toward inherent challenges in climate downscaling. In this review, we provide an overview of these recent advances with a critical discussion of their advantages and limitations. We also discuss opportunities to refine existing machine learning methods alongside new approaches for the generation of large ensembles of high-resolution climate projections.
Abstract
The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5 Precursor (Sentinel-5P) satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not provide prediction uncertainties and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.
Significance Statement
Gaps in satellite-based measurements of atmospheric variables can make the application of complex analysis methods such as deep learning approaches difficult. The purpose of this study is to present and evaluate a purely data-driven method to fill incomplete satellite image time series. The application on atmospheric carbon monoxide data suggests that the method can achieve prediction errors comparable to other approaches with much lower computation times. Results highlight that the method is promising for larger datasets but also that care must be taken to avoid extrapolation. Future studies may integrate the approach into more complex deep learning models for understanding spatiotemporal dynamics from incomplete data.
Abstract
The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5 Precursor (Sentinel-5P) satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not provide prediction uncertainties and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.
Significance Statement
Gaps in satellite-based measurements of atmospheric variables can make the application of complex analysis methods such as deep learning approaches difficult. The purpose of this study is to present and evaluate a purely data-driven method to fill incomplete satellite image time series. The application on atmospheric carbon monoxide data suggests that the method can achieve prediction errors comparable to other approaches with much lower computation times. Results highlight that the method is promising for larger datasets but also that care must be taken to avoid extrapolation. Future studies may integrate the approach into more complex deep learning models for understanding spatiotemporal dynamics from incomplete data.
Abstract
The quantification of storm updrafts remains unavailable for operational forecasting despite their inherent importance to convection and its associated severe weather hazards. Updraft proxies, like overshooting top area from satellite images, have been linked to severe weather hazards but only relate to a limited portion of the total storm updraft. This study investigates if a machine learning model, namely, U-Nets, can skillfully retrieve maximum vertical velocity and its areal extent from three-dimensional gridded radar reflectivity alone. The machine learning model is trained using simulated radar reflectivity and vertical velocity from the National Severe Storm Laboratory’s convection permitting Warn-on-Forecast System (WoFS). A parametric regression technique using the sinh–arcsinh–normal distribution is adapted to run with U-Nets, allowing for both deterministic and probabilistic predictions of maximum vertical velocity. The best models after hyperparameter search provided less than 50% root mean squared error, a coefficient of determination greater than 0.65, and an intersection over union (IoU) of more than 0.45 on the independent test set composed of WoFS data. Beyond the WoFS analysis, a case study was conducted using real radar data and corresponding dual-Doppler analyses of vertical velocity within a supercell. The U-Net consistently underestimates the dual-Doppler updraft speed estimates by 50%. Meanwhile, the area of the 5 and 10 m s−1 updraft cores shows an IoU of 0.25. While the above statistics are not exceptional, the machine learning model enables quick distillation of 3D radar data that is related to the maximum vertical velocity, which could be useful in assessing a storm’s severe potential.
Significance Statement
All convective storm hazards (tornadoes, hail, heavy rain, straight line winds) can be related to a storm’s updraft. Yet, there is no direct measurement of updraft speed or area available for forecasters to make their warning decisions from. This paper addresses the lack of observational data by providing a machine learning solution that skillfully estimates the maximum updraft speed within storms from only the radar reflectivity 3D structure. After further vetting the machine learning solutions on additional real-world examples, the estimated storm updrafts will hopefully provide forecasters with an added tool to help diagnose a storm’s hazard potential more accurately.
Abstract
The quantification of storm updrafts remains unavailable for operational forecasting despite their inherent importance to convection and its associated severe weather hazards. Updraft proxies, like overshooting top area from satellite images, have been linked to severe weather hazards but only relate to a limited portion of the total storm updraft. This study investigates if a machine learning model, namely, U-Nets, can skillfully retrieve maximum vertical velocity and its areal extent from three-dimensional gridded radar reflectivity alone. The machine learning model is trained using simulated radar reflectivity and vertical velocity from the National Severe Storm Laboratory’s convection permitting Warn-on-Forecast System (WoFS). A parametric regression technique using the sinh–arcsinh–normal distribution is adapted to run with U-Nets, allowing for both deterministic and probabilistic predictions of maximum vertical velocity. The best models after hyperparameter search provided less than 50% root mean squared error, a coefficient of determination greater than 0.65, and an intersection over union (IoU) of more than 0.45 on the independent test set composed of WoFS data. Beyond the WoFS analysis, a case study was conducted using real radar data and corresponding dual-Doppler analyses of vertical velocity within a supercell. The U-Net consistently underestimates the dual-Doppler updraft speed estimates by 50%. Meanwhile, the area of the 5 and 10 m s−1 updraft cores shows an IoU of 0.25. While the above statistics are not exceptional, the machine learning model enables quick distillation of 3D radar data that is related to the maximum vertical velocity, which could be useful in assessing a storm’s severe potential.
Significance Statement
All convective storm hazards (tornadoes, hail, heavy rain, straight line winds) can be related to a storm’s updraft. Yet, there is no direct measurement of updraft speed or area available for forecasters to make their warning decisions from. This paper addresses the lack of observational data by providing a machine learning solution that skillfully estimates the maximum updraft speed within storms from only the radar reflectivity 3D structure. After further vetting the machine learning solutions on additional real-world examples, the estimated storm updrafts will hopefully provide forecasters with an added tool to help diagnose a storm’s hazard potential more accurately.
Abstract
We developed an advanced postprocessing model for precipitation forecasting using a microgenetic algorithm (MGA). The algorithm determines the optimal combination of three general circulation models: the Korean Integrated Model, the Unified Model, and the Integrated Forecast System model. To measure model accuracy, including the critical success index (CSI), probability of detection (POD), and frequency bias index, the MGA calculates optimal weights for individual models based on a fitness function that considers various indices. Our optimized multimodel yielded up to 13% and 10% improvement in CSI and POD performance compared to each individual model, respectively. Notably, when applied to an operational definition that considers precipitation thresholds from three models and averages the precipitation amount from the satisfactory models, our optimized multimodel outperformed the current operational model used by the Korea Meteorological Administration by up to 1.0% and 6.8% in terms of CSI and false alarm ratio performance, respectively. This study highlights the effectiveness of a weighted combination of global models to enhance the forecasting accuracy for regional precipitation. By utilizing the MGA for the fine-tuning of model weights, we achieved superior precipitation prediction compared to that of individual models and existing standard postprocessing operations. This approach can significantly improve the accuracy of precipitation forecasts.
Significance Statement
We developed an optimized multimodel for predicting precipitation occurrence using advanced techniques. By integrating various weather models with their optimized weights, our approach outperforms the method of using an arithmetic average of all models. This study underscores the potential to enhance regional precipitation forecasts, thereby facilitating more precise weather predictions for the public.
Abstract
We developed an advanced postprocessing model for precipitation forecasting using a microgenetic algorithm (MGA). The algorithm determines the optimal combination of three general circulation models: the Korean Integrated Model, the Unified Model, and the Integrated Forecast System model. To measure model accuracy, including the critical success index (CSI), probability of detection (POD), and frequency bias index, the MGA calculates optimal weights for individual models based on a fitness function that considers various indices. Our optimized multimodel yielded up to 13% and 10% improvement in CSI and POD performance compared to each individual model, respectively. Notably, when applied to an operational definition that considers precipitation thresholds from three models and averages the precipitation amount from the satisfactory models, our optimized multimodel outperformed the current operational model used by the Korea Meteorological Administration by up to 1.0% and 6.8% in terms of CSI and false alarm ratio performance, respectively. This study highlights the effectiveness of a weighted combination of global models to enhance the forecasting accuracy for regional precipitation. By utilizing the MGA for the fine-tuning of model weights, we achieved superior precipitation prediction compared to that of individual models and existing standard postprocessing operations. This approach can significantly improve the accuracy of precipitation forecasts.
Significance Statement
We developed an optimized multimodel for predicting precipitation occurrence using advanced techniques. By integrating various weather models with their optimized weights, our approach outperforms the method of using an arithmetic average of all models. This study underscores the potential to enhance regional precipitation forecasts, thereby facilitating more precise weather predictions for the public.
Abstract
Virtually all aspects of our societal functioning—from food security to energy supply to healthcare—depend on the dynamics of environmental factors. Nevertheless, the social dimensions of weather and climate are noticeably less explored by the artificial intelligence community. By harnessing the strength of geometric deep learning (GDL), we aim to investigate the pressing societal question the potential disproportional impacts of air quality on COVID-19 clinical severity. To quantify air pollution levels, here we use aerosol optical depth (AOD) records that measure the reduction of the sunlight due to atmospheric haze, dust, and smoke. We also introduce unique and not yet broadly available NASA satellite records (NASAdat) on AOD, temperature, and relative humidity and discuss the utility of these new data for biosurveillance and climate justice applications, with a specific focus on COVID-19 in the states of Texas and Pennsylvania. The results indicate, in general, that the poorer air quality tends to be associated with higher rates for clinical severity and, in the case of Texas, that this phenomenon particularly stands out in Texan counties characterized by higher socioeconomic vulnerability. This, in turn, raises a concern of environmental injustice in these socioeconomically disadvantaged communities. Furthermore, given that one of NASA’s recent long-term commitments is to address such inequitable burden of environmental harm by expanding the use of Earth science data such as NASAdat, this project is one of the first steps toward developing a new platform integrating NASA’s satellite observations with deep learning (DL) tools for social good.
Significance Statement
By leveraging the strengths of modern deep learning models, particularly, graph neural networks to describe complex spatiotemporal dependencies and by introducing new NASA satellite records, this study aims to investigate the problem of potential environmental injustice associated with COVID-19 clinical severity and caused by disproportional impacts of poor air quality on disadvantaged socioeconomic populations.
Abstract
Virtually all aspects of our societal functioning—from food security to energy supply to healthcare—depend on the dynamics of environmental factors. Nevertheless, the social dimensions of weather and climate are noticeably less explored by the artificial intelligence community. By harnessing the strength of geometric deep learning (GDL), we aim to investigate the pressing societal question the potential disproportional impacts of air quality on COVID-19 clinical severity. To quantify air pollution levels, here we use aerosol optical depth (AOD) records that measure the reduction of the sunlight due to atmospheric haze, dust, and smoke. We also introduce unique and not yet broadly available NASA satellite records (NASAdat) on AOD, temperature, and relative humidity and discuss the utility of these new data for biosurveillance and climate justice applications, with a specific focus on COVID-19 in the states of Texas and Pennsylvania. The results indicate, in general, that the poorer air quality tends to be associated with higher rates for clinical severity and, in the case of Texas, that this phenomenon particularly stands out in Texan counties characterized by higher socioeconomic vulnerability. This, in turn, raises a concern of environmental injustice in these socioeconomically disadvantaged communities. Furthermore, given that one of NASA’s recent long-term commitments is to address such inequitable burden of environmental harm by expanding the use of Earth science data such as NASAdat, this project is one of the first steps toward developing a new platform integrating NASA’s satellite observations with deep learning (DL) tools for social good.
Significance Statement
By leveraging the strengths of modern deep learning models, particularly, graph neural networks to describe complex spatiotemporal dependencies and by introducing new NASA satellite records, this study aims to investigate the problem of potential environmental injustice associated with COVID-19 clinical severity and caused by disproportional impacts of poor air quality on disadvantaged socioeconomic populations.
Abstract
In many regions of the world, tornadoes travel through forested areas with low population densities, making downed trees the only observable damage indicator. Current methods in the EF scale for analyzing tree damage may not reflect the true intensity of some tornadoes. However, new methods have been developed that use the number of trees downed or treefall directions from high-resolution aerial imagery to provide an estimate of maximum wind speed. Treefall Identification and Direction Analysis (TrIDA) maps are used to identify areas of treefall damage and treefall directions along the damage path. Currently, TrIDA maps are generated manually, but this is labor-intensive, often taking several days or weeks. To solve this, this paper describes a machine learning– and image-processing-based model that automatically extracts fallen trees from large-scale aerial imagery, assesses their fall directions, and produces an area-averaged treefall vector map with minimal initial human interaction. The automated model achieves a median tree direction difference of 13.3° when compared to the manual tree directions from the Alonsa, Manitoba, tornado, demonstrating the viability of the automated model compared to manual assessment. Overall, the automated production of treefall vector maps from large-scale aerial imagery significantly speeds up and reduces the labor required to create a Treefall Identification and Direction Analysis map from a matter of days or weeks to a matter of hours.
Significance Statement
The automation of treefall detection and direction is significant to the analyses of tornado paths and intensities. Previously, it would have taken a researcher multiple days to weeks to manually count and assess the directions of fallen trees in large-scale aerial photography of tornado damage. Through automation, analysis takes a matter of hours, with minimal initial human interaction. Tornado researchers will be able to use this automated process to help analyze and assess tornadoes and their enhanced Fujita–scale rating around the world.
Abstract
In many regions of the world, tornadoes travel through forested areas with low population densities, making downed trees the only observable damage indicator. Current methods in the EF scale for analyzing tree damage may not reflect the true intensity of some tornadoes. However, new methods have been developed that use the number of trees downed or treefall directions from high-resolution aerial imagery to provide an estimate of maximum wind speed. Treefall Identification and Direction Analysis (TrIDA) maps are used to identify areas of treefall damage and treefall directions along the damage path. Currently, TrIDA maps are generated manually, but this is labor-intensive, often taking several days or weeks. To solve this, this paper describes a machine learning– and image-processing-based model that automatically extracts fallen trees from large-scale aerial imagery, assesses their fall directions, and produces an area-averaged treefall vector map with minimal initial human interaction. The automated model achieves a median tree direction difference of 13.3° when compared to the manual tree directions from the Alonsa, Manitoba, tornado, demonstrating the viability of the automated model compared to manual assessment. Overall, the automated production of treefall vector maps from large-scale aerial imagery significantly speeds up and reduces the labor required to create a Treefall Identification and Direction Analysis map from a matter of days or weeks to a matter of hours.
Significance Statement
The automation of treefall detection and direction is significant to the analyses of tornado paths and intensities. Previously, it would have taken a researcher multiple days to weeks to manually count and assess the directions of fallen trees in large-scale aerial photography of tornado damage. Through automation, analysis takes a matter of hours, with minimal initial human interaction. Tornado researchers will be able to use this automated process to help analyze and assess tornadoes and their enhanced Fujita–scale rating around the world.
Abstract
Airborne Doppler radar provides detailed and targeted observations of winds and precipitation in weather systems over remote or difficult-to-access regions that can help to improve scientific understanding and weather forecasts. Quality control (QC) is necessary to remove nonweather echoes from raw radar data for subsequent analysis. The complex decision-making ability of the machine learning random-forest technique is employed to create a generalized QC method for airborne radar data in convective weather systems. A manually QCed dataset was used to train the model containing data from the Electra Doppler Radar (ELDORA) in mature and developing tropical cyclones, a tornadic supercell, and a bow echo. Successful classification of ∼96% and ∼93% of weather and nonweather radar gates, respectively, in withheld testing data indicate the generalizability of the method. Dual-Doppler analysis from the genesis phase of Hurricane Ophelia (2005) using data not previously seen by the model produced a comparable wind field to that from manual QC. The framework demonstrates a proof of concept that can be applied to newer airborne Doppler radars.
Significance Statement
Airborne Doppler radar is an invaluable tool for making detailed measurements of wind and precipitation in weather systems over remote or difficult to access regions, such as hurricanes over the ocean. Using the collected radar data depends strongly on quality control (QC) procedures to classify weather and nonweather radar echoes and to then remove the latter before subsequent analysis or assimilation into numerical weather prediction models. Prior QC techniques require interactive editing and subjective classification by trained researchers and can demand considerable time for even small amounts of data. We present a new machine learning algorithm that is trained on past QC efforts from radar experts, resulting in an accurate, fast technique with far less user input required that can greatly reduce the time required for QC. The new technique is based on the random forest, which is a machine learning model composed of decision trees, to classify weather and nonweather radar echoes. Continued efforts to build on this technique could benefit future weather forecasts by quickly and accurately quality-controlling data from other airborne radars for research or operational meteorology.
Abstract
Airborne Doppler radar provides detailed and targeted observations of winds and precipitation in weather systems over remote or difficult-to-access regions that can help to improve scientific understanding and weather forecasts. Quality control (QC) is necessary to remove nonweather echoes from raw radar data for subsequent analysis. The complex decision-making ability of the machine learning random-forest technique is employed to create a generalized QC method for airborne radar data in convective weather systems. A manually QCed dataset was used to train the model containing data from the Electra Doppler Radar (ELDORA) in mature and developing tropical cyclones, a tornadic supercell, and a bow echo. Successful classification of ∼96% and ∼93% of weather and nonweather radar gates, respectively, in withheld testing data indicate the generalizability of the method. Dual-Doppler analysis from the genesis phase of Hurricane Ophelia (2005) using data not previously seen by the model produced a comparable wind field to that from manual QC. The framework demonstrates a proof of concept that can be applied to newer airborne Doppler radars.
Significance Statement
Airborne Doppler radar is an invaluable tool for making detailed measurements of wind and precipitation in weather systems over remote or difficult to access regions, such as hurricanes over the ocean. Using the collected radar data depends strongly on quality control (QC) procedures to classify weather and nonweather radar echoes and to then remove the latter before subsequent analysis or assimilation into numerical weather prediction models. Prior QC techniques require interactive editing and subjective classification by trained researchers and can demand considerable time for even small amounts of data. We present a new machine learning algorithm that is trained on past QC efforts from radar experts, resulting in an accurate, fast technique with far less user input required that can greatly reduce the time required for QC. The new technique is based on the random forest, which is a machine learning model composed of decision trees, to classify weather and nonweather radar echoes. Continued efforts to build on this technique could benefit future weather forecasts by quickly and accurately quality-controlling data from other airborne radars for research or operational meteorology.
Abstract
Explainable artificial intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the AI model and not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.
Abstract
Explainable artificial intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the AI model and not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.