Browse
Abstract
This paper illustrates the lessons learned as we applied the U-Net3+ deep learning model to the task of building an operational model for predicting wildfire occurrence for the contiguous United States (CONUS) in the 1–10-day range. Through the lens of model performance, we explore the reasons for performance improvements made possible by the model. Lessons include the importance of labeling, the impact of information loss in input variables, and the role of operational considerations in the modeling process. This work offers lessons learned for other interdisciplinary researchers working at the intersection of deep learning and fire occurrence prediction with an eye toward operationalization.
Abstract
This paper illustrates the lessons learned as we applied the U-Net3+ deep learning model to the task of building an operational model for predicting wildfire occurrence for the contiguous United States (CONUS) in the 1–10-day range. Through the lens of model performance, we explore the reasons for performance improvements made possible by the model. Lessons include the importance of labeling, the impact of information loss in input variables, and the role of operational considerations in the modeling process. This work offers lessons learned for other interdisciplinary researchers working at the intersection of deep learning and fire occurrence prediction with an eye toward operationalization.
Abstract
Weather radar data are critical for nowcasting and an integral component of numerical weather prediction models. While weather radar data provide valuable information at high resolution, their ground-based nature limits their availability, which impedes large-scale applications. In contrast, meteorological satellites cover larger domains but with coarser resolution. However, with the rapid advancements in data-driven methodologies and modern sensors aboard geostationary satellites, new opportunities are emerging to bridge the gap between ground- and space-based observations, ultimately leading to more skillful weather prediction with high accuracy. Here, we present a transformer-based model for nowcasting ground-based radar image sequences using satellite data up to 2-h lead time. Trained on a dataset reflecting severe weather conditions, the model predicts radar fields occurring under different weather phenomena and shows robustness against rapidly growing/decaying fields and complex field structures. Model interpretation reveals that the infrared channel centered at 10.3 μm (C13) contains skillful information for all weather conditions, while lightning data have the highest relative feature importance in severe weather conditions, particularly in shorter lead times. The model can support precipitation nowcasting across large domains without an explicit need for radar towers, enhance numerical weather prediction and hydrological models, and provide radar proxy for data-scarce regions. Moreover, the open-source framework facilitates progress toward operational data-driven nowcasting.
Significance Statement
Ground-based weather radar data are essential for nowcasting, but data availability limitations hamper usage of radar data across large domains. We present a machine learning model, rooted in transformer architecture, that performs nowcasting of radar data using high-resolution geostationary satellite retrievals, for lead times of up to 2 h. Our model captures the spatiotemporal dynamics of radar fields from satellite data and offers accurate forecasts. Analysis indicates that the infrared channel centered at 10.3 μm provides useful information for nowcasting radar fields under various weather conditions. However, lightning activity exhibits the highest forecasting skill for severe weather at short lead times. Our findings show the potential of transformer-based models for nowcasting severe weather.
Abstract
Weather radar data are critical for nowcasting and an integral component of numerical weather prediction models. While weather radar data provide valuable information at high resolution, their ground-based nature limits their availability, which impedes large-scale applications. In contrast, meteorological satellites cover larger domains but with coarser resolution. However, with the rapid advancements in data-driven methodologies and modern sensors aboard geostationary satellites, new opportunities are emerging to bridge the gap between ground- and space-based observations, ultimately leading to more skillful weather prediction with high accuracy. Here, we present a transformer-based model for nowcasting ground-based radar image sequences using satellite data up to 2-h lead time. Trained on a dataset reflecting severe weather conditions, the model predicts radar fields occurring under different weather phenomena and shows robustness against rapidly growing/decaying fields and complex field structures. Model interpretation reveals that the infrared channel centered at 10.3 μm (C13) contains skillful information for all weather conditions, while lightning data have the highest relative feature importance in severe weather conditions, particularly in shorter lead times. The model can support precipitation nowcasting across large domains without an explicit need for radar towers, enhance numerical weather prediction and hydrological models, and provide radar proxy for data-scarce regions. Moreover, the open-source framework facilitates progress toward operational data-driven nowcasting.
Significance Statement
Ground-based weather radar data are essential for nowcasting, but data availability limitations hamper usage of radar data across large domains. We present a machine learning model, rooted in transformer architecture, that performs nowcasting of radar data using high-resolution geostationary satellite retrievals, for lead times of up to 2 h. Our model captures the spatiotemporal dynamics of radar fields from satellite data and offers accurate forecasts. Analysis indicates that the infrared channel centered at 10.3 μm provides useful information for nowcasting radar fields under various weather conditions. However, lightning activity exhibits the highest forecasting skill for severe weather at short lead times. Our findings show the potential of transformer-based models for nowcasting severe weather.
Abstract
Virtually all aspects of our societal functioning—from food security to energy supply to healthcare—depend on the dynamics of environmental factors. Nevertheless, the social dimensions of weather and climate are noticeably less explored by the artificial intelligence community. By harnessing the strength of geometric deep learning (GDL), we aim to investigate the pressing societal question the potential disproportional impacts of air quality on COVID-19 clinical severity. To quantify air pollution levels, here we use aerosol optical depth (AOD) records that measure the reduction of the sunlight due to atmospheric haze, dust, and smoke. We also introduce unique and not yet broadly available NASA satellite records (NASAdat) on AOD, temperature, and relative humidity and discuss the utility of these new data for biosurveillance and climate justice applications, with a specific focus on COVID-19 in the states of Texas and Pennsylvania. The results indicate, in general, that the poorer air quality tends to be associated with higher rates for clinical severity and, in the case of Texas, that this phenomenon particularly stands out in Texan counties characterized by higher socioeconomic vulnerability. This, in turn, raises a concern of environmental injustice in these socioeconomically disadvantaged communities. Furthermore, given that one of NASA’s recent long-term commitments is to address such inequitable burden of environmental harm by expanding the use of Earth science data such as NASAdat, this project is one of the first steps toward developing a new platform integrating NASA’s satellite observations with deep learning (DL) tools for social good.
Significance Statement
By leveraging the strengths of modern deep learning models, particularly, graph neural networks to describe complex spatiotemporal dependencies and by introducing new NASA satellite records, this study aims to investigate the problem of potential environmental injustice associated with COVID-19 clinical severity and caused by disproportional impacts of poor air quality on disadvantaged socioeconomic populations.
Abstract
Virtually all aspects of our societal functioning—from food security to energy supply to healthcare—depend on the dynamics of environmental factors. Nevertheless, the social dimensions of weather and climate are noticeably less explored by the artificial intelligence community. By harnessing the strength of geometric deep learning (GDL), we aim to investigate the pressing societal question the potential disproportional impacts of air quality on COVID-19 clinical severity. To quantify air pollution levels, here we use aerosol optical depth (AOD) records that measure the reduction of the sunlight due to atmospheric haze, dust, and smoke. We also introduce unique and not yet broadly available NASA satellite records (NASAdat) on AOD, temperature, and relative humidity and discuss the utility of these new data for biosurveillance and climate justice applications, with a specific focus on COVID-19 in the states of Texas and Pennsylvania. The results indicate, in general, that the poorer air quality tends to be associated with higher rates for clinical severity and, in the case of Texas, that this phenomenon particularly stands out in Texan counties characterized by higher socioeconomic vulnerability. This, in turn, raises a concern of environmental injustice in these socioeconomically disadvantaged communities. Furthermore, given that one of NASA’s recent long-term commitments is to address such inequitable burden of environmental harm by expanding the use of Earth science data such as NASAdat, this project is one of the first steps toward developing a new platform integrating NASA’s satellite observations with deep learning (DL) tools for social good.
Significance Statement
By leveraging the strengths of modern deep learning models, particularly, graph neural networks to describe complex spatiotemporal dependencies and by introducing new NASA satellite records, this study aims to investigate the problem of potential environmental injustice associated with COVID-19 clinical severity and caused by disproportional impacts of poor air quality on disadvantaged socioeconomic populations.
Abstract
Explainable artificial intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the AI model and not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.
Abstract
Explainable artificial intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the AI model and not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.
Abstract
Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a reevaluation of the top-performing candidate models postretraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).
Abstract
Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a reevaluation of the top-performing candidate models postretraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).
Abstract
Methods of explainable artificial intelligence (XAI) are used in geoscientific applications to gain insights into the decision-making strategy of neural networks (NNs), highlighting which features in the input contribute the most to a NN prediction. Here, we discuss our “lesson learned” that the task of attributing a prediction to the input does not have a single solution. Instead, the attribution results depend greatly on the considered baseline that the XAI method utilizes—a fact that has been overlooked in the geoscientific literature. The baseline is a reference point to which the prediction is compared so that the prediction can be understood. This baseline can be chosen by the user or is set by construction in the method’s algorithm—often without the user being aware of that choice. We highlight that different baselines can lead to different insights for different science questions and, thus, should be chosen accordingly. To illustrate the impact of the baseline, we use a large ensemble of historical and future climate simulations forced with the shared socioeconomic pathway 3-7.0 (SSP3-7.0) scenario and train a fully connected NN to predict the ensemble- and global-mean temperature (i.e., the forced global warming signal) given an annual temperature map from an individual ensemble member. We then use various XAI methods and different baselines to attribute the network predictions to the input. We show that attributions differ substantially when considering different baselines, because they correspond to answering different science questions. We conclude by discussing important implications and considerations about the use of baselines in XAI research.
Significance Statement
In recent years, methods of explainable artificial intelligence (XAI) have found great application in geoscientific applications, because they can be used to attribute the predictions of neural networks (NNs) to the input and interpret them physically. Here, we highlight that the attributions—and the physical interpretation—depend greatly on the choice of the baseline—a fact that has been overlooked in the geoscientific literature. We illustrate this dependence for a specific climate task, in which a NN is trained to predict the ensemble- and global-mean temperature (i.e., the forced global warming signal) given an annual temperature map from an individual ensemble member. We show that attributions differ substantially when considering different baselines, because they correspond to answering different science questions.
Abstract
Methods of explainable artificial intelligence (XAI) are used in geoscientific applications to gain insights into the decision-making strategy of neural networks (NNs), highlighting which features in the input contribute the most to a NN prediction. Here, we discuss our “lesson learned” that the task of attributing a prediction to the input does not have a single solution. Instead, the attribution results depend greatly on the considered baseline that the XAI method utilizes—a fact that has been overlooked in the geoscientific literature. The baseline is a reference point to which the prediction is compared so that the prediction can be understood. This baseline can be chosen by the user or is set by construction in the method’s algorithm—often without the user being aware of that choice. We highlight that different baselines can lead to different insights for different science questions and, thus, should be chosen accordingly. To illustrate the impact of the baseline, we use a large ensemble of historical and future climate simulations forced with the shared socioeconomic pathway 3-7.0 (SSP3-7.0) scenario and train a fully connected NN to predict the ensemble- and global-mean temperature (i.e., the forced global warming signal) given an annual temperature map from an individual ensemble member. We then use various XAI methods and different baselines to attribute the network predictions to the input. We show that attributions differ substantially when considering different baselines, because they correspond to answering different science questions. We conclude by discussing important implications and considerations about the use of baselines in XAI research.
Significance Statement
In recent years, methods of explainable artificial intelligence (XAI) have found great application in geoscientific applications, because they can be used to attribute the predictions of neural networks (NNs) to the input and interpret them physically. Here, we highlight that the attributions—and the physical interpretation—depend greatly on the choice of the baseline—a fact that has been overlooked in the geoscientific literature. We illustrate this dependence for a specific climate task, in which a NN is trained to predict the ensemble- and global-mean temperature (i.e., the forced global warming signal) given an annual temperature map from an individual ensemble member. We show that attributions differ substantially when considering different baselines, because they correspond to answering different science questions.