Browse
Abstract
In many regions of the world, tornadoes travel through forested areas with low population densities, making downed trees the only observable damage indicator. Current methods in the EF scale for analyzing tree damage may not reflect the true intensity of some tornadoes. However, new methods have been developed that use the number of trees downed or treefall directions from high-resolution aerial imagery to provide an estimate of maximum wind speed. Treefall Identification and Direction Analysis (TrIDA) maps are used to identify areas of treefall damage and treefall directions along the damage path. Currently, TrIDA maps are generated manually, but this is labor-intensive, often taking several days or weeks. To solve this, this paper describes a machine learning– and image-processing-based model that automatically extracts fallen trees from large-scale aerial imagery, assesses their fall directions, and produces an area-averaged treefall vector map with minimal initial human interaction. The automated model achieves a median tree direction difference of 13.3° when compared to the manual tree directions from the Alonsa, Manitoba, tornado, demonstrating the viability of the automated model compared to manual assessment. Overall, the automated production of treefall vector maps from large-scale aerial imagery significantly speeds up and reduces the labor required to create a Treefall Identification and Direction Analysis map from a matter of days or weeks to a matter of hours.
Significance Statement
The automation of treefall detection and direction is significant to the analyses of tornado paths and intensities. Previously, it would have taken a researcher multiple days to weeks to manually count and assess the directions of fallen trees in large-scale aerial photography of tornado damage. Through automation, analysis takes a matter of hours, with minimal initial human interaction. Tornado researchers will be able to use this automated process to help analyze and assess tornadoes and their enhanced Fujita–scale rating around the world.
Abstract
In many regions of the world, tornadoes travel through forested areas with low population densities, making downed trees the only observable damage indicator. Current methods in the EF scale for analyzing tree damage may not reflect the true intensity of some tornadoes. However, new methods have been developed that use the number of trees downed or treefall directions from high-resolution aerial imagery to provide an estimate of maximum wind speed. Treefall Identification and Direction Analysis (TrIDA) maps are used to identify areas of treefall damage and treefall directions along the damage path. Currently, TrIDA maps are generated manually, but this is labor-intensive, often taking several days or weeks. To solve this, this paper describes a machine learning– and image-processing-based model that automatically extracts fallen trees from large-scale aerial imagery, assesses their fall directions, and produces an area-averaged treefall vector map with minimal initial human interaction. The automated model achieves a median tree direction difference of 13.3° when compared to the manual tree directions from the Alonsa, Manitoba, tornado, demonstrating the viability of the automated model compared to manual assessment. Overall, the automated production of treefall vector maps from large-scale aerial imagery significantly speeds up and reduces the labor required to create a Treefall Identification and Direction Analysis map from a matter of days or weeks to a matter of hours.
Significance Statement
The automation of treefall detection and direction is significant to the analyses of tornado paths and intensities. Previously, it would have taken a researcher multiple days to weeks to manually count and assess the directions of fallen trees in large-scale aerial photography of tornado damage. Through automation, analysis takes a matter of hours, with minimal initial human interaction. Tornado researchers will be able to use this automated process to help analyze and assess tornadoes and their enhanced Fujita–scale rating around the world.
Abstract
Airborne Doppler radar provides detailed and targeted observations of winds and precipitation in weather systems over remote or difficult-to-access regions that can help to improve scientific understanding and weather forecasts. Quality control (QC) is necessary to remove nonweather echoes from raw radar data for subsequent analysis. The complex decision-making ability of the machine learning random-forest technique is employed to create a generalized QC method for airborne radar data in convective weather systems. A manually QCed dataset was used to train the model containing data from the Electra Doppler Radar (ELDORA) in mature and developing tropical cyclones, a tornadic supercell, and a bow echo. Successful classification of ∼96% and ∼93% of weather and nonweather radar gates, respectively, in withheld testing data indicate the generalizability of the method. Dual-Doppler analysis from the genesis phase of Hurricane Ophelia (2005) using data not previously seen by the model produced a comparable wind field to that from manual QC. The framework demonstrates a proof of concept that can be applied to newer airborne Doppler radars.
Significance Statement
Airborne Doppler radar is an invaluable tool for making detailed measurements of wind and precipitation in weather systems over remote or difficult to access regions, such as hurricanes over the ocean. Using the collected radar data depends strongly on quality control (QC) procedures to classify weather and nonweather radar echoes and to then remove the latter before subsequent analysis or assimilation into numerical weather prediction models. Prior QC techniques require interactive editing and subjective classification by trained researchers and can demand considerable time for even small amounts of data. We present a new machine learning algorithm that is trained on past QC efforts from radar experts, resulting in an accurate, fast technique with far less user input required that can greatly reduce the time required for QC. The new technique is based on the random forest, which is a machine learning model composed of decision trees, to classify weather and nonweather radar echoes. Continued efforts to build on this technique could benefit future weather forecasts by quickly and accurately quality-controlling data from other airborne radars for research or operational meteorology.
Abstract
Airborne Doppler radar provides detailed and targeted observations of winds and precipitation in weather systems over remote or difficult-to-access regions that can help to improve scientific understanding and weather forecasts. Quality control (QC) is necessary to remove nonweather echoes from raw radar data for subsequent analysis. The complex decision-making ability of the machine learning random-forest technique is employed to create a generalized QC method for airborne radar data in convective weather systems. A manually QCed dataset was used to train the model containing data from the Electra Doppler Radar (ELDORA) in mature and developing tropical cyclones, a tornadic supercell, and a bow echo. Successful classification of ∼96% and ∼93% of weather and nonweather radar gates, respectively, in withheld testing data indicate the generalizability of the method. Dual-Doppler analysis from the genesis phase of Hurricane Ophelia (2005) using data not previously seen by the model produced a comparable wind field to that from manual QC. The framework demonstrates a proof of concept that can be applied to newer airborne Doppler radars.
Significance Statement
Airborne Doppler radar is an invaluable tool for making detailed measurements of wind and precipitation in weather systems over remote or difficult to access regions, such as hurricanes over the ocean. Using the collected radar data depends strongly on quality control (QC) procedures to classify weather and nonweather radar echoes and to then remove the latter before subsequent analysis or assimilation into numerical weather prediction models. Prior QC techniques require interactive editing and subjective classification by trained researchers and can demand considerable time for even small amounts of data. We present a new machine learning algorithm that is trained on past QC efforts from radar experts, resulting in an accurate, fast technique with far less user input required that can greatly reduce the time required for QC. The new technique is based on the random forest, which is a machine learning model composed of decision trees, to classify weather and nonweather radar echoes. Continued efforts to build on this technique could benefit future weather forecasts by quickly and accurately quality-controlling data from other airborne radars for research or operational meteorology.
Abstract
Explainable artificial intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the AI model and not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.
Abstract
Explainable artificial intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the AI model and not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.
Abstract
Statistical postprocessing is used to translate ensembles of raw numerical weather forecasts into reliable probabilistic forecast distributions. In this study, we examine the use of permutation-invariant neural networks for this task. In contrast to previous approaches, which often operate on ensemble summary statistics and dismiss details of the ensemble distribution, we propose networks that treat forecast ensembles as a set of unordered member forecasts and learn link functions that are by design invariant to permutations of the member ordering. We evaluate the quality of the obtained forecast distributions in terms of calibration and sharpness and compare the models against classical and neural network–based benchmark methods. In case studies addressing the postprocessing of surface temperature and wind gust forecasts, we demonstrate state-of-the-art prediction quality. To deepen the understanding of the learned inference process, we further propose a permutation-based importance analysis for ensemble-valued predictors, which highlights specific aspects of the ensemble forecast that are considered important by the trained postprocessing models. Our results suggest that most of the relevant information is contained in a few ensemble-internal degrees of freedom, which may impact the design of future ensemble forecasting and postprocessing systems.
Abstract
Statistical postprocessing is used to translate ensembles of raw numerical weather forecasts into reliable probabilistic forecast distributions. In this study, we examine the use of permutation-invariant neural networks for this task. In contrast to previous approaches, which often operate on ensemble summary statistics and dismiss details of the ensemble distribution, we propose networks that treat forecast ensembles as a set of unordered member forecasts and learn link functions that are by design invariant to permutations of the member ordering. We evaluate the quality of the obtained forecast distributions in terms of calibration and sharpness and compare the models against classical and neural network–based benchmark methods. In case studies addressing the postprocessing of surface temperature and wind gust forecasts, we demonstrate state-of-the-art prediction quality. To deepen the understanding of the learned inference process, we further propose a permutation-based importance analysis for ensemble-valued predictors, which highlights specific aspects of the ensemble forecast that are considered important by the trained postprocessing models. Our results suggest that most of the relevant information is contained in a few ensemble-internal degrees of freedom, which may impact the design of future ensemble forecasting and postprocessing systems.
Abstract
Statistical postprocessing of global ensemble weather forecasts is revisited by leveraging recent developments in machine learning. Verification of past forecasts is exploited to learn systematic deficiencies of numerical weather predictions in order to boost postprocessed forecast performance. Here, we introduce postprocessing of ensembles with transformers (PoET), a postprocessing approach based on hierarchical transformers. PoET has two major characteristics: 1) the postprocessing is applied directly to the ensemble members rather than to a predictive distribution or a functional of it, and 2) the method is ensemble-size agnostic in the sense that the number of ensemble members in training and inference mode can differ. The PoET output is a set of calibrated members that has the same size as the original ensemble but with improved reliability. Performance assessments show that PoET can bring up to 20% improvement in skill globally for 2-m temperature and 2% for precipitation forecasts and outperforms the simpler statistical member-by-member method, used here as a competitive benchmark. PoET is also applied to the ENS-10 benchmark dataset for ensemble postprocessing and provides better results when compared to other deep learning solutions that are evaluated for most parameters. Furthermore, because each ensemble member is calibrated separately, downstream applications should directly benefit from the improvement made on the ensemble forecast with postprocessing.
Abstract
Statistical postprocessing of global ensemble weather forecasts is revisited by leveraging recent developments in machine learning. Verification of past forecasts is exploited to learn systematic deficiencies of numerical weather predictions in order to boost postprocessed forecast performance. Here, we introduce postprocessing of ensembles with transformers (PoET), a postprocessing approach based on hierarchical transformers. PoET has two major characteristics: 1) the postprocessing is applied directly to the ensemble members rather than to a predictive distribution or a functional of it, and 2) the method is ensemble-size agnostic in the sense that the number of ensemble members in training and inference mode can differ. The PoET output is a set of calibrated members that has the same size as the original ensemble but with improved reliability. Performance assessments show that PoET can bring up to 20% improvement in skill globally for 2-m temperature and 2% for precipitation forecasts and outperforms the simpler statistical member-by-member method, used here as a competitive benchmark. PoET is also applied to the ENS-10 benchmark dataset for ensemble postprocessing and provides better results when compared to other deep learning solutions that are evaluated for most parameters. Furthermore, because each ensemble member is calibrated separately, downstream applications should directly benefit from the improvement made on the ensemble forecast with postprocessing.
Abstract
Low-level marine clouds play a pivotal role in Earth’s weather and climate through their interactions with radiation, heat and moisture transport, and the hydrological cycle. These interactions depend on a range of dynamical and microphysical processes that result in a broad diversity of cloud types and spatial structures, and a comprehensive understanding of cloud morphology is critical for continued improvement of our atmospheric modeling and prediction capabilities moving forward. Deep learning has recently accelerated our ability to study clouds using satellite remote sensing, and machine learning classifiers have enabled detailed studies of cloud morphology. A major limitation of deep learning approaches to this problem, however, is the large number of hand-labeled samples that are required for training. This work applies a recently developed self-supervised learning scheme to train a deep convolutional neural network (CNN) to map marine cloud imagery to vector embeddings that capture information about mesoscale cloud morphology and can be used for satellite image classification. The model is evaluated against existing cloud classification datasets and several use cases are demonstrated, including training cloud classifiers with very few labeled samples, interrogation of the CNN’s learned internal feature representations, cross-instrument application, and resilience against sensor calibration drift and changing scene brightness. The self-supervised approach learns meaningful internal representations of cloud structures and achieves comparable classification accuracy to supervised deep learning methods without the expense of creating large hand-annotated training datasets.
Significance Statement
Marine clouds heavily influence Earth’s weather and climate, and improved understanding of marine clouds is required to improve our atmospheric modeling capabilities and physical understanding of the atmosphere. Recently, deep learning has emerged as a powerful research tool that can be used to identify and study specific marine cloud types in the vast number of images collected by Earth-observing satellites. While powerful, these approaches require hand-labeling of training data, which is prohibitively time intensive. This study evaluates a recently developed self-supervised deep learning method that does not require human-labeled training data for processing images of clouds. We show that the trained algorithm performs competitively with algorithms trained on hand-labeled data for image classification tasks. We also discuss potential downstream uses and demonstrate some exciting features of the approach including application to multiple satellite instruments, resilience against changing image brightness, and its learned internal representations of cloud types. The self-supervised technique removes one of the major hurdles for applying deep learning to very large atmospheric datasets.
Abstract
Low-level marine clouds play a pivotal role in Earth’s weather and climate through their interactions with radiation, heat and moisture transport, and the hydrological cycle. These interactions depend on a range of dynamical and microphysical processes that result in a broad diversity of cloud types and spatial structures, and a comprehensive understanding of cloud morphology is critical for continued improvement of our atmospheric modeling and prediction capabilities moving forward. Deep learning has recently accelerated our ability to study clouds using satellite remote sensing, and machine learning classifiers have enabled detailed studies of cloud morphology. A major limitation of deep learning approaches to this problem, however, is the large number of hand-labeled samples that are required for training. This work applies a recently developed self-supervised learning scheme to train a deep convolutional neural network (CNN) to map marine cloud imagery to vector embeddings that capture information about mesoscale cloud morphology and can be used for satellite image classification. The model is evaluated against existing cloud classification datasets and several use cases are demonstrated, including training cloud classifiers with very few labeled samples, interrogation of the CNN’s learned internal feature representations, cross-instrument application, and resilience against sensor calibration drift and changing scene brightness. The self-supervised approach learns meaningful internal representations of cloud structures and achieves comparable classification accuracy to supervised deep learning methods without the expense of creating large hand-annotated training datasets.
Significance Statement
Marine clouds heavily influence Earth’s weather and climate, and improved understanding of marine clouds is required to improve our atmospheric modeling capabilities and physical understanding of the atmosphere. Recently, deep learning has emerged as a powerful research tool that can be used to identify and study specific marine cloud types in the vast number of images collected by Earth-observing satellites. While powerful, these approaches require hand-labeling of training data, which is prohibitively time intensive. This study evaluates a recently developed self-supervised deep learning method that does not require human-labeled training data for processing images of clouds. We show that the trained algorithm performs competitively with algorithms trained on hand-labeled data for image classification tasks. We also discuss potential downstream uses and demonstrate some exciting features of the approach including application to multiple satellite instruments, resilience against changing image brightness, and its learned internal representations of cloud types. The self-supervised technique removes one of the major hurdles for applying deep learning to very large atmospheric datasets.
Abstract
Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a reevaluation of the top-performing candidate models postretraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).
Abstract
Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a reevaluation of the top-performing candidate models postretraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).
Abstract
Machine learning models have been employed to perform either physics-free data-driven or hybrid dynamical downscaling of climate data. Most of these implementations operate over relatively small downscaling factors because of the challenge of recovering fine-scale information from coarse data. This limits their compatibility with many global climate model outputs, often available between ∼50- and 100-km resolution, to scales of interest such as cloud resolving or urban scales. This study systematically examines the capability of a type of superresolving convolutional neural network (SR-CNNs) to downscale surface wind speed data over land from different coarse resolutions (25-, 48-, and 100-km resolution) to 3 km. For each downscaling factor, we consider three convolutional neural network (CNN) configurations that generate superresolved predictions of fine-scale wind speed, which take between one and three input fields: coarse wind speed, fine-scale topography, and diurnal cycle. In addition to fine-scale wind speeds, probability density function parameters are generated through which sample wind speeds can be generated, accounting for the intrinsic stochasticity of wind speed. For assessing generalization to new data, CNN models are tested on regions with different topography and climate that are unseen during training. The evaluation of superresolved predictions focuses on subgrid-scale variability and the recovery of extremes. Models with coarse wind and fine topography as inputs exhibit the best performance when compared with other model configurations, operating across the same downscaling factor. Our diurnal cycle encoding results in lower out-of-sample generalizability when compared with other input configurations.
Abstract
Machine learning models have been employed to perform either physics-free data-driven or hybrid dynamical downscaling of climate data. Most of these implementations operate over relatively small downscaling factors because of the challenge of recovering fine-scale information from coarse data. This limits their compatibility with many global climate model outputs, often available between ∼50- and 100-km resolution, to scales of interest such as cloud resolving or urban scales. This study systematically examines the capability of a type of superresolving convolutional neural network (SR-CNNs) to downscale surface wind speed data over land from different coarse resolutions (25-, 48-, and 100-km resolution) to 3 km. For each downscaling factor, we consider three convolutional neural network (CNN) configurations that generate superresolved predictions of fine-scale wind speed, which take between one and three input fields: coarse wind speed, fine-scale topography, and diurnal cycle. In addition to fine-scale wind speeds, probability density function parameters are generated through which sample wind speeds can be generated, accounting for the intrinsic stochasticity of wind speed. For assessing generalization to new data, CNN models are tested on regions with different topography and climate that are unseen during training. The evaluation of superresolved predictions focuses on subgrid-scale variability and the recovery of extremes. Models with coarse wind and fine topography as inputs exhibit the best performance when compared with other model configurations, operating across the same downscaling factor. Our diurnal cycle encoding results in lower out-of-sample generalizability when compared with other input configurations.
Abstract
With increasing interest in explaining machine learning (ML) models, this paper synthesizes many topics related to ML explainability. We distinguish explainability from interpretability, local from global explainability, and feature importance versus feature relevance. We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers and model developers to explore these explainability methods. The explainability methods include Shapley additive explanations (SHAP), Shapley additive global explanation (SAGE), and accumulated local effects (ALE). Our focus is primarily on Shapley-based techniques, which serve as a unifying framework for various existing methods to enhance model explainability. For example, SHAP unifies methods like local interpretable model-agnostic explanations (LIME) and tree interpreter for local explainability, while SAGE unifies the different variations of permutation importance for global explainability. We provide a short tutorial for explaining ML models using three disparate datasets: a convection-allowing model dataset for severe weather prediction, a nowcasting dataset for subfreezing road surface prediction, and satellite-based data for lightning prediction. In addition, we showcase the adverse effects that correlated features can have on the explainability of a model. Finally, we demonstrate the notion of evaluating model impacts of feature groups instead of individual features. Evaluating the feature groups mitigates the impacts of feature correlations and can provide a more holistic understanding of the model. All code, models, and data used in this study are freely available to accelerate the adoption of machine learning explainability in the atmospheric and other environmental sciences.
Abstract
With increasing interest in explaining machine learning (ML) models, this paper synthesizes many topics related to ML explainability. We distinguish explainability from interpretability, local from global explainability, and feature importance versus feature relevance. We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers and model developers to explore these explainability methods. The explainability methods include Shapley additive explanations (SHAP), Shapley additive global explanation (SAGE), and accumulated local effects (ALE). Our focus is primarily on Shapley-based techniques, which serve as a unifying framework for various existing methods to enhance model explainability. For example, SHAP unifies methods like local interpretable model-agnostic explanations (LIME) and tree interpreter for local explainability, while SAGE unifies the different variations of permutation importance for global explainability. We provide a short tutorial for explaining ML models using three disparate datasets: a convection-allowing model dataset for severe weather prediction, a nowcasting dataset for subfreezing road surface prediction, and satellite-based data for lightning prediction. In addition, we showcase the adverse effects that correlated features can have on the explainability of a model. Finally, we demonstrate the notion of evaluating model impacts of feature groups instead of individual features. Evaluating the feature groups mitigates the impacts of feature correlations and can provide a more holistic understanding of the model. All code, models, and data used in this study are freely available to accelerate the adoption of machine learning explainability in the atmospheric and other environmental sciences.
Abstract
Recently, the U.S. Department of Energy (DOE), Office of Science, Biological and Environmental Research (BER), and Advanced Scientific Computing Research (ASCR) programs organized and held the Artificial Intelligence for Earth System Predictability (AI4ESP) workshop series. From this workshop, a critical conclusion that the DOE BER and ASCR community came to is the requirement to develop a new paradigm for Earth system predictability focused on enabling artificial intelligence (AI) across the field, laboratory, modeling, and analysis activities, called model experimentation (ModEx). BER’s ModEx is an iterative approach that enables process models to generate hypotheses. The developed hypotheses inform field and laboratory efforts to collect measurement and observation data, which are subsequently used to parameterize, drive, and test model (e.g., process based) predictions. A total of 17 technical sessions were held in this AI4ESP workshop series. This paper discusses the topic of the AI Architectures and Codesign session and associated outcomes. The AI Architectures and Codesign session included two invited talks, two plenary discussion panels, and three breakout rooms that covered specific topics, including 1) DOE high-performance computing (HPC) systems, 2) cloud HPC systems, and 3) edge computing and Internet of Things (IoT). We also provide forward-looking ideas and perspectives on potential research in this codesign area that can be achieved by synergies with the other 16 session topics. These ideas include topics such as 1) reimagining codesign, 2) data acquisition to distribution, 3) heterogeneous HPC solutions for integration of AI/ML and other data analytics like uncertainty quantification with Earth system modeling and simulation, and 4) AI-enabled sensor integration into Earth system measurements and observations. Such perspectives are a distinguishing aspect of this paper.
Significance Statement
This study aims to provide perspectives on AI architectures and codesign approaches for Earth system predictability. Such visionary perspectives are essential because AI-enabled model-data integration has shown promise in improving predictions associated with climate change, perturbations, and extreme events. Our forward-looking ideas guide what is next in codesign to enhance Earth system models, observations, and theory using state-of-the-art and futuristic computational infrastructure.
Abstract
Recently, the U.S. Department of Energy (DOE), Office of Science, Biological and Environmental Research (BER), and Advanced Scientific Computing Research (ASCR) programs organized and held the Artificial Intelligence for Earth System Predictability (AI4ESP) workshop series. From this workshop, a critical conclusion that the DOE BER and ASCR community came to is the requirement to develop a new paradigm for Earth system predictability focused on enabling artificial intelligence (AI) across the field, laboratory, modeling, and analysis activities, called model experimentation (ModEx). BER’s ModEx is an iterative approach that enables process models to generate hypotheses. The developed hypotheses inform field and laboratory efforts to collect measurement and observation data, which are subsequently used to parameterize, drive, and test model (e.g., process based) predictions. A total of 17 technical sessions were held in this AI4ESP workshop series. This paper discusses the topic of the AI Architectures and Codesign session and associated outcomes. The AI Architectures and Codesign session included two invited talks, two plenary discussion panels, and three breakout rooms that covered specific topics, including 1) DOE high-performance computing (HPC) systems, 2) cloud HPC systems, and 3) edge computing and Internet of Things (IoT). We also provide forward-looking ideas and perspectives on potential research in this codesign area that can be achieved by synergies with the other 16 session topics. These ideas include topics such as 1) reimagining codesign, 2) data acquisition to distribution, 3) heterogeneous HPC solutions for integration of AI/ML and other data analytics like uncertainty quantification with Earth system modeling and simulation, and 4) AI-enabled sensor integration into Earth system measurements and observations. Such perspectives are a distinguishing aspect of this paper.
Significance Statement
This study aims to provide perspectives on AI architectures and codesign approaches for Earth system predictability. Such visionary perspectives are essential because AI-enabled model-data integration has shown promise in improving predictions associated with climate change, perturbations, and extreme events. Our forward-looking ideas guide what is next in codesign to enhance Earth system models, observations, and theory using state-of-the-art and futuristic computational infrastructure.