Browse

You are looking at 61 - 70 of 168 items for :

  • Artificial Intelligence for the Earth Systems x
  • Refine by Access: All Content x
Clear All
Zied Ben Bouallègue
,
Jonathan A. Weyn
,
Mariana C. A. Clare
,
Jesper Dramsch
,
Peter Dueben
, and
Matthew Chantry

Abstract

Statistical postprocessing of global ensemble weather forecasts is revisited by leveraging recent developments in machine learning. Verification of past forecasts is exploited to learn systematic deficiencies of numerical weather predictions in order to boost postprocessed forecast performance. Here, we introduce postprocessing of ensembles with transformers (PoET), a postprocessing approach based on hierarchical transformers. PoET has two major characteristics: 1) the postprocessing is applied directly to the ensemble members rather than to a predictive distribution or a functional of it, and 2) the method is ensemble-size agnostic in the sense that the number of ensemble members in training and inference mode can differ. The PoET output is a set of calibrated members that has the same size as the original ensemble but with improved reliability. Performance assessments show that PoET can bring up to 20% improvement in skill globally for 2-m temperature and 2% for precipitation forecasts and outperforms the simpler statistical member-by-member method, used here as a competitive benchmark. PoET is also applied to the ENS-10 benchmark dataset for ensemble postprocessing and provides better results when compared to other deep learning solutions that are evaluated for most parameters. Furthermore, because each ensemble member is calibrated separately, downstream applications should directly benefit from the improvement made on the ensemble forecast with postprocessing.

Open access
Andrew Geiss
,
Matthew W. Christensen
,
Adam C. Varble
,
Tianle Yuan
, and
Hua Song

Abstract

Low-level marine clouds play a pivotal role in Earth’s weather and climate through their interactions with radiation, heat and moisture transport, and the hydrological cycle. These interactions depend on a range of dynamical and microphysical processes that result in a broad diversity of cloud types and spatial structures, and a comprehensive understanding of cloud morphology is critical for continued improvement of our atmospheric modeling and prediction capabilities moving forward. Deep learning has recently accelerated our ability to study clouds using satellite remote sensing, and machine learning classifiers have enabled detailed studies of cloud morphology. A major limitation of deep learning approaches to this problem, however, is the large number of hand-labeled samples that are required for training. This work applies a recently developed self-supervised learning scheme to train a deep convolutional neural network (CNN) to map marine cloud imagery to vector embeddings that capture information about mesoscale cloud morphology and can be used for satellite image classification. The model is evaluated against existing cloud classification datasets and several use cases are demonstrated, including training cloud classifiers with very few labeled samples, interrogation of the CNN’s learned internal feature representations, cross-instrument application, and resilience against sensor calibration drift and changing scene brightness. The self-supervised approach learns meaningful internal representations of cloud structures and achieves comparable classification accuracy to supervised deep learning methods without the expense of creating large hand-annotated training datasets.

Significance Statement

Marine clouds heavily influence Earth’s weather and climate, and improved understanding of marine clouds is required to improve our atmospheric modeling capabilities and physical understanding of the atmosphere. Recently, deep learning has emerged as a powerful research tool that can be used to identify and study specific marine cloud types in the vast number of images collected by Earth-observing satellites. While powerful, these approaches require hand-labeling of training data, which is prohibitively time intensive. This study evaluates a recently developed self-supervised deep learning method that does not require human-labeled training data for processing images of clouds. We show that the trained algorithm performs competitively with algorithms trained on hand-labeled data for image classification tasks. We also discuss potential downstream uses and demonstrate some exciting features of the approach including application to multiple satellite instruments, resilience against changing image brightness, and its learned internal representations of cloud types. The self-supervised technique removes one of the major hurdles for applying deep learning to very large atmospheric datasets.

Open access
Sungduk Yu
,
Po-Lun Ma
,
Balwinder Singh
,
Sam Silva
, and
Mike Pritchard

Abstract

Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a reevaluation of the top-performing candidate models postretraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).

Open access
Daniel Getter
,
Julie Bessac
,
Johann Rudi
, and
Yan Feng

Abstract

Machine learning models have been employed to perform either physics-free data-driven or hybrid dynamical downscaling of climate data. Most of these implementations operate over relatively small downscaling factors because of the challenge of recovering fine-scale information from coarse data. This limits their compatibility with many global climate model outputs, often available between ∼50- and 100-km resolution, to scales of interest such as cloud resolving or urban scales. This study systematically examines the capability of a type of superresolving convolutional neural network (SR-CNNs) to downscale surface wind speed data over land from different coarse resolutions (25-, 48-, and 100-km resolution) to 3 km. For each downscaling factor, we consider three convolutional neural network (CNN) configurations that generate superresolved predictions of fine-scale wind speed, which take between one and three input fields: coarse wind speed, fine-scale topography, and diurnal cycle. In addition to fine-scale wind speeds, probability density function parameters are generated through which sample wind speeds can be generated, accounting for the intrinsic stochasticity of wind speed. For assessing generalization to new data, CNN models are tested on regions with different topography and climate that are unseen during training. The evaluation of superresolved predictions focuses on subgrid-scale variability and the recovery of extremes. Models with coarse wind and fine topography as inputs exhibit the best performance when compared with other model configurations, operating across the same downscaling factor. Our diurnal cycle encoding results in lower out-of-sample generalizability when compared with other input configurations.

Open access
Montgomery L. Flora
,
Corey K. Potvin
,
Amy McGovern
, and
Shawn Handler

Abstract

With increasing interest in explaining machine learning (ML) models, this paper synthesizes many topics related to ML explainability. We distinguish explainability from interpretability, local from global explainability, and feature importance versus feature relevance. We demonstrate and visualize different explanation methods, how to interpret them, and provide a complete Python package (scikit-explain) to allow future researchers and model developers to explore these explainability methods. The explainability methods include Shapley additive explanations (SHAP), Shapley additive global explanation (SAGE), and accumulated local effects (ALE). Our focus is primarily on Shapley-based techniques, which serve as a unifying framework for various existing methods to enhance model explainability. For example, SHAP unifies methods like local interpretable model-agnostic explanations (LIME) and tree interpreter for local explainability, while SAGE unifies the different variations of permutation importance for global explainability. We provide a short tutorial for explaining ML models using three disparate datasets: a convection-allowing model dataset for severe weather prediction, a nowcasting dataset for subfreezing road surface prediction, and satellite-based data for lightning prediction. In addition, we showcase the adverse effects that correlated features can have on the explainability of a model. Finally, we demonstrate the notion of evaluating model impacts of feature groups instead of individual features. Evaluating the feature groups mitigates the impacts of feature correlations and can provide a more holistic understanding of the model. All code, models, and data used in this study are freely available to accelerate the adoption of machine learning explainability in the atmospheric and other environmental sciences.

Open access
Maruti K. Mudunuru
,
James Ang
,
Mahantesh Halappanavar
,
Simon D. Hammond
,
Maya B. Gokhale
,
James C. Hoe
,
Tushar Krishna
,
Sarat Sreepathi
,
Matthew R. Norman
,
Ivy B. Peng
, and
Philip W. Jones

Abstract

Recently, the U.S. Department of Energy (DOE), Office of Science, Biological and Environmental Research (BER), and Advanced Scientific Computing Research (ASCR) programs organized and held the Artificial Intelligence for Earth System Predictability (AI4ESP) workshop series. From this workshop, a critical conclusion that the DOE BER and ASCR community came to is the requirement to develop a new paradigm for Earth system predictability focused on enabling artificial intelligence (AI) across the field, laboratory, modeling, and analysis activities, called model experimentation (ModEx). BER’s ModEx is an iterative approach that enables process models to generate hypotheses. The developed hypotheses inform field and laboratory efforts to collect measurement and observation data, which are subsequently used to parameterize, drive, and test model (e.g., process based) predictions. A total of 17 technical sessions were held in this AI4ESP workshop series. This paper discusses the topic of the AI Architectures and Codesign session and associated outcomes. The AI Architectures and Codesign session included two invited talks, two plenary discussion panels, and three breakout rooms that covered specific topics, including 1) DOE high-performance computing (HPC) systems, 2) cloud HPC systems, and 3) edge computing and Internet of Things (IoT). We also provide forward-looking ideas and perspectives on potential research in this codesign area that can be achieved by synergies with the other 16 session topics. These ideas include topics such as 1) reimagining codesign, 2) data acquisition to distribution, 3) heterogeneous HPC solutions for integration of AI/ML and other data analytics like uncertainty quantification with Earth system modeling and simulation, and 4) AI-enabled sensor integration into Earth system measurements and observations. Such perspectives are a distinguishing aspect of this paper.

Significance Statement

This study aims to provide perspectives on AI architectures and codesign approaches for Earth system predictability. Such visionary perspectives are essential because AI-enabled model-data integration has shown promise in improving predictions associated with climate change, perturbations, and extreme events. Our forward-looking ideas guide what is next in codesign to enhance Earth system models, observations, and theory using state-of-the-art and futuristic computational infrastructure.

Open access
Daniel Galea
,
Hsi-Yen Ma
,
Wen-Ying Wu
, and
Daigo Kobayashi

Abstract

The identification of atmospheric rivers (ARs) is crucial for weather and climate predictions as they are often associated with severe storm systems and extreme precipitation, which can cause large impacts on society. This study presents a deep learning model, termed ARDetect, for image segmentation of ARs using ERA5 data from 1960 to 2020 with labels obtained from the TempestExtremes tracking algorithm. ARDetect is a convolutional neural network (CNN)-based U-Net model, with its structure having been optimized using automatic hyperparameter tuning. Inputs to ARDetect were selected to be the integrated water vapor transport (IVT) and total column water (TCW) fields, as well as the AR mask from TempestExtremes from the previous time step to the one being considered. ARDetect achieved a mean intersection-over-union (mIoU) rate of 89.04% for ARs, indicating its high accuracy in identifying these weather patterns and a superior performance than most deep learning–based models for AR detection. In addition, ARDetect can be executed faster than the TempestExtremes method (seconds vs minutes) for the same period. This provides a significant benefit for online AR detection, especially for high-resolution global models. An ensemble of 10 models, each trained on the same dataset but having different starting weights, was used to further improve on the performance produced by ARDetect, thus demonstrating the importance of model diversity in improving performance. ARDetect provides an effective and fast deep learning–based model for researchers and weather forecasters to better detect and understand ARs, which have significant impacts on weather-related events such as floods and droughts.

Open access
Sebastian Scher
and
Gabriele Messori

Abstract

Recently, there has been a surge of research on data-driven weather forecasting systems, especially applications based on convolutional neural networks (CNNs). These are usually trained on atmospheric data represented on regular latitude–longitude grids, neglecting the curvature of Earth. We assess the benefit of replacing the standard convolution operations with an adapted convolution operation that takes into account the geometry of the underlying data (SphereNet convolution), specifically near the poles. Additionally, we assess the effect of including the information that the two hemispheres of Earth have “flipped” properties—for example, cyclones circulating in opposite directions—into the structure of the network. Both approaches are examples of physics-informed machine learning. The methods are tested on the WeatherBench dataset, at a resolution of ∼1.4°, which is higher than many previous studies on CNNs for weather forecasting. For most lead times up to day +10 for 500-hPa geopotential and 850-hPa temperature, we find that using SphereNet convolution or including hemisphere-specific information individually leads to improvement in forecast skill. Combining the two methods typically gives the highest forecast skill. Our version of SphereNet is implemented flexibly and scales well to high-resolution datasets but is still significantly more expensive than a standard convolution operation. Finally, we analyze cases with high forecast error. These occur mainly in winter and are relatively consistent across different training realizations of the networks, pointing to flow-dependent atmospheric predictability.

Open access
Donifan Barahona
,
Katherine H. Breen
,
Heike Kalesse-Los
, and
Johannes Röttenbacher

Abstract

Atmospheric models with typical resolution in the tenths of kilometers cannot resolve the dynamics of air parcel ascent, which varies on scales ranging from tens to hundreds of meters. Small-scale wind fluctuations are thus characterized by a subgrid distribution of vertical wind velocity W with standard deviation σW . The parameterization of σW is fundamental to the representation of aerosol–cloud interactions, yet it is poorly constrained. Using a novel deep learning technique, this work develops a new parameterization for σW merging data from global storm-resolving model simulations, high-frequency retrievals of W, and climate reanalysis products. The parameterization reproduces the observed statistics of σW and leverages learned physical relations from the model simulations to guide extrapolation beyond the observed domain. Incorporating observational data during the training phase was found to be critical for its performance. The parameterization can be applied online within large-scale atmospheric models, or offline using output from weather forecasting and reanalysis products.

Significance Statement

Vertical air motion plays a crucial role in several atmospheric processes, such as cloud droplet and ice crystal formation. However, it often occurs at scales smaller than those resolved by standard atmospheric models, leading to uncertainties in climate predictions. To address this, we present a novel deep learning approach that synthesizes data from various sources, providing a representation of small-scale vertical wind velocity suitable for integration into atmospheric models. Our method demonstrates high accuracy when compared to observation-based retrievals, offering potential to mitigate uncertainties and enhance climate forecasting.

Open access
Katherine Haynes
,
Jason Stock
,
Jack Dostalek
,
Charles Anderson
, and
Imme Ebert-Uphoff

Abstract

Vertical profiles of temperature and dewpoint are useful in predicting deep convection that leads to severe weather, which threatens property and lives. Currently, forecasters rely on observations from radiosonde launches and numerical weather prediction (NWP) models. Radiosonde observations are, however, temporally and spatially sparse, and NWP models contain inherent errors that influence short-term predictions of high impact events. This work explores using machine learning (ML) to postprocess NWP model forecasts, combining them with satellite data to improve vertical profiles of temperature and dewpoint. We focus on different ML architectures, loss functions, and input features to optimize predictions. Because we are predicting vertical profiles at 256 levels in the atmosphere, this work provides a unique perspective at using ML for 1D tasks. Compared to baseline profiles from the Rapid Refresh (RAP), ML predictions offer the largest improvement for dewpoint, particularly in the middle and upper atmosphere. Temperature improvements are modest, but CAPE values are improved by up to 40%. Feature importance analyses indicate that the ML models are primarily improving incoming RAP biases. While additional model and satellite data offer some improvement to the predictions, architecture choice is more important than feature selection in fine-tuning the results. Our proposed deep residual U-Net performs the best by leveraging spatial context from the input RAP profiles; however, the results are remarkably robust across model architecture. Further, uncertainty estimates for every level are well calibrated and can provide useful information to forecasters.

Open access