Search Results

You are looking at 1 - 10 of 11 items for

  • Author or Editor: Imme Ebert-Uphoff x
  • Refine by Access: All Content x
Clear All Modify Search
Imme Ebert-Uphoff and Kyle Hilburn

Capsule

This article discusses strategies for the development of neural networks (aka deep learning) for meteorological applications. Topics include evaluation, tuning and interpretation of neural networks for working with meteorological images.

Full access
Imme Ebert-Uphoff and Kyle Hilburn

Abstract

The method of neural networks (aka deep learning) has opened up many new opportunities to utilize remotely sensed images in meteorology. Common applications include image classification, e.g., to determine whether an image contains a tropical cyclone, and image-to-image translation, e.g., to emulate radar imagery for satellites that only have passive channels. However, there are yet many open questions regarding the use of neural networks for working with meteorological images, such as best practices for evaluation, tuning, and interpretation. This article highlights several strategies and practical considerations for neural network development that have not yet received much attention in the meteorological community, such as the concept of receptive fields, underutilized meteorological performance measures, and methods for neural network interpretation, such as synthetic experiments and layer-wise relevance propagation. We also consider the process of neural network interpretation as a whole, recognizing it as an iterative meteorologist-driven discovery process that builds on experimental design and hypothesis generation and testing. Finally, while most work on neural network interpretation in meteorology has so far focused on networks for image classification tasks, we expand the focus to also include networks for image-to-image translation.

Full access
Imme Ebert-Uphoff and Yi Deng

Abstract

Causal discovery seeks to recover cause–effect relationships from statistical data using graphical models. One goal of this paper is to provide an accessible introduction to causal discovery methods for climate scientists, with a focus on constraint-based structure learning. Second, in a detailed case study constraint-based structure learning is applied to derive hypotheses of causal relationships between four prominent modes of atmospheric low-frequency variability in boreal winter including the Western Pacific Oscillation (WPO), Eastern Pacific Oscillation (EPO), Pacific–North America (PNA) pattern, and North Atlantic Oscillation (NAO). The results are shown in the form of static and temporal independence graphs also known as Bayesian Networks. It is found that WPO and EPO are nearly indistinguishable from the cause–effect perspective as strong simultaneous coupling is identified between the two. In addition, changes in the state of EPO (NAO) may cause changes in the state of NAO (PNA) approximately 18 (3–6) days later. These results are not only consistent with previous findings on dynamical processes connecting different low-frequency modes (e.g., interaction between synoptic and low-frequency eddies) but also provide the basis for formulating new hypotheses regarding the time scale and temporal sequencing of dynamical processes responsible for these connections. Last, the authors propose to use structure learning for climate networks, which are currently based primarily on correlation analysis. While correlation-based climate networks focus on similarity between nodes, independence graphs would provide an alternative viewpoint by focusing on information flow in the network.

Full access
Ryan Lagerquist and Imme Ebert-Uphoff

Abstract

In the last decade, much work in atmospheric science has focused on spatial verification (SV) methods for gridded prediction, which overcome serious disadvantages of pixelwise verification. However, neural networks (NN) in atmospheric science are almost always trained to optimize pixelwise loss functions, even when ultimately assessed with SV methods. This establishes a disconnect between model verification during vs. after training. To address this issue, we develop spatially enhanced loss functions (SELF) and demonstrate their use for a real-world problem: predicting the occurrence of thunderstorms (henceforth, “convection”) with NNs. In each SELF we use either a neighbourhood filter, which highlights convection at scales larger than a threshold, or a spectral filter (employing Fourier or wavelet decomposition), which is more flexible and highlights convection at scales between two thresholds. We use these filters to spatially enhance common verification scores, such as the Brier score. We train each NN with a different SELF and compare their performance at many scales of convection, from discrete storm cells to tropical cyclones. Among our many findings are that (a) for a low (high) risk threshold, the ideal SELF focuses on small (large) scales; (b) models trained with a pixelwise loss function perform surprisingly well; (c) however, models trained with a spectral filter produce much better-calibrated probabilities than a pixelwise model. We provide a general guide to using SELFs, including technical challenges and the final Python code, as well as demonstrating their use for the convection problem. To our knowledge this is the most in-depth guide to SELFs in the geosciences.

Free access
Kyle A. Hilburn, Imme Ebert-Uphoff, and Steven D. Miller

Abstract

The objective of this research is to develop techniques for assimilating GOES-R series observations in precipitating scenes for the purpose of improving short-term convective-scale forecasts of high-impact weather hazards. Whereas one approach is radiance assimilation, the information content of GOES-R radiances from its Advanced Baseline Imager saturates in precipitating scenes, and radiance assimilation does not make use of lightning observations from the GOES Lightning Mapper. Here, a convolutional neural network (CNN) is developed to transform GOES-R radiances and lightning into synthetic radar reflectivity fields to make use of existing radar assimilation techniques. We find that the ability of CNNs to utilize spatial context is essential for this application and offers breakthrough improvement in skill compared to traditional pixel-by-pixel based approaches. To understand the improved performance, we use a novel analysis method that combines several techniques, each providing different insights into the network’s reasoning. Channel-withholding experiments and spatial information–withholding experiments are used to show that the CNN achieves skill at high reflectivity values from the information content in radiance gradients and the presence of lightning. The attribution method, layerwise relevance propagation, demonstrates that the CNN uses radiance and lightning information synergistically, where lightning helps the CNN focus on which neighboring locations are most important. Synthetic inputs are used to quantify the sensitivity to radiance gradients, showing that sharper gradients produce a stronger response in predicted reflectivity. Lightning observations are found to be uniquely valuable for their ability to pinpoint locations of strong radar echoes.

Open access
Savini M. Samarasinghe, Yi Deng, and Imme Ebert-Uphoff

Abstract

This paper reports preliminary yet encouraging findings on the use of causal discovery methods to understand the interaction between atmospheric planetary- and synoptic-scale disturbances in the Northern Hemisphere. Specifically, constraint-based structure learning of probabilistic graphical models is applied to the spherical harmonics decomposition of the daily 500-hPa geopotential height field in boreal winter for the period 1948–2015. Active causal pathways among different spherical harmonics components are identified and documented in the form of a temporal probabilistic graphical model. Since, by definition, the structure learning algorithm used here only robustly identifies linear causal effects, we report only causal pathways between two groups of disturbances with sufficiently large differences in temporal and/or spatial scales, that is, planetary-scale (mainly zonal wavenumbers 1–3) and synoptic-scale disturbances (mainly zonal wavenumbers 6–8). Daily reconstruction of geopotential heights using only interacting scales suggest that the modulation of synoptic-scale disturbances by planetary-scale disturbances is best characterized by the flow of information from a zonal wavenumber-1 disturbance to a synoptic-scale circumglobal wave train whose amplitude peaks at the North Pacific and North Atlantic storm-track region. The feedback of synoptic-scale to planetary-scale disturbances manifests itself as a zonal wavenumber-2 structure driven by synoptic-eddy momentum fluxes. This wavenumber-2 structure locally enhances the East Asian trough and western Europe ridge of the wavenumber-1 planetary-scale disturbance that actively modulates the activity of synoptic-scale disturbances. The winter-mean amplitude of the actively interacting disturbances are characterized by pronounced fluctuations across interannual to decadal time scales.

Free access
Ryan Lagerquist, Jebb Q. Stewart, Imme Ebert-Uphoff, and Christina Kumler

Abstract

Predicting the timing and location of thunderstorms (“convection”) allows for preventive actions that can save both lives and property. We have applied U-nets, a deep-learning-based type of neural network, to forecast convection on a grid at lead times up to 120 min. The goal is to make skillful forecasts with only present and past satellite data as predictors. Specifically, predictors are multispectral brightness-temperature images from the Himawari-8 satellite, while targets (ground truth) are provided by weather radars in Taiwan. U-nets are becoming popular in atmospheric science due to their advantages for gridded prediction. Furthermore, we use three novel approaches to advance U-nets in atmospheric science. First, we compare three architectures—vanilla, temporal, and U-net++—and find that vanilla U-nets are best for this task. Second, we train U-nets with the fractions skill score, which is spatially aware, as the loss function. Third, because we do not have adequate ground truth over the full Himawari-8 domain, we train the U-nets with small radar-centered patches, then apply trained U-nets to the full domain. Also, we find that the best predictions are given by U-nets trained with satellite data from multiple lag times, not only the present. We evaluate U-nets in detail—by time of day, month, and geographic location—and compare them to persistence models. The U-nets outperform persistence at lead times ≥ 60 min, and at all lead times the U-nets provide a more realistic climatology than persistence. Our code is available publicly.

Restricted access
Antonios Mamalakis, Elizabeth A. Barnes, and Imme Ebert-Uphoff

Abstract

Convolutional neural networks (CNNs) have recently attracted great attention in geoscience because of their ability to capture nonlinear system behavior and extract predictive spatiotemporal patterns. Given their black-box nature, however, and the importance of prediction explainability, methods of explainable artificial intelligence (XAI) are gaining popularity as a means to explain the CNN decision-making strategy. Here, we establish an intercomparison of some of the most popular XAI methods and investigate their fidelity in explaining CNN decisions for geoscientific applications. Our goal is to raise awareness of the theoretical limitations of these methods and to gain insight into the relative strengths and weaknesses to help guide best practices. The considered XAI methods are first applied to an idealized attribution benchmark, in which the ground truth of explanation of the network is known a priori, to help objectively assess their performance. Second, we apply XAI to a climate-related prediction setting, namely, to explain a CNN that is trained to predict the number of atmospheric rivers in daily snapshots of climate simulations. Our results highlight several important issues of XAI methods (e.g., gradient shattering, inability to distinguish the sign of attribution, and ignorance to zero input) that have previously been overlooked in our field and, if not considered cautiously, may lead to a distorted picture of the CNN decision-making strategy. We envision that our analysis will motivate further investigation into XAI fidelity and will help toward a cautious implementation of XAI in geoscience, which can lead to further exploitation of CNNs and deep learning for prediction problems.

Free access
Ryan Lagerquist, David Turner, Imme Ebert-Uphoff, Jebb Stewart, and Venita Hagerty

Abstract

This paper describes the development of U-net++ models, a type of neural network that performs deep learning, to emulate the shortwave Rapid Radiative Transfer Model (RRTM). The goal is to emulate the RRTM accurately in a small fraction of the computing time, creating a U-net++ that could be used as a parameterization in numerical weather prediction (NWP). Target variables are surface downwelling flux, top-of-atmosphere upwelling flux (FupTOA), net flux, and a profile of radiative-heating rates. We have devised several ways to make the U-net++ models knowledge-guided, recently identified as a key priority in machine learning (ML) applications to the geosciences. We conduct two experiments to find the best U-net++ configurations. In experiment 1, we train on nontropical sites and test on tropical sites, to assess extreme spatial generalization. In experiment 2, we train on sites from all regions and test on different sites from all regions, with the goal of creating the best possible model for use in NWP. The selected model from experiment 1 shows impressive skill on the tropical testing sites, except four notable deficiencies: large bias and error for heating rate in the upper stratosphere, unreliable FupTOA for profiles with single-layer liquid cloud, large heating-rate bias in the midtroposphere for profiles with multilayer liquid cloud, and negative bias at low zenith angles for all flux components and tropospheric heating rates. The selected model from experiment 2 corrects all but the first deficiency, and both models run ~104 times faster than the RRTM. Our code is available publicly.

Full access
John M. Haynes, Yoo-Jeong Noh, Steven D. Miller, Katherine D. Haynes, Imme Ebert-Uphoff, and Andrew Heidinger

Abstract

The detection of multilayer clouds in the atmosphere can be particularly challenging from passive visible and infrared imaging radiometers since cloud boundary information is limited primarily to the topmost cloud layer. Yet detection of low clouds in the atmosphere is important for a number of applications, including aviation nowcasting and general weather forecasting. In this work, we develop pixel-based machine learning–based methods of detecting low clouds, with a focus on improving detection in multilayer cloud situations and specific attention given to improving the Cloud Cover Layers (CCL) product, which assigns cloudiness in a scene into vertical bins. The random forest (RF) and neural network (NN) implementations use inputs from a variety of sources, including GOES Advanced Baseline Imager (ABI) visible radiances, infrared brightness temperatures, auxiliary information about the underlying surface, and relative humidity (which holds some utility as a cloud proxy). Training and independent validation enlists near-global, actively sensed cloud boundaries from the radar and lidar systems on board the CloudSat and CALIPSO satellites. We find that the RF and NN models have similar performances. The probability of detection (PoD) of low cloud increases from 0.685 to 0.815 when using the RF technique instead of the CCL methodology, while the false alarm ratio decreases. The improved PoD of low cloud is particularly notable for scenes that appear to be cirrus from an ABI perspective, increasing from 0.183 to 0.686. Various extensions of the model are discussed, including a nighttime-only algorithm and expansion to other satellite sensors.

Significance Statement

Using satellites to detect the heights of clouds in the atmosphere is important for a variety of weather applications, including aviation weather forecasting. However, detecting low clouds can be challenging if there are other clouds above them. To address this, we have developed machine learning–based models that can be used with passive satellite instruments. These models use satellite observations at visible and infrared wavelengths, an estimate of relative humidity in the atmosphere, and geographic and surface-type information to predict whether low clouds are present. Our results show that these models have significant skill at predicting low clouds, even in the presence of higher cloud layers.

Restricted access