Browse

You are looking at 11 - 20 of 143 items for :

  • Artificial Intelligence for the Earth Systems x
  • Refine by Access: All Content x
Clear All
Wei-Yi Cheng
,
Daehyun Kim
,
Scott Henderson
,
Yoo-Geun Ham
,
Jeong-Hwan Kim
, and
Rober H. Holzworth

Abstract

The diversity in the lightning parameterizations for numerical weather and climate models causes considerable uncertainty in lightning prediction. In this study, we take a data-driven approach to address the lightning parameterization problem, by combining machine learning (ML) techniques with the rich lightning observations from the World Wide Lightning Location Network. Three ML algorithms are trained over the contiguous United States (CONUS) to predict lightning stroke density in a 1° box based on the information about the atmospheric variables in the same grid (local) or over the entire CONUS (nonlocal). The performance of the ML-based lightning schemes is examined and compared with that of a simple, conventional lightning parameterization scheme of Romps et al. We find that all ML-based lightning schemes exhibit a performance that is superior to that of the conventional scheme in the regions and in the seasons with climatologically higher lightning stroke density. To the west of the Rocky Mountains, the nonlocal ML lightning scheme achieves the best overall performance, with lightning stroke density predictions being 70% more accurate than the conventional scheme. Our results suggest that the ML-based approaches have the potential to improve the representation of lightning and other types of extreme weather events in the weather and climate models.

Open access
Harold E. Brooks
,
Montgomery L. Flora
, and
Michael E. Baldwin

Abstract

Forecast evaluation metrics have been discovered and rediscovered in a variety of contexts, leading to confusion. We look at measures from the 2 × 2 contingency table and the history of their development and illustrate how different fields working on similar problems has led to different approaches and perspectives of the same mathematical concepts. For example, probability of detection (POD) is a quantity in meteorology that was also called prefigurance in the field, while the same thing is named recall in information science and machine learning, and sensitivity and true positive rate in the medical literature. Many of the scores that combine three elements of the 2 × 2 table can be seen as either coming from a perspective of Venn diagrams or from the Pythagorean means, possibly weighted, of two ratios of performance measures. Although there are algebraic relationships between the two perspectives, the approaches taken by authors led them in different directions, making it unlikely that they would discover scores that naturally arose from the other approach. We close by discussing the importance of understanding the implicit or explicit values expressed by the choice of scores. In addition, we make some simple recommendations about the appropriate nomenclature to use when publishing interdisciplinary work.

Open access
Bethany L. Earnest
,
Amy McGovern
,
Christopher Karstens
, and
Israel Jirak

Abstract

The purpose of this research is to build an operational model for predicting wildfire occurrence for the contiguous United States (CONUS) in the 1-to-10-day range using the UNet3+ machine-learning model. This paper illustrates the range of model performance resulting from choices made in the modeling process, such as how labels are defined for the model, and how input variables are codified for the model. By combining the capabilities of the UNet3+ model with a neighborhood loss function, Fractions Skill Score (FSS), we can quantify model success by predictions made both in and around the location of the original fire occurrence label. The model is trained on weather, weather-derived fuel, and topography observational inputs and labels representing fire occurrence. Observational weather, weather-derived fuel, and topography data are sourced from the gridMET data set, a daily, CONUS-wide, high-spatial resolution data set of surface meteorological variables. Fire occurrence labels are sourced from the U.S. Department of Agriculture’s Fire Program Analysis Fire-Occurrence Database (FPA-FOD), which contains spatial wildfire occurrence data for CONUS, combining data sourced from the reporting systems of federal, state, and local organizations. By exploring the many aspects of the modeling process with the added context of model performance, this work builds understanding around the use of deep learning to predict fire occurrence in CONUS.

Open access
Bethany L. Earnest
,
Amy McGovern
,
Christopher Karstens
, and
Israel Jirak

Abstract

This paper illustrates the lessons learned as we applied the Unet3+ deep learning model to the task of building an operational model for predicting wildfire occurrence for the contiguous United States (CONUS) in the 1-to-10-day range. Through the lens of model performance, we explore the reasons for performance improvements made possible by the model. Lessons include the importance of labeling, the impact of information loss in input variables, and the role of operational considerations in the modeling process. This work offers lessons learned for other interdisciplinary researchers working at the intersection of deep learning and fire occurrence prediction with an eye towards operationalization.

Open access
Mahsa Payami
,
Yunsoo Choi
,
Ahmed Khan Salman
,
Seyedali Mousavinezhad
,
Jincheol Park
, and
Arman Pouyaei

Abstract

In this study, we developed an emulator of the Community Multiscale Air Quality (CMAQ) model by employing a one-dimensional (1D) convolutional neural network (CNN) algorithm to predict hourly surface nitrogen dioxide (NO2) concentrations over the most densely populated urban regions in Texas. The inputs for the emulator were the same as those for the CMAQ model, which includes emission, meteorological, and land-use/land-cover data. We trained the model over June, July, and August (JJA) of 2011 and 2014 and then tested it on JJA of 2017, achieving an index of agreement (IOA) of 0.95 and a correlation of 0.90. We also employed temporal threefold cross validation to evaluate the model’s performance, ensuring the robustness and generalizability of the results. To gain deeper insights and understand the factors influencing the model’s surface NO2 predictions, we conducted a Shapley additive explanations analysis. The results revealed solar radiation reaching the surface, planetary boundary layer height, and NO x (NO + NO2) emissions are key variables driving the model’s predictions. These findings highlight the emulator’s ability to capture the individual impact of each variable on the model’s NO2 predictions. Furthermore, our emulator outperformed the CMAQ model in terms of computational efficiency, being more than 900 times as fast in predicting NO2 concentrations, enabling the rapid assessment of various pollution management scenarios. This work offers a valuable resource for air pollution mitigation efforts, not just in Texas, but with appropriate regional data training, its utility could be extended to other regions and pollutants as well.

Significance Statement

This work develops an emulator of the Community Multiscale Air Quality model, using a one-dimensional convolutional neural network to predict hourly surface NO2 concentrations across densely populated regions in Texas. Our emulator is capable of providing rapid and highly accurate NO2 estimates, enabling it to model diverse scenarios and facilitating informed decision-making to improve public health outcomes. Notably, this model outperforms traditional methods in computational efficiency, making it a robust, time-efficient tool for air pollution mitigation efforts. The findings suggest that key variables like solar radiation, planetary boundary layer height, and NO x (NO + NO2) emissions significantly influence the model’s NO2 predictions. By adding appropriate training data, this work can be extended to other regions and other pollutants such as O3, PM2.5, and PM10, offering a powerful tool for pollution mitigation and public health improvement efforts worldwide.

Open access
Jeong-Hwan Kim
,
Yoo-Geun Ham
,
Daehyun Kim
,
Tim Li
, and
Chen Ma

Abstract

Forecasting the intensity of a tropical cyclone (TC) remains challenging, particularly when it undergoes rapid changes in intensity. This study aims to develop a convolutional neural network (CNN) for 24-h forecasts of the TC intensity changes and their rapid intensifications over the western Pacific. The CNN model, the DeepTC, is trained using a unique loss function, an amplitude focal loss, to better capture large intensity changes, such as those during rapid intensification (RI) events. We showed that the DeepTC outperforms operational forecasts, with a lower mean absolute error (8.9%–10.2%) and a higher coefficient of determination (31.7%–35%). In addition, the DeepTC exhibits a substantially better skill at capturing RI events than operational forecasts. To understand the superior performance of the DeepTC in RI forecasts, we conduct an occlusion sensitivity analysis to quantify the relative importance of each predictor. Results revealed that scalar quantities such as latitude, previous intensity change, initial intensity, and vertical wind shear play critical roles in successful RI prediction. Additionally, the DeepTC utilizes the three-dimensional distribution of relative humidity to distinguish RI cases from non-RI cases, with higher dry–moist moisture gradients in the mid-to-low troposphere and steeper radial moisture gradients in the upper troposphere showed during RI events. These relationships between the identified key variables and intensity change were successfully simulated by the DeepTC, implying that the relationship is physically reasonable. Our study demonstrates that the DeepTC can be a powerful tool for improving RI understanding and enhancing the reliability of TC intensity forecasts.

Open access
Bryan Shaddy
,
Deep Ray
,
Angel Farguell
,
Valentina Calaza
,
Jan Mandel
,
James Haley
,
Kyle Hilburn
,
Derek V. Mallia
,
Adam Kochanski
, and
Assad Oberai

Abstract

Increases in wildfire activity and the resulting impacts have prompted the development of high-resolution wildfire behavior models for forecasting fire spread. Recent progress in using satellites to detect fire locations further provides the opportunity to use measurements towards improving fire spread forecasts from numerical models through data assimilation. This work develops a physics-informed approach for inferring the history of a wildfire from satellite measurements, providing the necessary information to initialize coupled atmosphere-wildfire models from a measured wildfire state. The fire arrival time, which is the time the fire reaches a given spatial location, acts as a succinct representation of the history of a wildfire. In this work, a conditional Wasserstein Generative Adversarial Network (cWGAN), trained with WRF-SFIRE simulations, is used to infer the fire arrival time from satellite active fire data. The cWGAN is used to produce samples of likely fire arrival times from the conditional distribution of arrival times given satellite active fire detections. Samples produced by the cWGAN are further used to assess the uncertainty of predictions. The cWGAN is tested on four California wildfires occurring between 2020 and 2022, and predictions for fire extent are compared against high resolution airborne infrared measurements. Further, the predicted ignition times are compared with reported ignition times. An average Sorensen’s coefficient of 0.81 for the fire perimeters and an average ignition time difference of 32 minutes suggest that the method is highly accurate.

Open access
Guangming Zheng
,
Stephanie Schollaert Uz
,
Pierre St-Laurent
,
Marjorie A. M. Friedrichs
,
Amita Mehta
, and
Paul M. DiGiacomo

Abstract

Seasonal hypoxia is a recurring threat to ecosystems and fisheries in the Chesapeake Bay. Hypoxia forecasting based on coupled hydrodynamic and biogeochemical models has proven useful for many stakeholders, as these models excel in accounting for the effects of physical forcing on oxygen supply, but may fall short in replicating the more complex biogeochemical processes that govern oxygen consumption. Satellite-derived reflectances could be used to indicate the presence of surface organic matter over the Bay. However, teasing apart the contribution of atmospheric and aquatic constituents from the signal received by the satellite is not straightforward. As a result, it is difficult to derive surface concentrations of organic matter from satellite data in a robust fashion. A potential solution to this complexity is to use deep learning to build end-to-end applications that do not require precise accounting of the satellite signal from atmosphere or water, phytoplankton blooms, or sediment plumes. By training a deep neural network with data from a vast suite of variables that could potentially affect oxygen in the water column, improvement of short-term (daily) hypoxia forecast may be possible. Here we predict oxygen concentrations using inputs that account for both physical and biogeochemical factors. The physical inputs include wind velocity reanalysis information, together with 3D outputs from an estuarine hydrodynamic model, including current velocity, water temperature, and salinity. Satellite-derived spectral reflectance data are used as a surrogate for the biogeochemical factors. These input fields are time series of weekly statistics calculated from daily information, starting 8 weeks before each oxygen observation was collected. To accommodate this input data structure, we adopted a model architecture of long short-term memory networks with 8 time steps. At each time step, a set of convolutional neural networks are used to extract information from the inputs. Ablation and cross validation tests suggest that among all input features, the strongest predictor is the 3D temperature field, with which the new model can outperform the state-of-the-art by ∼20% in terms of median absolute error. Our approach represents a novel application of deep learning to address a complex water management challenge.

Open access
Daniel Galea
,
Kevin Hodges
, and
Bryan N. Lawrence

Abstract

Tropical cyclones (TCs) are important phenomena, and understanding their behavior requires being able to detect their presence in simulations. Detection algorithms vary; here we compare a novel deep learning–based detection algorithm (TCDetect) with a state-of-the-art tracking system (TRACK) and an observational dataset (IBTrACS) to provide context for potential use in climate simulations. Previous work has shown that TCDetect has good recall, particularly for hurricane-strength events. The primary question addressed here is to what extent the structure of the systems plays a part in detection. To compare with observations of TCs, it is necessary to apply detection techniques to reanalysis. For this purpose, we use ERA-Interim, and a key part of the comparison is the recognition that ERA-Interim itself does not fully reflect the observations. Despite that limitation, both TCDetect and TRACK applied to ERA-Interim mostly agree with each other. Also, when considering only hurricane-strength TCs, TCDetect and TRACK correspond well to the TC observations from IBTrACS. Like TRACK, TCDetect has good recall for strong systems; however, it finds a significant number of false positives associated with weaker TCs (i.e., events detected as having hurricane strength but are weaker in reality) and extratropical storms. Because TCDetect was not trained to locate TCs, a post hoc method to perform comparisons was used. Although this method was not always successful, some success in matching tracks and events in physical space was also achieved. The analysis of matches suggested that the best results were found in the Northern Hemisphere and that in most regions the detections followed the same patterns in time no matter which detection method was used.

Open access
Charles H. White
,
Imme Ebert-Uphoff
,
John M. Haynes
, and
Yoo-Jeong Noh

Abstract

Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.

Significance Statement

Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.

Open access