Browse
Abstract
Regional climate models (RCMs) are essential tools for simulating and studying regional climate variability and change. However, their high computational cost limits the production of comprehensive ensembles of regional climate projections covering multiple scenarios and driving Global Climate Models (GCMs) across regions. RCM emulators based on deep learning models have recently been introduced as a cost-effective and promising alternative that requires only short RCM simulations to train the models. Therefore, evaluating their transferability to different periods, scenarios, and GCMs becomes a pivotal and complex task in which the inherent biases of both GCMs and RCMs play a significant role. Here we focus on this problem by considering the two different emulation approaches introduced in the literature as perfect and imperfect, that we here refer to as Perfect Prognosis (PP) and Model Output Statistics (MOS), respectively, following the well-established downscaling terminology. In addition to standard evaluation techniques, we expand the analysis with methods from the field of eXplainable Artificial Intelligence (XAI), to assess the physical consistency of the empirical links learnt by the models. We find that both approaches are able to emulate certain climatological properties of RCMs for different periods and scenarios (soft transferability), but the consistency of the emulation functions differ between approaches. Whereas PP learns robust and physically meaningful patterns, MOS results are GCM-dependent and lack physical consistency in some cases. Both approaches face problems when transferring the emulation function to other GCMs (hard transferability), due to the existence of GCM-dependent biases. This limits their applicability to build RCM ensembles. We conclude by giving prospects for future applications.
Abstract
Regional climate models (RCMs) are essential tools for simulating and studying regional climate variability and change. However, their high computational cost limits the production of comprehensive ensembles of regional climate projections covering multiple scenarios and driving Global Climate Models (GCMs) across regions. RCM emulators based on deep learning models have recently been introduced as a cost-effective and promising alternative that requires only short RCM simulations to train the models. Therefore, evaluating their transferability to different periods, scenarios, and GCMs becomes a pivotal and complex task in which the inherent biases of both GCMs and RCMs play a significant role. Here we focus on this problem by considering the two different emulation approaches introduced in the literature as perfect and imperfect, that we here refer to as Perfect Prognosis (PP) and Model Output Statistics (MOS), respectively, following the well-established downscaling terminology. In addition to standard evaluation techniques, we expand the analysis with methods from the field of eXplainable Artificial Intelligence (XAI), to assess the physical consistency of the empirical links learnt by the models. We find that both approaches are able to emulate certain climatological properties of RCMs for different periods and scenarios (soft transferability), but the consistency of the emulation functions differ between approaches. Whereas PP learns robust and physically meaningful patterns, MOS results are GCM-dependent and lack physical consistency in some cases. Both approaches face problems when transferring the emulation function to other GCMs (hard transferability), due to the existence of GCM-dependent biases. This limits their applicability to build RCM ensembles. We conclude by giving prospects for future applications.
Abstract
Precipitation values produced by climate models are biased due to the parameterization of physical processes and limited spatial resolution. Current bias-correction approaches usually focus on correcting lower-order statistics (mean and standard deviation), which make it difficult to capture precipitation extremes. However, accurate modeling of extremes is critical for policymaking to mitigate and adapt to the effects of climate change. We develop a deep learning framework, leveraging information from key dynamical variables impacting precipitation to also match higher-order statistics (skewness and kurtosis) for the entire precipitation distribution, including extremes. The deep learning framework consists of a two-part architecture: a U-Net convolutional network to capture the spatiotemporal distribution of precipitation and a fully connected network to capture the distribution of higher-order statistics. The joint network, termed UFNet, can simultaneously improve the spatial structure of the modeled precipitation and capture the distribution of extreme precipitation values. Using climate model simulation data and observations that are climatologically similar but not strictly paired, the UFNet identifies and corrects the climate model biases, significantly improving the estimation of daily precipitation as measured by a broad range of spatiotemporal statistics. In particular, UFNet significantly improves the underestimation of extreme precipitation values seen with current bias-correction methods. Our approach constitutes a generalized framework for correcting other climate model variables which improves the accuracy of the climate model predictions, while utilizing a simpler and more stable training process.
Abstract
Precipitation values produced by climate models are biased due to the parameterization of physical processes and limited spatial resolution. Current bias-correction approaches usually focus on correcting lower-order statistics (mean and standard deviation), which make it difficult to capture precipitation extremes. However, accurate modeling of extremes is critical for policymaking to mitigate and adapt to the effects of climate change. We develop a deep learning framework, leveraging information from key dynamical variables impacting precipitation to also match higher-order statistics (skewness and kurtosis) for the entire precipitation distribution, including extremes. The deep learning framework consists of a two-part architecture: a U-Net convolutional network to capture the spatiotemporal distribution of precipitation and a fully connected network to capture the distribution of higher-order statistics. The joint network, termed UFNet, can simultaneously improve the spatial structure of the modeled precipitation and capture the distribution of extreme precipitation values. Using climate model simulation data and observations that are climatologically similar but not strictly paired, the UFNet identifies and corrects the climate model biases, significantly improving the estimation of daily precipitation as measured by a broad range of spatiotemporal statistics. In particular, UFNet significantly improves the underestimation of extreme precipitation values seen with current bias-correction methods. Our approach constitutes a generalized framework for correcting other climate model variables which improves the accuracy of the climate model predictions, while utilizing a simpler and more stable training process.
Abstract
Long-term environmental monitoring is critical for managing the soil and groundwater at contaminated sites. Recent improvementsin state-of-the-art sensor technology, communication networks, and artificial intelligence have created opportunities to modernize this monitoring activity for automated, fast, robust, and predictive monitoring. In such modernization, it is required that sensor locations be optimized to capture the spatiotemporal dynamics of all monitoring variables as well as to make it cost-effective. The legacy monitoring datasets of the target area are important to perform this optimization. In this study, we have developed a machine-learning approach to optimize sensor locations for soil and groundwater monitoring based on ensemble supervised learning and majority voting. For spatial optimization, Gaussian Process Regression (GPR) is used for spatial interpolation, while the majority voting is applied to accommodate the multivariate temporal dimension. Results show that the algorithms significantly outperform the random selection of the sensor locations for predictive spatiotemporal interpolation. While the method has been applied to a four-dimensional dataset (with two-dimensional space, time and multiple contaminants), we anticipate that it can be generalizable to higher dimensional datasets for environmental monitoring sensor location optimization.
Abstract
Long-term environmental monitoring is critical for managing the soil and groundwater at contaminated sites. Recent improvementsin state-of-the-art sensor technology, communication networks, and artificial intelligence have created opportunities to modernize this monitoring activity for automated, fast, robust, and predictive monitoring. In such modernization, it is required that sensor locations be optimized to capture the spatiotemporal dynamics of all monitoring variables as well as to make it cost-effective. The legacy monitoring datasets of the target area are important to perform this optimization. In this study, we have developed a machine-learning approach to optimize sensor locations for soil and groundwater monitoring based on ensemble supervised learning and majority voting. For spatial optimization, Gaussian Process Regression (GPR) is used for spatial interpolation, while the majority voting is applied to accommodate the multivariate temporal dimension. Results show that the algorithms significantly outperform the random selection of the sensor locations for predictive spatiotemporal interpolation. While the method has been applied to a four-dimensional dataset (with two-dimensional space, time and multiple contaminants), we anticipate that it can be generalizable to higher dimensional datasets for environmental monitoring sensor location optimization.
Abstract
Seasonal hypoxia is a recurring threat to ecosystems and fisheries in the Chesapeake Bay. Hypoxia forecasting based on coupled hydrodynamic and biogeochemical models has proven useful for many stakeholders, as these models excel in accounting for the effects of physical forcing on oxygen supply, but may fall short in replicating the more complex biogeochemical processes that govern oxygen consumption. Satellite-derived reflectances could be used to indicate the presence of surface organic matter over the Bay. However, teasing apart the contribution of atmospheric and aquatic constituents from the signal received by the satellite is not straightforward. As a result, it is difficult to derive surface concentrations of organic matter from satellite data in a robust fashion. A potential solution to this complexity is to use deep learning to build end-to-end applications that do not require precise accounting of the satellite signal from the atmosphere or water, phytoplankton blooms, or sediment plumes. By training a deep neural network with data from a vast suite of variables that could potentially affect oxygen in the water column, improvement of short-term (daily) hypoxia forecast may be possible. Here, we predict oxygen concentrations using inputs that account for both physical and biogeochemical factors. The physical inputs include wind velocity reanalysis information, together with 3D outputs from an estuarine hydrodynamic model, including current velocity, water temperature, and salinity. Satellite-derived spectral reflectance data are used as a surrogate for the biogeochemical factors. These input fields are time series of weekly statistics calculated from daily information, starting 8 weeks before each oxygen observation was collected. To accommodate this input data structure, we adopted a model architecture of long short-term memory networks with eight time steps. At each time step, a set of convolutional neural networks are used to extract information from the inputs. Ablation and cross-validation tests suggest that among all input features, the strongest predictor is the 3D temperature field, with which the new model can outperform the state-of-the-art by ∼20% in terms of median absolute error. Our approach represents a novel application of deep learning to address a complex water management challenge.
Significance Statement
This study presents a novel approach that combines deep learning and hydrodynamic model outputs to improve the accuracy of hypoxia forecasts in the Chesapeake Bay. By training a deep neural network with both physical and biogeochemical information as input features, the model accurately predicts oxygen concentration at any depth in the water column 1 day in advance. This approach has the potential to benefit stakeholders and inform adaptation measures during the recurring threat of hypoxia in the Chesapeake Bay. The success of this study suggests the potential for similar applications of deep learning to address complex water management challenges. Further research could investigate the application of this approach to different forecast lead times and other regions and ecosystem types.
Abstract
Seasonal hypoxia is a recurring threat to ecosystems and fisheries in the Chesapeake Bay. Hypoxia forecasting based on coupled hydrodynamic and biogeochemical models has proven useful for many stakeholders, as these models excel in accounting for the effects of physical forcing on oxygen supply, but may fall short in replicating the more complex biogeochemical processes that govern oxygen consumption. Satellite-derived reflectances could be used to indicate the presence of surface organic matter over the Bay. However, teasing apart the contribution of atmospheric and aquatic constituents from the signal received by the satellite is not straightforward. As a result, it is difficult to derive surface concentrations of organic matter from satellite data in a robust fashion. A potential solution to this complexity is to use deep learning to build end-to-end applications that do not require precise accounting of the satellite signal from the atmosphere or water, phytoplankton blooms, or sediment plumes. By training a deep neural network with data from a vast suite of variables that could potentially affect oxygen in the water column, improvement of short-term (daily) hypoxia forecast may be possible. Here, we predict oxygen concentrations using inputs that account for both physical and biogeochemical factors. The physical inputs include wind velocity reanalysis information, together with 3D outputs from an estuarine hydrodynamic model, including current velocity, water temperature, and salinity. Satellite-derived spectral reflectance data are used as a surrogate for the biogeochemical factors. These input fields are time series of weekly statistics calculated from daily information, starting 8 weeks before each oxygen observation was collected. To accommodate this input data structure, we adopted a model architecture of long short-term memory networks with eight time steps. At each time step, a set of convolutional neural networks are used to extract information from the inputs. Ablation and cross-validation tests suggest that among all input features, the strongest predictor is the 3D temperature field, with which the new model can outperform the state-of-the-art by ∼20% in terms of median absolute error. Our approach represents a novel application of deep learning to address a complex water management challenge.
Significance Statement
This study presents a novel approach that combines deep learning and hydrodynamic model outputs to improve the accuracy of hypoxia forecasts in the Chesapeake Bay. By training a deep neural network with both physical and biogeochemical information as input features, the model accurately predicts oxygen concentration at any depth in the water column 1 day in advance. This approach has the potential to benefit stakeholders and inform adaptation measures during the recurring threat of hypoxia in the Chesapeake Bay. The success of this study suggests the potential for similar applications of deep learning to address complex water management challenges. Further research could investigate the application of this approach to different forecast lead times and other regions and ecosystem types.
Abstract
Explainable artificial intelligence (XAI) methods shed light on the predictions of machine learning algorithms. Several different approaches exist and have already been applied in climate science. However, usually missing ground truth explanations complicate their evaluation and comparison, subsequently impeding the choice of the XAI method. Therefore, in this work, we introduce XAI evaluation in the climate context and discuss different desired explanation properties, namely, robustness, faithfulness, randomization, complexity, and localization. To this end, we chose previous work as a case study where the decade of annual-mean temperature maps is predicted. After training both a multilayer perceptron (MLP) and a convolutional neural network (CNN), multiple XAI methods are applied and their skill scores in reference to a random uniform explanation are calculated for each property. Independent of the network, we find that XAI methods such as Integrated Gradients, layerwise relevance propagation, and input times gradients exhibit considerable robustness, faithfulness, and complexity while sacrificing randomization performance. Sensitivity methods, gradient, SmoothGrad, NoiseGrad, and FusionGrad, match the robustness skill but sacrifice faithfulness and complexity for the randomization skill. We find architecture-dependent performance differences regarding robustness, complexity, and localization skills of different XAI methods, highlighting the necessity for research task-specific evaluation. Overall, our work offers an overview of different evaluation properties in the climate science context and shows how to compare and benchmark different explanation methods, assessing their suitability based on strengths and weaknesses, for the specific research problem at hand. By that, we aim to support climate researchers in the selection of a suitable XAI method.
Significance Statement
Explainable artificial intelligence (XAI) helps to understand the reasoning behind the prediction of a neural network. XAI methods have been applied in climate science to validate networks and provide new insight into physical processes. However, the increasing number of XAI methods can overwhelm practitioners, making it difficult to choose an explanation method. Since XAI methods’ results can vary, uninformed choices might cause misleading conclusions about the network decision. In this work, we introduce XAI evaluation to compare and assess the performance of explanation methods based on five desirable properties. We demonstrate that XAI evaluation reveals the strengths and weaknesses of different XAI methods. Thus, our work provides climate researchers with the tools to compare, analyze, and subsequently choose explanation methods.
Abstract
Explainable artificial intelligence (XAI) methods shed light on the predictions of machine learning algorithms. Several different approaches exist and have already been applied in climate science. However, usually missing ground truth explanations complicate their evaluation and comparison, subsequently impeding the choice of the XAI method. Therefore, in this work, we introduce XAI evaluation in the climate context and discuss different desired explanation properties, namely, robustness, faithfulness, randomization, complexity, and localization. To this end, we chose previous work as a case study where the decade of annual-mean temperature maps is predicted. After training both a multilayer perceptron (MLP) and a convolutional neural network (CNN), multiple XAI methods are applied and their skill scores in reference to a random uniform explanation are calculated for each property. Independent of the network, we find that XAI methods such as Integrated Gradients, layerwise relevance propagation, and input times gradients exhibit considerable robustness, faithfulness, and complexity while sacrificing randomization performance. Sensitivity methods, gradient, SmoothGrad, NoiseGrad, and FusionGrad, match the robustness skill but sacrifice faithfulness and complexity for the randomization skill. We find architecture-dependent performance differences regarding robustness, complexity, and localization skills of different XAI methods, highlighting the necessity for research task-specific evaluation. Overall, our work offers an overview of different evaluation properties in the climate science context and shows how to compare and benchmark different explanation methods, assessing their suitability based on strengths and weaknesses, for the specific research problem at hand. By that, we aim to support climate researchers in the selection of a suitable XAI method.
Significance Statement
Explainable artificial intelligence (XAI) helps to understand the reasoning behind the prediction of a neural network. XAI methods have been applied in climate science to validate networks and provide new insight into physical processes. However, the increasing number of XAI methods can overwhelm practitioners, making it difficult to choose an explanation method. Since XAI methods’ results can vary, uninformed choices might cause misleading conclusions about the network decision. In this work, we introduce XAI evaluation to compare and assess the performance of explanation methods based on five desirable properties. We demonstrate that XAI evaluation reveals the strengths and weaknesses of different XAI methods. Thus, our work provides climate researchers with the tools to compare, analyze, and subsequently choose explanation methods.
Abstract
Wind gusts are often associated with severe hazards and can cause structural and environmental damages, making gust prediction a crucial element of weather forecasting services. In this study, we explored the utilization of machine learning (ML) algorithms integrated with numerical weather prediction outputs from the Weather Research and Forecasting (WRF) Model, to align the estimation of wind gust potential with observed gusts. We have used two ML algorithms, namely, random forest (RF) and extreme gradient boosting (XGB), along with two statistical techniques: generalized linear model with identity link function (GLM-Identity) and generalized linear model with log link function (GLM-Log), to predict storm wind gusts for the northeast (NE) United States. We used 61 simulated extratropical and tropical storms that occurred between 2005 and 2020 to develop and validate the ML and statistical models. To assess the ML model performance, we compared our results with postprocessed gust potential from WRF. Our findings showed that ML models, especially XGB, performed significantly better than statistical models and Unified Post Processor for the WRF (WRF-UPP) Model and were able to better align predicted with observed gusts across all storms. The ML models faced challenges capturing the upper tail of the gust distribution, and the learning curves suggested that XGB was more effective than RF in generating better predictions with fewer storms.
Abstract
Wind gusts are often associated with severe hazards and can cause structural and environmental damages, making gust prediction a crucial element of weather forecasting services. In this study, we explored the utilization of machine learning (ML) algorithms integrated with numerical weather prediction outputs from the Weather Research and Forecasting (WRF) Model, to align the estimation of wind gust potential with observed gusts. We have used two ML algorithms, namely, random forest (RF) and extreme gradient boosting (XGB), along with two statistical techniques: generalized linear model with identity link function (GLM-Identity) and generalized linear model with log link function (GLM-Log), to predict storm wind gusts for the northeast (NE) United States. We used 61 simulated extratropical and tropical storms that occurred between 2005 and 2020 to develop and validate the ML and statistical models. To assess the ML model performance, we compared our results with postprocessed gust potential from WRF. Our findings showed that ML models, especially XGB, performed significantly better than statistical models and Unified Post Processor for the WRF (WRF-UPP) Model and were able to better align predicted with observed gusts across all storms. The ML models faced challenges capturing the upper tail of the gust distribution, and the learning curves suggested that XGB was more effective than RF in generating better predictions with fewer storms.
Abstract
Detection and tracking of tropical cyclones (TCs) in numerical weather prediction model outputs is essential for many applications, such as forecast guidance and real-time monitoring of events. While this task has been automated in the 1990s with heuristic models, relying on a set of empirical rules and thresholds, the recent success of machine learning methods to detect objects in images opens new perspectives. This paper introduces and evaluates the capacity of a convolutional neural network based on the U-Net architecture to detect the TC wind structure, including maximum wind speed area and hurricane-force wind speed area, in the outputs of the convective-scale AROME model. A dataset of 400 AROME forecasts over the West Indies domain has been entirely hand-labeled by experts, following a rigorous process to reduce heterogeneities. The U-Net performs well on a wide variety of TC intensities and shapes, with an average intersection-over-union metric of around 0.8. Its performances, however, strongly depend on the TC strength, and the detection of weak cyclones is more challenging since their structure is less well defined. The U-Net also significantly outperforms an operational heuristic detection model, with a significant gain for weak TCs, while running much faster. In the last part, the capacity of the U-Net to generalize on slightly different data is demonstrated in the context of a domain change and a resolution increase. In both cases, the pretrained U-Net achieves similar performances as the original dataset.
Abstract
Detection and tracking of tropical cyclones (TCs) in numerical weather prediction model outputs is essential for many applications, such as forecast guidance and real-time monitoring of events. While this task has been automated in the 1990s with heuristic models, relying on a set of empirical rules and thresholds, the recent success of machine learning methods to detect objects in images opens new perspectives. This paper introduces and evaluates the capacity of a convolutional neural network based on the U-Net architecture to detect the TC wind structure, including maximum wind speed area and hurricane-force wind speed area, in the outputs of the convective-scale AROME model. A dataset of 400 AROME forecasts over the West Indies domain has been entirely hand-labeled by experts, following a rigorous process to reduce heterogeneities. The U-Net performs well on a wide variety of TC intensities and shapes, with an average intersection-over-union metric of around 0.8. Its performances, however, strongly depend on the TC strength, and the detection of weak cyclones is more challenging since their structure is less well defined. The U-Net also significantly outperforms an operational heuristic detection model, with a significant gain for weak TCs, while running much faster. In the last part, the capacity of the U-Net to generalize on slightly different data is demonstrated in the context of a domain change and a resolution increase. In both cases, the pretrained U-Net achieves similar performances as the original dataset.
Abstract
CloudSat’s Cloud Profiling Radar is a valuable tool for remotely monitoring high-latitude snowfall, but its ability to observe hydrometeor activity near the Earth’s surface is limited by a radar blind zone caused by ground clutter contamination. This study presents the development of a deeply supervised U-Net-style convolutional neural network to predict cold season reflectivity profiles within the blind zone at two Arctic locations. The network learns to predict the presence and intensity of near-surface hydrometeors by coupling latent features encoded in blind zone-aloft clouds with additional context from collocated atmospheric state variables (i.e., temperature, specific humidity, and wind speed). Results show that the U-Net predictions outperform traditional linear extrapolation methods, with low mean absolute error, a 38% higher Sørensen–Dice coefficient, and vertical reflectivity distributions 60% closer to observed values. The U-Net is also able to detect the presence of near-surface cloud with a critical success index (CSI) of 72% and cases of shallow cumuliform snowfall and virga with 18% higher CSI values compared to linear methods. An explainability analysis shows that reflectivity information throughout the scene, especially at cloud edges and at the 1.2-km blind zone threshold, along with atmospheric state variables near the tropopause, are the most significant contributors to model skill. This surface-trained generative inpainting technique has the potential to enhance current and future remote sensing precipitation missions by providing a better understanding of the nonlinear relationship between blind zone reflectivity values and the surrounding atmospheric state.
Significance Statement
Snowfall is a critical contributor to the global water–energy budget, with important connections to water resource management, flood mitigation, and ecosystem sustainability. However, traditional spaceborne remote monitoring of snowfall faces challenges due to a near-surface radar blind zone, which masks a portion of the atmosphere. In this study, a deep learning model was developed to fill in missing data across these regions using surface radar and atmospheric state variables. The model accurately predicts reflectivity, with significant improvements over conventional methods. This innovative approach enhances our understanding of reflectivity patterns and atmospheric interactions, bolstering advances in remote snowfall prediction.
Abstract
CloudSat’s Cloud Profiling Radar is a valuable tool for remotely monitoring high-latitude snowfall, but its ability to observe hydrometeor activity near the Earth’s surface is limited by a radar blind zone caused by ground clutter contamination. This study presents the development of a deeply supervised U-Net-style convolutional neural network to predict cold season reflectivity profiles within the blind zone at two Arctic locations. The network learns to predict the presence and intensity of near-surface hydrometeors by coupling latent features encoded in blind zone-aloft clouds with additional context from collocated atmospheric state variables (i.e., temperature, specific humidity, and wind speed). Results show that the U-Net predictions outperform traditional linear extrapolation methods, with low mean absolute error, a 38% higher Sørensen–Dice coefficient, and vertical reflectivity distributions 60% closer to observed values. The U-Net is also able to detect the presence of near-surface cloud with a critical success index (CSI) of 72% and cases of shallow cumuliform snowfall and virga with 18% higher CSI values compared to linear methods. An explainability analysis shows that reflectivity information throughout the scene, especially at cloud edges and at the 1.2-km blind zone threshold, along with atmospheric state variables near the tropopause, are the most significant contributors to model skill. This surface-trained generative inpainting technique has the potential to enhance current and future remote sensing precipitation missions by providing a better understanding of the nonlinear relationship between blind zone reflectivity values and the surrounding atmospheric state.
Significance Statement
Snowfall is a critical contributor to the global water–energy budget, with important connections to water resource management, flood mitigation, and ecosystem sustainability. However, traditional spaceborne remote monitoring of snowfall faces challenges due to a near-surface radar blind zone, which masks a portion of the atmosphere. In this study, a deep learning model was developed to fill in missing data across these regions using surface radar and atmospheric state variables. The model accurately predicts reflectivity, with significant improvements over conventional methods. This innovative approach enhances our understanding of reflectivity patterns and atmospheric interactions, bolstering advances in remote snowfall prediction.