Browse
Abstract
An ensemble postprocessing method is developed for the probabilistic prediction of severe weather (tornadoes, hail, and wind gusts) over the conterminous United States (CONUS). The method combines conditional generative adversarial networks (CGANs), a type of deep generative model, with a convolutional neural network (CNN) to postprocess convection-allowing model (CAM) forecasts. The CGANs are designed to create synthetic ensemble members from deterministic CAM forecasts, and their outputs are processed by the CNN to estimate the probability of severe weather. The method is tested using High-Resolution Rapid Refresh (HRRR) 1–24-h forecasts as inputs and Storm Prediction Center (SPC) severe weather reports as targets. The method produced skillful predictions with up to 20% Brier skill score (BSS) increases compared to other neural-network-based reference methods using a testing dataset of HRRR forecasts in 2021. For the evaluation of uncertainty quantification, the method is overconfident but produces meaningful ensemble spreads that can distinguish good and bad forecasts. The quality of CGAN outputs is also evaluated. Results show that the CGAN outputs behave similarly to a numerical ensemble; they preserved the intervariable correlations and the contribution of influential predictors as in the original HRRR forecasts. This work provides a novel approach to postprocess CAM output using neural networks that can be applied to severe weather prediction.
Significance Statement
We use a new machine learning (ML) technique to generate probabilistic forecasts of convective weather hazards, such as tornadoes and hailstorms, with the output from high-resolution numerical weather model forecasts. The new ML system generates an ensemble of synthetic forecast fields from a single forecast, which are then used to train ML models for convective hazard prediction. Using this ML-generated ensemble for training leads to improvements of 10%–20% in severe weather forecast skills compared to using other ML algorithms that use only output from the single forecast. This work is unique in that it explores the use of ML methods for producing synthetic forecasts of convective storm events and using these to train ML systems for high-impact convective weather prediction.
Abstract
An ensemble postprocessing method is developed for the probabilistic prediction of severe weather (tornadoes, hail, and wind gusts) over the conterminous United States (CONUS). The method combines conditional generative adversarial networks (CGANs), a type of deep generative model, with a convolutional neural network (CNN) to postprocess convection-allowing model (CAM) forecasts. The CGANs are designed to create synthetic ensemble members from deterministic CAM forecasts, and their outputs are processed by the CNN to estimate the probability of severe weather. The method is tested using High-Resolution Rapid Refresh (HRRR) 1–24-h forecasts as inputs and Storm Prediction Center (SPC) severe weather reports as targets. The method produced skillful predictions with up to 20% Brier skill score (BSS) increases compared to other neural-network-based reference methods using a testing dataset of HRRR forecasts in 2021. For the evaluation of uncertainty quantification, the method is overconfident but produces meaningful ensemble spreads that can distinguish good and bad forecasts. The quality of CGAN outputs is also evaluated. Results show that the CGAN outputs behave similarly to a numerical ensemble; they preserved the intervariable correlations and the contribution of influential predictors as in the original HRRR forecasts. This work provides a novel approach to postprocess CAM output using neural networks that can be applied to severe weather prediction.
Significance Statement
We use a new machine learning (ML) technique to generate probabilistic forecasts of convective weather hazards, such as tornadoes and hailstorms, with the output from high-resolution numerical weather model forecasts. The new ML system generates an ensemble of synthetic forecast fields from a single forecast, which are then used to train ML models for convective hazard prediction. Using this ML-generated ensemble for training leads to improvements of 10%–20% in severe weather forecast skills compared to using other ML algorithms that use only output from the single forecast. This work is unique in that it explores the use of ML methods for producing synthetic forecasts of convective storm events and using these to train ML systems for high-impact convective weather prediction.
Abstract
Weather radar data are critical for nowcasting and an integral component of numerical weather prediction models. While weather radar data provide valuable information at high resolution, their ground-based nature limits their availability, which impedes large-scale applications. In contrast, meteorological satellites cover larger domains but with coarser resolution. However, with the rapid advancements in data-driven methodologies and modern sensors aboard geostationary satellites, new opportunities are emerging to bridge the gap between ground- and space-based observations, ultimately leading to more skillful weather prediction with high accuracy. Here, we present a transformer-based model for nowcasting ground-based radar image sequences using satellite data up to 2-h lead time. Trained on a dataset reflecting severe weather conditions, the model predicts radar fields occurring under different weather phenomena and shows robustness against rapidly growing/decaying fields and complex field structures. Model interpretation reveals that the infrared channel centered at 10.3 μm (C13) contains skillful information for all weather conditions, while lightning data have the highest relative feature importance in severe weather conditions, particularly in shorter lead times. The model can support precipitation nowcasting across large domains without an explicit need for radar towers, enhance numerical weather prediction and hydrological models, and provide radar proxy for data-scarce regions. Moreover, the open-source framework facilitates progress toward operational data-driven nowcasting.
Significance Statement
Ground-based weather radar data are essential for nowcasting, but data availability limitations hamper usage of radar data across large domains. We present a machine learning model, rooted in transformer architecture, that performs nowcasting of radar data using high-resolution geostationary satellite retrievals, for lead times of up to 2 h. Our model captures the spatiotemporal dynamics of radar fields from satellite data and offers accurate forecasts. Analysis indicates that the infrared channel centered at 10.3 μm provides useful information for nowcasting radar fields under various weather conditions. However, lightning activity exhibits the highest forecasting skill for severe weather at short lead times. Our findings show the potential of transformer-based models for nowcasting severe weather.
Abstract
Weather radar data are critical for nowcasting and an integral component of numerical weather prediction models. While weather radar data provide valuable information at high resolution, their ground-based nature limits their availability, which impedes large-scale applications. In contrast, meteorological satellites cover larger domains but with coarser resolution. However, with the rapid advancements in data-driven methodologies and modern sensors aboard geostationary satellites, new opportunities are emerging to bridge the gap between ground- and space-based observations, ultimately leading to more skillful weather prediction with high accuracy. Here, we present a transformer-based model for nowcasting ground-based radar image sequences using satellite data up to 2-h lead time. Trained on a dataset reflecting severe weather conditions, the model predicts radar fields occurring under different weather phenomena and shows robustness against rapidly growing/decaying fields and complex field structures. Model interpretation reveals that the infrared channel centered at 10.3 μm (C13) contains skillful information for all weather conditions, while lightning data have the highest relative feature importance in severe weather conditions, particularly in shorter lead times. The model can support precipitation nowcasting across large domains without an explicit need for radar towers, enhance numerical weather prediction and hydrological models, and provide radar proxy for data-scarce regions. Moreover, the open-source framework facilitates progress toward operational data-driven nowcasting.
Significance Statement
Ground-based weather radar data are essential for nowcasting, but data availability limitations hamper usage of radar data across large domains. We present a machine learning model, rooted in transformer architecture, that performs nowcasting of radar data using high-resolution geostationary satellite retrievals, for lead times of up to 2 h. Our model captures the spatiotemporal dynamics of radar fields from satellite data and offers accurate forecasts. Analysis indicates that the infrared channel centered at 10.3 μm provides useful information for nowcasting radar fields under various weather conditions. However, lightning activity exhibits the highest forecasting skill for severe weather at short lead times. Our findings show the potential of transformer-based models for nowcasting severe weather.
Abstract
The diversity in the lightning parameterizations for numerical weather and climate models causes considerable uncertainty in lightning prediction. In this study, we take a data-driven approach to address the lightning parameterization problem, by combining machine learning (ML) techniques with the rich lightning observations from the World Wide Lightning Location Network. Three ML algorithms are trained over the contiguous United States (CONUS) to predict lightning stroke density in a 1° box based on the information about the atmospheric variables in the same grid (local) or over the entire CONUS (nonlocal). The performance of the ML-based lightning schemes is examined and compared with that of a simple, conventional lightning parameterization scheme of Romps et al. We find that all ML-based lightning schemes exhibit a performance that is superior to that of the conventional scheme in the regions and in the seasons with climatologically higher lightning stroke density. To the west of the Rocky Mountains, the nonlocal ML lightning scheme achieves the best overall performance, with lightning stroke density predictions being 70% more accurate than the conventional scheme. Our results suggest that the ML-based approaches have the potential to improve the representation of lightning and other types of extreme weather events in the weather and climate models.
Abstract
The diversity in the lightning parameterizations for numerical weather and climate models causes considerable uncertainty in lightning prediction. In this study, we take a data-driven approach to address the lightning parameterization problem, by combining machine learning (ML) techniques with the rich lightning observations from the World Wide Lightning Location Network. Three ML algorithms are trained over the contiguous United States (CONUS) to predict lightning stroke density in a 1° box based on the information about the atmospheric variables in the same grid (local) or over the entire CONUS (nonlocal). The performance of the ML-based lightning schemes is examined and compared with that of a simple, conventional lightning parameterization scheme of Romps et al. We find that all ML-based lightning schemes exhibit a performance that is superior to that of the conventional scheme in the regions and in the seasons with climatologically higher lightning stroke density. To the west of the Rocky Mountains, the nonlocal ML lightning scheme achieves the best overall performance, with lightning stroke density predictions being 70% more accurate than the conventional scheme. Our results suggest that the ML-based approaches have the potential to improve the representation of lightning and other types of extreme weather events in the weather and climate models.
Abstract
Forecast evaluation metrics have been discovered and rediscovered in a variety of contexts, leading to confusion. We look at measures from the 2 × 2 contingency table and the history of their development and illustrate how different fields working on similar problems has led to different approaches and perspectives of the same mathematical concepts. For example, probability of detection (POD) is a quantity in meteorology that was also called prefigurance in the field, while the same thing is named recall in information science and machine learning, and sensitivity and true positive rate in the medical literature. Many of the scores that combine three elements of the 2 × 2 table can be seen as either coming from a perspective of Venn diagrams or from the Pythagorean means, possibly weighted, of two ratios of performance measures. Although there are algebraic relationships between the two perspectives, the approaches taken by authors led them in different directions, making it unlikely that they would discover scores that naturally arose from the other approach. We close by discussing the importance of understanding the implicit or explicit values expressed by the choice of scores. In addition, we make some simple recommendations about the appropriate nomenclature to use when publishing interdisciplinary work.
Abstract
Forecast evaluation metrics have been discovered and rediscovered in a variety of contexts, leading to confusion. We look at measures from the 2 × 2 contingency table and the history of their development and illustrate how different fields working on similar problems has led to different approaches and perspectives of the same mathematical concepts. For example, probability of detection (POD) is a quantity in meteorology that was also called prefigurance in the field, while the same thing is named recall in information science and machine learning, and sensitivity and true positive rate in the medical literature. Many of the scores that combine three elements of the 2 × 2 table can be seen as either coming from a perspective of Venn diagrams or from the Pythagorean means, possibly weighted, of two ratios of performance measures. Although there are algebraic relationships between the two perspectives, the approaches taken by authors led them in different directions, making it unlikely that they would discover scores that naturally arose from the other approach. We close by discussing the importance of understanding the implicit or explicit values expressed by the choice of scores. In addition, we make some simple recommendations about the appropriate nomenclature to use when publishing interdisciplinary work.
Abstract
In this study, we developed an emulator of the Community Multiscale Air Quality (CMAQ) model by employing a one-dimensional (1D) convolutional neural network (CNN) algorithm to predict hourly surface nitrogen dioxide (NO2) concentrations over the most densely populated urban regions in Texas. The inputs for the emulator were the same as those for the CMAQ model, which includes emission, meteorological, and land-use/land-cover data. We trained the model over June, July, and August (JJA) of 2011 and 2014 and then tested it on JJA of 2017, achieving an index of agreement (IOA) of 0.95 and a correlation of 0.90. We also employed temporal threefold cross validation to evaluate the model’s performance, ensuring the robustness and generalizability of the results. To gain deeper insights and understand the factors influencing the model’s surface NO2 predictions, we conducted a Shapley additive explanations analysis. The results revealed solar radiation reaching the surface, planetary boundary layer height, and NO x (NO + NO2) emissions are key variables driving the model’s predictions. These findings highlight the emulator’s ability to capture the individual impact of each variable on the model’s NO2 predictions. Furthermore, our emulator outperformed the CMAQ model in terms of computational efficiency, being more than 900 times as fast in predicting NO2 concentrations, enabling the rapid assessment of various pollution management scenarios. This work offers a valuable resource for air pollution mitigation efforts, not just in Texas, but with appropriate regional data training, its utility could be extended to other regions and pollutants as well.
Significance Statement
This work develops an emulator of the Community Multiscale Air Quality model, using a one-dimensional convolutional neural network to predict hourly surface NO2 concentrations across densely populated regions in Texas. Our emulator is capable of providing rapid and highly accurate NO2 estimates, enabling it to model diverse scenarios and facilitating informed decision-making to improve public health outcomes. Notably, this model outperforms traditional methods in computational efficiency, making it a robust, time-efficient tool for air pollution mitigation efforts. The findings suggest that key variables like solar radiation, planetary boundary layer height, and NO x (NO + NO2) emissions significantly influence the model’s NO2 predictions. By adding appropriate training data, this work can be extended to other regions and other pollutants such as O3, PM2.5, and PM10, offering a powerful tool for pollution mitigation and public health improvement efforts worldwide.
Abstract
In this study, we developed an emulator of the Community Multiscale Air Quality (CMAQ) model by employing a one-dimensional (1D) convolutional neural network (CNN) algorithm to predict hourly surface nitrogen dioxide (NO2) concentrations over the most densely populated urban regions in Texas. The inputs for the emulator were the same as those for the CMAQ model, which includes emission, meteorological, and land-use/land-cover data. We trained the model over June, July, and August (JJA) of 2011 and 2014 and then tested it on JJA of 2017, achieving an index of agreement (IOA) of 0.95 and a correlation of 0.90. We also employed temporal threefold cross validation to evaluate the model’s performance, ensuring the robustness and generalizability of the results. To gain deeper insights and understand the factors influencing the model’s surface NO2 predictions, we conducted a Shapley additive explanations analysis. The results revealed solar radiation reaching the surface, planetary boundary layer height, and NO x (NO + NO2) emissions are key variables driving the model’s predictions. These findings highlight the emulator’s ability to capture the individual impact of each variable on the model’s NO2 predictions. Furthermore, our emulator outperformed the CMAQ model in terms of computational efficiency, being more than 900 times as fast in predicting NO2 concentrations, enabling the rapid assessment of various pollution management scenarios. This work offers a valuable resource for air pollution mitigation efforts, not just in Texas, but with appropriate regional data training, its utility could be extended to other regions and pollutants as well.
Significance Statement
This work develops an emulator of the Community Multiscale Air Quality model, using a one-dimensional convolutional neural network to predict hourly surface NO2 concentrations across densely populated regions in Texas. Our emulator is capable of providing rapid and highly accurate NO2 estimates, enabling it to model diverse scenarios and facilitating informed decision-making to improve public health outcomes. Notably, this model outperforms traditional methods in computational efficiency, making it a robust, time-efficient tool for air pollution mitigation efforts. The findings suggest that key variables like solar radiation, planetary boundary layer height, and NO x (NO + NO2) emissions significantly influence the model’s NO2 predictions. By adding appropriate training data, this work can be extended to other regions and other pollutants such as O3, PM2.5, and PM10, offering a powerful tool for pollution mitigation and public health improvement efforts worldwide.
Abstract
Forecasting the intensity of a tropical cyclone (TC) remains challenging, particularly when it undergoes rapid changes in intensity. This study aims to develop a convolutional neural network (CNN) for 24-h forecasts of the TC intensity changes and their rapid intensifications over the western Pacific. The CNN model, the DeepTC, is trained using a unique loss function, an amplitude focal loss, to better capture large intensity changes, such as those during rapid intensification (RI) events. We showed that the DeepTC outperforms operational forecasts, with a lower mean absolute error (8.9%–10.2%) and a higher coefficient of determination (31.7%–35%). In addition, the DeepTC exhibits a substantially better skill at capturing RI events than operational forecasts. To understand the superior performance of the DeepTC in RI forecasts, we conduct an occlusion sensitivity analysis to quantify the relative importance of each predictor. Results revealed that scalar quantities such as latitude, previous intensity change, initial intensity, and vertical wind shear play critical roles in successful RI prediction. Additionally, the DeepTC utilizes the three-dimensional distribution of relative humidity to distinguish RI cases from non-RI cases, with higher dry–moist moisture gradients in the mid-to-low troposphere and steeper radial moisture gradients in the upper troposphere showed during RI events. These relationships between the identified key variables and intensity change were successfully simulated by the DeepTC, implying that the relationship is physically reasonable. Our study demonstrates that the DeepTC can be a powerful tool for improving RI understanding and enhancing the reliability of TC intensity forecasts.
Abstract
Forecasting the intensity of a tropical cyclone (TC) remains challenging, particularly when it undergoes rapid changes in intensity. This study aims to develop a convolutional neural network (CNN) for 24-h forecasts of the TC intensity changes and their rapid intensifications over the western Pacific. The CNN model, the DeepTC, is trained using a unique loss function, an amplitude focal loss, to better capture large intensity changes, such as those during rapid intensification (RI) events. We showed that the DeepTC outperforms operational forecasts, with a lower mean absolute error (8.9%–10.2%) and a higher coefficient of determination (31.7%–35%). In addition, the DeepTC exhibits a substantially better skill at capturing RI events than operational forecasts. To understand the superior performance of the DeepTC in RI forecasts, we conduct an occlusion sensitivity analysis to quantify the relative importance of each predictor. Results revealed that scalar quantities such as latitude, previous intensity change, initial intensity, and vertical wind shear play critical roles in successful RI prediction. Additionally, the DeepTC utilizes the three-dimensional distribution of relative humidity to distinguish RI cases from non-RI cases, with higher dry–moist moisture gradients in the mid-to-low troposphere and steeper radial moisture gradients in the upper troposphere showed during RI events. These relationships between the identified key variables and intensity change were successfully simulated by the DeepTC, implying that the relationship is physically reasonable. Our study demonstrates that the DeepTC can be a powerful tool for improving RI understanding and enhancing the reliability of TC intensity forecasts.
Abstract
Tropical cyclones (TCs) are important phenomena, and understanding their behavior requires being able to detect their presence in simulations. Detection algorithms vary; here we compare a novel deep learning–based detection algorithm (TCDetect) with a state-of-the-art tracking system (TRACK) and an observational dataset (IBTrACS) to provide context for potential use in climate simulations. Previous work has shown that TCDetect has good recall, particularly for hurricane-strength events. The primary question addressed here is to what extent the structure of the systems plays a part in detection. To compare with observations of TCs, it is necessary to apply detection techniques to reanalysis. For this purpose, we use ERA-Interim, and a key part of the comparison is the recognition that ERA-Interim itself does not fully reflect the observations. Despite that limitation, both TCDetect and TRACK applied to ERA-Interim mostly agree with each other. Also, when considering only hurricane-strength TCs, TCDetect and TRACK correspond well to the TC observations from IBTrACS. Like TRACK, TCDetect has good recall for strong systems; however, it finds a significant number of false positives associated with weaker TCs (i.e., events detected as having hurricane strength but are weaker in reality) and extratropical storms. Because TCDetect was not trained to locate TCs, a post hoc method to perform comparisons was used. Although this method was not always successful, some success in matching tracks and events in physical space was also achieved. The analysis of matches suggested that the best results were found in the Northern Hemisphere and that in most regions the detections followed the same patterns in time no matter which detection method was used.
Abstract
Tropical cyclones (TCs) are important phenomena, and understanding their behavior requires being able to detect their presence in simulations. Detection algorithms vary; here we compare a novel deep learning–based detection algorithm (TCDetect) with a state-of-the-art tracking system (TRACK) and an observational dataset (IBTrACS) to provide context for potential use in climate simulations. Previous work has shown that TCDetect has good recall, particularly for hurricane-strength events. The primary question addressed here is to what extent the structure of the systems plays a part in detection. To compare with observations of TCs, it is necessary to apply detection techniques to reanalysis. For this purpose, we use ERA-Interim, and a key part of the comparison is the recognition that ERA-Interim itself does not fully reflect the observations. Despite that limitation, both TCDetect and TRACK applied to ERA-Interim mostly agree with each other. Also, when considering only hurricane-strength TCs, TCDetect and TRACK correspond well to the TC observations from IBTrACS. Like TRACK, TCDetect has good recall for strong systems; however, it finds a significant number of false positives associated with weaker TCs (i.e., events detected as having hurricane strength but are weaker in reality) and extratropical storms. Because TCDetect was not trained to locate TCs, a post hoc method to perform comparisons was used. Although this method was not always successful, some success in matching tracks and events in physical space was also achieved. The analysis of matches suggested that the best results were found in the Northern Hemisphere and that in most regions the detections followed the same patterns in time no matter which detection method was used.
Abstract
Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.
Significance Statement
Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.
Abstract
Superresolution is the general task of artificially increasing the spatial resolution of an image. The recent surge in machine learning (ML) research has yielded many promising ML-based approaches for performing single-image superresolution including applications to satellite remote sensing. We develop a convolutional neural network (CNN) to superresolve the 1- and 2-km bands on the GOES-R series Advanced Baseline Imager (ABI) to a common high resolution of 0.5 km. Access to 0.5-km imagery from ABI band 2 enables the CNN to realistically sharpen lower-resolution bands without significant blurring. We first train the CNN on a proxy task, which allows us to only use ABI imagery, namely, degrading the resolution of ABI bands and training the CNN to restore the original imagery. Comparisons at reduced resolution and at full resolution with Landsat-8/Landsat-9 observations illustrate that the CNN produces images with realistic high-frequency detail that is not present in a bicubic interpolation baseline. Estimating all ABI bands at 0.5-km resolution allows for more easily combining information across bands without reconciling differences in spatial resolution. However, more analysis is needed to determine impacts on derived products or multispectral imagery that use superresolved bands. This approach is extensible to other remote sensing instruments that have bands with different spatial resolutions and requires only a small amount of data and knowledge of each channel’s modulation transfer function.
Significance Statement
Satellite remote sensing instruments often have bands with different spatial resolutions. This work shows that we can artificially increase the resolution of some lower-resolution bands by taking advantage of the texture of higher-resolution bands on the GOES-16 ABI instrument using a convolutional neural network. This may help reconcile differences in spatial resolution when combining information across bands, but future analysis is needed to precisely determine impacts on derived products that might use superresolved bands.
Abstract
Despite the sophistication of global climate models (GCMs), their coarse spatial resolution limits their ability to resolve important aspects of climate variability and change at the local scale. Both dynamical and empirical methods are used for enhancing the resolution of climate projections through downscaling, each with distinct advantages and challenges. Dynamical downscaling is physics based but comes with a large computational cost, posing a barrier for downscaling an ensemble of GCMs large enough for reliable uncertainty quantification of climate risks. In contrast, empirical downscaling, which encompasses statistical and machine learning techniques, provides a computationally efficient alternative to downscaling GCMs. Empirical downscaling algorithms can be developed to emulate the behavior of dynamical models directly, or through frameworks such as perfect prognosis in which relationships are established between large-scale atmospheric conditions and local weather variables using observational data. However, the ability of empirical downscaling algorithms to apply their learned relationships out of distribution into future climates remains uncertain, as is their ability to represent certain types of extreme events. This review covers the growing potential of machine learning methods to address these challenges, offering a thorough exploration of the current applications and training strategies that can circumvent certain issues. Additionally, we propose an evaluation framework for machine learning algorithms specific to the problem of climate downscaling as needed to improve transparency and foster trust in climate projections.
Significance Statement
This review offers a significant contribution to our understanding of how machine learning can offer a transformative change in climate downscaling. It serves as a guide to navigate recent advances in machine learning and how these advances can be better aligned toward inherent challenges in climate downscaling. In this review, we provide an overview of these recent advances with a critical discussion of their advantages and limitations. We also discuss opportunities to refine existing machine learning methods alongside new approaches for the generation of large ensembles of high-resolution climate projections.
Abstract
Despite the sophistication of global climate models (GCMs), their coarse spatial resolution limits their ability to resolve important aspects of climate variability and change at the local scale. Both dynamical and empirical methods are used for enhancing the resolution of climate projections through downscaling, each with distinct advantages and challenges. Dynamical downscaling is physics based but comes with a large computational cost, posing a barrier for downscaling an ensemble of GCMs large enough for reliable uncertainty quantification of climate risks. In contrast, empirical downscaling, which encompasses statistical and machine learning techniques, provides a computationally efficient alternative to downscaling GCMs. Empirical downscaling algorithms can be developed to emulate the behavior of dynamical models directly, or through frameworks such as perfect prognosis in which relationships are established between large-scale atmospheric conditions and local weather variables using observational data. However, the ability of empirical downscaling algorithms to apply their learned relationships out of distribution into future climates remains uncertain, as is their ability to represent certain types of extreme events. This review covers the growing potential of machine learning methods to address these challenges, offering a thorough exploration of the current applications and training strategies that can circumvent certain issues. Additionally, we propose an evaluation framework for machine learning algorithms specific to the problem of climate downscaling as needed to improve transparency and foster trust in climate projections.
Significance Statement
This review offers a significant contribution to our understanding of how machine learning can offer a transformative change in climate downscaling. It serves as a guide to navigate recent advances in machine learning and how these advances can be better aligned toward inherent challenges in climate downscaling. In this review, we provide an overview of these recent advances with a critical discussion of their advantages and limitations. We also discuss opportunities to refine existing machine learning methods alongside new approaches for the generation of large ensembles of high-resolution climate projections.
Abstract
The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5 Precursor (Sentinel-5P) satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not provide prediction uncertainties and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.
Significance Statement
Gaps in satellite-based measurements of atmospheric variables can make the application of complex analysis methods such as deep learning approaches difficult. The purpose of this study is to present and evaluate a purely data-driven method to fill incomplete satellite image time series. The application on atmospheric carbon monoxide data suggests that the method can achieve prediction errors comparable to other approaches with much lower computation times. Results highlight that the method is promising for larger datasets but also that care must be taken to avoid extrapolation. Future studies may integrate the approach into more complex deep learning models for understanding spatiotemporal dynamics from incomplete data.
Abstract
The abundance of gaps in satellite image time series often complicates the application of deep learning models such as convolutional neural networks for spatiotemporal modeling. Based on previous work in computer vision on image inpainting, this paper shows how three-dimensional spatiotemporal partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series. To evaluate the approach, we apply a U-Net-like model on incomplete image time series of quasi-global carbon monoxide observations from the Sentinel-5 Precursor (Sentinel-5P) satellite. Prediction errors were comparable to two considered statistical approaches while computation times for predictions were up to three orders of magnitude faster, making the approach applicable to process large amounts of satellite data. Partial convolutions can be added as layers to other types of neural networks, making it relatively easy to integrate with existing deep learning models. However, the approach does not provide prediction uncertainties and further research is needed to understand and improve model transferability. The implementation of spatiotemporal partial convolutions and the U-Net-like model is available as open-source software.
Significance Statement
Gaps in satellite-based measurements of atmospheric variables can make the application of complex analysis methods such as deep learning approaches difficult. The purpose of this study is to present and evaluate a purely data-driven method to fill incomplete satellite image time series. The application on atmospheric carbon monoxide data suggests that the method can achieve prediction errors comparable to other approaches with much lower computation times. Results highlight that the method is promising for larger datasets but also that care must be taken to avoid extrapolation. Future studies may integrate the approach into more complex deep learning models for understanding spatiotemporal dynamics from incomplete data.