Browse
Abstract
This study proposes and assesses a methodology to obtain high-quality probabilistic predictions and uncertainty information of near-landfall tropical cyclone–driven (TC-driven) storm tide and inundation with limited time and resources. Forecasts of TC track, intensity, and size are perturbed according to quasi-random Korobov sequences of historical forecast errors with assumed Gaussian and uniform statistical distributions. These perturbations are run in an ensemble of hydrodynamic storm tide model simulations. The resulting set of maximum water surface elevations are dimensionality reduced using Karhunen–Loève expansions and then used as a training set to develop a polynomial chaos (PC) surrogate model from which global sensitivities and probabilistic predictions can be extracted. The maximum water surface elevation is extrapolated over dry points incorporating energy head loss with distance to properly train the surrogate for predicting inundation. We find that the surrogate constructed with third-order PCs using elastic net penalized regression with leave-one-out cross validation provides the most robust fit across training and test sets. Probabilistic predictions of maximum water surface elevation and inundation area by the surrogate model at 48-h lead time for three past U.S. landfalling hurricanes (Irma in 2017, Florence in 2018, and Laura in 2020) are found to be reliable when compared to best track hindcast simulation results, even when trained with as few as 19 samples. The maximum water surface elevation is most sensitive to perpendicular track-offset errors for all three storms. Laura is also highly sensitive to storm size and has the least reliable prediction.
Significance Statement
The purpose of this study is to develop and evaluate a methodology that can be used to provide high-quality probabilistic predictions of hurricane-induced storm tide and inundation with limited time and resources. This is important for emergency management purposes during or after the landfall of hurricanes. Our results show that sampling forecast errors using quasi-random sequences combined with machine learning techniques that fit polynomial functions to the data are well suited to this task. The polynomial functions also have the benefit of producing exact sensitivity indices of storm tide and inundation to the forecasted hurricane properties such as path, intensity, and size, which can be used for uncertainty estimation. The code implementing the presented methodology is publicly available on GitHub.
Abstract
This study proposes and assesses a methodology to obtain high-quality probabilistic predictions and uncertainty information of near-landfall tropical cyclone–driven (TC-driven) storm tide and inundation with limited time and resources. Forecasts of TC track, intensity, and size are perturbed according to quasi-random Korobov sequences of historical forecast errors with assumed Gaussian and uniform statistical distributions. These perturbations are run in an ensemble of hydrodynamic storm tide model simulations. The resulting set of maximum water surface elevations are dimensionality reduced using Karhunen–Loève expansions and then used as a training set to develop a polynomial chaos (PC) surrogate model from which global sensitivities and probabilistic predictions can be extracted. The maximum water surface elevation is extrapolated over dry points incorporating energy head loss with distance to properly train the surrogate for predicting inundation. We find that the surrogate constructed with third-order PCs using elastic net penalized regression with leave-one-out cross validation provides the most robust fit across training and test sets. Probabilistic predictions of maximum water surface elevation and inundation area by the surrogate model at 48-h lead time for three past U.S. landfalling hurricanes (Irma in 2017, Florence in 2018, and Laura in 2020) are found to be reliable when compared to best track hindcast simulation results, even when trained with as few as 19 samples. The maximum water surface elevation is most sensitive to perpendicular track-offset errors for all three storms. Laura is also highly sensitive to storm size and has the least reliable prediction.
Significance Statement
The purpose of this study is to develop and evaluate a methodology that can be used to provide high-quality probabilistic predictions of hurricane-induced storm tide and inundation with limited time and resources. This is important for emergency management purposes during or after the landfall of hurricanes. Our results show that sampling forecast errors using quasi-random sequences combined with machine learning techniques that fit polynomial functions to the data are well suited to this task. The polynomial functions also have the benefit of producing exact sensitivity indices of storm tide and inundation to the forecasted hurricane properties such as path, intensity, and size, which can be used for uncertainty estimation. The code implementing the presented methodology is publicly available on GitHub.
Abstract
Heatwaves are extreme near-surface temperature events that can have substantial impacts on ecosystems and society. Early warning systems help to reduce these impacts by helping communities prepare for hazardous climate-related events. However, state-of-the-art prediction systems can often not make accurate forecasts of heatwaves more than two weeks in advance, which are required for advance warnings. We therefore investigate the potential of statistical and machine learning methods to understand and predict central European summer heatwaves on time scales of several weeks. As a first step, we identify the most important regional atmospheric and surface predictors based on previous studies and supported by a correlation analysis: 2-m air temperature, 500-hPa geopotential, precipitation, and soil moisture in central Europe, as well as Mediterranean and North Atlantic sea surface temperatures, and the North Atlantic jet stream. Based on these predictors, we apply machine learning methods to forecast two targets: summer temperature anomalies and the probability of heatwaves for 1–6 weeks lead time at weekly resolution. For each of these two target variables, we use both a linear and a random forest model. The performance of these statistical models decays with lead time, as expected, but outperforms persistence and climatology at all lead times. For lead times longer than two weeks, our machine learning models compete with the ensemble mean of the European Centre for Medium-Range Weather Forecast’s hindcast system. We thus show that machine learning can help improve subseasonal forecasts of summer temperature anomalies and heatwaves.
Significance Statement
Heatwaves (prolonged extremely warm temperatures) cause thousands of fatalities worldwide each year. These damaging events are becoming even more severe with climate change. This study aims to improve advance predictions of summer heatwaves in central Europe by using statistical and machine learning methods. Machine learning models are shown to compete with conventional physics-based models for forecasting heatwaves more than two weeks in advance. These early warnings can be used to activate effective and timely response plans targeting vulnerable communities and regions, thereby reducing the damage caused by heatwaves.
Abstract
Heatwaves are extreme near-surface temperature events that can have substantial impacts on ecosystems and society. Early warning systems help to reduce these impacts by helping communities prepare for hazardous climate-related events. However, state-of-the-art prediction systems can often not make accurate forecasts of heatwaves more than two weeks in advance, which are required for advance warnings. We therefore investigate the potential of statistical and machine learning methods to understand and predict central European summer heatwaves on time scales of several weeks. As a first step, we identify the most important regional atmospheric and surface predictors based on previous studies and supported by a correlation analysis: 2-m air temperature, 500-hPa geopotential, precipitation, and soil moisture in central Europe, as well as Mediterranean and North Atlantic sea surface temperatures, and the North Atlantic jet stream. Based on these predictors, we apply machine learning methods to forecast two targets: summer temperature anomalies and the probability of heatwaves for 1–6 weeks lead time at weekly resolution. For each of these two target variables, we use both a linear and a random forest model. The performance of these statistical models decays with lead time, as expected, but outperforms persistence and climatology at all lead times. For lead times longer than two weeks, our machine learning models compete with the ensemble mean of the European Centre for Medium-Range Weather Forecast’s hindcast system. We thus show that machine learning can help improve subseasonal forecasts of summer temperature anomalies and heatwaves.
Significance Statement
Heatwaves (prolonged extremely warm temperatures) cause thousands of fatalities worldwide each year. These damaging events are becoming even more severe with climate change. This study aims to improve advance predictions of summer heatwaves in central Europe by using statistical and machine learning methods. Machine learning models are shown to compete with conventional physics-based models for forecasting heatwaves more than two weeks in advance. These early warnings can be used to activate effective and timely response plans targeting vulnerable communities and regions, thereby reducing the damage caused by heatwaves.
Abstract
Neural networks (NN) have become an important tool for prediction tasks—both regression and classification—in environmental science. Since many environmental-science problems involve life-or-death decisions and policy making, it is crucial to provide not only predictions but also an estimate of the uncertainty in the predictions. Until recently, very few tools were available to provide uncertainty quantification (UQ) for NN predictions. However, in recent years the computer-science field has developed numerous UQ approaches, and several research groups are exploring how to apply these approaches in environmental science. We provide an accessible introduction to six of these UQ approaches, then focus on tools for the next step, namely, to answer the question: Once we obtain an uncertainty estimate (using any approach), how do we know whether it is good or bad? To answer this question, we highlight four evaluation graphics and eight evaluation scores that are well suited for evaluating and comparing uncertainty estimates (NN based or otherwise) for environmental-science applications. We demonstrate the UQ approaches and UQ-evaluation methods for two real-world problems: 1) estimating vertical profiles of atmospheric dewpoint (a regression task) and 2) predicting convection over Taiwan based on Himawari-8 satellite imagery (a classification task). We also provide Jupyter notebooks with Python code for implementing the UQ approaches and UQ-evaluation methods discussed herein. This article provides the environmental-science community with the knowledge and tools to start incorporating the large number of emerging UQ methods into their research.
Significance Statement
Neural networks are used for many environmental-science applications, some involving life-or-death decision-making. In recent years new methods have been developed to provide much-needed uncertainty estimates for NN predictions. We seek to accelerate the adoption of these methods in the environmental-science community with an accessible introduction to 1) methods for computing uncertainty estimates in NN predictions and 2) methods for evaluating such estimates.
Abstract
Neural networks (NN) have become an important tool for prediction tasks—both regression and classification—in environmental science. Since many environmental-science problems involve life-or-death decisions and policy making, it is crucial to provide not only predictions but also an estimate of the uncertainty in the predictions. Until recently, very few tools were available to provide uncertainty quantification (UQ) for NN predictions. However, in recent years the computer-science field has developed numerous UQ approaches, and several research groups are exploring how to apply these approaches in environmental science. We provide an accessible introduction to six of these UQ approaches, then focus on tools for the next step, namely, to answer the question: Once we obtain an uncertainty estimate (using any approach), how do we know whether it is good or bad? To answer this question, we highlight four evaluation graphics and eight evaluation scores that are well suited for evaluating and comparing uncertainty estimates (NN based or otherwise) for environmental-science applications. We demonstrate the UQ approaches and UQ-evaluation methods for two real-world problems: 1) estimating vertical profiles of atmospheric dewpoint (a regression task) and 2) predicting convection over Taiwan based on Himawari-8 satellite imagery (a classification task). We also provide Jupyter notebooks with Python code for implementing the UQ approaches and UQ-evaluation methods discussed herein. This article provides the environmental-science community with the knowledge and tools to start incorporating the large number of emerging UQ methods into their research.
Significance Statement
Neural networks are used for many environmental-science applications, some involving life-or-death decision-making. In recent years new methods have been developed to provide much-needed uncertainty estimates for NN predictions. We seek to accelerate the adoption of these methods in the environmental-science community with an accessible introduction to 1) methods for computing uncertainty estimates in NN predictions and 2) methods for evaluating such estimates.
Abstract
In recent decades, spaceborne microwave and hyperspectral infrared sounding instruments have significantly benefited weather forecasting and climate science. However, existing retrievals of lower-troposphere temperature and humidity profiles have limitations in vertical resolution, and often cannot accurately represent key features such as the mixed-layer thermodynamic structure and the inversion at the planetary boundary layer (PBL) top. Because of the existing limitations in PBL remote sensing from space, there is a compelling need to improve routine, global observations of the PBL and enable advances in scientific understanding and weather and climate prediction. To address this, we have developed a new 3D deep neural network (DNN) that enhances detail and reduces noise in level 2 granules of temperature and humidity profiles from the Atmospheric Infrared Sounder (AIRS)/Advanced Microwave Sounding Unit (AMSU) sounder instruments aboard NASA’s Aqua spacecraft. We show that the enhancement improves accuracy and detail including key features such as capping inversions at the top of the PBL over land, resulting in improved accuracy in estimations of PBL height.
Abstract
In recent decades, spaceborne microwave and hyperspectral infrared sounding instruments have significantly benefited weather forecasting and climate science. However, existing retrievals of lower-troposphere temperature and humidity profiles have limitations in vertical resolution, and often cannot accurately represent key features such as the mixed-layer thermodynamic structure and the inversion at the planetary boundary layer (PBL) top. Because of the existing limitations in PBL remote sensing from space, there is a compelling need to improve routine, global observations of the PBL and enable advances in scientific understanding and weather and climate prediction. To address this, we have developed a new 3D deep neural network (DNN) that enhances detail and reduces noise in level 2 granules of temperature and humidity profiles from the Atmospheric Infrared Sounder (AIRS)/Advanced Microwave Sounding Unit (AMSU) sounder instruments aboard NASA’s Aqua spacecraft. We show that the enhancement improves accuracy and detail including key features such as capping inversions at the top of the PBL over land, resulting in improved accuracy in estimations of PBL height.
Abstract
This paper presents an innovational way of assimilating observations of clouds into the icosahedral nonhydrostatic weather forecasting model for regional scale (ICON-D2), which is operated by the German Weather Service (Deutscher Wetterdienst) (DWD). A convolutional neural network (CNN) is trained to detect clouds in camera photographs. The network’s output is a grayscale picture, in which each pixel has a value between 0 and 1, describing the probability of the pixel belonging to a cloud (1) or not (0). By averaging over a certain box of the picture a value for the cloud cover of that region is obtained. A forward operator is built to map an ICON model state into the observation space. A three-dimensional grid in the space of the camera’s perspective is constructed and the ICON model variable cloud cover (CLC) is interpolated onto that grid. The maximum CLC along the rays that fabricate the camera grid, is taken as a model equivalent for each pixel. After superobbing, monitoring experiments have been conducted to compare the observations and model equivalents over a longer time period, yielding promising results. Further we show the performance of a single assimilation step as well as a longer assimilation experiment over a time period of 6 days, which also yields good results. These findings are proof of concept and further research has to be invested before these new innovational observations can be assimilated operationally in any numerical weather prediction (NWP) model.
Abstract
This paper presents an innovational way of assimilating observations of clouds into the icosahedral nonhydrostatic weather forecasting model for regional scale (ICON-D2), which is operated by the German Weather Service (Deutscher Wetterdienst) (DWD). A convolutional neural network (CNN) is trained to detect clouds in camera photographs. The network’s output is a grayscale picture, in which each pixel has a value between 0 and 1, describing the probability of the pixel belonging to a cloud (1) or not (0). By averaging over a certain box of the picture a value for the cloud cover of that region is obtained. A forward operator is built to map an ICON model state into the observation space. A three-dimensional grid in the space of the camera’s perspective is constructed and the ICON model variable cloud cover (CLC) is interpolated onto that grid. The maximum CLC along the rays that fabricate the camera grid, is taken as a model equivalent for each pixel. After superobbing, monitoring experiments have been conducted to compare the observations and model equivalents over a longer time period, yielding promising results. Further we show the performance of a single assimilation step as well as a longer assimilation experiment over a time period of 6 days, which also yields good results. These findings are proof of concept and further research has to be invested before these new innovational observations can be assimilated operationally in any numerical weather prediction (NWP) model.
Abstract
The safe and successful operation of offshore infrastructure relies on a detailed awareness of ocean wave conditions. Ongoing growth in offshore wind energy is focused on very large-scale projects, deployed in ever more challenging environments. This inherently increases both cost and complexity and therefore the requirement for efficient operational planning. To support this, we propose a new machine learning framework for the short-term forecasting of ocean wave conditions to support critical decision-making associated with marine operations. Here, an attention-based long short-term memory (LSTM) neural network approach is used to learn the short-term temporal patterns from in situ observations. This is then integrated with an existing, low computational cost spatial nowcasting model to develop a complete framework for spatiotemporal forecasting. The framework addresses the challenge of filling gaps in the in situ observations and undertakes feature selection, with seasonal training datasets embedded. The full spatiotemporal forecasting system is demonstrated using a case study based on independent observation locations near the southwest coast of the United Kingdom. Results are validated against in situ data from two wave buoy locations within the domain and compared to operational physics-based wave forecasts from the Met Office (the United Kingdom’s national weather service). For these two example locations, the spatiotemporal forecast is found to have an accuracy of R 2 = 0.9083 and 0.7409 in forecasting 1-h-ahead significant wave height and R 2 = 0.8581 and 0.6978 in 12-h-ahead forecasts, respectively. Importantly, this represents respectable levels of accuracy, comparable to traditional physics-based forecast products, but requires only a fraction of the computational resources.
Significance Statement
Spectral wave models, based on modeling the underlying physics and physical processes, are traditionally used to generate wave forecasts but require significant computational cost. In this study, we propose a machine learning forecasting framework developed using both in situ buoy observations and a surrogate regional numerical wave model. The proposed framework is validated against in situ measurements at two renewable energy sites and found to have very similar 12-h forecasting errors when benchmarked against the Met Office’s physics-based forecasting model but requires far less computational power. The proposed framework is highly flexible and has the potential for offering a low-cost, low computational resource approach for the provision of short-term forecasts and can operate with other types of observations and other machine learning algorithms to improve the availability and accuracy of the prediction.
Abstract
The safe and successful operation of offshore infrastructure relies on a detailed awareness of ocean wave conditions. Ongoing growth in offshore wind energy is focused on very large-scale projects, deployed in ever more challenging environments. This inherently increases both cost and complexity and therefore the requirement for efficient operational planning. To support this, we propose a new machine learning framework for the short-term forecasting of ocean wave conditions to support critical decision-making associated with marine operations. Here, an attention-based long short-term memory (LSTM) neural network approach is used to learn the short-term temporal patterns from in situ observations. This is then integrated with an existing, low computational cost spatial nowcasting model to develop a complete framework for spatiotemporal forecasting. The framework addresses the challenge of filling gaps in the in situ observations and undertakes feature selection, with seasonal training datasets embedded. The full spatiotemporal forecasting system is demonstrated using a case study based on independent observation locations near the southwest coast of the United Kingdom. Results are validated against in situ data from two wave buoy locations within the domain and compared to operational physics-based wave forecasts from the Met Office (the United Kingdom’s national weather service). For these two example locations, the spatiotemporal forecast is found to have an accuracy of R 2 = 0.9083 and 0.7409 in forecasting 1-h-ahead significant wave height and R 2 = 0.8581 and 0.6978 in 12-h-ahead forecasts, respectively. Importantly, this represents respectable levels of accuracy, comparable to traditional physics-based forecast products, but requires only a fraction of the computational resources.
Significance Statement
Spectral wave models, based on modeling the underlying physics and physical processes, are traditionally used to generate wave forecasts but require significant computational cost. In this study, we propose a machine learning forecasting framework developed using both in situ buoy observations and a surrogate regional numerical wave model. The proposed framework is validated against in situ measurements at two renewable energy sites and found to have very similar 12-h forecasting errors when benchmarked against the Met Office’s physics-based forecasting model but requires far less computational power. The proposed framework is highly flexible and has the potential for offering a low-cost, low computational resource approach for the provision of short-term forecasts and can operate with other types of observations and other machine learning algorithms to improve the availability and accuracy of the prediction.
Abstract
Estimating the impact of wind-driven snow transport requires modeling wind fields with a lower grid spacing than the spacing on the order of 1 or a few kilometers used in the current numerical weather prediction (NWP) systems. In this context, we introduce a new strategy to downscale wind fields from NWP systems to decametric scales, using high-resolution (30 m) topographic information. Our method (named “DEVINE”) is leveraged on a convolutional neural network (CNN), trained to replicate the behavior of the complex atmospheric model ARPS, and was previously run on a large number (7279) of synthetic Gaussian topographies under controlled weather conditions. A 10-fold cross validation reveals that our CNN is able to accurately emulate the behavior of ARPS (mean absolute error for wind speed = 0.16 m s−1). We then apply DEVINE to real cases in the Alps, that is, downscaling wind fields forecast by the AROME NWP system using information from real alpine topographies. DEVINE proved able to reproduce main features of wind fields in complex terrain (acceleration on ridges, leeward deceleration, and deviations around obstacles). Furthermore, an evaluation on quality-checked observations acquired at 61 sites in the French Alps reveals improved behavior of the downscaled winds (AROME wind speed mean bias is reduced by 27% with DEVINE), especially at the most elevated and exposed stations. Wind direction is, however, only slightly modified. Hence, despite some current limitations inherited from the ARPS simulations setup, DEVINE appears to be an efficient downscaling tool whose minimalist architecture, low input data requirements (NWP wind fields and high-resolution topography), and competitive computing times may be attractive for operational applications.
Significance Statement
Wind largely influences the spatial distribution of snow in mountains, with direct consequences on hydrology and avalanche hazard. Most operational models predicting wind in complex terrain use a grid spacing on the order of several kilometers, too coarse to represent the real patterns of mountain winds. We introduce a novel method based on deep learning to increase this spatial resolution while maintaining acceptable computational costs. Our method mimics the behavior of a complex model that is able to represent part of the complexity of mountain winds by using topographic information only. We compared our results with observations collected in complex terrain and showed that our model improves the representation of winds, notably at the most elevated and exposed observation stations.
Abstract
Estimating the impact of wind-driven snow transport requires modeling wind fields with a lower grid spacing than the spacing on the order of 1 or a few kilometers used in the current numerical weather prediction (NWP) systems. In this context, we introduce a new strategy to downscale wind fields from NWP systems to decametric scales, using high-resolution (30 m) topographic information. Our method (named “DEVINE”) is leveraged on a convolutional neural network (CNN), trained to replicate the behavior of the complex atmospheric model ARPS, and was previously run on a large number (7279) of synthetic Gaussian topographies under controlled weather conditions. A 10-fold cross validation reveals that our CNN is able to accurately emulate the behavior of ARPS (mean absolute error for wind speed = 0.16 m s−1). We then apply DEVINE to real cases in the Alps, that is, downscaling wind fields forecast by the AROME NWP system using information from real alpine topographies. DEVINE proved able to reproduce main features of wind fields in complex terrain (acceleration on ridges, leeward deceleration, and deviations around obstacles). Furthermore, an evaluation on quality-checked observations acquired at 61 sites in the French Alps reveals improved behavior of the downscaled winds (AROME wind speed mean bias is reduced by 27% with DEVINE), especially at the most elevated and exposed stations. Wind direction is, however, only slightly modified. Hence, despite some current limitations inherited from the ARPS simulations setup, DEVINE appears to be an efficient downscaling tool whose minimalist architecture, low input data requirements (NWP wind fields and high-resolution topography), and competitive computing times may be attractive for operational applications.
Significance Statement
Wind largely influences the spatial distribution of snow in mountains, with direct consequences on hydrology and avalanche hazard. Most operational models predicting wind in complex terrain use a grid spacing on the order of several kilometers, too coarse to represent the real patterns of mountain winds. We introduce a novel method based on deep learning to increase this spatial resolution while maintaining acceptable computational costs. Our method mimics the behavior of a complex model that is able to represent part of the complexity of mountain winds by using topographic information only. We compared our results with observations collected in complex terrain and showed that our model improves the representation of winds, notably at the most elevated and exposed observation stations.
Abstract
Heatwaves are projected to increase in frequency and severity with global warming. Improved warning systems would help reduce the associated loss of lives, wildfires, power disruptions, and reduction in crop yields. In this work, we explore the potential for deep learning systems trained on historical data to forecast extreme heat on short, medium and subseasonal time scales. To this purpose, we train a set of neural weather models (NWMs) with convolutional architectures to forecast surface temperature anomalies globally, 1 to 28 days ahead, at ∼200-km resolution and on the cubed sphere. The NWMs are trained using the ERA5 reanalysis product and a set of candidate loss functions, including the mean-square error and exponential losses targeting extremes. We find that training models to minimize custom losses tailored to emphasize extremes leads to significant skill improvements in the heatwave prediction task, relative to NWMs trained on the mean-square-error loss. This improvement is accomplished with almost no skill reduction in the general temperature prediction task, and it can be efficiently realized through transfer learning, by retraining NWMs with the custom losses for a few epochs. In addition, we find that the use of a symmetric exponential loss reduces the smoothing of NWM forecasts with lead time. Our best NWM is able to outperform persistence in a regressive sense for all lead times and temperature anomaly thresholds considered, and shows positive regressive skill relative to the ECMWF subseasonal-to-seasonal control forecast after 2 weeks.
Significance Statement
Heatwaves are projected to become stronger and more frequent as a result of global warming. Accurate forecasting of these events would enable the implementation of effective mitigation strategies. Here we analyze the forecast accuracy of artificial intelligence systems trained on historical surface temperature data to predict extreme heat events globally, 1 to 28 days ahead. We find that artificial intelligence systems trained to focus on extreme temperatures are significantly more accurate at predicting heatwaves than systems trained to minimize errors in surface temperatures and remain equally skillful at predicting moderate temperatures. Furthermore, the extreme-focused systems compete with state-of-the-art physics-based forecast systems in the subseasonal range, while incurring a much lower computational cost.
Abstract
Heatwaves are projected to increase in frequency and severity with global warming. Improved warning systems would help reduce the associated loss of lives, wildfires, power disruptions, and reduction in crop yields. In this work, we explore the potential for deep learning systems trained on historical data to forecast extreme heat on short, medium and subseasonal time scales. To this purpose, we train a set of neural weather models (NWMs) with convolutional architectures to forecast surface temperature anomalies globally, 1 to 28 days ahead, at ∼200-km resolution and on the cubed sphere. The NWMs are trained using the ERA5 reanalysis product and a set of candidate loss functions, including the mean-square error and exponential losses targeting extremes. We find that training models to minimize custom losses tailored to emphasize extremes leads to significant skill improvements in the heatwave prediction task, relative to NWMs trained on the mean-square-error loss. This improvement is accomplished with almost no skill reduction in the general temperature prediction task, and it can be efficiently realized through transfer learning, by retraining NWMs with the custom losses for a few epochs. In addition, we find that the use of a symmetric exponential loss reduces the smoothing of NWM forecasts with lead time. Our best NWM is able to outperform persistence in a regressive sense for all lead times and temperature anomaly thresholds considered, and shows positive regressive skill relative to the ECMWF subseasonal-to-seasonal control forecast after 2 weeks.
Significance Statement
Heatwaves are projected to become stronger and more frequent as a result of global warming. Accurate forecasting of these events would enable the implementation of effective mitigation strategies. Here we analyze the forecast accuracy of artificial intelligence systems trained on historical surface temperature data to predict extreme heat events globally, 1 to 28 days ahead. We find that artificial intelligence systems trained to focus on extreme temperatures are significantly more accurate at predicting heatwaves than systems trained to minimize errors in surface temperatures and remain equally skillful at predicting moderate temperatures. Furthermore, the extreme-focused systems compete with state-of-the-art physics-based forecast systems in the subseasonal range, while incurring a much lower computational cost.
Abstract
Backward-in-time predictions are needed to better understand the underlying dynamics of physical fluid flows and improve future forecasts. However, integrating fluid flows backward in time is challenging because of numerical instabilities caused by the diffusive nature of the fluid systems and nonlinearities of the governing equations. Although this problem has been long addressed using a nonpositive diffusion coefficient when integrating backward, it is notoriously inaccurate. In this study, a physics-informed deep neural network (PI-DNN) is presented to predict past states of a dissipative dynamical system from snapshots of the subsequent evolution of the system state. The performance of the PI-DNN is investigated using several systematic numerical experiments and the accuracy of the backward-in-time predictions is evaluated in terms of different error metrics. The proposed PI-DNN can predict the previous state of the Rayleigh–Bénard convection with an 8-time-step average normalized
Abstract
Backward-in-time predictions are needed to better understand the underlying dynamics of physical fluid flows and improve future forecasts. However, integrating fluid flows backward in time is challenging because of numerical instabilities caused by the diffusive nature of the fluid systems and nonlinearities of the governing equations. Although this problem has been long addressed using a nonpositive diffusion coefficient when integrating backward, it is notoriously inaccurate. In this study, a physics-informed deep neural network (PI-DNN) is presented to predict past states of a dissipative dynamical system from snapshots of the subsequent evolution of the system state. The performance of the PI-DNN is investigated using several systematic numerical experiments and the accuracy of the backward-in-time predictions is evaluated in terms of different error metrics. The proposed PI-DNN can predict the previous state of the Rayleigh–Bénard convection with an 8-time-step average normalized