Browse
Abstract
Sea surface height observations provided by satellite altimetry since 1993 show a rising rate (3.4 mm/year) for global mean sea level. While on average, sea level has risen 10 cm over the last 30 years, there is considerable regional variation in the sea level change. Through this work, we predict sea level trends 30 years into the future at a 2-degree spatial resolution and investigate the future patterns of the sea level change. We show the potential of machine learning (ML) in this challenging application of long-term sea level forecasting over the global ocean. Our approach incorporates sea level data from both altimeter observations and climate model simulations. We develop a supervised learning framework using fully connected neural networks (FCNNs) that can predict the sea level trend based on climate model projections. Alongside this, our method provides uncertainty estimates associated with the ML prediction. We also show the effectiveness of partitioning our spatial dataset and learning a dedicated ML model for each segmented region. We compare two partitioning strategies: one achieved using domain knowledge, and the other employing spectral clustering. Our results demonstrate that segmenting the spatial dataset with spectral clustering improves the ML predictions.
Abstract
Sea surface height observations provided by satellite altimetry since 1993 show a rising rate (3.4 mm/year) for global mean sea level. While on average, sea level has risen 10 cm over the last 30 years, there is considerable regional variation in the sea level change. Through this work, we predict sea level trends 30 years into the future at a 2-degree spatial resolution and investigate the future patterns of the sea level change. We show the potential of machine learning (ML) in this challenging application of long-term sea level forecasting over the global ocean. Our approach incorporates sea level data from both altimeter observations and climate model simulations. We develop a supervised learning framework using fully connected neural networks (FCNNs) that can predict the sea level trend based on climate model projections. Alongside this, our method provides uncertainty estimates associated with the ML prediction. We also show the effectiveness of partitioning our spatial dataset and learning a dedicated ML model for each segmented region. We compare two partitioning strategies: one achieved using domain knowledge, and the other employing spectral clustering. Our results demonstrate that segmenting the spatial dataset with spectral clustering improves the ML predictions.
Abstract
Weather predictions 2–4 weeks in advance, called the subseasonal time scale, are highly relevant for socioeconomic decision-makers. Unfortunately, the skill of numerical weather prediction models at this time scale is generally low. Here, we use probabilistic random forest (RF)-based machine learning models to postprocess the subseasonal to seasonal (S2S) reforecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF). We show that these models are able to improve the forecasts slightly in a 20-winter mean at lead times of 14, 21, and 28 days for wintertime central European mean 2-m temperatures compared to the lead-time-dependent mean bias-corrected ECMWF’s S2S reforecasts and RF-based models using only reanalysis data as input. Predictions of the occurrence of cold wave days are improved at lead times of 21 and 28 days. Thereby, forecasts of continuous temperatures show a better skill than forecasts of binary occurrences of cold wave days. Furthermore, we analyze if the skill depends on the large-scale flow configuration of the atmosphere at initialization, as represented by weather regimes (WRs). We find that the WR at the start of the forecast influences the skill and its evolution across lead times. These results can be used to assess the conditional improvement of forecasts initialized during one WR in comparison to forecasts initialized during another WR.
Significance Statement
Forecasts of winter temperatures and cold waves 2–4 weeks in advance done by numerical weather prediction (NWP) models are often unsatisfactory due to the chaotic characteristics of the atmosphere and limited predictive skill at this time range. Here, we use statistical methods, belonging to the so-called machine learning (ML) models, to improve forecast quality by postprocessing predictions of a state-of-the-art NWP model. We compare the forecasts of the NWP and ML models considering different weather regimes (WRs), which represent the large-scale atmospheric circulation such as the typical westerly winds in Europe. We find that the ML models generally yield better temperature forecasts for 14, 21, and 28 days in advance and better forecasts of cold wave days 21 and 28 days in advance. The quality of forecasts depends on the WR present at the forecast start. This information can be used to assess the conditional improvement of forecasts.
Abstract
Weather predictions 2–4 weeks in advance, called the subseasonal time scale, are highly relevant for socioeconomic decision-makers. Unfortunately, the skill of numerical weather prediction models at this time scale is generally low. Here, we use probabilistic random forest (RF)-based machine learning models to postprocess the subseasonal to seasonal (S2S) reforecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF). We show that these models are able to improve the forecasts slightly in a 20-winter mean at lead times of 14, 21, and 28 days for wintertime central European mean 2-m temperatures compared to the lead-time-dependent mean bias-corrected ECMWF’s S2S reforecasts and RF-based models using only reanalysis data as input. Predictions of the occurrence of cold wave days are improved at lead times of 21 and 28 days. Thereby, forecasts of continuous temperatures show a better skill than forecasts of binary occurrences of cold wave days. Furthermore, we analyze if the skill depends on the large-scale flow configuration of the atmosphere at initialization, as represented by weather regimes (WRs). We find that the WR at the start of the forecast influences the skill and its evolution across lead times. These results can be used to assess the conditional improvement of forecasts initialized during one WR in comparison to forecasts initialized during another WR.
Significance Statement
Forecasts of winter temperatures and cold waves 2–4 weeks in advance done by numerical weather prediction (NWP) models are often unsatisfactory due to the chaotic characteristics of the atmosphere and limited predictive skill at this time range. Here, we use statistical methods, belonging to the so-called machine learning (ML) models, to improve forecast quality by postprocessing predictions of a state-of-the-art NWP model. We compare the forecasts of the NWP and ML models considering different weather regimes (WRs), which represent the large-scale atmospheric circulation such as the typical westerly winds in Europe. We find that the ML models generally yield better temperature forecasts for 14, 21, and 28 days in advance and better forecasts of cold wave days 21 and 28 days in advance. The quality of forecasts depends on the WR present at the forecast start. This information can be used to assess the conditional improvement of forecasts.
Abstract
Monitoring fine particulate matter (PM2.5) is crucial for evaluating air quality and its effects on public health. However, the limited distribution of monitoring stations presents a challenge in accurately assessing air pollution, especially in areas distant from these stations. To address this challenge, our study introduces a two-step deep learning approach for estimating daily gap-free surface PM2.5 concentrations across the contiguous United States (CONUS) from 2018 to 2022, with a spatial resolution of 4 km. In the first phase, we employ a Depthwise-Partial Convolutional Neural Network (DW-PCNN) to fill gaps between surface PM2.5 stations, utilizing Aerosol Optical Depth (AOD) data from the Moderate Resolution Imaging Spectroradiometer (MODIS). In the second phase, we integrate the PM2.5 grids imputed by the DW-PCNN with meteorological and anthropogenic variables into a Deep Convolutional Neural Network (Deep-CNN) to further enhance the accuracy of our estimation. This enables us to estimate gap-free surface PM2.5 concentrations accurately, evidenced by a Pearson’s correlation coefficient (R) of 0.92 and an Index of Agreement (IOA) of 0.96 in ten-fold cross-validation. We also introduce a grid-based method for calculating PM2.5 Design Values (DV), providing a continuous spatial representation of PM2.5 DV that enhances the traditional station-based approach. Our grid-based DV representations offer a comprehensive perspective on air quality, facilitating more detailed analysis. Furthermore, our model's ability to provide spatiotemporally consistent, gap-free PM2.5 data addresses the issue of missing values, supporting health impact research, policy formulation, and the accuracy of environmental assessments.
Abstract
Monitoring fine particulate matter (PM2.5) is crucial for evaluating air quality and its effects on public health. However, the limited distribution of monitoring stations presents a challenge in accurately assessing air pollution, especially in areas distant from these stations. To address this challenge, our study introduces a two-step deep learning approach for estimating daily gap-free surface PM2.5 concentrations across the contiguous United States (CONUS) from 2018 to 2022, with a spatial resolution of 4 km. In the first phase, we employ a Depthwise-Partial Convolutional Neural Network (DW-PCNN) to fill gaps between surface PM2.5 stations, utilizing Aerosol Optical Depth (AOD) data from the Moderate Resolution Imaging Spectroradiometer (MODIS). In the second phase, we integrate the PM2.5 grids imputed by the DW-PCNN with meteorological and anthropogenic variables into a Deep Convolutional Neural Network (Deep-CNN) to further enhance the accuracy of our estimation. This enables us to estimate gap-free surface PM2.5 concentrations accurately, evidenced by a Pearson’s correlation coefficient (R) of 0.92 and an Index of Agreement (IOA) of 0.96 in ten-fold cross-validation. We also introduce a grid-based method for calculating PM2.5 Design Values (DV), providing a continuous spatial representation of PM2.5 DV that enhances the traditional station-based approach. Our grid-based DV representations offer a comprehensive perspective on air quality, facilitating more detailed analysis. Furthermore, our model's ability to provide spatiotemporally consistent, gap-free PM2.5 data addresses the issue of missing values, supporting health impact research, policy formulation, and the accuracy of environmental assessments.
Abstract
Optical turbulence poses a significant challenge for communication, directed energy, and imaging systems, particularly in the atmospheric boundary layer. Effective modeling of optical turbulence is crucial for the development and deployment of these systems, yet the lack of standardized evaluation tools and benchmark data sets hinders the development and adoption of machine learning to address these challenges. We introduce the otbench Python package, a comprehensive framework for rigorous development and evaluation of optical turbulence strength prediction models. This package provides a consistent interface for testing models across diverse data sets and tasks, including a novel, long-term data set collected over two years at the United States Naval Academy. otbench incorporates a range of baseline models (statistical, data-driven, and deep learning), enabling researchers to assess the relative quality of their approaches and identify areas for improvement. Our analysis reveals the applicability of various models across different environments, highlighting the importance of long-term data sets for robust model evaluation. By promoting standardized benchmarking and facilitating model comparison, otbench empowers researchers to accelerate the adoption of machine learning techniques for optical turbulence modeling.
Abstract
Optical turbulence poses a significant challenge for communication, directed energy, and imaging systems, particularly in the atmospheric boundary layer. Effective modeling of optical turbulence is crucial for the development and deployment of these systems, yet the lack of standardized evaluation tools and benchmark data sets hinders the development and adoption of machine learning to address these challenges. We introduce the otbench Python package, a comprehensive framework for rigorous development and evaluation of optical turbulence strength prediction models. This package provides a consistent interface for testing models across diverse data sets and tasks, including a novel, long-term data set collected over two years at the United States Naval Academy. otbench incorporates a range of baseline models (statistical, data-driven, and deep learning), enabling researchers to assess the relative quality of their approaches and identify areas for improvement. Our analysis reveals the applicability of various models across different environments, highlighting the importance of long-term data sets for robust model evaluation. By promoting standardized benchmarking and facilitating model comparison, otbench empowers researchers to accelerate the adoption of machine learning techniques for optical turbulence modeling.
Abstract
Deep learning frequently leverages satellite imagery to estimate key benchmarked properties of tropical cyclones (TCs) such as intensity. This study goes a step further to investigate the potential for using this two-dimensional information to produce a two-dimensional wind field product for the TC inner core. Here we train a product on flight-level in situ wind from center-crossing aircraft transects and focus on the ability to reproduce a full two-dimensional field of flight-level wind. The wind model, dubbed ‘TC2D,’ is a unique multi-branched UNet design with a loss function that efficiently compensates for the relative sparsity of labeled data. This model accurately captures many challenging radial wind profiles, including large eyewalls, profiles with secondary wind maxima and TCs in transition between various states. It performs well in a variety of environments including strong vertical wind shear. The RMS error of the estimated radius of maximum winds is 15.5 km for Category 2–5 TCs, and half of the tested cases have error less than 1.3 km. The RMS error of windspeed is 5–6 ms−1 for tropical depression to Category 1-strength TCs and 5–10 ms−1 for Category 2–5 TCs, depending on radius. The model generally lacks the ability to reproduce the storm-relative azimuthal variability of flight-level wind, but it successfully captures earth-relative variability due to the more straightforward corrections for TC translation. TC2D offers to be a good nowcasting aide to provide low-latency, TC inner core wind distribution estimates several times per day.
Abstract
Deep learning frequently leverages satellite imagery to estimate key benchmarked properties of tropical cyclones (TCs) such as intensity. This study goes a step further to investigate the potential for using this two-dimensional information to produce a two-dimensional wind field product for the TC inner core. Here we train a product on flight-level in situ wind from center-crossing aircraft transects and focus on the ability to reproduce a full two-dimensional field of flight-level wind. The wind model, dubbed ‘TC2D,’ is a unique multi-branched UNet design with a loss function that efficiently compensates for the relative sparsity of labeled data. This model accurately captures many challenging radial wind profiles, including large eyewalls, profiles with secondary wind maxima and TCs in transition between various states. It performs well in a variety of environments including strong vertical wind shear. The RMS error of the estimated radius of maximum winds is 15.5 km for Category 2–5 TCs, and half of the tested cases have error less than 1.3 km. The RMS error of windspeed is 5–6 ms−1 for tropical depression to Category 1-strength TCs and 5–10 ms−1 for Category 2–5 TCs, depending on radius. The model generally lacks the ability to reproduce the storm-relative azimuthal variability of flight-level wind, but it successfully captures earth-relative variability due to the more straightforward corrections for TC translation. TC2D offers to be a good nowcasting aide to provide low-latency, TC inner core wind distribution estimates several times per day.
Abstract
Long-term environmental monitoring is critical for managing the soil and groundwater at contaminated sites. Recent improvements in state-of-the-art sensor technology, communication networks, and artificial intelligence have created opportunities to modernize this monitoring activity for automated, fast, robust, and predictive monitoring. In such modernization, it is required that sensor locations be optimized to capture the spatiotemporal dynamics of all monitoring variables as well as to make it cost-effective. The legacy monitoring datasets of the target area are important to perform this optimization. In this study, we have developed a machine-learning approach to optimize sensor locations for soil and groundwater monitoring based on ensemble supervised learning and majority voting. For spatial optimization, Gaussian process regression (GPR) is used for spatial interpolation, while the majority voting is applied to accommodate the multivariate temporal dimension. Results show that the algorithms significantly outperform the random selection of the sensor locations for predictive spatiotemporal interpolation. While the method has been applied to a four-dimensional dataset (with two-dimensional space, time, and multiple contaminants), we anticipate that it can be generalizable to higher-dimensional datasets for environmental monitoring sensor location optimization.
Abstract
Long-term environmental monitoring is critical for managing the soil and groundwater at contaminated sites. Recent improvements in state-of-the-art sensor technology, communication networks, and artificial intelligence have created opportunities to modernize this monitoring activity for automated, fast, robust, and predictive monitoring. In such modernization, it is required that sensor locations be optimized to capture the spatiotemporal dynamics of all monitoring variables as well as to make it cost-effective. The legacy monitoring datasets of the target area are important to perform this optimization. In this study, we have developed a machine-learning approach to optimize sensor locations for soil and groundwater monitoring based on ensemble supervised learning and majority voting. For spatial optimization, Gaussian process regression (GPR) is used for spatial interpolation, while the majority voting is applied to accommodate the multivariate temporal dimension. Results show that the algorithms significantly outperform the random selection of the sensor locations for predictive spatiotemporal interpolation. While the method has been applied to a four-dimensional dataset (with two-dimensional space, time, and multiple contaminants), we anticipate that it can be generalizable to higher-dimensional datasets for environmental monitoring sensor location optimization.
Abstract
We introduce a machine learned surrogate model from high-resolution simulation data to capture the subgrid-scale effects in dry, stratified atmospheric flows. We use deep neural networks (NNs) to model the spatially local state differences between a coarse-resolution simulation and a high-resolution simulation. The setup enables the capture of both dissipative and antidissipative effects in the state differences. The NN model is able to accurately capture the state differences in offline tests outside the training regime. In online tests intended for production use, the NN-coupled coarse simulation has higher accuracy over a significant period of time compared to the coarse-resolution simulation without any correction. We provide evidence of the capability of the NN model to accurately capture high-gradient regions in the flow field. With the accumulation of the errors, the NN-coupled simulation becomes computationally unstable after approximately 90 coarse simulation time steps. Insights gained from these surrogate models further pave the way for formulating stable, complex, physics-based spatially local NN models which are driven by traditional subgrid-scale turbulence closure models.
Significance Statement
Flows in the atmosphere are highly chaotic and turbulent, comprising flow structures of broad scales. For effective computational modeling of atmospheric flows, the effects of the small- and large-scale structures need to be captured by the simulations. Capturing the small-scale structures requires fine-resolution simulations. Even with the current state-of-the-art supercomputers, it can be prohibitively expensive to simulate these flows when computed for the entire earth over climate time scales. Thus, it is necessary to focus on the larger-scale structures using a coarse-resolution simulation while capturing the effects of the smaller-scale structures using some parameterization (approximation) scheme and incorporating it into the coarse-resolution simulation. We use machine learning to model the effects of the small-scale structures (subgrid-scale effects) in atmospheric flows. Data from a fine-resolution simulation is used to compute the missing subgrid-scale effects in coarse-resolution simulations. We then use machine learning models to approximate these differences between the coarse- and fine-resolution simulations. We see improved accuracy for the coarse-resolution simulations when corrected using these machine learned models.
Abstract
We introduce a machine learned surrogate model from high-resolution simulation data to capture the subgrid-scale effects in dry, stratified atmospheric flows. We use deep neural networks (NNs) to model the spatially local state differences between a coarse-resolution simulation and a high-resolution simulation. The setup enables the capture of both dissipative and antidissipative effects in the state differences. The NN model is able to accurately capture the state differences in offline tests outside the training regime. In online tests intended for production use, the NN-coupled coarse simulation has higher accuracy over a significant period of time compared to the coarse-resolution simulation without any correction. We provide evidence of the capability of the NN model to accurately capture high-gradient regions in the flow field. With the accumulation of the errors, the NN-coupled simulation becomes computationally unstable after approximately 90 coarse simulation time steps. Insights gained from these surrogate models further pave the way for formulating stable, complex, physics-based spatially local NN models which are driven by traditional subgrid-scale turbulence closure models.
Significance Statement
Flows in the atmosphere are highly chaotic and turbulent, comprising flow structures of broad scales. For effective computational modeling of atmospheric flows, the effects of the small- and large-scale structures need to be captured by the simulations. Capturing the small-scale structures requires fine-resolution simulations. Even with the current state-of-the-art supercomputers, it can be prohibitively expensive to simulate these flows when computed for the entire earth over climate time scales. Thus, it is necessary to focus on the larger-scale structures using a coarse-resolution simulation while capturing the effects of the smaller-scale structures using some parameterization (approximation) scheme and incorporating it into the coarse-resolution simulation. We use machine learning to model the effects of the small-scale structures (subgrid-scale effects) in atmospheric flows. Data from a fine-resolution simulation is used to compute the missing subgrid-scale effects in coarse-resolution simulations. We then use machine learning models to approximate these differences between the coarse- and fine-resolution simulations. We see improved accuracy for the coarse-resolution simulations when corrected using these machine learned models.
Abstract
Regional climate models (RCMs) are essential tools for simulating and studying regional climate variability and change. However, their high computational cost limits the production of comprehensive ensembles of regional climate projections covering multiple scenarios and driving Global climate models (GCMs) across regions. RCM emulators based on deep learning models have recently been introduced as a cost-effective and promising alternative that requires only short RCM simulations to train the models. Therefore, evaluating their transferability to different periods, scenarios, and GCMs becomes a pivotal and complex task in which the inherent biases of both GCMs and RCMs play a significant role. Here, we focus on this problem by considering the two different emulation approaches introduced in the literature as perfect and imperfect, that we here refer to as perfect prognosis (PP) and model output statistics (MOS), respectively, following the well-established downscaling terminology. In addition to standard evaluation techniques, we expand the analysis with methods from the field of explainable artificial intelligence (XAI), to assess the physical consistency of the empirical links learnt by the models. We find that both approaches are able to emulate certain climatological properties of RCMs for different periods and scenarios (soft transferability), but the consistency of the emulation functions differs between approaches. Whereas PP learns robust and physically meaningful patterns, MOS results are GCM dependent and lack physical consistency in some cases. Both approaches face problems when transferring the emulation function to other GCMs (hard transferability), due to the existence of GCM-dependent biases. This limits their applicability to build RCM ensembles. We conclude by giving prospects for future applications.
Significance Statement
Regional climate model (RCM) emulators are a cost-effective emerging approach for generating comprehensive ensembles of regional climate projections. Promising results have been recently obtained using deep learning models. However, their potential to capture the regional climate dynamics and to emulate other periods, emission scenarios, or driving global climate models (GCMs) remains an open issue that affects their practical use. This study explores the potential of current emulation approaches incorporating new explainable artificial intelligence (XAI) evaluation techniques to assess the reliability and transferability of the emulators. Our findings show that the different global and regional model biases involved in the different approaches play a key role in transferability. Based on the results obtained, we provide some prospects for potential applications of these models in challenging problems.
Abstract
Regional climate models (RCMs) are essential tools for simulating and studying regional climate variability and change. However, their high computational cost limits the production of comprehensive ensembles of regional climate projections covering multiple scenarios and driving Global climate models (GCMs) across regions. RCM emulators based on deep learning models have recently been introduced as a cost-effective and promising alternative that requires only short RCM simulations to train the models. Therefore, evaluating their transferability to different periods, scenarios, and GCMs becomes a pivotal and complex task in which the inherent biases of both GCMs and RCMs play a significant role. Here, we focus on this problem by considering the two different emulation approaches introduced in the literature as perfect and imperfect, that we here refer to as perfect prognosis (PP) and model output statistics (MOS), respectively, following the well-established downscaling terminology. In addition to standard evaluation techniques, we expand the analysis with methods from the field of explainable artificial intelligence (XAI), to assess the physical consistency of the empirical links learnt by the models. We find that both approaches are able to emulate certain climatological properties of RCMs for different periods and scenarios (soft transferability), but the consistency of the emulation functions differs between approaches. Whereas PP learns robust and physically meaningful patterns, MOS results are GCM dependent and lack physical consistency in some cases. Both approaches face problems when transferring the emulation function to other GCMs (hard transferability), due to the existence of GCM-dependent biases. This limits their applicability to build RCM ensembles. We conclude by giving prospects for future applications.
Significance Statement
Regional climate model (RCM) emulators are a cost-effective emerging approach for generating comprehensive ensembles of regional climate projections. Promising results have been recently obtained using deep learning models. However, their potential to capture the regional climate dynamics and to emulate other periods, emission scenarios, or driving global climate models (GCMs) remains an open issue that affects their practical use. This study explores the potential of current emulation approaches incorporating new explainable artificial intelligence (XAI) evaluation techniques to assess the reliability and transferability of the emulators. Our findings show that the different global and regional model biases involved in the different approaches play a key role in transferability. Based on the results obtained, we provide some prospects for potential applications of these models in challenging problems.
Abstract
The Atlantic Meridional Overturning Circulation (AMOC) is an important component of the global climate, known to be a tipping element, as it could collapse under global warming. The main objective of this study is to compute the probability that the AMOC collapses within a specified time window, using a rare-event algorithm called Trajectory-Adaptive Multilevel Splitting (TAMS). However, the efficiency and accuracy of TAMS depend on the choice of the score function. Although the definition of the optimal score function, called “committor function” is known, it is impossible in general to compute it a priori. Here, we combine TAMS with a Next-Generation Reservoir Computing technique that estimates the committor function from the data generated by the rare-event algorithm. We test this technique in a stochastic box model of the AMOC for which two types of transition exist, the so-called F(ast)-transitions and S(low)-transitions. Results for the F-transtions compare favorably with those in the literature where a physically-informed score function was used. We show that coupling a rare-event algorithm with machine learning allows for a correct estimation of transition probabilities, transition times, and even transition paths for a wide range of model parameters. We then extend these results to the more difficult problem of S-transitions in the same model. In both cases of F-transitions and S-transitions, we also show how the Next-Generation Reservoir Computing technique can be interpreted to retrieve an analytical estimate of the committor function.
Abstract
The Atlantic Meridional Overturning Circulation (AMOC) is an important component of the global climate, known to be a tipping element, as it could collapse under global warming. The main objective of this study is to compute the probability that the AMOC collapses within a specified time window, using a rare-event algorithm called Trajectory-Adaptive Multilevel Splitting (TAMS). However, the efficiency and accuracy of TAMS depend on the choice of the score function. Although the definition of the optimal score function, called “committor function” is known, it is impossible in general to compute it a priori. Here, we combine TAMS with a Next-Generation Reservoir Computing technique that estimates the committor function from the data generated by the rare-event algorithm. We test this technique in a stochastic box model of the AMOC for which two types of transition exist, the so-called F(ast)-transitions and S(low)-transitions. Results for the F-transtions compare favorably with those in the literature where a physically-informed score function was used. We show that coupling a rare-event algorithm with machine learning allows for a correct estimation of transition probabilities, transition times, and even transition paths for a wide range of model parameters. We then extend these results to the more difficult problem of S-transitions in the same model. In both cases of F-transitions and S-transitions, we also show how the Next-Generation Reservoir Computing technique can be interpreted to retrieve an analytical estimate of the committor function.
Abstract
Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as postprocessing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dates) and observational data, including relative humidity, pressure at sea level, and geopotential height, are incorporated into various ML methods to predict monthly average precipitation and 2-m temperature 2 weeks in advance for the continental United States. For regression, quantile regression, and tercile classification tasks, we consider using linear models, random forests, convolutional neural networks, and stacked models (a multimodel approach based on the prediction of the individual ML models). Unlike previous ML approaches that often use ensemble mean alone, we leverage information embedded in the ensemble forecasts to enhance prediction accuracy. Additionally, we investigate extreme event predictions that are crucial for planning and mitigation efforts. Considering ensemble members as a collection of spatial forecasts, we explore different approaches to using spatial information. Trade-offs between different approaches may be mitigated with model stacking. Our proposed models outperform standard baselines such as climatological forecasts and ensemble means. In addition, we investigate feature importance, trade-offs between using the full ensemble or only the ensemble mean, and different modes of accounting for spatial variability.
Significance Statement
Accurately forecasting temperature and precipitation on subseasonal time scales—2 weeks–2 months in advance—is extremely challenging. These forecasts would have immense value in agriculture, insurance, and economics. Our paper describes an application of machine learning techniques to improve forecasts of monthly average precipitation and 2-m temperature using lagged physics-based predictions and observational data 2 weeks in advance for the entire continental United States. For lagged ensembles, the proposed models outperform standard benchmarks such as historical averages and averages of physics-based predictions. Our findings suggest that utilizing the full set of physics-based predictions instead of the average enhances the accuracy of the final forecast.
Abstract
Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as postprocessing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dates) and observational data, including relative humidity, pressure at sea level, and geopotential height, are incorporated into various ML methods to predict monthly average precipitation and 2-m temperature 2 weeks in advance for the continental United States. For regression, quantile regression, and tercile classification tasks, we consider using linear models, random forests, convolutional neural networks, and stacked models (a multimodel approach based on the prediction of the individual ML models). Unlike previous ML approaches that often use ensemble mean alone, we leverage information embedded in the ensemble forecasts to enhance prediction accuracy. Additionally, we investigate extreme event predictions that are crucial for planning and mitigation efforts. Considering ensemble members as a collection of spatial forecasts, we explore different approaches to using spatial information. Trade-offs between different approaches may be mitigated with model stacking. Our proposed models outperform standard baselines such as climatological forecasts and ensemble means. In addition, we investigate feature importance, trade-offs between using the full ensemble or only the ensemble mean, and different modes of accounting for spatial variability.
Significance Statement
Accurately forecasting temperature and precipitation on subseasonal time scales—2 weeks–2 months in advance—is extremely challenging. These forecasts would have immense value in agriculture, insurance, and economics. Our paper describes an application of machine learning techniques to improve forecasts of monthly average precipitation and 2-m temperature using lagged physics-based predictions and observational data 2 weeks in advance for the entire continental United States. For lagged ensembles, the proposed models outperform standard benchmarks such as historical averages and averages of physics-based predictions. Our findings suggest that utilizing the full set of physics-based predictions instead of the average enhances the accuracy of the final forecast.