Browse
Abstract
NOAA global surface temperature (NOAAGlobalTemp) is NOAA’s operational global surface temperature product, which has been widely used in Earth’s climate assessment and monitoring. To improve the spatial interpolation of monthly land surface air temperatures (LSATs) in NOAAGlobalTemp from 1850 to 2020, a three-layer artificial neural network (ANN) system was designed. The ANN system was trained by repeatedly randomly selecting 90% of the LSATs from ERA5 (1950–2019) and validating with the remaining 10%. Validations show clear improvements of ANN over the original empirical orthogonal teleconnection (EOT) method: the global spatial correlation coefficient (SCC) increases from 65% to 80%, and the global root-mean-square difference (RMSD) decreases from 0.99° to 0.57°C during 1850–2020. The improvements of SCCs and RMSDs are larger in the Southern Hemisphere than in the Northern Hemisphere and are larger before the 1950s and where observations are sparse. The ANN system was finally fed in observed LSATs, and its output over the global land surface was compared with those from the EOT method. Comparisons demonstrate similar improvements by ANN over the EOT method: The global SCC increased from 78% to 89%, the global RMSD decreased from 0.93° to 0.68°C, and the LSAT variability quantified by the monthly standard deviation (STD) increases from 1.16° to 1.41°C during 1850–2020. While the SCC, RMSD, and STD at the monthly time scale have been improved, long-term trends remain largely unchanged because the low-frequency component of LSAT in ANN is identical to that in the EOT approach.
Significance Statement
The spatial interpolation method of an artificial neural network has greatly improved the accuracy of land surface air temperature reconstruction, which reduces root-mean-square error and increases spatial coherence and variabilities over the global land surface from 1850 to 2020.
Abstract
NOAA global surface temperature (NOAAGlobalTemp) is NOAA’s operational global surface temperature product, which has been widely used in Earth’s climate assessment and monitoring. To improve the spatial interpolation of monthly land surface air temperatures (LSATs) in NOAAGlobalTemp from 1850 to 2020, a three-layer artificial neural network (ANN) system was designed. The ANN system was trained by repeatedly randomly selecting 90% of the LSATs from ERA5 (1950–2019) and validating with the remaining 10%. Validations show clear improvements of ANN over the original empirical orthogonal teleconnection (EOT) method: the global spatial correlation coefficient (SCC) increases from 65% to 80%, and the global root-mean-square difference (RMSD) decreases from 0.99° to 0.57°C during 1850–2020. The improvements of SCCs and RMSDs are larger in the Southern Hemisphere than in the Northern Hemisphere and are larger before the 1950s and where observations are sparse. The ANN system was finally fed in observed LSATs, and its output over the global land surface was compared with those from the EOT method. Comparisons demonstrate similar improvements by ANN over the EOT method: The global SCC increased from 78% to 89%, the global RMSD decreased from 0.93° to 0.68°C, and the LSAT variability quantified by the monthly standard deviation (STD) increases from 1.16° to 1.41°C during 1850–2020. While the SCC, RMSD, and STD at the monthly time scale have been improved, long-term trends remain largely unchanged because the low-frequency component of LSAT in ANN is identical to that in the EOT approach.
Significance Statement
The spatial interpolation method of an artificial neural network has greatly improved the accuracy of land surface air temperature reconstruction, which reduces root-mean-square error and increases spatial coherence and variabilities over the global land surface from 1850 to 2020.
Abstract
A deep learning model is presented to nowcast the occurrence of lightning at a 5-min time resolution 60 min into the future. The model is based on a recurrent-convolutional architecture that allows it to recognize and predict the spatiotemporal development of convection, including the motion, growth and decay of thunderstorm cells. The predictions are performed on a stationary grid, without the use of storm object detection and tracking. The input data, collected from an area in and surrounding Switzerland, comprise ground-based radar data, visible/infrared satellite data and derived cloud products, lightning detection, numerical weather prediction, and digital elevation model data. We analyze different alternative loss functions, class weighting strategies and model features, providing guidelines for future studies to select loss functions optimally and to properly calibrate the probabilistic predictions of their model. On the basis of these analyses, we use focal loss in this study but conclude that it only provides a small benefit over cross entropy, which is a viable option if recalibration of the model is not practical. The model achieves a pixelwise critical success index (CSI) of 0.45 to predict lightning occurrence within 8 km over the 60-min nowcast period, ranging from a CSI of 0.75 at a 5-min lead time to a CSI of 0.32 at a 60-min lead time.
Significance Statement
We have developed a method based on artificial intelligence to forecast the occurrence of lightning at 5-min intervals within the next hour from the forecast time. The method utilizes a neural network that learns to predict lightning from a set of training images containing lightning detection data, weather radar observations, satellite imagery, weather forecasts, and elevation data. We find that the network is able to predict the motion, growth, and decay of lightning-producing thunderstorms and that, when properly tuned, it can accurately determine the probability of lightning occurring. This is expected to permit more informed decisions to be made about short-term lightning risks in fields such as civil protection, electricity-grid management, and aviation.
Abstract
A deep learning model is presented to nowcast the occurrence of lightning at a 5-min time resolution 60 min into the future. The model is based on a recurrent-convolutional architecture that allows it to recognize and predict the spatiotemporal development of convection, including the motion, growth and decay of thunderstorm cells. The predictions are performed on a stationary grid, without the use of storm object detection and tracking. The input data, collected from an area in and surrounding Switzerland, comprise ground-based radar data, visible/infrared satellite data and derived cloud products, lightning detection, numerical weather prediction, and digital elevation model data. We analyze different alternative loss functions, class weighting strategies and model features, providing guidelines for future studies to select loss functions optimally and to properly calibrate the probabilistic predictions of their model. On the basis of these analyses, we use focal loss in this study but conclude that it only provides a small benefit over cross entropy, which is a viable option if recalibration of the model is not practical. The model achieves a pixelwise critical success index (CSI) of 0.45 to predict lightning occurrence within 8 km over the 60-min nowcast period, ranging from a CSI of 0.75 at a 5-min lead time to a CSI of 0.32 at a 60-min lead time.
Significance Statement
We have developed a method based on artificial intelligence to forecast the occurrence of lightning at 5-min intervals within the next hour from the forecast time. The method utilizes a neural network that learns to predict lightning from a set of training images containing lightning detection data, weather radar observations, satellite imagery, weather forecasts, and elevation data. We find that the network is able to predict the motion, growth, and decay of lightning-producing thunderstorms and that, when properly tuned, it can accurately determine the probability of lightning occurring. This is expected to permit more informed decisions to be made about short-term lightning risks in fields such as civil protection, electricity-grid management, and aviation.
Abstract
Tropical cyclone (TC) track forecasts derived from dynamical models inherit their errors. In this study, a neural network (NN) algorithm was proposed for postprocessing TC tracks predicted by the Global Ensemble Forecast System (GEFS) for lead times of 2, 4, 5, and 6 days over the western North Pacific. The hybrid NN is a combination of three NN classes: 1) convolutional NN that extracts spatial features from GEFS fields; 2) multilayer perceptron, which processes TC positions predicted by GEFS; and 3) recurrent NN that handles information from previous time steps. A dataset of 204 TCs (6744 samples), which were formed from 1985 to 2019 (June–October) and survived for at least six days, was separated into various track patterns. TCs in each track pattern were distributed uniformly to validation and test dataset, in which each contained 10% TCs of the entire dataset, and the remaining 80% were allocated to the training dataset. Two NN architectures were developed, with and without a shortcut connection. Feature selection and hyperparameter tuning were performed to improve model performance. The results present that mean track error and dispersion could be reduced, particularly with the shortcut connection, which also corrected the systematic speed and direction bias of GEFS. Although a reduction in mean track error was not achieved by the NNs for every forecast lead time, improvement can be foreseen upon calibration for reducing overfitting, and the performance encourages further development in the present application.
Abstract
Tropical cyclone (TC) track forecasts derived from dynamical models inherit their errors. In this study, a neural network (NN) algorithm was proposed for postprocessing TC tracks predicted by the Global Ensemble Forecast System (GEFS) for lead times of 2, 4, 5, and 6 days over the western North Pacific. The hybrid NN is a combination of three NN classes: 1) convolutional NN that extracts spatial features from GEFS fields; 2) multilayer perceptron, which processes TC positions predicted by GEFS; and 3) recurrent NN that handles information from previous time steps. A dataset of 204 TCs (6744 samples), which were formed from 1985 to 2019 (June–October) and survived for at least six days, was separated into various track patterns. TCs in each track pattern were distributed uniformly to validation and test dataset, in which each contained 10% TCs of the entire dataset, and the remaining 80% were allocated to the training dataset. Two NN architectures were developed, with and without a shortcut connection. Feature selection and hyperparameter tuning were performed to improve model performance. The results present that mean track error and dispersion could be reduced, particularly with the shortcut connection, which also corrected the systematic speed and direction bias of GEFS. Although a reduction in mean track error was not achieved by the NNs for every forecast lead time, improvement can be foreseen upon calibration for reducing overfitting, and the performance encourages further development in the present application.
Abstract
Many deep learning technologies have been applied to the Earth sciences. Nonetheless, the difficulty in interpreting deep learning results still prevents their applications to studies on climate dynamics. Here, we applied a convolutional neural network to understand El Niño–Southern Oscillation (ENSO) dynamics from long-term climate model simulations. The deep learning algorithm successfully predicted ENSO events with a high correlation skill (∼0.82) for a 9-month lead. For interpreting deep learning results beyond the prediction, we present a “contribution map” to estimate how much the grid box and variable contribute to the output and “contribution sensitivity” to estimate how much the output variable is changed to the small perturbation of the input variables. The contribution map and sensitivity are calculated by modifying the input variables to the pretrained deep learning, which is quite similar to the occlusion sensitivity. Based on the two methods, we identified three precursors of ENSO and investigated their physical processes with El Niño and La Niña development. In particular, it is suggested here that the roles of each precursor are asymmetric between El Niño and La Niña. Our results suggest that the contribution map and sensitivity are simple approaches but can be a powerful tool in understanding ENSO dynamics and they might be also applied to other climate phenomena.
Abstract
Many deep learning technologies have been applied to the Earth sciences. Nonetheless, the difficulty in interpreting deep learning results still prevents their applications to studies on climate dynamics. Here, we applied a convolutional neural network to understand El Niño–Southern Oscillation (ENSO) dynamics from long-term climate model simulations. The deep learning algorithm successfully predicted ENSO events with a high correlation skill (∼0.82) for a 9-month lead. For interpreting deep learning results beyond the prediction, we present a “contribution map” to estimate how much the grid box and variable contribute to the output and “contribution sensitivity” to estimate how much the output variable is changed to the small perturbation of the input variables. The contribution map and sensitivity are calculated by modifying the input variables to the pretrained deep learning, which is quite similar to the occlusion sensitivity. Based on the two methods, we identified three precursors of ENSO and investigated their physical processes with El Niño and La Niña development. In particular, it is suggested here that the roles of each precursor are asymmetric between El Niño and La Niña. Our results suggest that the contribution map and sensitivity are simple approaches but can be a powerful tool in understanding ENSO dynamics and they might be also applied to other climate phenomena.
Abstract
A simple yet flexible and robust algorithm is described for fully partitioning an arbitrary dataset into compact, nonoverlapping groups or classes, sorted by size, based entirely on a pairwise similarity matrix and a user-specified similarity threshold. Unlike many clustering algorithms, there is no assumption that natural clusters exist in the dataset, although clusters, when present, may be preferentially assigned to one or more classes. The method also does not require data objects to be compared within any coordinate system but rather permits the user to define pairwise similarity using almost any conceivable criterion. The method therefore lends itself to certain geoscientific applications for which conventional clustering methods are unsuited, including two nontrivial and distinctly different datasets presented as examples. In addition to identifying large classes containing numerous similar dataset members, it is also well suited for isolating rare or anomalous members of a dataset. The method is inductive in that prototypes identified in representative subset of a larger dataset can be used to classify the remainder.
Abstract
A simple yet flexible and robust algorithm is described for fully partitioning an arbitrary dataset into compact, nonoverlapping groups or classes, sorted by size, based entirely on a pairwise similarity matrix and a user-specified similarity threshold. Unlike many clustering algorithms, there is no assumption that natural clusters exist in the dataset, although clusters, when present, may be preferentially assigned to one or more classes. The method also does not require data objects to be compared within any coordinate system but rather permits the user to define pairwise similarity using almost any conceivable criterion. The method therefore lends itself to certain geoscientific applications for which conventional clustering methods are unsuited, including two nontrivial and distinctly different datasets presented as examples. In addition to identifying large classes containing numerous similar dataset members, it is also well suited for isolating rare or anomalous members of a dataset. The method is inductive in that prototypes identified in representative subset of a larger dataset can be used to classify the remainder.
Abstract
As wind and solar power play increasingly important roles in the European energy system, unfavorable weather conditions, such as “Dunkelflaute” (extended calm and cloudy periods), will pose ever greater challenges to transmission system operators. Thus, accurate identification and characterization of such events from open data streams (e.g., reanalysis, numerical weather prediction, and climate projection) are going to be crucial. In this study, we propose a two-step, unsupervised deep learning framework [wind and solar network (WISRnet)] to automatically encode spatial patterns of wind speed and insolation, and subsequently, identify Dunkelflaute periods from the encoded patterns. Specifically, a deep convolutional neural network (CNN)–based autoencoder (AE) is first employed for feature extraction from the spatial patterns. These two-dimensional CNN-AE patterns encapsulate both amplitude and spatial information in a parsimonious way. In the second step of the WISRnet framework, a variant of the well-known k-means algorithm is used to divide the CNN-AE patterns in region-dependent meteorological clusters. For the validation of the WISRnet framework, aggregated wind and solar power production data from Belgium are used. Using a simple criterion from published literature, all the Dunkelflaute periods are directly identified from this 6-year-long dataset. Next, each of these periods is associated with a WISRnet-derived cluster. Interestingly, we find that the majority of these Dunkelflaute periods are part of only 5 clusters (out of 25). We show that in lieu of proprietary power production data, the WISRnet framework can identify Dunkelflaute periods from public-domain meteorological data. To further demonstrate the prowess of this framework, it is deployed to identify and characterize Dunkelflaute events in Denmark, Sweden, and the United Kingdom.
Abstract
As wind and solar power play increasingly important roles in the European energy system, unfavorable weather conditions, such as “Dunkelflaute” (extended calm and cloudy periods), will pose ever greater challenges to transmission system operators. Thus, accurate identification and characterization of such events from open data streams (e.g., reanalysis, numerical weather prediction, and climate projection) are going to be crucial. In this study, we propose a two-step, unsupervised deep learning framework [wind and solar network (WISRnet)] to automatically encode spatial patterns of wind speed and insolation, and subsequently, identify Dunkelflaute periods from the encoded patterns. Specifically, a deep convolutional neural network (CNN)–based autoencoder (AE) is first employed for feature extraction from the spatial patterns. These two-dimensional CNN-AE patterns encapsulate both amplitude and spatial information in a parsimonious way. In the second step of the WISRnet framework, a variant of the well-known k-means algorithm is used to divide the CNN-AE patterns in region-dependent meteorological clusters. For the validation of the WISRnet framework, aggregated wind and solar power production data from Belgium are used. Using a simple criterion from published literature, all the Dunkelflaute periods are directly identified from this 6-year-long dataset. Next, each of these periods is associated with a WISRnet-derived cluster. Interestingly, we find that the majority of these Dunkelflaute periods are part of only 5 clusters (out of 25). We show that in lieu of proprietary power production data, the WISRnet framework can identify Dunkelflaute periods from public-domain meteorological data. To further demonstrate the prowess of this framework, it is deployed to identify and characterize Dunkelflaute events in Denmark, Sweden, and the United Kingdom.
Abstract
A wealth of forecasting models is available for operational weather forecasting. Their strengths often depend on the lead time considered, which generates the need for a seamless combination of different forecast methods. The combined and continuous products are made in order to retain or even enhance the forecast quality of the individual forecasts and to extend the lead time to potentially hazardous weather events. In this study, we further improve an artificial neural network–based combination model that was recently proposed in a previous paper. This model combines two initial precipitation ensemble forecasts and produces exceedance probabilities for a set of thresholds for hourly precipitation amounts. Both initial forecasts perform differently well for different lead times, whereas the combined forecast is calibrated and outperforms both initial forecasts with respect to various validation scores and for all considered lead times (from +1 to +6 h). Moreover, the robustness of the combination model is tested by applying it to a new dataset and by evaluating the spatial and temporal consistency of its forecasts. The changes proposed further improve the forecast quality and make it more useful for practical applications. Temporal consistency of the combined product is evaluated using a flip-flop index. It is shown that the combination provides a higher persistence with decreasing lead times compared to both input systems.
Abstract
A wealth of forecasting models is available for operational weather forecasting. Their strengths often depend on the lead time considered, which generates the need for a seamless combination of different forecast methods. The combined and continuous products are made in order to retain or even enhance the forecast quality of the individual forecasts and to extend the lead time to potentially hazardous weather events. In this study, we further improve an artificial neural network–based combination model that was recently proposed in a previous paper. This model combines two initial precipitation ensemble forecasts and produces exceedance probabilities for a set of thresholds for hourly precipitation amounts. Both initial forecasts perform differently well for different lead times, whereas the combined forecast is calibrated and outperforms both initial forecasts with respect to various validation scores and for all considered lead times (from +1 to +6 h). Moreover, the robustness of the combination model is tested by applying it to a new dataset and by evaluating the spatial and temporal consistency of its forecasts. The changes proposed further improve the forecast quality and make it more useful for practical applications. Temporal consistency of the combined product is evaluated using a flip-flop index. It is shown that the combination provides a higher persistence with decreasing lead times compared to both input systems.
Abstract
Convolutional neural networks (CNNs) have recently attracted great attention in geoscience because of their ability to capture nonlinear system behavior and extract predictive spatiotemporal patterns. Given their black-box nature, however, and the importance of prediction explainability, methods of explainable artificial intelligence (XAI) are gaining popularity as a means to explain the CNN decision-making strategy. Here, we establish an intercomparison of some of the most popular XAI methods and investigate their fidelity in explaining CNN decisions for geoscientific applications. Our goal is to raise awareness of the theoretical limitations of these methods and to gain insight into the relative strengths and weaknesses to help guide best practices. The considered XAI methods are first applied to an idealized attribution benchmark, in which the ground truth of explanation of the network is known a priori, to help objectively assess their performance. Second, we apply XAI to a climate-related prediction setting, namely, to explain a CNN that is trained to predict the number of atmospheric rivers in daily snapshots of climate simulations. Our results highlight several important issues of XAI methods (e.g., gradient shattering, inability to distinguish the sign of attribution, and ignorance to zero input) that have previously been overlooked in our field and, if not considered cautiously, may lead to a distorted picture of the CNN decision-making strategy. We envision that our analysis will motivate further investigation into XAI fidelity and will help toward a cautious implementation of XAI in geoscience, which can lead to further exploitation of CNNs and deep learning for prediction problems.
Abstract
Convolutional neural networks (CNNs) have recently attracted great attention in geoscience because of their ability to capture nonlinear system behavior and extract predictive spatiotemporal patterns. Given their black-box nature, however, and the importance of prediction explainability, methods of explainable artificial intelligence (XAI) are gaining popularity as a means to explain the CNN decision-making strategy. Here, we establish an intercomparison of some of the most popular XAI methods and investigate their fidelity in explaining CNN decisions for geoscientific applications. Our goal is to raise awareness of the theoretical limitations of these methods and to gain insight into the relative strengths and weaknesses to help guide best practices. The considered XAI methods are first applied to an idealized attribution benchmark, in which the ground truth of explanation of the network is known a priori, to help objectively assess their performance. Second, we apply XAI to a climate-related prediction setting, namely, to explain a CNN that is trained to predict the number of atmospheric rivers in daily snapshots of climate simulations. Our results highlight several important issues of XAI methods (e.g., gradient shattering, inability to distinguish the sign of attribution, and ignorance to zero input) that have previously been overlooked in our field and, if not considered cautiously, may lead to a distorted picture of the CNN decision-making strategy. We envision that our analysis will motivate further investigation into XAI fidelity and will help toward a cautious implementation of XAI in geoscience, which can lead to further exploitation of CNNs and deep learning for prediction problems.
Abstract
Satellite low-Earth-orbiting (LEO) and geostationary (GEO) imager estimates of cloud-top pressure (CTP) have many applications in both operations and in studying long-term variations in cloud properties. Recently, machine learning (ML) approaches have shown improvement upon physically based algorithms. However, ML approaches, and especially neural networks, can suffer from a lack of interpretability, making it difficult to understand what information is most useful for accurate predictions of cloud properties. We trained several neural networks to estimate CTP from the infrared channels of the Visible Infrared Imaging Radiometer Suite (VIIRS) and the Advanced Baseline Imager (ABI). The main focus of this work is assessing the relative importance of each instrument’s infrared channels in neural networks trained to estimate CTP. We use several ML explainability methods to offer different perspectives on feature importance. These methods show many differences in the relative feature importance depending on the exact method used, but most agree on a few points. Overall, the 8.4- and 8.6-μm channels appear to be the most useful for CTP estimation on ABI and VIIRS, respectively, with other native infrared window channels and the 13.3-μm channel playing a moderate role. Furthermore, we find that the neural networks learn relationships that may account for properties of clouds such as opacity and cloud-top phase that otherwise complicate the estimation of CTP.
Significance Statement
Model interpretability is an important consideration for transitioning machine learning models to operations. This work applies several explainability methods in an attempt to understand what information is most important for estimating the pressure level at the top of a cloud from satellite imagers in a neural network model. We observe much disagreement between approaches, which motivates further work in this area but find agreement on the importance of channels in the infrared window region around 8.6 and 10–12 μm, informing future cloud property algorithm development. We also find some evidence suggesting that these neural networks are able to learn physically relevant variability in radiation measurements related to key cloud properties.
Abstract
Satellite low-Earth-orbiting (LEO) and geostationary (GEO) imager estimates of cloud-top pressure (CTP) have many applications in both operations and in studying long-term variations in cloud properties. Recently, machine learning (ML) approaches have shown improvement upon physically based algorithms. However, ML approaches, and especially neural networks, can suffer from a lack of interpretability, making it difficult to understand what information is most useful for accurate predictions of cloud properties. We trained several neural networks to estimate CTP from the infrared channels of the Visible Infrared Imaging Radiometer Suite (VIIRS) and the Advanced Baseline Imager (ABI). The main focus of this work is assessing the relative importance of each instrument’s infrared channels in neural networks trained to estimate CTP. We use several ML explainability methods to offer different perspectives on feature importance. These methods show many differences in the relative feature importance depending on the exact method used, but most agree on a few points. Overall, the 8.4- and 8.6-μm channels appear to be the most useful for CTP estimation on ABI and VIIRS, respectively, with other native infrared window channels and the 13.3-μm channel playing a moderate role. Furthermore, we find that the neural networks learn relationships that may account for properties of clouds such as opacity and cloud-top phase that otherwise complicate the estimation of CTP.
Significance Statement
Model interpretability is an important consideration for transitioning machine learning models to operations. This work applies several explainability methods in an attempt to understand what information is most important for estimating the pressure level at the top of a cloud from satellite imagers in a neural network model. We observe much disagreement between approaches, which motivates further work in this area but find agreement on the importance of channels in the infrared window region around 8.6 and 10–12 μm, informing future cloud property algorithm development. We also find some evidence suggesting that these neural networks are able to learn physically relevant variability in radiation measurements related to key cloud properties.
Abstract
A major challenge for food security worldwide is the large interannual variability of crop yield, and climate change is expected to further exacerbate this volatility. Accurate prediction of the crop response to climate variability and change is critical for short-term management and long-term planning in multiple sectors. In this study, using maize in the U.S. Corn Belt as an example, we train and validate multiple machine learning (ML) models predicting crop yield based on meteorological variables and soil properties using the leaving-one-year-out approach, and compare their performance with that of a widely used process-based crop model (PBM). Our proposed long short-term memory model with attention (LSTMatt) outperforms other ML models (including other variations of LSTM developed in this study) and explains 73% of the spatiotemporal variance of the observed maize yield, in contrast to 16% explained by the regionally calibrated PBM; the magnitude of yield prediction errors in LSTMatt is about one-third of that in the PBM. When applied to the extreme drought year 2012 that has no counterpart in the training data, the LSTMatt performance drops but still shows advantage over the PBM. Findings from this study suggest a great potential for out-of-sample application of the LSTMatt model to predict crop yield under a changing climate.
Significance Statement
Changing climate is expected to exacerbate extreme weather events, thus affecting global food security. Accurate estimation and prediction of crop productivity under extremes are crucial for long-term agricultural decision-making and climate adaptation planning. Here we seek to improve crop yield prediction from meteorological features and soil properties using machine learning approaches. Our long short-term memory (LSTM) model with attention and shortcut connection explains 73% of the spatiotemporal variance of the observed maize yield in the U.S. Corn Belt and outperforms a widely used process-based crop model even in an extreme drought year when meteorological conditions are significantly different from the training data. Our findings suggest great potential for out-of-sample application of the LSTM model to predict crop yield under a changing climate.
Abstract
A major challenge for food security worldwide is the large interannual variability of crop yield, and climate change is expected to further exacerbate this volatility. Accurate prediction of the crop response to climate variability and change is critical for short-term management and long-term planning in multiple sectors. In this study, using maize in the U.S. Corn Belt as an example, we train and validate multiple machine learning (ML) models predicting crop yield based on meteorological variables and soil properties using the leaving-one-year-out approach, and compare their performance with that of a widely used process-based crop model (PBM). Our proposed long short-term memory model with attention (LSTMatt) outperforms other ML models (including other variations of LSTM developed in this study) and explains 73% of the spatiotemporal variance of the observed maize yield, in contrast to 16% explained by the regionally calibrated PBM; the magnitude of yield prediction errors in LSTMatt is about one-third of that in the PBM. When applied to the extreme drought year 2012 that has no counterpart in the training data, the LSTMatt performance drops but still shows advantage over the PBM. Findings from this study suggest a great potential for out-of-sample application of the LSTMatt model to predict crop yield under a changing climate.
Significance Statement
Changing climate is expected to exacerbate extreme weather events, thus affecting global food security. Accurate estimation and prediction of crop productivity under extremes are crucial for long-term agricultural decision-making and climate adaptation planning. Here we seek to improve crop yield prediction from meteorological features and soil properties using machine learning approaches. Our long short-term memory (LSTM) model with attention and shortcut connection explains 73% of the spatiotemporal variance of the observed maize yield in the U.S. Corn Belt and outperforms a widely used process-based crop model even in an extreme drought year when meteorological conditions are significantly different from the training data. Our findings suggest great potential for out-of-sample application of the LSTM model to predict crop yield under a changing climate.