Browse
Abstract
Forecast skill from dynamical forecast models decreases quickly with projection time due to various errors. Therefore, postprocessing methods, from simple bias correction methods to more complicated multiple linear regression–based model output statistics, are used to improve raw model forecasts. Usually, these methods show clear forecast improvement over the raw model forecasts, especially for short-range weather forecasts. However, linear approaches have limitations because the relationship between predictands and predictors may be nonlinear. This is even truer for extended range forecasts, such as week-3–4 forecasts. In this study, neural network techniques are used to seek or model the relationships between a set of predictors and predictands, and eventually to improve week-3–4 precipitation and 2-m temperature forecasts made by the NOAA/NCEP Climate Forecast System. Benefitting from advances in machine learning techniques in recent years, more flexible and capable machine learning algorithms and availability of big datasets enable us not only to explore nonlinear features or relationships within a given large dataset, but also to extract more sophisticated pattern relationships and covariabilities hidden within the multidimensional predictors and predictands. Then these more sophisticated relationships and high-level statistical information are used to correct the model week-3–4 precipitation and 2-m temperature forecasts. The results show that to some extent neural network techniques can significantly improve the week-3–4 forecast accuracy and greatly increase the efficiency over the traditional multiple linear regression methods.
Abstract
Forecast skill from dynamical forecast models decreases quickly with projection time due to various errors. Therefore, postprocessing methods, from simple bias correction methods to more complicated multiple linear regression–based model output statistics, are used to improve raw model forecasts. Usually, these methods show clear forecast improvement over the raw model forecasts, especially for short-range weather forecasts. However, linear approaches have limitations because the relationship between predictands and predictors may be nonlinear. This is even truer for extended range forecasts, such as week-3–4 forecasts. In this study, neural network techniques are used to seek or model the relationships between a set of predictors and predictands, and eventually to improve week-3–4 precipitation and 2-m temperature forecasts made by the NOAA/NCEP Climate Forecast System. Benefitting from advances in machine learning techniques in recent years, more flexible and capable machine learning algorithms and availability of big datasets enable us not only to explore nonlinear features or relationships within a given large dataset, but also to extract more sophisticated pattern relationships and covariabilities hidden within the multidimensional predictors and predictands. Then these more sophisticated relationships and high-level statistical information are used to correct the model week-3–4 precipitation and 2-m temperature forecasts. The results show that to some extent neural network techniques can significantly improve the week-3–4 forecast accuracy and greatly increase the efficiency over the traditional multiple linear regression methods.
Abstract
Intense thunderstorms threaten life and property, impact aviation, and are a challenging forecast problem, particularly without precipitation-sensing radar data. Trained forecasters often look for features in geostationary satellite images such as rapid cloud growth, strong and persistent overshooting tops, U- or V-shaped patterns in storm-top temperature (and associated above-anvil cirrus plumes), thermal couplets, intricate texturing in cloud albedo (e.g., “bubbling” cloud tops), cloud-top divergence, spatial and temporal trends in lightning, and other nuances to identify intense thunderstorms. In this paper, a machine-learning algorithm was employed to automatically learn and extract salient features and patterns in geostationary satellite data for the prediction of intense convection. Namely, a convolutional neural network (CNN) was trained on 0.64-μm reflectance and 10.35-μm brightness temperature from the Advanced Baseline Imager (ABI) and flash-extent density (FED) from the Geostationary Lightning Mapper (GLM) on board GOES-16. Using a training dataset consisting of over 220 000 human-labeled satellite images, the CNN learned pertinent features that are known to be associated with intense convection and skillfully discriminated between intense and ordinary convection. The CNN also learned a more nuanced feature associated with intense convection—strong infrared brightness temperature gradients near cloud edges in the vicinity of the main updraft. A successive-permutation test ranked the most important predictors as follows: 1) ABI 10.35-μm brightness temperature, 2) ABI GLM flash-extent density, and 3) ABI 0.64-μm reflectance. The CNN model can provide forecasters with quantitative information that often foreshadows the occurrence of severe weather, day or night, over the full range of instrument-scan modes.
Abstract
Intense thunderstorms threaten life and property, impact aviation, and are a challenging forecast problem, particularly without precipitation-sensing radar data. Trained forecasters often look for features in geostationary satellite images such as rapid cloud growth, strong and persistent overshooting tops, U- or V-shaped patterns in storm-top temperature (and associated above-anvil cirrus plumes), thermal couplets, intricate texturing in cloud albedo (e.g., “bubbling” cloud tops), cloud-top divergence, spatial and temporal trends in lightning, and other nuances to identify intense thunderstorms. In this paper, a machine-learning algorithm was employed to automatically learn and extract salient features and patterns in geostationary satellite data for the prediction of intense convection. Namely, a convolutional neural network (CNN) was trained on 0.64-μm reflectance and 10.35-μm brightness temperature from the Advanced Baseline Imager (ABI) and flash-extent density (FED) from the Geostationary Lightning Mapper (GLM) on board GOES-16. Using a training dataset consisting of over 220 000 human-labeled satellite images, the CNN learned pertinent features that are known to be associated with intense convection and skillfully discriminated between intense and ordinary convection. The CNN also learned a more nuanced feature associated with intense convection—strong infrared brightness temperature gradients near cloud edges in the vicinity of the main updraft. A successive-permutation test ranked the most important predictors as follows: 1) ABI 10.35-μm brightness temperature, 2) ABI GLM flash-extent density, and 3) ABI 0.64-μm reflectance. The CNN model can provide forecasters with quantitative information that often foreshadows the occurrence of severe weather, day or night, over the full range of instrument-scan modes.
Abstract
Most ensembles suffer from underdispersion and systematic biases. One way to correct for these shortcomings is via machine learning (ML), which is advantageous due to its ability to identify and correct nonlinear biases. This study uses a single random forest (RF) to calibrate next-day (i.e., 12–36-h lead time) probabilistic precipitation forecasts over the contiguous United States (CONUS) from the Short-Range Ensemble Forecast System (SREF) with 16-km grid spacing and the High-Resolution Ensemble Forecast version 2 (HREFv2) with 3-km grid spacing. Random forest forecast probabilities (RFFPs) from each ensemble are compared against raw ensemble probabilities over 496 days from April 2017 to November 2018 using 16-fold cross validation. RFFPs are also compared against spatially smoothed ensemble probabilities since the raw SREF and HREFv2 probabilities are overconfident and undersample the true forecast probability density function. Probabilistic precipitation forecasts are evaluated at four precipitation thresholds ranging from 0.1 to 3 in. In general, RFFPs are found to have better forecast reliability and resolution, fewer spatial biases, and significantly greater Brier skill scores and areas under the relative operating characteristic curve compared to corresponding raw and spatially smoothed ensemble probabilities. The RFFPs perform best at the lower thresholds, which have a greater observed climatological frequency. Additionally, the RF-based postprocessing technique benefits the SREF more than the HREFv2, likely because the raw SREF forecasts contain more systematic biases than those from the raw HREFv2. It is concluded that the RFFPs provide a convenient, skillful summary of calibrated ensemble output and are computationally feasible to implement in real time. Advantages and disadvantages of ML-based postprocessing techniques are discussed.
Abstract
Most ensembles suffer from underdispersion and systematic biases. One way to correct for these shortcomings is via machine learning (ML), which is advantageous due to its ability to identify and correct nonlinear biases. This study uses a single random forest (RF) to calibrate next-day (i.e., 12–36-h lead time) probabilistic precipitation forecasts over the contiguous United States (CONUS) from the Short-Range Ensemble Forecast System (SREF) with 16-km grid spacing and the High-Resolution Ensemble Forecast version 2 (HREFv2) with 3-km grid spacing. Random forest forecast probabilities (RFFPs) from each ensemble are compared against raw ensemble probabilities over 496 days from April 2017 to November 2018 using 16-fold cross validation. RFFPs are also compared against spatially smoothed ensemble probabilities since the raw SREF and HREFv2 probabilities are overconfident and undersample the true forecast probability density function. Probabilistic precipitation forecasts are evaluated at four precipitation thresholds ranging from 0.1 to 3 in. In general, RFFPs are found to have better forecast reliability and resolution, fewer spatial biases, and significantly greater Brier skill scores and areas under the relative operating characteristic curve compared to corresponding raw and spatially smoothed ensemble probabilities. The RFFPs perform best at the lower thresholds, which have a greater observed climatological frequency. Additionally, the RF-based postprocessing technique benefits the SREF more than the HREFv2, likely because the raw SREF forecasts contain more systematic biases than those from the raw HREFv2. It is concluded that the RFFPs provide a convenient, skillful summary of calibrated ensemble output and are computationally feasible to implement in real time. Advantages and disadvantages of ML-based postprocessing techniques are discussed.