Browse
Abstract
Using a 3-km regional ensemble prediction system (EPS), this study tested a three-dimensional (3D) rescaling mask for initial condition (IC) perturbation. Whether the 3D mask-based EPS improves ensemble forecasts over current two-dimensional (2D) mask-based EPS has been evaluated in three aspects: ensemble mean, spread, and probability. The forecasts of wind, temperature, geopotential height, sea level pressure, and precipitation were examined for a summer month (1–28 July 2018) and a winter month (1–27 February 2019) over a region in North China. The EPS was run twice per day (initiated at 0000 and 1200 UTC) to 36 h in forecast length, providing 56 warm-season forecast cases and 54 cold-season cases for verification. The warm and cold seasons are verified separately for comparison. The study found the following: 1) The vertical profile of IC perturbation becomes closer to that of analysis uncertainty with the 3D rescaling mask. 2) Ensemble performance is significantly improved in all three aspects. The biggest improvement is in the ensemble spread, followed by the probabilistic forecast, and the least improvement is in the ensemble mean forecast. Larger improvements are seen in the warm season than in the cold season. 3) More improvement is in the shorter time range (<24 h) than in the longer range. 4) Surface and lower-level variables are improved more than upper-level ones. 5) The underlying mechanism for the improvement has been investigated. Convective instability is found to be responsible for the spread increment and, thus, overall ensemble forecast improvement. Therefore, using a 3D rescaling mask is recommended for an EPS to increase its utility especially for shorter time range and surface weather elements.
Significant Statement
A weather prediction model is a complex system that consists of nonlinear differential equations. Small errors in either its inputs or model itself will grow with time during model integration, which will contaminate a forecast. To quantify such contamination (“uncertainty”) of a forecast, the ensemble forecasting technique is used. An ensemble of forecasts is a multiple of model runs at the same time but with slightly “perturbed” inputs or model versions. These small perturbations are supposed to represent true “uncertainty” in inputs or model representation. This study proposed a technique that makes a perturbation’s vertical structure more resemble real uncertainty (intrinsic error) in input data and confirmed that it can significantly improve ensemble forecast quality especially for a shorter time range and lower-level weather elements. It is found that convective instability is responsible for the improvement.
Abstract
Using a 3-km regional ensemble prediction system (EPS), this study tested a three-dimensional (3D) rescaling mask for initial condition (IC) perturbation. Whether the 3D mask-based EPS improves ensemble forecasts over current two-dimensional (2D) mask-based EPS has been evaluated in three aspects: ensemble mean, spread, and probability. The forecasts of wind, temperature, geopotential height, sea level pressure, and precipitation were examined for a summer month (1–28 July 2018) and a winter month (1–27 February 2019) over a region in North China. The EPS was run twice per day (initiated at 0000 and 1200 UTC) to 36 h in forecast length, providing 56 warm-season forecast cases and 54 cold-season cases for verification. The warm and cold seasons are verified separately for comparison. The study found the following: 1) The vertical profile of IC perturbation becomes closer to that of analysis uncertainty with the 3D rescaling mask. 2) Ensemble performance is significantly improved in all three aspects. The biggest improvement is in the ensemble spread, followed by the probabilistic forecast, and the least improvement is in the ensemble mean forecast. Larger improvements are seen in the warm season than in the cold season. 3) More improvement is in the shorter time range (<24 h) than in the longer range. 4) Surface and lower-level variables are improved more than upper-level ones. 5) The underlying mechanism for the improvement has been investigated. Convective instability is found to be responsible for the spread increment and, thus, overall ensemble forecast improvement. Therefore, using a 3D rescaling mask is recommended for an EPS to increase its utility especially for shorter time range and surface weather elements.
Significant Statement
A weather prediction model is a complex system that consists of nonlinear differential equations. Small errors in either its inputs or model itself will grow with time during model integration, which will contaminate a forecast. To quantify such contamination (“uncertainty”) of a forecast, the ensemble forecasting technique is used. An ensemble of forecasts is a multiple of model runs at the same time but with slightly “perturbed” inputs or model versions. These small perturbations are supposed to represent true “uncertainty” in inputs or model representation. This study proposed a technique that makes a perturbation’s vertical structure more resemble real uncertainty (intrinsic error) in input data and confirmed that it can significantly improve ensemble forecast quality especially for a shorter time range and lower-level weather elements. It is found that convective instability is responsible for the improvement.
Abstract
Warm season heavy rainfall in Minnesota can lead to flooding with serious impacts on life and infrastructure. Situated in a transition zone between humid eastern and semiarid western conditions in the United States, Minnesota experiences large spatial variability in precipitation. Previous research has often lacked spatiotemporal detail important for heavy rainfall analysis for Minnesota. This research used Stage-IV hourly precipitation data with 4-km grid spacing during May–September 2004–20 to analyze Minnesota spatial, seasonal, and event-based characteristics. Rain event frequency, accumulation, hours, and intensities were compared for all rain events (>2.5 mm) and heavy rain events (>36 mm). For all rain events, results showed the highest regional median monthly rain event frequency (>6 events) in June and the lowest (<5 events) in September. Median monthly accumulations were largest (∼75 mm) in June, followed by July and August. Monthly total rain event hours at a point peaked around 20 h in May in southeastern Minnesota. Smaller event accumulations occurred more frequently than larger accumulations, and event mean intensities were higher in summertime (June–August) than in May and September for rain events and heavy rain events. Heavy rain event region-based analyses showed monthly peaks for frequency in July–August, accumulation in July, and event hours in June–July and September. Median heavy rain event durations were shorter during June–August than in May and September. Monthly heavy rain event accumulation as a percent of all rain event accumulation was greatest in September (24%). These results establish a foundation for future research into precipitation patterns and trends.
Significance Statement
Climate analysis has indicated that Minnesota is in a region where increases in heavy rainfall are anticipated for the future. Heavy rainfall in Minnesota has led to flooding with severe adverse impacts. This study addresses a gap in information about heavy precipitation in Minnesota and provides heavy rainfall analyses useful for climate-related planning. Stage-IV hourly precipitation data for the warm season (May–September) during 2004–20 enabled the identification of rain events and heavy rain events, as well as their characteristic frequency, rainfall accumulation, duration, and intensity. The results help establish a baseline for past and future analyses of precipitation patterns and trends. They also build a foundation for future research investigating the weather patterns that lead to heavy rainfall.
Abstract
Warm season heavy rainfall in Minnesota can lead to flooding with serious impacts on life and infrastructure. Situated in a transition zone between humid eastern and semiarid western conditions in the United States, Minnesota experiences large spatial variability in precipitation. Previous research has often lacked spatiotemporal detail important for heavy rainfall analysis for Minnesota. This research used Stage-IV hourly precipitation data with 4-km grid spacing during May–September 2004–20 to analyze Minnesota spatial, seasonal, and event-based characteristics. Rain event frequency, accumulation, hours, and intensities were compared for all rain events (>2.5 mm) and heavy rain events (>36 mm). For all rain events, results showed the highest regional median monthly rain event frequency (>6 events) in June and the lowest (<5 events) in September. Median monthly accumulations were largest (∼75 mm) in June, followed by July and August. Monthly total rain event hours at a point peaked around 20 h in May in southeastern Minnesota. Smaller event accumulations occurred more frequently than larger accumulations, and event mean intensities were higher in summertime (June–August) than in May and September for rain events and heavy rain events. Heavy rain event region-based analyses showed monthly peaks for frequency in July–August, accumulation in July, and event hours in June–July and September. Median heavy rain event durations were shorter during June–August than in May and September. Monthly heavy rain event accumulation as a percent of all rain event accumulation was greatest in September (24%). These results establish a foundation for future research into precipitation patterns and trends.
Significance Statement
Climate analysis has indicated that Minnesota is in a region where increases in heavy rainfall are anticipated for the future. Heavy rainfall in Minnesota has led to flooding with severe adverse impacts. This study addresses a gap in information about heavy precipitation in Minnesota and provides heavy rainfall analyses useful for climate-related planning. Stage-IV hourly precipitation data for the warm season (May–September) during 2004–20 enabled the identification of rain events and heavy rain events, as well as their characteristic frequency, rainfall accumulation, duration, and intensity. The results help establish a baseline for past and future analyses of precipitation patterns and trends. They also build a foundation for future research investigating the weather patterns that lead to heavy rainfall.
Abstract
The Appalachian Mountains have a considerable impact on daily weather, including severe convection, across the eastern United States. However, the impact of the Appalachians on supercells is not well understood, posing a short-term forecast challenge across the region. While case studies have been conducted, there has been no large multicase analysis of supercells interacting with complex terrain. To address this gap, we examined 62 isolated warm-season supercells that occurred within the central or southern Appalachians. Each supercell was broadly classified as “crossing” or “noncrossing” based on their maintenance of supercellular structure during interaction with significant terrain features. Rapid Update Cycle (RUC) and the Rapid Refresh (RAP) model analyses were used to identify key synoptic and mesoscale factors that distinguish between environments supportive of crossing versus noncrossing supercells. Roughly 40% of supercells were sustained crossing significant terrain. Pre-storm synoptic features common among crossing storms (relative to noncrossing storms) included a stronger polar jet, a deeper trough, a north–south-oriented cold front, a strong prefrontal low-level jet, and no wedge front leeward of the terrain. Mesoscale environmental differences were determined using near-storm model soundings collected for each supercell at three locations: upstream initiation, peak terrain, and downstream dissipation. The most significant mesoscale differences were present in the peak and downstream environments, whereby crossing storms encountered stronger low-level vertical shear, greater storm-relative helicity, and greater midlevel moisture than noncrossing storms. Such results reenforce the notion that sustained dynamical support for mesocyclones is critical to supercell maintenance when interacting with significant terrain.
Significance Statement
The ability of isolated storms with rotating updrafts to traverse complex terrain is not well understood and is a notable forecast problem in the eastern United States due to the Appalachian Mountains. This study represents the first systematic analysis of numerous warm-season supercells in the vicinity of the central and southern Appalachians. We focus on synoptic and near-storm mesoscale environmental differences between storms that maintain supercellular structure following terrain interaction (“crossing”) and those that do not (“noncrossing”). The results provide useful environmental metrics for forecasting supercell longevity in the vicinity of the Appalachian Mountains.
Abstract
The Appalachian Mountains have a considerable impact on daily weather, including severe convection, across the eastern United States. However, the impact of the Appalachians on supercells is not well understood, posing a short-term forecast challenge across the region. While case studies have been conducted, there has been no large multicase analysis of supercells interacting with complex terrain. To address this gap, we examined 62 isolated warm-season supercells that occurred within the central or southern Appalachians. Each supercell was broadly classified as “crossing” or “noncrossing” based on their maintenance of supercellular structure during interaction with significant terrain features. Rapid Update Cycle (RUC) and the Rapid Refresh (RAP) model analyses were used to identify key synoptic and mesoscale factors that distinguish between environments supportive of crossing versus noncrossing supercells. Roughly 40% of supercells were sustained crossing significant terrain. Pre-storm synoptic features common among crossing storms (relative to noncrossing storms) included a stronger polar jet, a deeper trough, a north–south-oriented cold front, a strong prefrontal low-level jet, and no wedge front leeward of the terrain. Mesoscale environmental differences were determined using near-storm model soundings collected for each supercell at three locations: upstream initiation, peak terrain, and downstream dissipation. The most significant mesoscale differences were present in the peak and downstream environments, whereby crossing storms encountered stronger low-level vertical shear, greater storm-relative helicity, and greater midlevel moisture than noncrossing storms. Such results reenforce the notion that sustained dynamical support for mesocyclones is critical to supercell maintenance when interacting with significant terrain.
Significance Statement
The ability of isolated storms with rotating updrafts to traverse complex terrain is not well understood and is a notable forecast problem in the eastern United States due to the Appalachian Mountains. This study represents the first systematic analysis of numerous warm-season supercells in the vicinity of the central and southern Appalachians. We focus on synoptic and near-storm mesoscale environmental differences between storms that maintain supercellular structure following terrain interaction (“crossing”) and those that do not (“noncrossing”). The results provide useful environmental metrics for forecasting supercell longevity in the vicinity of the Appalachian Mountains.
Abstract
The skill of operational deterministic turbulence forecasts is impacted by the uncertainties in both weather forecasts from the underlying numerical weather prediction (NWP) models and diagnoses of turbulence from the NWP model output. This study compares various probabilistic turbulence forecasting approaches to quantify these uncertainties and provides recommendations on the most suitable approach for operational implementation. The approaches considered are all based on ensembles of NWP forecasts and/or turbulence diagnostics, and include a multi-diagnostic ensemble (MDE), a time-lagged NWP ensemble (TLE), a forecast-model NWP ensemble (FME), and combined time-lagged MDE (TMDE) and forecast-model MDE (FMDE). Both case studies and statistical analyses are provided. The case studies show that the MDE approach that represents the uncertainty in turbulence diagnostics provides a larger ensemble spread than the TLE and FME approaches that represent the uncertainty in NWP forecasts. The larger spreads of MDE, TMDE, and FMDE allow for higher probabilities of detection for low percentage thresholds at the cost of increased false alarms. The small spreads of TLE and FME result in either hits with higher confidence or missed events, highly dependent on the performance of the underlying NWP model. Statistical evaluations reveal that increasing the number of diagnostics in MDE is a cost-effective and powerful method for describing the uncertainty of turbulence forecasts, considering trade-offs between accuracy and computational cost associated with using NWP ensembles. Combining either time-lagged or forecast-model NWP ensembles with MDE can further improve prediction skill and could be considered if sufficient computational resources are available.
Abstract
The skill of operational deterministic turbulence forecasts is impacted by the uncertainties in both weather forecasts from the underlying numerical weather prediction (NWP) models and diagnoses of turbulence from the NWP model output. This study compares various probabilistic turbulence forecasting approaches to quantify these uncertainties and provides recommendations on the most suitable approach for operational implementation. The approaches considered are all based on ensembles of NWP forecasts and/or turbulence diagnostics, and include a multi-diagnostic ensemble (MDE), a time-lagged NWP ensemble (TLE), a forecast-model NWP ensemble (FME), and combined time-lagged MDE (TMDE) and forecast-model MDE (FMDE). Both case studies and statistical analyses are provided. The case studies show that the MDE approach that represents the uncertainty in turbulence diagnostics provides a larger ensemble spread than the TLE and FME approaches that represent the uncertainty in NWP forecasts. The larger spreads of MDE, TMDE, and FMDE allow for higher probabilities of detection for low percentage thresholds at the cost of increased false alarms. The small spreads of TLE and FME result in either hits with higher confidence or missed events, highly dependent on the performance of the underlying NWP model. Statistical evaluations reveal that increasing the number of diagnostics in MDE is a cost-effective and powerful method for describing the uncertainty of turbulence forecasts, considering trade-offs between accuracy and computational cost associated with using NWP ensembles. Combining either time-lagged or forecast-model NWP ensembles with MDE can further improve prediction skill and could be considered if sufficient computational resources are available.
Abstract
An evaluation framework for tropical cyclone rapid intensification (RI) forecasts is introduced and applied to evaluate the performance of RI forecasts by the operational Hurricane Weather Research and Forecasting (HWRF) Model. The framework is based on the performance of each 5-day forecast cycle, while the conventional RI evaluation is based on the statistics of successful or false RI forecasts at individual lead times. The framework can be used to compare RI forecasts of different cycles, which helps model developers and forecasters to characterize RI forecasts under different scenarios. It also can provide the evaluation of statistical performance in the context of 5-day forecast cycles. The RI forecast of each cycle is assessed using a modified probability-based approach that takes the absolute errors in intensity changes into account. The overall performance of RI forecasts during a given period is assessed based on the fractions of the individual forecast cycles during which RI events are successfully or falsely predicted. The framework is applied to evaluate the performance of RI forecasts by the HWRF Model for the whole life cycle of a single hurricane, as well as for each of the hurricane seasons from 2009 to 2021. The metric based on the probabilities of detection and false alarm rate of RI is compared with that based on the absolute errors in the intensity and intensity change during RI events.
Significance Statement
An evaluation framework for tropical cyclone rapid intensification (RI) forecasts is introduced, focusing on the performance of RI forecasts in each 5-day forecast cycle. The cycle-based approach can help to characterize RI forecasts under different conditions such as certain synoptic scenarios, initial conditions, or vortex structures. It also can be used to assess the overall performance of RI forecasts in terms of the percentages of individual forecast cycles that successfully or falsely predict RI events.
Abstract
An evaluation framework for tropical cyclone rapid intensification (RI) forecasts is introduced and applied to evaluate the performance of RI forecasts by the operational Hurricane Weather Research and Forecasting (HWRF) Model. The framework is based on the performance of each 5-day forecast cycle, while the conventional RI evaluation is based on the statistics of successful or false RI forecasts at individual lead times. The framework can be used to compare RI forecasts of different cycles, which helps model developers and forecasters to characterize RI forecasts under different scenarios. It also can provide the evaluation of statistical performance in the context of 5-day forecast cycles. The RI forecast of each cycle is assessed using a modified probability-based approach that takes the absolute errors in intensity changes into account. The overall performance of RI forecasts during a given period is assessed based on the fractions of the individual forecast cycles during which RI events are successfully or falsely predicted. The framework is applied to evaluate the performance of RI forecasts by the HWRF Model for the whole life cycle of a single hurricane, as well as for each of the hurricane seasons from 2009 to 2021. The metric based on the probabilities of detection and false alarm rate of RI is compared with that based on the absolute errors in the intensity and intensity change during RI events.
Significance Statement
An evaluation framework for tropical cyclone rapid intensification (RI) forecasts is introduced, focusing on the performance of RI forecasts in each 5-day forecast cycle. The cycle-based approach can help to characterize RI forecasts under different conditions such as certain synoptic scenarios, initial conditions, or vortex structures. It also can be used to assess the overall performance of RI forecasts in terms of the percentages of individual forecast cycles that successfully or falsely predict RI events.
Abstract
As part of NOAA’s Hazardous Weather Testbed Spring Forecasting Experiment (SFE) in 2020, an international collaboration yielded a set of real-time convection-allowing model (CAM) forecasts over the contiguous United States in which the model configurations and initial/boundary conditions were varied in a controlled manner. Three model configurations were employed, among which the Finite Volume Cubed-Sphere (FV3), Unified Model (UM), and Advanced Research version of the Weather Research and Forecasting (WRF-ARW) Model dynamical cores were represented. Two runs were produced for each configuration: one driven by NOAA’s Global Forecast System for initial and boundary conditions, and the other driven by the Met Office’s operational global UM. For 32 cases during SFE2020, these runs were initialized at 0000 UTC and integrated for 36 h. Objective verification of model fields relevant to convective forecasting illuminates differences in the influence of configuration versus driving model pertinent to the ongoing problem of optimizing spread and skill in CAM ensembles. The UM and WRF configurations tend to outperform FV3 for forecasts of precipitation, thermodynamics, and simulated radar reflectivity; using a driving model with the native CAM core also tends to produce better skill in aggregate. Reflectivity and thermodynamic forecasts were found to cluster more by configuration than by driving model at lead times greater than 18 h. The two UM configuration experiments had notably similar solutions that, despite competitive aggregate skill, had large errors in the diurnal convective cycle.
Abstract
As part of NOAA’s Hazardous Weather Testbed Spring Forecasting Experiment (SFE) in 2020, an international collaboration yielded a set of real-time convection-allowing model (CAM) forecasts over the contiguous United States in which the model configurations and initial/boundary conditions were varied in a controlled manner. Three model configurations were employed, among which the Finite Volume Cubed-Sphere (FV3), Unified Model (UM), and Advanced Research version of the Weather Research and Forecasting (WRF-ARW) Model dynamical cores were represented. Two runs were produced for each configuration: one driven by NOAA’s Global Forecast System for initial and boundary conditions, and the other driven by the Met Office’s operational global UM. For 32 cases during SFE2020, these runs were initialized at 0000 UTC and integrated for 36 h. Objective verification of model fields relevant to convective forecasting illuminates differences in the influence of configuration versus driving model pertinent to the ongoing problem of optimizing spread and skill in CAM ensembles. The UM and WRF configurations tend to outperform FV3 for forecasts of precipitation, thermodynamics, and simulated radar reflectivity; using a driving model with the native CAM core also tends to produce better skill in aggregate. Reflectivity and thermodynamic forecasts were found to cluster more by configuration than by driving model at lead times greater than 18 h. The two UM configuration experiments had notably similar solutions that, despite competitive aggregate skill, had large errors in the diurnal convective cycle.
Abstract
Tropical cyclone (TC) genesis forecasts during 2018–20 from two operational global ensemble prediction systems (EPSs) are evaluated over three basins in this study. The two ensembles are from the European Centre for Medium-Range Weather Forecasts (ECMWF-EPS) and the MetOffice in the United Kingdom (UKMO-EPS). The three basins include the northwest Pacific, northeast Pacific, and the North Atlantic. It is found that the ensemble members in each EPS show a good level of agreement in forecast skill, but their forecasts are complementary. Probability of detection (POD) can be doubled by taking all the member forecasts in the EPS into account. Even if an ensemble member does not make a hit forecast, it may predict the presence of cyclonic vortices. Statistically, a hit forecast has more nearby disturbance forecasts in the ensemble than a false alarm. Based on the above analysis, we grouped the nearby forecasts at each model initialization time to define ensemble genesis forecasts, and verified these forecasts to represent the performance of the ensemble system. The PODs are found to be more than twice that of the individual ensemble members at most lead times, which is about 59% and 38% at the 5-day lead time in UKMO-EPS and ECMWF-EPS, respectively; while the success ratios are smaller compared with that of the ensemble members. In addition, predictability differs in different basins, and genesis events in the North Atlantic basin are the most difficult to forecast in EPS, and its POD at the 5-day lead time is only 46% and 23% in UKMO-EPS and ECMWF-EPS, respectively.
Significance Statement
Operational forecasting of tropical cyclone (TC) genesis relies greatly on numerical models. Compared with deterministic forecasts, ensemble prediction systems (EPSs) can provide uncertainty information for forecasters. This study examined the predictability of TC genesis in two operational EPSs. We found that the forecasts of ensemble members complement each other, and the detection ratio of observed genesis will be doubled by considering the forecasts of all members, as multiple simulations conducted by the EPS partially reflect the inherent uncertainties of the genesis process. Successful forecasts are surrounded by more cyclonic vortices in the ensemble than false alarms, so the vortex information is used to group the nearby forecasts at each model initialization to define ensemble genesis forecasts when evaluating the ensemble performance. The results demonstrate that the global ensemble models can serve as a valuable reference for TC genesis forecasting.
Abstract
Tropical cyclone (TC) genesis forecasts during 2018–20 from two operational global ensemble prediction systems (EPSs) are evaluated over three basins in this study. The two ensembles are from the European Centre for Medium-Range Weather Forecasts (ECMWF-EPS) and the MetOffice in the United Kingdom (UKMO-EPS). The three basins include the northwest Pacific, northeast Pacific, and the North Atlantic. It is found that the ensemble members in each EPS show a good level of agreement in forecast skill, but their forecasts are complementary. Probability of detection (POD) can be doubled by taking all the member forecasts in the EPS into account. Even if an ensemble member does not make a hit forecast, it may predict the presence of cyclonic vortices. Statistically, a hit forecast has more nearby disturbance forecasts in the ensemble than a false alarm. Based on the above analysis, we grouped the nearby forecasts at each model initialization time to define ensemble genesis forecasts, and verified these forecasts to represent the performance of the ensemble system. The PODs are found to be more than twice that of the individual ensemble members at most lead times, which is about 59% and 38% at the 5-day lead time in UKMO-EPS and ECMWF-EPS, respectively; while the success ratios are smaller compared with that of the ensemble members. In addition, predictability differs in different basins, and genesis events in the North Atlantic basin are the most difficult to forecast in EPS, and its POD at the 5-day lead time is only 46% and 23% in UKMO-EPS and ECMWF-EPS, respectively.
Significance Statement
Operational forecasting of tropical cyclone (TC) genesis relies greatly on numerical models. Compared with deterministic forecasts, ensemble prediction systems (EPSs) can provide uncertainty information for forecasters. This study examined the predictability of TC genesis in two operational EPSs. We found that the forecasts of ensemble members complement each other, and the detection ratio of observed genesis will be doubled by considering the forecasts of all members, as multiple simulations conducted by the EPS partially reflect the inherent uncertainties of the genesis process. Successful forecasts are surrounded by more cyclonic vortices in the ensemble than false alarms, so the vortex information is used to group the nearby forecasts at each model initialization to define ensemble genesis forecasts when evaluating the ensemble performance. The results demonstrate that the global ensemble models can serve as a valuable reference for TC genesis forecasting.
Abstract
Producing an accurate and calibrated probabilistic forecast has high social and economic value. Systematic errors or biases in the ensemble weather forecast can be corrected by postprocessing models whose development is an urgent challenge. Traditionally, the bias correction is done by employing linear regression models that estimate the conditional probability distribution of the forecast. Although this model framework works well, it is restricted to a prespecified model form that often relies on a limited set of predictors only. Most machine learning (ML) methods can tackle these problems with a point prediction, but only a few of them can be applied effectively in a probabilistic manner. The tree-based ML techniques, namely, natural gradient boosting (NGB), quantile random forests (QRF), and distributional regression forests (DRF), are used to adjust hourly 2-m temperature ensemble prediction at lead times of 1–10 days. The ensemble model output statistics (EMOS) and its boosting version are used as benchmark models. The model forecast is based on the European Centre for Medium-Range Weather Forecasts (ECMWF) for the Czech Republic domain. Two training periods 2015–18 and 2018 only were used to learn the models, and their prediction skill was evaluated in 2019. The results show that the QRF and NGB methods provide the best performance for 1–2-day forecasts, while the EMOS method outperforms other methods for 8–10-day forecasts. Key components to improving short-term forecasting are additional atmospheric/surface state predictors and the 4-yr training sample size.
Significance Statement
Machine learning methods have great potential and are beginning to be widely applied in meteorology in recent years. A new technique called natural gradient boosting (NGB) has been released and used in this paper to refine the probabilistic forecast of surface temperature. It was found that the NGB has better prediction skills than the traditional ensemble model output statistics in forecasting 1 and 2 days in advance. The NGB has similar prediction skills with lower computational demands compared to other advanced machine learning methods such as the quantile random forests. We showed a path to employ the NGB method in this task, which can be followed for refining other and more challenging meteorological variables such as wind speed or precipitation.
Abstract
Producing an accurate and calibrated probabilistic forecast has high social and economic value. Systematic errors or biases in the ensemble weather forecast can be corrected by postprocessing models whose development is an urgent challenge. Traditionally, the bias correction is done by employing linear regression models that estimate the conditional probability distribution of the forecast. Although this model framework works well, it is restricted to a prespecified model form that often relies on a limited set of predictors only. Most machine learning (ML) methods can tackle these problems with a point prediction, but only a few of them can be applied effectively in a probabilistic manner. The tree-based ML techniques, namely, natural gradient boosting (NGB), quantile random forests (QRF), and distributional regression forests (DRF), are used to adjust hourly 2-m temperature ensemble prediction at lead times of 1–10 days. The ensemble model output statistics (EMOS) and its boosting version are used as benchmark models. The model forecast is based on the European Centre for Medium-Range Weather Forecasts (ECMWF) for the Czech Republic domain. Two training periods 2015–18 and 2018 only were used to learn the models, and their prediction skill was evaluated in 2019. The results show that the QRF and NGB methods provide the best performance for 1–2-day forecasts, while the EMOS method outperforms other methods for 8–10-day forecasts. Key components to improving short-term forecasting are additional atmospheric/surface state predictors and the 4-yr training sample size.
Significance Statement
Machine learning methods have great potential and are beginning to be widely applied in meteorology in recent years. A new technique called natural gradient boosting (NGB) has been released and used in this paper to refine the probabilistic forecast of surface temperature. It was found that the NGB has better prediction skills than the traditional ensemble model output statistics in forecasting 1 and 2 days in advance. The NGB has similar prediction skills with lower computational demands compared to other advanced machine learning methods such as the quantile random forests. We showed a path to employ the NGB method in this task, which can be followed for refining other and more challenging meteorological variables such as wind speed or precipitation.
Abstract
High winds are one of the key forecast challenges across southeast Wyoming. The complex mountainous terrain across the region frequently results in strong gap winds in localized areas, as well as more widespread bora and chinook winds in the winter season (October–March). The predictors and general weather patterns that result in strong winds across the region are well understood by local forecasters. However, no single predictor provides notable skill by itself in separating warning-level events from others. Random forest (RF) classifier models were developed to improve upon high wind prediction using a training dataset constructed of archived observations and model parameters from the North American Regional Reanalysis (NARR). Three locations were selected for initial RF model development, including the city of Cheyenne, Wyoming, and two gap regions along Interstate 80 (Arlington) and Interstate 25 (Bordeaux). Verification scores over two winters suggested the RF models were beneficial relative to current operational tools when predicting warning-criteria high wind events. Three case studies of high wind events provide examples of the RF models’ effectiveness to forecast operations over current forecast tools. The first case explores a classic, widespread high wind scenario, which was well anticipated by local forecasters. A more marginal scenario is explored in the second case, which presented greater forecast challenges relating to timing and intensity of the strongest winds. The final case study carefully uses Global Forecast System (GFS) data as input into the RF models, further supporting real-time implementation into forecast operations.
Abstract
High winds are one of the key forecast challenges across southeast Wyoming. The complex mountainous terrain across the region frequently results in strong gap winds in localized areas, as well as more widespread bora and chinook winds in the winter season (October–March). The predictors and general weather patterns that result in strong winds across the region are well understood by local forecasters. However, no single predictor provides notable skill by itself in separating warning-level events from others. Random forest (RF) classifier models were developed to improve upon high wind prediction using a training dataset constructed of archived observations and model parameters from the North American Regional Reanalysis (NARR). Three locations were selected for initial RF model development, including the city of Cheyenne, Wyoming, and two gap regions along Interstate 80 (Arlington) and Interstate 25 (Bordeaux). Verification scores over two winters suggested the RF models were beneficial relative to current operational tools when predicting warning-criteria high wind events. Three case studies of high wind events provide examples of the RF models’ effectiveness to forecast operations over current forecast tools. The first case explores a classic, widespread high wind scenario, which was well anticipated by local forecasters. A more marginal scenario is explored in the second case, which presented greater forecast challenges relating to timing and intensity of the strongest winds. The final case study carefully uses Global Forecast System (GFS) data as input into the RF models, further supporting real-time implementation into forecast operations.