Search Results

You are looking at 1 - 10 of 15 items for

  • Author or Editor: Gregory R. Herman x
  • All content x
Clear All Modify Search
Gregory R. Herman and Russ S. Schumacher

Abstract

Approximately 11 years of reforecasts from NOAA’s Second-Generation Global Ensemble Forecast System Reforecast (GEFS/R) model are used to train a contiguous United States (CONUS)-wide gridded probabilistic prediction system for locally extreme precipitation. This system is developed primarily using the random forest (RF) algorithm. Locally extreme precipitation is quantified for 24-h precipitation accumulations in the framework of average recurrence intervals (ARIs), with two severity levels: 1- and 10-yr ARI exceedances. Forecasts are made from 0000 UTC forecast initializations for two 1200–1200 UTC periods: days 2 and 3, comprising, respectively, forecast hours 36–60 and 60–84. Separate models are trained for each of eight forecast regions and for each forecast lead time. GEFS/R predictors vary in space and time relative to the forecast point and include not only the quantitative precipitation forecast (QPF) output from the model, but also variables that characterize the meteorological regime, including winds, moisture, and instability. Numerous sensitivity experiments are performed to determine the effects of the inclusion or exclusion of different aspects of forecast information in the model predictors, the choice of statistical algorithm, and the effect of performing dimensionality reduction via principal component analysis as a preprocessing step. Overall, it is found that the machine learning (ML)-based forecasts add significant skill over exceedance forecasts produced from both the raw GEFS/R ensemble QPFs and from the European Centre for Medium-Range Weather Forecasts’ (ECMWF) global ensemble across almost all regions of the CONUS. ML-based forecasts are found to be underconfident, while raw ensemble forecasts are highly overconfident.

Full access
Gregory R. Herman and Russ S. Schumacher

Abstract

Fifteen years of forecasts from the National Oceanic and Atmospheric Administration’s Second-Generation Global Medium-Range Ensemble Reforecast (GEFS/R) dataset were used to develop a statistical model that generates probabilistic predictions of cloud ceiling and visibility. Four major airports—Seattle–Tacoma International Airport (KSEA), San Francisco International Airport (KSFO), Denver International Airport (KDEN), and George Bush Intercontinental Airport (KIAH) in Houston, Texas—were selected for model training and analysis. Numerous statistical model configurations, including the use of several different machine learning algorithms, input predictors, and internal parameters, were explored and verified through cross validation to develop skillful forecasts at each station. The final model was then compared with both probabilistic climatology-based forecasts and deterministic operational guidance. Results indicated significantly enhanced skill within both deterministic and probabilistic frameworks from the model trained in this study relative to both operational guidance and climatology at all stations. Probabilistic forecasts also showed substantially higher skill within the framework used than any deterministic forecast. Dewpoint depression and cloud cover forecast fields from the GEFS/R model were typically found to have the highest correspondence with observed flight rule conditions of the atmospheric fields examined. Often forecast values nearest the prediction station were not found to be the most important flight rule condition predictors, with forecast values along coastlines and immediately offshore, where applicable, often serving as superior predictors. The effect of training data length on model performance was also examined; it was determined that approximately 3 yr of training data from a dynamical model were required for the statistical model to robustly capture the relationships between model variables and observed flight rule conditions (FRCs).

Full access
Russ S. Schumacher and Gregory R. Herman

Abstract

We applaud Gourley and Vergara for their thorough investigation of the relationship between precipitation and flash flood reports, as well as their inclusion of information from advanced hydrologic model output. We conducted some additional analysis to identify the reasons for the substantial differences between their findings and ours. The primary reason for the differences was found to be temporal sampling. The high temporal resolution of the MRMS dataset, as well as their use of “rolling” accumulation periods, explains most of the discrepancies. For guidance related to real-time warning decisions for flash flooding, Gourley and Vergara’s analyses provide an important new guide and we recommend the use of their results for this purpose. For other applications, including model postprocessing and for precipitation datasets with lower temporal resolution, our results will continue to prove useful.

Restricted access
Gregory R. Herman and Russ S. Schumacher

Abstract

Quantitative precipitation estimate (QPE) exceedances of numerous different heavy precipitation thresholds—including spatially varying average recurrence interval (ARI) and flash flood guidance (FFG) thresholds—are compared among each other and against reported and warned flash floods to quantify existing deficiencies with QPEs and to identify best practices for using QPE for flash flood forecasting and analysis. QPEs from three different sources—NCEP Stage IV Precipitation Analysis (ST4), Climatology Calibrated Precipitation Analysis (CCPA), and Multi-Radar Multi-Sensor (MRMS) QPE—are evaluated across the United States from January 2015 to June 2017. In addition to evaluating different QPE sources, threshold types, and magnitudes, QPE accumulation interval lengths from hourly to daily are considered. Systematic errors with QPE sources are identified, including a radar distance dependence on extreme rainfall frequency in MRMS, spurious occurrences of locally extreme precipitation in the complex terrain of the West in ST4, and insufficient QPEs for many legitimate heavy precipitation events in CCPA. Overall, flash flood warnings and reports corresponded to each other far more than any QPE exceedances. Correspondence between all sources was at a maximum in the East and worst in the West, with ST4, CCPA, and MRMS QPE exceedances locally yielding maximal correspondence in the East, Plains, and West, respectively. Surprisingly, using a fixed 2.5 in. (24 h)−1 proxy outperformed shorter accumulation exceedances and the use of ARIs and FFGs. On regional scales, different ARI exceedances achieved superior performance to the selection of any fixed threshold; FFG exceedances were consistently too rare to achieve optimal correspondence with observed flash flooding.

Full access
Gregory R. Herman and Russ S. Schumacher

Abstract

A continental United States (CONUS)-wide framework for analyzing quantitative precipitation forecasts (QPFs) from NWP models from the perspective of precipitation return period (RP) exceedances is introduced using threshold estimates derived from a combination of NOAA Atlas 14 and older sources. Forecasts between 2009 and 2015 from several different NWP models of varying configurations and spatial resolutions are analyzed to assess bias characteristics and forecast skill for predicting RP exceedances. Specifically, NOAA’s Global Ensemble Forecast System Reforecast (GEFS/R) and the National Severe Storms Laboratory WRF (NSSL-WRF) model are evaluated for 24-h precipitation accumulations. The climatology of extreme precipitation events for 6-h accumulations is also explored in three convection-allowing models: 1) NSSL-WRF, 2) the North American Mesoscale 4-km nest (NAM-NEST), and 3) the experimental High Resolution Rapid Refresh (HRRR). The GEFS/R and NSSL-WRF are both found to exhibit similar 24-h accumulation RP exceedance climatologies over the U.S. West Coast to those found in observations and are found to be approximately equally skillful at predicting these exceedance events in this region. In contrast, over the eastern two-thirds of the CONUS, GEFS/R struggles to predict the predominantly convectively driven extreme QPFs, predicting far fewer events than are observed and exhibiting inferior forecast skill to the NSSL-WRF. The NSSL-WRF and HRRR are found to produce 6-h extreme precipitation climatologies that are approximately in accord with those found in the observations, while NAM-NEST produces many more RP exceedances than are observed across all of the CONUS.

Full access
Gregory R. Herman and Russ S. Schumacher

Abstract

Three different statistical algorithms are applied to forecast locally extreme precipitation across the contiguous United States (CONUS) as quantified by 1- and 10-yr average recurrence interval (ARI) exceedances for 1200–1200 UTC forecasts spanning forecast hours 36–60 and 60–84, denoted, respectively, day 2 and day 3. Predictors come from nearly 11 years of reforecasts from NOAA’s Second-Generation Global Ensemble Forecast System Reforecast (GEFS/R) model and derive from a variety of thermodynamic and kinematic variables that characterize the meteorological regime in addition to the quantitative precipitation forecast (QPF) output from the ensemble. In addition to encompassing nine different atmospheric fields, predictors also vary in space and time relative to the forecast point. Distinct models are trained for eight different hydrometeorologically cohesive regions of the CONUS. One algorithm supplies the GEFS/R predictors directly to a random forest (RF) procedure to produce extreme precipitation forecasts; the second also employs RFs, but the predictors instead undergo principal component analysis (PCA), and extracted leading components are supplied to the RF. In the last algorithm, dimension-reduced predictors are supplied to a logistic regression (LR) algorithm instead of an RF. A companion paper investigated the quality of the forecasts produced by these models and other RF-based forecast models. This study is an extension of that work and explores the internals of these trained models and what physical and statistical insights they reveal about forecasting extreme precipitation from a global, convection-parameterized model.

Full access
Gregory R. Herman, Erik R. Nielsen, and Russ S. Schumacher

Abstract

Eight years’ worth of day 1 and 4.5 years’ worth of day 2–3 probabilistic convective outlooks from the Storm Prediction Center (SPC) are converted to probability grids spanning the continental United States (CONUS). These results are then evaluated using standard probabilistic forecast metrics including the Brier skill score and reliability diagrams. Forecasts are gridded in two different ways: one with a high-resolution grid and interpolation between probability contours and another on an 80-km-spaced grid without interpolation. Overall, the highest skill is found for severe wind forecasts and the lowest skill is observed for tornadoes; for significant severe criteria, the opposite discrepancy is observed, with highest forecast skill for significant tornadoes and approximately no overall forecast skill for significant severe winds. Highest climatology-relative skill is generally observed over the central and northern Great Plains and Midwest, with the lowest—and often negative—skill seen in the West, southern Texas, and the Atlantic Southeast. No discernible year-to-year trend in skill was identified; seasonally, forecasts verified the best in the spring and late autumn and worst in the summer and early autumn. Forecasts are also evaluated in CAPE-versus-shear parameter space; forecasts struggle most in very low shear but also in high-shear, low-CAPE environments. In aggregate, forecasts for all variables verified more skillfully using interpolated probability grids, suggesting utility in interpreting forecasts as a continuous field. Forecast reliability results depend substantially on the interpretation of the forecast fields, but day 1 and day 2–3 tornado outlooks consistently exhibit an underforecast bias.

Full access
Aaron J. Hill, Gregory R. Herman, and Russ S. Schumacher

Abstract

Using nine years of historical forecasts spanning April 2003–April 2012 from NOAA’s Second Generation Global Ensemble Forecast System Reforecast (GEFS/R) ensemble, random forest (RF) models are trained to make probabilistic predictions of severe weather across the contiguous United States (CONUS) at Days 1–3, with separate models for tornado, hail, and severe wind prediction at Day 1 in an analogous fashion to the Storm Prediction Center’s (SPC’s) convective outlooks. Separate models are also trained for the western, central, and eastern CONUS. Input predictors include fields associated with severe weather prediction, including CAPE, CIN, wind shear, and numerous other variables. Predictor inputs incorporate the simulated spatiotemporal evolution of these atmospheric fields throughout the forecast period in the vicinity of the forecast point. These trained RF models are applied to unseen inputs from April 2012 to December 2016, and their forecasts are evaluated alongside the equivalent SPC outlooks. The RFs objectively make statistical deductions about the relationships between various simulated atmospheric fields and observations of different severe weather phenomena that accord with the community’s physical understandings about severe weather forecasting. Using these quantified flow-dependent relationships, the RF outlooks are found to produce calibrated probabilistic forecasts that slightly underperform SPC outlooks at Day 1, but significantly outperform their outlooks at Days 2 and 3. In all cases, a blend of the SPC and RF outlooks significantly outperforms the SPC outlooks alone, suggesting that use of RFs can improve operational severe weather forecasting throughout the Day 1–3 period.

Restricted access
Erik R. Nielsen, Gregory R. Herman, Robert C. Tournay, John M. Peters, and Russ S. Schumacher

Abstract

While both tornadoes and flash floods individually present public hazards, when the two threats are both concurrent and collocated (referred to here as TORFF events), unique concerns arise. This study aims to evaluate the climatological and meteorological characteristics associated with TORFF events over the continental United States. Two separate datasets, one based on overlapping tornado and flash flood warnings and the other based on observations, were used to arrive at estimations of the instances when a TORFF event was deemed imminent and verified to have occurred, respectively. These datasets were then used to discern the geographical and meteorological characteristics of recent TORFF events. During 2008–14, TORFF events were found to be publicly communicated via overlapping warnings an average of 400 times per year, with a maximum frequency occurring in the lower Mississippi River valley. Additionally, 68 verified TORFF events between 2008 and 2013 were identified and subsequently classified based on synoptic characteristics and radar observations. In general, synoptic conditions associated with TORFF events were found to exhibit similar characteristics of typical tornadic environments, but the TORFF environment tended to be moister and have stronger synoptic-scale forcing for ascent. These results indicate that TORFF events occur with appreciable frequency and in complex meteorological scenarios. Furthermore, despite these identified differences, TORFF scenarios are not easily distinguishable from tornadic events that fail to produce collocated flash flooding, and present difficult challenges both from the perspective of forecasting and public communication.

Full access
Jen Henderson, Erik R. Nielsen, Gregory R. Herman, and Russ S. Schumacher

Abstract

The U.S. weather warning system is designed to help operational forecasters identify hazards and issue alerts to assist people in taking life-saving actions. Assessing risks for separate hazards, such as flash flooding, can be challenging for individuals, depending on their contexts, resources, and abilities. When two or more hazards co-occur in time and space, such as tornadoes and flash floods, which we call TORFFs, risk assessment and available actions people can take to stay safe become increasingly complex and potentially dangerous. TORFF advice can suggest contradictory action—that people get low for a tornado and seek higher ground for a flash flood. The origin of risk information about such threats is the National Weather Service (NWS) Weather Forecast Office. This article contributes to an understanding of the warning and forecast system though a naturalistic study of the NWS during a TORFF event in the southeastern United States. Drawing on literature for the Social Amplification of Risk Framework, this article argues that during TORFFs, elements of the NWS warning operations can unintentionally amplify or attenuate one threat over the other. Our results reveal three ways this amplification or attenuation might occur: 1) underlying assumptions that forecasters understandably make about the danger of different threats; 2) threat terminology and coordination with national offices that shape the communication of risks during a multihazard event; and 3) organizational arrangements of space and forecaster expertise during operations. We conclude with suggestions for rethinking sites of amplification and attenuation and additional areas of future study.

Restricted access