• Arthur, A., , Cox G. , , Kuhnert N. , , Slayter D. , , and Howard K. , 2005: The National Basin Delineation Project. Bull. Amer. Meteor. Soc., 86, 14431452.

    • Search Google Scholar
    • Export Citation
  • Ashley, S., , and Ashley W. , 2008: Flood fatalities in the United States. J. Appl. Meteor. Climatol., 47, 806818.

  • Carpenter, T., , Sperfslage J. , , Georgakakos K. , , Sweeney T. , , and Fread D. , 1999: National threshold runoff estimation utilizing GIS in support of operational flash flood warning systems. J. Hydrol., 224, 2144.

    • Search Google Scholar
    • Export Citation
  • Davis, R., 2007: Detecting the entire spectrum of stream flooding with the Flash Flood Monitoring and Prediction (FFMP) program. Preprints, 21st Conf. on Hydrology, San Antonio, TX, Amer. Meteor. Soc., 6B.1. [Available online at https://ams.confex.com/ams/pdfpapers/120738.pdf.]

  • Gourley, J. J., , Erlingis J. , , Smith T. , , Ortega K. , , and Hong Y. , 2010: Remote collection and analysis of witness reports on flash floods. J. Hydrol., 394, 5362.

    • Search Google Scholar
    • Export Citation
  • Gourley, J. J., , Erlingis J. , , Hong Y. , , and Wells E. , 2012: Evaluation of tools used for monitoring and forecasting flash floods in the United States. Wea. Forecasting, 27, 158173.

    • Search Google Scholar
    • Export Citation
  • Gourley, J. J., and Coauthors, 2013: A unified flash flood database over the United States. Bull. Amer. Meteor. Soc., 94, 799805.

  • Helble, T., 2010: Definitions and general terminology. NWS Manual 10-950, 5 pp. [Available online at http://www.nws.noaa.gov/directives/sym/pd01009050curr.pdf.]

  • Koren, V., , Smith M. , , Wang D. , , and Zhang Z. , 2000: Use of soil property data in the derivation of conceptual rainfall-runoff model parameters. Preprints, 15th Conf. on Hydrology, Long Beach, CA, Amer. Meteor. Soc., 103106.

  • Lin, Y., cited 2012: Q&A about the new NCEP stage II/stage IV. Mesoscale Modeling Branch, Environmental Modeling Center, National Centers for Environmental Prediction. [Available online at http://www.emc.ncep.noaa.gov/mmb/ylin/pcpanl/QandA.]

  • Mogil, H., , Monro J. , , and Groper H. , 1978: NWS’s flash flood warning and disaster preparedness programs. Bull. Amer. Meteor. Soc., 59, 690699.

    • Search Google Scholar
    • Export Citation
  • NWS, cited 2012: National Weather Service glossary. [Available online at http://w1.weather.gov/glossary/index.php.]

  • RFC Development Management Team, 2003: Flash Flood Guidance Improvement Team—Final report. River Forecast Center Development Management Team Rep. to the Operations Subcommittee of the NWS Corporate Board, 47 pp. [Available online at http://www.nws.noaa.gov/oh/rfcdev/docs/ffgitreport.pdf.]

  • Schmidt, J., , Anderson A. , , and Paul J. , 2007: Spatially-variable, physically-derived, flash flood guidance. Preprints, 21st Conf. on Hydrology, San Antonio, TX, Amer. Meteor. Soc., 6B.2. [Available online at https://ams.confex.com/ams/pdfpapers/120022.pdf.]

  • Smith, G., 2003: Flash flood potential: Determining the hydrologic response of FFMP basins to heavy rain by analyzing their physiographic characteristics. Rep. to the NWS Colorado Basin River Forecast Center, 11 pp. [Available online at http://www.cbrfc.noaa.gov/papers/ffp_wpap.pdf.]

  • Sweeney, T., 1992: Modernized areal flash flood guidance. NOAA Tech. Rep. NWS HYDRO 44, NOAA/NWS/Hydrologic Research Laboratory, Silver Spring, MD, 21 pp. + an appendix.

  • Sweeney, T., , and Baumgardner T. , 1999: Modernized flash flood guidance. Rep. to NWS Hydrology Laboratory, 11 pp. [Available online at http://www.nws.noaa.gov/oh/hrl/ffg/modflash.htm.]

  • View in gallery

    Map of the 12 CONUS RFCs with the domains for each area outlined in boldface.

  • View in gallery

    Percentage of the study period in which timely FFG data were available.

  • View in gallery

    CONUS flash flooding observations (Storm Data) occurring between 1 Oct 2006 and 31 Aug 2010. Point reports (generally recorded prior to 30 Sep 2007) are plotted in purple and storm-based polygon reports (generally recorded after 1 Oct 2007) are plotted in blue. RFC domain boundaries are in boldface.

  • View in gallery

    Locations of USGS stream gauges used in the study (orange marks). Included gauges have a contributing drainage area of <260 km2, have been assigned action stage heights by the NWS, and had at least one instance of the action stage being exceeded during the study period. RFC domain boundaries are in boldface.

  • View in gallery

    General schematic of the FFG event selection and evaluation process using Storm Data reports. The (top left) 1-h stage IV precipitation for the state of Oklahoma and (top right) 1-h FFG product for 0600 UTC 19 Aug 2007. (bottom) The precipitation grid is divided by the FFG grid to produce the ratio grid. All contiguous ratio grid cells over 1.0 are selected as an FFG event. Then all Storm Data events recorded within 2 h before or 8 h after the valid time of the ratio grid, are used to determine the performance of FFG. Storm Data reports are shown in black, with a circle around each to represent the search radius used. The single black arrows refer to events where FFG correctly forecast the flash flood. The single red arrows refer to events where FFG failed to properly forecast the flash flood.

  • View in gallery

    The CONUS-wide skill of flash flood guidance for a variety of exceedance ratios. Observations are from reports of flash flooding in Storm Data between 1 Oct 2006 and 31 Aug 2010.

  • View in gallery

    As in Fig. 6, but flash flood observations are from exceedances of action stage heights at USGS stations plotted in Fig. 4 for the basin-mean QPE-to-FFG ratio.

  • View in gallery

    Map of FFG skill as verified by Storm Data flash flooding reports when (a) a QPE-to-FFG ratio of 1.0 is considered and (b) when any QPE-to-FFG ratio is considered.

  • View in gallery

    As in Fig. 8, but when verified by USGS stream gauge measurements.

  • View in gallery

    The skill of (a) DFFG, (b) FFPI, (c) LFFG, and (d) GFFG for a variety of exceedance ratios. Observations are from reports of flash flooding in Storm Data between 1 Oct 2006 and 31 Aug 2010.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 217 217 48
PDF Downloads 134 134 39

CONUS-Wide Evaluation of National Weather Service Flash Flood Guidance Products

View More View Less
  • 1 Cooperative Institute for Mesoscale Meteorological Studies, and Advanced Radar Research Center, University of Oklahoma, and NOAA/National Severe Storms Laboratory, Norman, Oklahoma
  • 2 NOAA/National Severe Storms Laboratory, Norman, Oklahoma
  • 3 Advanced Radar Research Center, University of Oklahoma, and NOAA/National Severe Storms Laboratory, Norman, Oklahoma
  • 4 Advanced Radar Research Center, and Department of Civil Engineering and Environmental Science, University of Oklahoma, Norman, Oklahoma
  • 5 NOAA/National Weather Service/Office of Climate, Water, and Weather Services, Silver Spring, Maryland
© Get Permissions
Full access

Abstract

This study quantifies the skill of the National Weather Service’s (NWS) flash flood guidance (FFG) product. Generated by River Forecast Centers (RFCs) across the United States, local NWS Weather Forecast Offices compare estimated and forecast rainfall to FFG to monitor and assess flash flooding potential. A national flash flood observation database consisting of reports in the NWS publication Storm Data and U.S. Geological Survey (USGS) stream gauge measurements are used to determine the skill of FFG over a 4-yr period. FFG skill is calculated at several different precipitation-to-FFG ratios for both observation datasets. Although a ratio of 1.0 nominally indicates a potential flash flooding event, this study finds that FFG can be more skillful when ratios other than 1.0 are considered. When the entire continental United States is considered, the highest observed critical success index (CSI) with 1-h FFG is 0.20 for the USGS dataset, which should be considered a benchmark for future research that seeks to improve, modify, or replace the current FFG system. Regional benchmarks of FFG skill are also determined on an RFC-by-RFC basis. When evaluated against Storm Data reports, the regional skill of FFG ranges from 0.00 to 0.19. When evaluated against USGS stream gauge measurements, the regional skill of FFG ranges from 0.00 to 0.44.

Corresponding author address: Jonathan Gourley, National Weather Center, 120 David L. Boren Blvd., Norman, OK 73072-7303. E-mail: jj.gourley@noaa.gov

Abstract

This study quantifies the skill of the National Weather Service’s (NWS) flash flood guidance (FFG) product. Generated by River Forecast Centers (RFCs) across the United States, local NWS Weather Forecast Offices compare estimated and forecast rainfall to FFG to monitor and assess flash flooding potential. A national flash flood observation database consisting of reports in the NWS publication Storm Data and U.S. Geological Survey (USGS) stream gauge measurements are used to determine the skill of FFG over a 4-yr period. FFG skill is calculated at several different precipitation-to-FFG ratios for both observation datasets. Although a ratio of 1.0 nominally indicates a potential flash flooding event, this study finds that FFG can be more skillful when ratios other than 1.0 are considered. When the entire continental United States is considered, the highest observed critical success index (CSI) with 1-h FFG is 0.20 for the USGS dataset, which should be considered a benchmark for future research that seeks to improve, modify, or replace the current FFG system. Regional benchmarks of FFG skill are also determined on an RFC-by-RFC basis. When evaluated against Storm Data reports, the regional skill of FFG ranges from 0.00 to 0.19. When evaluated against USGS stream gauge measurements, the regional skill of FFG ranges from 0.00 to 0.44.

Corresponding author address: Jonathan Gourley, National Weather Center, 120 David L. Boren Blvd., Norman, OK 73072-7303. E-mail: jj.gourley@noaa.gov

1. Introduction

Flash floods are the second most deadly weather-related hazard in the United States behind extreme heat (Ashley and Ashley 2008). The National Weather Service Glossary (NWS 2012) defines a flash flood as follows:

A flash flood is a rapid and extreme flow of high water into a normally dry area, or a rapid water level rise in a stream or creek above a predetermined flood level, beginning within 6 h of the causative event (e.g., intense rainfall, dam failure, and ice jam).

This 6-h threshold is used within the NWS to divide hydrologic forecasting and monitoring responsibility between regional River Forecast Centers (RFCs), who deal with fluvial floods that take place over longer time scales, and local Weather Forecast Offices (WFOs), who deal with flash floods that happen on shorter time scales (Gourley et al. 2012).

Despite recent advances made in hydrologic modeling, quantitative precipitation estimation, and numerical weather prediction, some components of the system the NWS uses to forecast and monitor dangerous flash flood events are 40 yr old (RFC Development Management Team 2003). This system, which includes flash flood guidance (FFG), was originally implemented after a deadly 1969 flash flood in Ohio (Schmidt et al. 2007). FFG was significantly modified in 1992 (Sweeney and Baumgardner 1999) with additional changes undertaken in the last 10 yr (Schmidt et al. 2007; Smith 2003). Henceforth, the pre-1992 product will be referred to as original FFG. The product used between 1992 and the early 2000s (and still used at some RFCs as of 2012) will be referred to as lumped FFG (LFFG). Finally, additional FFG products developed at the Colorado Basin River Forecast Center (CBRFC) in 2003, at the Arkansas-Red Basin River Forecast Center (ABRFC) in 2005, and at the Middle Atlantic River Forecast Center (MARFC) will be called flash flood potential index (FFPI), gridded flash flood guidance (GFFG), and distributed flash flood guidance (DFFG), respectively. Flash flood guidance is a broader term that encompasses all of these above methods and refers to the general product issued by the RFCs for the purposes of helping WFOs monitor and forecast flash flooding events.

FFG is defined as the amount of rain required over a given time and area to produce bank-full conditions on small streams; these conditions are considered to be associated with flash flooding. FFG is produced at 12 RFCs located throughout the continental United States (see Fig. 1). Regardless of the exact flavor of FFG being produced, this RFC-derived FFG is then delivered to the NWS’s network of WFOs. WFOs overlay their most accurate and timely precipitation estimates onto FFG values, and areas where precipitation exceeds FFG are potential candidates for flash flood warnings or other actions on the part of WFO forecasters (Gourley et al. 2012).

Fig. 1.
Fig. 1.

Map of the 12 CONUS RFCs with the domains for each area outlined in boldface.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

Although the NWS maintains verification statistics regarding flash flood warnings, there is presently no system to provide feedback to WFOs or RFCs about the effectiveness of the FFG product that is typically used to issue these flash flood warnings (RFC Development Management Team 2003). Therefore, this study will, for the first time, provide these verification statistics for FFG over the entire conterminous United States (CONUS) from 1 October 2006 to 31 August 2010. The general methodology is similar to that outlined in Gourley et al. (2012); they produced similar statistics over the ABRFC area of responsibility for the 2006–08 period. The specific study objectives are to determine the benchmark skill of FFG across the entire CONUS, to determine the benchmark skill of FFG at each CONUS RFC, to offer specific recommendations to the NWS regarding improvements to the use of FFG at RFCs and WFOs, and to provide the research community with information about the current state of U.S. flash flood forecasting and monitoring. The next section provides historical context for FFG leading up to the current status in the NWS. We provide the technical details of each of the operational FFG-generation methods in section 3 and the analysis methodology in section 4. Section 5 presents the results of this study, and concluding remarks are supplied in section 6.

2. The history and current status of flash flood guidance

From the 1940s to the 1970s, the average annual number of deaths due to flash flooding tripled, while the monetary damages due to flash flooding increased more than sixfold (Mogil et al. 1978). The NWS flash flood warning program was not deployed nationally until 1971, but severe thunderstorms and tornadoes had national warning programs for years or decades before that. In the early 1970s, RFCs already produced an early FFG product—original FFG—based “on drainage basin configuration and past rainfall” (Mogil et al. 1978). Through the 1970s, methods of estimating rainfall from convective activity varied from office to office within the NWS while additional local programs, including flash flood alarm systems, were explored. However, the “critical element” in any of these local programs remained the RFC-generated FFG product (Mogil et al. 1978).

During the 1970s and 1980s, the NWS developed the NWS River Forecast System (NWSRFS; RFC Development Management Team 2003). This system was initially only used to produce forecasts for larger-scale fluvial floods, but provided national consistency between the RFCs for those particular products. By the 1980s, the use of NWSRFS to produce FFG was being explored, as well, due to the local and regional differences between the FFG being produced at each RFC. This work eventually resulted in “modernized FFG” or lumped FFG (Sweeney 1992).

There were two main impetuses behind the development of LFFG: the deployment of the Advanced Weather Interactive Processing System (AWIPS) and the more-accurate and higher-resolution precipitation estimates available from the Weather Surveillance Radar-1988 Doppler (WSR-88D) network (Sweeney and Baumgardner 1999). An additional benefit was the increased consistency in the FFG generation method at each RFC (Sweeney 1992). This modernization project also put FFG generation into the same framework as the RFC river stage forecast system that had been developed during the 1980s, and now national standards were available to guide RFCs in the process of generating FFG products. In 2003, the RFC Development Management Team issued a report regarding the state of FFG at that time as well as several recommendations regarding the future direction of the program. The most significant advance described in the report is the delineation of small, truly flash flood–scale basins. The National Basin Delineation Project (NBDP) used geographic information system (GIS) technology to produce flash flood–scale basin datasets for each NWS WFO (Arthur et al. 2005). These basins then are used as part of the Flash Flood Monitoring and Prediction (FFMP) system, which was deployed as part of the AWIPS software package. The scale of these small flash flood basins is much more similar to the scale of precipitation estimates from the WSR-88Ds. The average basin area traced out by the NBDP is around 10 km2 (RFC Development Management Team 2003) and the minimum size is 5 km2 (Davis 2007). This does not eliminate the resolution gap between WSR-88D precipitation estimates and the lumped FFG basins used at the RFCs (300–5000 km2). In other words, the FFG was still representative of processes on the large basin scale, not the newly computed small basins. The RFC Development Management Team (2003) recognized this limitation, but due to computational requirements and scientific limitations, this particular issue was still partially unresolved at the end of the study period.

The RFC Development Management Team (2003) primarily focused on suggesting small changes to the FFG system rather than major modifications or a complete overhaul. Several of the issues described in that report are still observed in the FFG mosaics used in this study (1 October 2006–31 August 2010). Some Hydrologic Rainfall Analysis Projection (HRAP) grid cells always have missing FFG values; this problem occurs within RFC domains and on the boundaries between domains (see Fig. 2). Other grid cells on RFC boundaries have multiple overlapping (and different) FFG values. Additionally, FFG values can exhibit sharp gradients along RFC boundaries, in many cases for no hydrologic reason. These problems are due to software and hardware limitations, hydrologic model parameter differences between RFCs, or even model differences between RFCs. Most seriously, no national verification program for FFG has ever been developed (RFC Development Management Team 2003).

Fig. 2.
Fig. 2.

Percentage of the study period in which timely FFG data were available.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

During and immediately after the RFC Development Management Team recommendations, some RFCs began to modify or replace the lumped FFG product. However, lumped FFG was still being produced at multiple RFCs at the end of the study period. Starting in 2003, the Colorado Basin RFC began testing a replacement for LFFG. This replacement is a FFPI method. In 2005, the Arkansas-Red Basin RFC deployed a method known as gridded flash flood guidance. Over the next few years, both the FFPI and GFFG methods were implemented at additional RFCs. A fourth method is used at the Middle Atlantic RFC and is referred to as distributed flash flood guidance in this paper. Technical details about all four FFG generation methods are provided in section 3. The history of the NWS flash flood warning program involved only minor alterations for its first 20 yr of existence. With the advent of the modernized FFG program at the beginning of the 1990s, RFC methodologies were mostly standardized across the United States. However, several problems continued to be noted by forecasters and others throughout the next 10 yr. In the last decade, hydrologists and meteorologists at some RFCs have developed their own FFG products. This patchwork of different methods of FFG generation (LFFG, FFPI, GFFG, and DFFG) was the state of flash flood operations in the NWS as of 2010.

3. Technical details of flash flood guidance

a. Lumped flash flood guidance

Sweeney and Baumgardner (1999) describe the methodology used to generate LFFG values. Rainfall-runoff models are normally used to determine the amount of runoff generated by a given amount of rainfall and a particular soil moisture condition. In LFFG, a rainfall-runoff model is run in reverse: the FFG value transmitted to a WFO is the amount of rainfall required to cause bank-full (i.e., flooding) conditions on small streams at the basin outlet. Thus, in this process, it is necessary to know the state of two variables: soil moisture and threshold runoff (ThreshR). Soil moisture data for LFFG comes from the same information used by RFCs to produce river stage forecasts on large basins. The function ThreshR is used to represent basin geography; ThreshR values are most easily determined at gauged basin outlets but some RFCs have undertaken field campaigns to determine ThreshR at ungauged locations, as well. Because RFCs must produce FFG over large areas, ThreshR values are usually contoured between gauged points to produce areal averages (Gourley et al. 2012). In some RFCs, a single ThreshR value is assigned to entire states while in others, each county has its own ThreshR value assigned. Other offices delineated basins on the order of 1000 km2, calculated ThreshR at the headwaters of these basins, and then averaged these on a county-by-county basis. Thus, ThreshR values are not always representative of small basin hydrology. Additional information on the available threshold runoff calculation methods can be found in Carpenter et al. (1999).

LFFG works with any sort of soil moisture information and does not require a specific rainfall-runoff model, though most RFCs utilize the Sacramento Soil Moisture Accounting (SAC-SMA) model. Although the exact issuance schedule of LFFG varies, if precipitation data are available on schedule at the RFCs, the soil moisture information used for RFC river stage forecasts is updated every 6 h and thus LFFG can be updated every 6 h, as well (Sweeney and Baumgardner 1999). A major limitation of LFFG is that model parameters are constant for the basins of 300–5000 km2 over which the various rainfall-runoff models run (RFC Development Management Team 2003). However, flash flooding events are often observed on basins of much smaller size and the lumped-parameter method does not allow for variability in soil moisture conditions within each lumped basin. Additionally, even though the rainfall-runoff models used in the generation process are calibrated with 6-h time steps, flash flood events take place on time scales of less than 6 h and so these rainfall-runoff models may not be best suited for flash flood forecasting (RFC Development Management Team 2003).

b. Flash flood potential index

The Colorado Basin RFC covers an area where flash flooding is not necessarily associated with bank-full conditions on small streams (Smith 2003). Additionally, in this area, soil moisture is believed to be a less important component of determining when and where a flash flood might occur (RFC Development Management Team 2003). Therefore, in 2003 and 2004, CBRFC undertook a project to develop FFPI, designed as a replacement for LFFG. FFPI uses gridded physiographic information [soil characteristics, vegetation cover (including forest density), slope, land use and urbanization, and seasonal effects like wildfire] to determine the relative likelihood of flash flooding in a given FFMP basin (Smith 2003). The FFPI method developed at CBRFC was eventually deployed at the California Nevada RFC (CNRFC) in 2008; the Northwest RFC (NWRFC) also uses a similar method.

Each piece of gridded physiographic information referenced above was resampled to a consistent resolution and then each HRAP grid cell was assigned a flash flood potential index on a scale from 1 (least hydrologically sensitive to rainfall) to 10 (most hydrologically sensitive to rainfall) for each layer of data. An average of these potentials yields the final FFPI product. Initially, all layers were given equal weight except for the slope parameter, which was weighted above the other data layers.

The slope data are derived from a USGS digital elevation model dataset and sampled at a resolution of 400 m; thus, the smallest basins that can be defined from this process will have a drainage area of roughly 60 km2. Although this represents a significant resolution improvement over the larger lumped FFG basins, it is still coarser than the FFMP basins or an individual HRAP grid cell. The second piece of data used in FFPI generation is soil type. A total of 16 possible soil types are defined by the State Soil Geographic Data from the Natural Resources Conservation Service (STATSGO) dataset. Land-use information comes from the Landsat satellite program and forest density was obtained from satellite imagery.

It is possible for FFPI values to change over seasonal time scales. Some of these changes are due to WFO requests to alter the FFPI of specific basins. Other changes are due to wildfires; after such events, an FFPI can be modified to reflect changes in soil permeability and forest cover. Still other variations in the FFPI grid are due to seasonal changes in vegetation cover and snow cover (RFC Development Management Team 2003).

The original FFPI product was interpolated onto FFMP basins over the CBRFC area of responsibility and was used at individual WFOs as a supplement to LFFG, not as a replacement for LFFG. Positive forecaster feedback resulted in CBRFC eventually replacing LFFG with FFPI. In operations at the Colorado Basin RFC, FFPI basin susceptibilities are used to adjust a 1 in. h−1 (2.54 mm h−1) rainfall rate; this modified rule of thumb is then used as FFG by the applicable WFOs. In the NWRFC and CNRFC regions, FFPI values are similarly used for assigning initial basin susceptibilities and are then scaled to produce the final flash flood guidance values for transmission to WFOs.

c. Gridded flash flood guidance

In 2005 and 2006, the ABRFC deployed a new method of producing FFG known as gridded FFG (Schmidt et al. 2007). This method imitates LFFG but increases the FFG’s spatial resolution to that of the HRAP grid (the cells are nominally 4 km on a side, though the actual dimensions vary with latitude). This resolution is much closer to the resolution of the FFMP basins and thus mitigates, but does not eliminate, the issue of scale mismatch between LFFG basins and FFMP basins noted earlier in this paper. Gourley et al. (2012) found that the distribution of GFFG values over the ABRFC domain from 2006 to 2008 was roughly comparable with the distribution of LFFG values over the same area and time period, which was an anticipated result, as discussed in Schmidt et al. (2007). In 2007 and 2008, the GFFG method was extended to other RFCs, including the Lower Mississippi RFC (LMRFC), the Southeast RFC (SERFC), and the West Gulf RFC (WGRFC). By the end of 2008, GFFG was in use across the entire southeastern and south-central United States.

GFFG uses a distributed hydrologic model to monitor the soil moisture component of FFG, unlike the older lumped model used in LFFG (Schmidt et al. 2007). Like LFFG, the GFFG method requires a soil moisture model, a rainfall-runoff model, and the determination of ThreshR values. Land-use and soil-type datasets are combined to yield an Natural Resources Conservation Service (NRCS) curve number (CN). Higher curve numbers are associated with larger runoff generation potential and therefore with a greater flash flood potential. These curve numbers are then adjusted to account for recent soil moisture conditions (Schmidt et al. 2007). In GFFG, this is accomplished by calculating a saturation ratio for each grid cell. The NWS Hydrology Laboratory Research Distributed Hydrologic Model (HL-RDHM) is run in continuous mode and the upper-zone tension and free-water content (UZTWC and UZFWC) parameters are obtained (Gourley et al. 2012). Then, the maximum possible values for each of these parameters are estimated using a method outlined in Koren et al. (2000). The two model parameters are added and the ratio of these parameters to their maximum possible values becomes the saturation ratio at each grid cell (Schmidt et al. 2007). This saturation ratio is used to adjust the NRCS CN to a final value.

The final remaining variable in the GFFG system is ThreshR. In this method, a 3-h design rainfall event that corresponds to a 5-yr return period is used to produce the precipitation for input to the curve number model. The runoff produced by this model is then treated as the flow at flood stage (Schmidt et al. 2007). The unit hydrograph peak is found using the NRCS curve number method (Gourley et al. 2012), and requires basin slope, rainfall duration, soil moisture conditions, basin area, rainfall duration, and other characteristics (Schmidt et al. 2007). Then, as in the LFFG method, ThreshR is the ratio of flow at flood stage to the unit hydrograph peak. Schmidt et al. (2007) note that the GFFG ThreshR is lower at high elevations and higher at low elevations and contains greater spatial variability than the legacy ThreshR values.

The final calculation of GFFG values is accomplished using the adjusted curve number S and the ThreshR values at each grid cell. The following equation,
e1
is solved for P, the precipitation, where Q is the ThreshR value and S is the soil-moisture-adjusted curve number. In addition, P is the final gridded flash flood guidance value (Schmidt et al. 2007).

d. Distributed flash flood guidance

A fourth type of FFG is generated at the Middle Atlantic RFC. It uses the continuous antecedent precipitation index (API) model on the HRAP grid for the soil moisture component of FFG. The net result is a spatially distributed FFG product with similar spatial variability to the GFFG method derived at ABRFC.

4. Methodology

a. Study domain

The spatial domain of the study covers the entire CONUS (see Fig. 1) while the temporal range of the study is 1 October 2006–31 August 2010. The start time of the study was fixed due to the availability of flash flooding reports from the online NWS Performance Management system; only reports of flash flooding starting on or after 1 October 2006 are available through that website. The end date of the study was set by the end of the date range of the USGS streamflow measurements that make up the flash flood event database described in Gourley et al. (2013).

b. Datasets

Four major datasets were obtained to complete this study: quantitative precipitation estimates (QPEs), the FFG products, and two observation datasets, one derived from flash flood reports in the NWS publication Storm Data and another derived from USGS stream gauge measurements.

The QPE data used in this project are the hourly stage IV products generated operationally at all CONUS RFCs. Hourly stage IV accumulations were available for all of the CONUS except for the state of Washington, the northern third of Oregon, and the Idaho panhandle. Stage IV precipitation estimates include information from the WSR-88Ds and rain gauges and benefit from manual quality control procedures undertaken by RFC personnel (Gourley et al. 2012). Algorithms used to generate stage IV products are not consistent across the country. At most RFCs, the multisensor precipitation estimator (MPE) is used operationally. However, the three western RFCs use the Mountain Mapper methodology and the ABRFC uses the P1 methodology (Lin 2012). Differences in the algorithms used to produce stage IV precipitation estimates across the country likely contribute to some of the skill differences observed between FFPI and the other three FFG methods. It is also important to note that, in operations, the WFOs use radar-derived precipitation estimates and not stage IV to monitor basins for FFG exceedance, so this study cannot exactly replicate the operational conditions under which FFG is normally used. However, stage IV is a manually quality-controlled product and is the most accurate national precipitation estimate archived by the National Centers for Environmental Prediction (NCEP) (Lin 2012). The NCEP archive contained stage IV mosaics for more than 99.7% of the hours in the study period.

Mosaics of operational FFG were obtained from the National Precipitation Verification Unit (NPVU). The FFG mosaics cover the time period from 1 October 2006 to 31 August 2010. FFG mosaics are produced at NPVU every 6 h (at 0000, 0600, 1200, and 1800 UTC), but the regular issuance schedule varies between RFCs (see Table 1). A method was developed to fill in the gaps in the national FFG mosaic for times and RFC domains where FFG was not issued or was not available. In this procedure, the most recent valid FFG issuance for a given RFC was copied forward to the next mosaic time if no new FFG issuance was available. For instance, in cases when ABRFC did not issue FFG at 0600 UTC, their FFG from 0000 UTC was copied forward in time and used to populate the 0600 UTC FFG mosaic. In some situations, an RFC may not have issued any FFG product in a 24-h period. In that case, those areas were left blank in the FFG mosaics for that day. NPVU archives contained national FFG mosaics for more than 97% of the total time in the study period (see Fig. 2). The dark areas in the western United States are due to situations where no FFG products were issued on certain days.

Table 1.

FFG issuance schedule for each RFC.

Table 1.

Additionally, some RFCs changed their methods for generating FFG during the study period (see Table 2). These dates and times were established via visual examination of the FFG grids in GIS software and will be used to divide the nation geographically and spatially into LFFG, DFFG, GFFG, and FFPI areas in the results section of this paper.

Table 2.

Type of FFG produced by each RFC; changes in generation method are noted if applicable.

Table 2.

All flash flooding reports recorded by the National Weather Service in Storm Data between 1 October 2006 and 31 December 2011 were downloaded from the online NWS Performance Management system and processed for use in this study. This resulted in 19 419 reports from across the entire United States. After filtering the database to include only those flash floods caused by heavy rains, occurring in the CONUS, and during the study period, 14 827 Storm Data events remained (see Fig. 3). These reports are recorded by NWS forecasters and are used by the NWS to produce verification statistics regarding flash flood warnings. The reports contain information about the timing (start and end times), location (WFO, state, county, NWS region, time zone, latitude, longitude, and distance to nearest place name), meteorological conditions, injuries, fatalities, and monetary damages. The sourcing of these reports varies, but most are from emergency management officials, law enforcement, trained spotters, off-duty NWS employees, broadcast media, and the public. The NWS observations fall into two categories: point based and storm based. Because storm-based observations are not of a standard size, all NWS observations (both point based and storm based) were normalized to a circular area of 49 HRAP grid cells for this study. Gourley et al. (2013) contains additional information about the character and development of this NWS report database.

Fig. 3.
Fig. 3.

CONUS flash flooding observations (Storm Data) occurring between 1 Oct 2006 and 31 Aug 2010. Point reports (generally recorded prior to 30 Sep 2007) are plotted in purple and storm-based polygon reports (generally recorded after 1 Oct 2007) are plotted in blue. RFC domain boundaries are in boldface.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

Gourley et al. (2013) obtained an archive of streamflow data from July 1927 to September 2010 for 10 106 gauges operated by the USGS. We consider only small basins on the flash flood scale (contributing drainage area of less than 260 km2), and those to which the NWS has assigned an “action stage.” An action stage is defined by Helble (2010) as “the stage which when reached by a rising stream, lake, or reservoir represents the level where the NWS or a partner/user needs to take some type of mitigation action in preparation for possible significant hydrologic activity.” Action stage is used as the stage height of interest in this study because it allows for the consideration of more events than could have been included if “minor flood” stage was used instead. Although action stage is defined differently than bank-full conditions, the two are extremely close at many gauged sites. On average, for the gauges selected for use in this study for which both bank-full and action stage information is available, the action stage height is 0.05 m less than the bank-full height. Only those sites with at least one action stage exceedance during the study period are included. This resulted in a total of 244 gauges being used in the analysis. Each action stage exceedance is treated as an observed flash flood; 2244 of these events make up the final dataset used in the USGS analysis portion of this study. These USGS events occur in all 12 of the CONUS RFC domains and the included stream gauge sites are located at different elevations and in different hydroclimatic regimes (see Fig. 4). Each gauge is associated with an average of nine flooding events during the roughly 4-yr study period. The USGS data do not contain floods associated with ungauged basins and overland flow.

Fig. 4.
Fig. 4.

Locations of USGS stream gauges used in the study (orange marks). Included gauges have a contributing drainage area of <260 km2, have been assigned action stage heights by the NWS, and had at least one instance of the action stage being exceeded during the study period. RFC domain boundaries are in boldface.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

c. Flash flood guidance evaluation procedure

In normal operations, forecasters update FFG values several times a day (see Table 1). FFG values are predicated upon changes in soil moisture occurring as a result of previous rainfall. Consider an FFG mosaic nominally valid at 0000 UTC. If heavy rainfall and/or flooding are ongoing at the time this mosaic was issued, a time series of FFG values will exhibit a sharp dip. This is because the new FFG mosaic (0000 UTC) takes into account 6 h (or possibly more, if the RFC in question updates FFG less than every 6 h) of antecedent rainfall not included in the previously issued FFG mosaic. The heavy rainfall occurring between 1800 and 0000 UTC has the effect of reducing the FFG, because correspondingly less rainfall would be needed after 0000 UTC to cause flash flooding. Problems arising from these sharp differences in FFG values from issuance to issuance are mitigated in operations by simply resetting the QPE to which the FFG is compared every time a new FFG grid is issued. However, in this evaluation, QPE cannot be “reset” with each new FFG grid because the 3- and 6-h FFG results are compared to a rolling sum of hourly stage IV QPE grids. Therefore, a precipitation-weighted FFG interpolation procedure outlined in Gourley et al. (2012) is used to create interim hourly mosaics of 1-, 3-, and 6-h FFG. This procedure eliminates the problem of sharp jumps or changes in FFG values from issuance to issuance. In all cases, original FFG grids are preserved as they were originally issued. In the example above, the 1800 and 0000 UTC grids remain unchanged and only the interim products valid at 1900, 2000, 2100, 2200, and 2300 UTC are affected by the interpolation procedure. The following equation explains the interpolation process:
e2
where FFGcurrent represents the interpolated FFG at that exact hour, FFGnext is the FFG product issued by the RFC at the next normal issuance time, FFGprevious is the FFG product issued by the RFC at the previous normal issuance time, and wt is a precipitation weight given by
e3
The result of this is a series of national hourly FFG mosaics. For each hour, the stage IV precipitation estimate is divided by the corresponding interpolated FFG mosaic. As in Gourley et al. (2012), FFG skill at QPE-to-FFG ratios of 0.5, 0.75, 1.0, 1.25, 1.5, 2.0, 2.5, and 3.0 is examined in this study.

The evaluation of FFG proceeds on a case-by-case basis (see Fig. 5). For each QPE-to-FFG ratio of interest, we search the interpolated hourly ratio mosaics for collections of adjacent HRAP grid cells where the ratio of interest is exceeded; each of these collections of grid cells is saved as a forecast flash flood event (also referred to hereafter as an FFG event). The ratios of interest are exceedance thresholds, not discreet intervals. Then, the next ratio mosaic in time is searched, and if two events in subsequent mosaics overlap at all, they are combined into one event. This procedure continues throughout the entire study period. In this manner, a series of FFG events is saved for 1-, 3-, and 6-h FFG products at each of the eight QPE-to-FFG ratios of interest. The HRAP grid cell in which each FFG event is centered is recorded for comparison with the actual reported NWS flash flooding events.

Fig. 5.
Fig. 5.

General schematic of the FFG event selection and evaluation process using Storm Data reports. The (top left) 1-h stage IV precipitation for the state of Oklahoma and (top right) 1-h FFG product for 0600 UTC 19 Aug 2007. (bottom) The precipitation grid is divided by the FFG grid to produce the ratio grid. All contiguous ratio grid cells over 1.0 are selected as an FFG event. Then all Storm Data events recorded within 2 h before or 8 h after the valid time of the ratio grid, are used to determine the performance of FFG. Storm Data reports are shown in black, with a circle around each to represent the search radius used. The single black arrows refer to events where FFG correctly forecast the flash flood. The single red arrows refer to events where FFG failed to properly forecast the flash flood.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

In the USGS stage height evaluation, the rasterized USGS basins plotted on the HRAP grid are compared to the interpolated hourly QPE-to-FFG ratio grids. Ratio data for the HRAP grid cells that compose a given USGS basin are extracted and stored for each hour. Then, the mean QPE-to-FFG ratio is calculated at each hour for each USGS basin in the analysis. When the basin-mean ratio exceeds a threshold of interest (0.5, 0.75, 1.0, and the other values listed above), the start time, end time, and basin in which this exceedance occurred are recorded.

For both the NWS and USGS evaluations, the list of FFG events can simply be compared in space and time with the list of reported Storm Data events or with the list of recorded USGS events. If an FFG event centroid falls within the search radius (the normalized 49 HRAP grid cell area described above) of an NWS event centroid, a “hit” is recorded. “Misses” occur when no FFG event centroid falls within the search radius of an NWS event centroid, and “false alarms” occur when there is an FFG event with no associated NWS event. For USGS events, no search radius is used. Instead, the centroid of an FFG event must be located within the drainage area of the gauge where the flooding event was recorded. As in Gourley et al. (2012), time buffers are applied to each USGS and NWS report. Eight hours are added to the start time of each report (2 h of this are for general uncertainty in the timing of the flooding reports and estimated rainfall and 6 h of this allow for heavy rainfall to translate into surface flooding impacts), and 2 h are added to the end time of each report (again for uncertainty).

From hits, misses, and false alarms, a contingency table is populated (see Tables 3 and 4). Then, the standard metrics of probability of detection (POD), false alarm rate (FAR), and critical success index (CSI) (also called skill in this study) are computed:
e4
e5
e6
A CSI value of 1.0 indicates perfect forecast skill while a CSI of 0.0 indicates the forecast had no skill.
Table 3.

Contingency table used to evaluate flash flood forecasts.

Table 3.
Table 4.

Contingency table for the CONUS-wide Storm Data evaluation of FFG at a QPE-to-FFG ratio of 1.0.

Table 4.

5. Results

a. CONUS skill of operational flash flood guidance

Skill indices are presented for operational 1-, 3-, and 6-h FFG products running over the CONUS from 1 October 2006 to 31 August 2010 at eight different QPE-to-FFG ratios and evaluated against Storm Data reports (see Fig. 6). The CONUS-wide CSI of FFG ranges between 0.01 and 0.07; the latter value is achieved by 1-h FFG at a QPE-to-FFG ratio of 1.5. In particular, for the higher QPE-to-FFG ratios (2.0, 2.5, and 3.0), the FFG tool fails to forecast, or catch, a large number of the observed Storm Data events. This means the probability of detection is much better for low QPE-to-FFG ratios but the effect is counterbalanced by a higher false alarm rate at those ratios. The net result is, in the Storm Data analysis, that FFG performs best when considering moderate ratios. In general, the 1- and 3-h FFG products display similar skill values at all ratios examined while 6-h FFG is less skillful.

Fig. 6.
Fig. 6.

The CONUS-wide skill of flash flood guidance for a variety of exceedance ratios. Observations are from reports of flash flooding in Storm Data between 1 Oct 2006 and 31 Aug 2010.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

A similar analysis, but where FFG is evaluated using USGS stream gauge flood stage heights instead of Storm Data reports, shows slightly higher skill values (see Fig. 7). Now the CSI of FFG ranges between 0.01 and 0.20, where the highest skill occurs with the 3-h FFG product at a ratio of 0.5. Unlike in the Storm Data analysis, here the FFG skill is highest at low QPE-to-FFG ratios and declines sharply with increasing ratio. Additionally, the 6-h product is the most skillful and the 1-h product is the least skillful, with the 3-h FFG generally falling somewhere in between the other two. Gourley et al. (2012) also reported these differences between USGS and NWS analyses of FFG skill, where the skill curves using USGS reports are pushed up and to the left compared with the Storm Data skill curves. The high false alarm rates in the NWS analysis can be partially explained by underreporting of flash flood events in sparsely populated regions of the United States. The authors are more confident about conclusions drawn from the Storm Data analysis because there are many more events available in that dataset. Not all small basins gauged by the USGS have defined action stages, so the sample size of gauges and events available for the USGS analysis is much smaller than desired.

Fig. 7.
Fig. 7.

As in Fig. 6, but flash flood observations are from exceedances of action stage heights at USGS stations plotted in Fig. 4 for the basin-mean QPE-to-FFG ratio.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

b. Flash flood guidance skill by River Forecast Center

FFG values are generated at the RFC level, and different methods of generating FFG were in operational use during the study period. The number of Storm Data events in each RFC domain varies, as does the number of events normalized for the area of each RFC domain (see Table 5). The three western RFCs (Northwest, California Nevada, and Colorado Basin) have the lowest number of events per 1000 km2. This is probably due to underreporting of events; these areas have low population densities relative to the rest of the United States, and FFG in these areas displays very high false alarm rates, which could indicate that forecast events are not being observed. The event densities in the other nine RFCs display less variance, suggesting underreporting of events is less of a problem in those areas.

Table 5.

Number of events per RFC in Storm Data during the study period.

Table 5.

When Storm Data reports are used to evaluate FFG at a QPE-to-FFG ratio of 1.0, the DFFG product developed at the Middle Atlantic RFC performs well, with an observed CSI of 0.15 (see Fig. 8a and Table 6). The RFCs in the western United States, using FFPI, exhibit skills ranging from 0.00 to 0.04. RFCs in roughly the northern half of the CONUS east of the Rockies, generally using LFFG, have CSIs between 0.07 and 0.12. Finally, those RFCs using GFFG, generally located in the southern half of the CONUS east of the Rockies, have CSIs between 0.05 and 0.07.

Fig. 8.
Fig. 8.

Map of FFG skill as verified by Storm Data flash flooding reports when (a) a QPE-to-FFG ratio of 1.0 is considered and (b) when any QPE-to-FFG ratio is considered.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

Table 6.

Statistics based on Storm Data reports used to produce Fig. 8. The three leftmost columns show the best CSI and corresponding POD and FAR associated with a QPE-to-FFG ratio of 1.0. The three rightmost columns should the best CSI and corresponding POD and FAR associated with any QPE-to-FFG ratio.

Table 6.

If the evaluation is expanded to include any QPE-to-FFG ratios (see Fig. 8b and Table 6), the MARFC’s skill improves to 0.19. The skill of the western RFCs improves slightly, now ranging from 0.00 to 0.05. Improvement is also noted in the RFCs using LFFG (0.07–0.16) and those using GFFG (0.08–0.13).

If USGS flood stage height reports are used to evaluate FFG, a different picture emerges. At the standard QPE-to-FFG ratio of 1.0, the three western RFCs all have a CSI of 0.00 (see Fig. 9a and Table 7). The CSI of DFFG in the MARFC domain (0.16 here) is roughly the same as it was in the Storm Data analysis (0.15). The skill indices of the RFCs running GFFG all improve, some in dramatic fashion. In this analysis, those values now range from 0.12 to 0.33. Finally, some RFCs running LFFG improve their CSI numbers in this analysis, while others decline, ranging from 0.04 to 0.19.

Fig. 9.
Fig. 9.

As in Fig. 8, but when verified by USGS stream gauge measurements.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

Table 7.

As in Table 6, but based on USGS stream gauges used to produce Fig. 9.

Table 7.

If the analysis is expanded to include all QPE-to-FFG ratios, FFPI skill at the three western RFCs does not improve (see Fig. 9b and Table 7). Some improvement in GFFG areas is noted when additional ratios are considered, with CSIs ranging from 0.18 to 0.43. There is also improvement over the LFFG domains, where CSIs now range from 0.16 to 0.27. Finally, the DFFG product used in the MARFC has a skill of 0.22 in this analysis. The USGS data suffer from a small sample size (see Table 8) and are less likely to include the difficult forecast locations associated with very small basins, urban runoff, wildfire scars, and overland flows. For these reasons, more credence should be given to conclusions drawn from the Storm Data analysis.

Table 8.

Number of events per RFC in the USGS stage height exceedance database.

Table 8.

c. Flash flood guidance skill by generation method

Because this study evaluates whichever version of FFG was being produced operationally at the various RFCs, each generation method cannot be directly compared, since each type of FFG was running over different regions and at different times. However, we divide the CONUS into four spatial and temporal regions; these represent the times and places where FFPI, LFFG, GFFG, and DFFG were operational. We provide statistics about each method without providing any judgment on the relative utility of each.

FFPI was operational at the Northwest RFC and the Colorado Basin RFC during the entire study period and at the California Nevada RFC after November 2008. Using Storm Data flash flooding reports (N = 820) to verify the skill of FFPI reveals the minimal skill of the product at all QPE-to-FFG ratios considered (see Fig. 10b). The maximum skill of FFPI (CSI = 0.02) is achieved when the 3-h product is used at a ratio of 0.75. At all ratios, high false alarm rates, ranging from 99% to 100%, and low probabilities of detection, ranging from 1% to 40%, contribute to the low CSI of FFPI. The poor skill of the FFPI method can only be partially explained by the low population densities and less frequent reporting of events. Poor radar coverage over the western United States results in difficulties in the production of the stage IV precipitation estimates used in the study. Some areas of the West rely on climatological precipitation altered by rain gauge data to produce stage IV estimates. Additionally, flash floods in this region are often caused by meteorological systems of a different character than the organized convective systems in the central and eastern CONUS. This also helps explain the low skill of FFPI, which was observed in both the Storm Data and USGS flood stage height analyses.

Fig. 10.
Fig. 10.

The skill of (a) DFFG, (b) FFPI, (c) LFFG, and (d) GFFG for a variety of exceedance ratios. Observations are from reports of flash flooding in Storm Data between 1 Oct 2006 and 31 Aug 2010.

Citation: Weather and Forecasting 29, 2; 10.1175/WAF-D-12-00124.1

LFFG was operational at 8 of the 12 CONUS RFCs at the start of the study period, but was replaced by GFFG at three of these offices and by FFPI at a fourth. A total of 7760 Storm Data reports are included in this analysis. LFFG is most skillful at ratios of 1.5 or 2.0 (see Fig. 10c) and the 1- and 3-h products are generally more skillful than the 6-h ones. False alarm rates for this product range from 85% to 99%, and probabilities of detection range from 8% to 40%.

GFFG was developed at the Arkansas-Red Basin RFC in 2005 and then later deployed at the West Gulf RFC, Lower Mississippi RFC, and Southeast RFC. In these domains, during the times in which GFFG was operational, a total of 5530 flash flooding events were recorded in Storm Data. At all QPE-to-FFG ratios (see Fig. 10d), the 1-h GFFG product is the most skillful, with 3-h GFFG less skillful, and 6-h GFFG even less skillful than 3-h GFFG. The best GFFG skill is observed when the high ratios (over 2.0) are considered, though once a 2.0 ratio of is reached, neither 2.5 nor 3.0 provides an improvement or a dropoff in skill.

DFFG, which was only produced at the Middle Atlantic RFC during the study period, has the same spatial variability as GFFG but is generated using a different hydrologic model. There were 710 reported Storm Data flash flood events in the MARFC domain, which is a much smaller dataset compared to the other analyses. Nonetheless, at QPE-to-FFG ratios between 1.0 and 2.0, DFFG performs well (see Fig. 10a). The best DFFG skill in the Storm Data evaluation occurs when using 1-h DFFG at ratio of 1.25, which results in a CSI of 0.19.

6. Conclusions

This study establishes the benchmark skill of the operational flash flood guidance (FFG) product used by the National Weather Service (NWS) to forecast, monitor, and warn the public about dangerous flash flooding events. Although flash flood guidance has been produced by the NWS for over 40 yr, little literature about its performance outside of isolated case studies exists. Using a CONUS-wide observational database of flash flooding events consisting of two separate sources, FFG was evaluated on a national scale, on an RFC scale, and by the various methods used to generate the product operationally. This evaluation covers a 4-yr period with events occurring in all but one state of the CONUS (Washington), including over 2200 instances of flash flooding recorded by USGS stream gauges and over 14 000 instances of flooding recorded by NWS forecasters.

Storm Data reports include thousands of observed events because that database is intended to be comprehensive. Subjectivity due to the human element must be considered in any analysis relying upon Storm Data reports. The USGS reports, on the other hand, suffer from small sample sizes. Only a portion of the available USGS stream gauges was used in this study, because not all gauged locations have defined action stages and not all of those locations experienced an action stage exceedance during the study period. However, those flash floods that were recorded will be highly reliable because of the automated nature of the USGS observations. For this reason, the best approximation of the skill of FFG at forecasting flash floods and near–flash floods is the CONUS-wide USGS analysis.

Using Storm Data reports as verification, FFG (all methods combined) achieved a maximum skill (CSI = 0.07) using the 1-h accumulation product at a QPE-to-FFG ratio of 1.5. When using the USGS flood stage heights as the verification source, a maximum skill of 0.20 occurred with the 1-h accumulation product for a QPE-to-FFG ratio of 0.5. This latter value should serve as a benchmark skill for FFG on the national scale in subsequent research.

Four different methods for deriving FFG exist within the River Forecast Centers (RFCs). We grouped them according to FFG generation method, being either distributed (DFFG), lumped (LFFG), gridded (GFFG), or flash flood potential index (FFPI), and compared their skill. This intercomparison is not objective because each method was running in different locations and so the events and sample sizes are quite different. When using the Storm Data reports of flash flooding, LFFG, the oldest current method of FFG generation in use at the end of the study, performed best when its 3-h version was used with a ratio of either 1.5 or 2.0. GFFG, a newer method with some higher-resolution components, was most skillful when its 1-h version was used at ratios of 2.0 and higher. DFFG, though only used at the Middle Atlantic RFC, reached a skill of 0.19 when the 1-h product was evaluated against Storm Data reports while using a ratio of 1.25. FFPI, used in the West, had CSI values below 0.02 for all ratios. Since these methods were not operating at the same times and in the same places, factors like topography, radar coverage, and population density prevent us from ranking the relative skill of any of them.

Future research in this area should continue, specifically looking at new ways of generating FFG with advanced distributed hydrologic models. More observational datasets, including additional small-scale USGS gauged basins, could be used to produce more detailed evaluations of FFG. Finer-scale observations like those collected by the Severe Hazards Analysis and Verification Experiment (Gourley et al. 2010) could also be used to build upon this study and may also support the more widespread adoption of modern, distributed methods of generating products to supplement or replace the current FFG system.

National Weather Service forecasters working in areas covered by the FFPI method of FFG generation should continue to use a wide range of information in the flash flood monitoring and warning process, as relying solely on FFPI-generated FFG may have undesirable results. RFCs currently running the LFFG system should consider transitioning to GFFG or DFFG, since the overall distribution of forecast values is similar, but GFFG and DFFG produce higher-resolution information on a scale similar to that used in radar precipitation estimates and in the flash flood basins used by the Flash Flood Monitoring and Prediction program. NWS forecasters should remain aware of locations when a QPE-to-FFG ratio of 1.0 is exceeded while recognizing that the skill of the guidance product is potentially maximized at either a higher or lower ratio depending on the part of the country and the FFG generation method being considered. Additionally, NWS forecasters can be trained to modify FFG within their county warning area in an effort to increase the usefulness of the current product until more permanent improvements can be made. These modifications are currently undertaken at many WFOs across the United States, particularly in urban areas, but they are not centrally archived and thus were not evaluated as a part of this study.

Over more than 40 yr, flash flood guidance has been a critical link in the system that protects Americans and their property from the most dangerous storm-related hazards. Many years of modifications have resulted in a patchwork of different generation methods and ideas about how FFG should work. The results of this study are not intended to discourage the storm-scale hydrologic community or to disparage the current state of FFG generation. Instead, the groundwork is being laid for the meteorological and hydrological communities to explore large-scale improvements to operational FFG in the hope of improving scientific understanding of flash flooding events and of making flash flood forecasts more specific, more accurate, and more useful.

Acknowledgments

Funding was provided by the NOAA/Office of Oceanic and Atmospheric Research under NOAA–University of Oklahoma Cooperative Agreement NA17RJ1227, the U.S. Department of Commerce, and the National Weather Service’s Advanced Hydrologic Prediction System funds. We appreciate the former NWS National Precipitation Verification Unit for making the QPE and GFFG grids available to us. We are also grateful to the USGS for access to the database of streamflow observations. Comments from three anonymous reviewers improved the content and readability of the manuscript. We appreciate their time.

REFERENCES

  • Arthur, A., , Cox G. , , Kuhnert N. , , Slayter D. , , and Howard K. , 2005: The National Basin Delineation Project. Bull. Amer. Meteor. Soc., 86, 14431452.

    • Search Google Scholar
    • Export Citation
  • Ashley, S., , and Ashley W. , 2008: Flood fatalities in the United States. J. Appl. Meteor. Climatol., 47, 806818.

  • Carpenter, T., , Sperfslage J. , , Georgakakos K. , , Sweeney T. , , and Fread D. , 1999: National threshold runoff estimation utilizing GIS in support of operational flash flood warning systems. J. Hydrol., 224, 2144.

    • Search Google Scholar
    • Export Citation
  • Davis, R., 2007: Detecting the entire spectrum of stream flooding with the Flash Flood Monitoring and Prediction (FFMP) program. Preprints, 21st Conf. on Hydrology, San Antonio, TX, Amer. Meteor. Soc., 6B.1. [Available online at https://ams.confex.com/ams/pdfpapers/120738.pdf.]

  • Gourley, J. J., , Erlingis J. , , Smith T. , , Ortega K. , , and Hong Y. , 2010: Remote collection and analysis of witness reports on flash floods. J. Hydrol., 394, 5362.

    • Search Google Scholar
    • Export Citation
  • Gourley, J. J., , Erlingis J. , , Hong Y. , , and Wells E. , 2012: Evaluation of tools used for monitoring and forecasting flash floods in the United States. Wea. Forecasting, 27, 158173.

    • Search Google Scholar
    • Export Citation
  • Gourley, J. J., and Coauthors, 2013: A unified flash flood database over the United States. Bull. Amer. Meteor. Soc., 94, 799805.

  • Helble, T., 2010: Definitions and general terminology. NWS Manual 10-950, 5 pp. [Available online at http://www.nws.noaa.gov/directives/sym/pd01009050curr.pdf.]

  • Koren, V., , Smith M. , , Wang D. , , and Zhang Z. , 2000: Use of soil property data in the derivation of conceptual rainfall-runoff model parameters. Preprints, 15th Conf. on Hydrology, Long Beach, CA, Amer. Meteor. Soc., 103106.

  • Lin, Y., cited 2012: Q&A about the new NCEP stage II/stage IV. Mesoscale Modeling Branch, Environmental Modeling Center, National Centers for Environmental Prediction. [Available online at http://www.emc.ncep.noaa.gov/mmb/ylin/pcpanl/QandA.]

  • Mogil, H., , Monro J. , , and Groper H. , 1978: NWS’s flash flood warning and disaster preparedness programs. Bull. Amer. Meteor. Soc., 59, 690699.

    • Search Google Scholar
    • Export Citation
  • NWS, cited 2012: National Weather Service glossary. [Available online at http://w1.weather.gov/glossary/index.php.]

  • RFC Development Management Team, 2003: Flash Flood Guidance Improvement Team—Final report. River Forecast Center Development Management Team Rep. to the Operations Subcommittee of the NWS Corporate Board, 47 pp. [Available online at http://www.nws.noaa.gov/oh/rfcdev/docs/ffgitreport.pdf.]

  • Schmidt, J., , Anderson A. , , and Paul J. , 2007: Spatially-variable, physically-derived, flash flood guidance. Preprints, 21st Conf. on Hydrology, San Antonio, TX, Amer. Meteor. Soc., 6B.2. [Available online at https://ams.confex.com/ams/pdfpapers/120022.pdf.]

  • Smith, G., 2003: Flash flood potential: Determining the hydrologic response of FFMP basins to heavy rain by analyzing their physiographic characteristics. Rep. to the NWS Colorado Basin River Forecast Center, 11 pp. [Available online at http://www.cbrfc.noaa.gov/papers/ffp_wpap.pdf.]

  • Sweeney, T., 1992: Modernized areal flash flood guidance. NOAA Tech. Rep. NWS HYDRO 44, NOAA/NWS/Hydrologic Research Laboratory, Silver Spring, MD, 21 pp. + an appendix.

  • Sweeney, T., , and Baumgardner T. , 1999: Modernized flash flood guidance. Rep. to NWS Hydrology Laboratory, 11 pp. [Available online at http://www.nws.noaa.gov/oh/hrl/ffg/modflash.htm.]

Save