Flash flooding is frequently associated with heavy precipitation (defined here as ≥1 in. h−1) occurring over a short period of time. Although a significant amount of work has been done to define the climatology of precipitation on a timescale of 24 h (Smith and Bradley 1994), this timescale is much longer than that associated with flash flood–producing rains (Maddox et al. 1978; Vieux and Bedient 1998). Unfortunately, reports of flash floods are often vague, many flash flood events are probably never reported, and there is no national database for collecting flash flood reports (Maddox et al. 1979). In the case of individual flooding events that are reported and studied, bucket surveys are often done in which any container that holds water is used to estimate precipitation totals. Quality control of bucket surveys is problematic, however, since the question of whether or not a container was empty at the beginning of a heavy precipitation event can be answered with confidence rarely. This leaves the meteorological community with little reliable data, valid over a number of years, on the climatology of flash floods in the United States.
The most useful database to begin a study of flash flood producing rain events is the Hourly Precipitation Dataset (HPD), archived at the National Center for Climatic Data (NCDC). This database is used to develop a climatology of heavy rains on timescales of 3 h or less across the contiguous United States. Although it is not possible to directly relate heavy rainfall events to flash floods, knowledge of the frequency and distribution of heavy rainfall can help to better define the potential for flash floods.
The HPD provides hourly observations of precipitation from all 50 states for a period of more than 40 yr (we use 1948–93 in this paper). Approximately 5000 sites are found in the archive, although few of the sites have records covering the whole period (Fig. 1). The number of reporting stations grows from approximately 300 in the late 1940s to approximately 2800 in the early 1980s. The latter number represents a station density approximately equivalent to a uniform network with stations spaced 50 km apart. In addition to the changes in the gauge network over time, the gauges themselves have different accuracies, with some stations reporting in hundredths of an inch, whereas others report in tenths of an inch. Although the coverage provided by the HPD is not uniform in either space or time, this dataset still represents the most complete and accurate set of measurements of precipitation presently available.
From a meteorological perspective, flash floods may be the most difficult forecast hazard associated with thunderstorms (Maddox et al. 1979; Doswell et al. 1996). The forecasting process would be greatly assisted by a better knowledge of the climatology of heavy precipitation events, particularly if probabilistic estimates of threats are to be made. Since heavy precipitation is a rare event at any one location, the climatology of the occurrence of heavy precipitation also has implications for the experience level of weather forecasters dealing with the flash flood problem. Accurate forecasts of the threat in any given situation are crucial for the protection of life and property.
We begin by describing the format of the HPD, and briefly highlight some of the problems found in this dataset, in section 2. The observed monthly frequencies and distribution of rainfall amounts greater than 1 in. h−1 is found in section 3. Section 4 describes an approach for estimating the frequency of more extreme rainfall rates, while section 5 discusses the implications of these results.
2. Nature of the HPD
The HPD consists of a series of records with one record per station that observes precipitation on a given day, giving hourly totals and a daily precipitation amount. Each record ends with the total daily precipitation amount. As such, the HPD is well suited to computing time series of precipitation at individual stations, and to making counts of precipitation values, particularly for hourly and total daily observations. Such time series can be revealing in certain cases. For example, part of the warning problem associated with the 19–20 July 1977 Johnstown, Pennsylvania, flash flood (Bosart and Sanders 1981) is readily apparent in the time series (Fig. 2). Most of the more than 8 inches that fell at Johnstown occurred after the late evening news, which was probably the last good opportunity to reach the public with warning information. On the other hand, many events are not sampled by the HPD. With the 9 June 1972 Rapid City, South Dakota, flood (Maddox et al. 1978), two HPD sites were within 30 km of an estimated 12 in. (3 h)−1 rainfall, yet neither site recorded more than 2 inches. In an even more extreme case, no HPD sites recorded significant precipitation associated with the Big Thompson River, Colorado, flood of 31 July 1976 (Maddox et al. 1978). Gauges at Drake and Estes Park, Colorado, were not put into place until after the flood. Similarly, for some flash floods after the period of record which we have considered, such as the Fort Collins, Colorado, flood of 1997, no sites recorded significant precipitation.
As with any large observational dataset, quality control is a significant concern. Some errors are easily detected and removed from the dataset, while we cannot determine the accuracy of a number of other questionable observations. Examples of the simple errors include a number of records in which the value in the hundredths of an inch column is reproduced in the tens of inches column, leading to reported hourly accumulations of 10.01" and 20.02". These values are clearly unreasonably large and, with the repetitive pattern, can be eliminated. Somewhat less obvious errors are extremely large reports (e.g., 15.55" in an hour), but the extreme value still makes it possible to eliminate them automatically.
Another type of error involves the recording of hundredths of an inch values in the tenths of an inch column (e.g., 7.80" in 1 h, followed by 6.50" with no precipitation on either side of those 2 h). These are problematic, as are other isolated instances of large precipitation (≥4 in. h−1), because there is no objective way to determine whether they are the result of “bad” data or extremely large, “good” data. Indeed, this is a fundamental problem in using any dataset about any rare, extreme event;bad data and interesting, good data frequently look alike. In an attempt to remove any bad data, we hand checked every hourly report of greater than 4.5" (approximately 360 reports) with reports from meteorological and climatological journals, and found that only a few are likely to be good reports. The rest fell into one of the kinds of errors mentioned above. The real difficulty comes in attempting to hand check the much larger number of reports at smaller hourly amounts. The distinction between obviously bad and good data becomes blurred, and the volume of work becomes prohibitive. Because of limited resources and no other datasets to use for confirmation, this problem with the HPD is not addressed in this study and may lead to some inaccuracies in the results presented.
Groisman and Legates (1994) have discussed some sources of observational errors in the dataset, noting that the gauges tend to underestimate true precipitation, particularly in winter, and are inadequate for use in areal-mean calculations. Since the majority of the events that we are concerned with occur in summer, and since we will use gauge accumulations, rather than areal calculations, we believe that some of these error sources will not affect our results. Nevertheless, concerns about the accuracy of the gauge record remains a problem in any analysis.
3. Observed frequency of hourly precipitation
Once obvious errors are removed from the HPD records, the data are combined in a variety of ways to explore the seasonal distribution of heavy rainfall. In particular, the average number of events per month and the mean monthly frequencies are calculated.
a. Mean monthly values
One of our purposes in investigating heavy precipitation is to consider the number of times that operational weather forecasters have to deal with it as a forecast problem, in particular from the perspective of a national forecast center. Hence, we begin by examining the average number of events in the contiguous United States over the entire period of the HPD records. Since our focus is upon rainfall amounts that could contribute to producing flash floods, only hourly rainfall totals in excess of 1 inch are investigated. The observations are binned into half-inch bins (e.g., 1–1.5 inch, 1.5–2 inch) to simplify the data analysis, and to increase the number of samples at the higher precipitation amounts.
Results show that the annual cycle of heavy precipitation peaks in July and is symmetric about that month (Fig. 3). Twenty percent of all observations occur in July and more than 81% of all observations occur from April through September. This is very similar to the monthly distribution of flash flood events over a 5-yr period studied by Maddox et al. (1979), who find that 25% of flash floods occur in July and 86% of flash floods occur from April through September. However, the number of flash flood events is much smaller than the number of 1 in. h−1 rainfall events, with flash floods being 17 times less frequent on average. Over all months, there are approximately 2400 reports of 1–1.5 inch hourly rainfalls per h each yr and 3200 reports of 1 inch or greater hourly rainfalls, whereas there appears to be only 30 reported flash flood events in a typical yr (Maddox et al. 1979).
The results of this analysis also indicate that the number of events decreases logarithmically as the precipitation value increases (Fig. 4). The fit to the curve for the July observations is extremely good from the 1 through 4 inch accumulation bins. Comparison of the number of reported events, given by the black squares in Fig. 4, to the logarithmic line gives us some confidence as to the number of extreme events reported in the HPD that are likely to be bad. Based on that fit, we estimate that approximately 90% of the reports greater than 4.5 in. h−1 are bad. As mentioned before, a large number of these are relatively easy to detect and automate quality control procedures for. Based on this analysis, about one-third of those extreme events cannot be rejected with simple techniques. Later, the logarithmic fit will provide a powerful tool for estimating the“true” number of events that occur in the United States in a year.
Although each month follows a similar logarithmic decrease, the rate of decrease shows hints of a seasonal cycle. In the summer, the number of events observed in a given half-inch increment decreases to approximately 7.5% of the value one inch lower. In the winter, it decreases more rapidly to approximately 6.5% of the value one inch lower. While this is a relatively small change, it could be an indication of the greater frequency of strong convection, and hence, high rain rates, in the summer season.
Results for accumulation times of 2 and 3 h display a few surprises (not shown). Eight thousand (11 400) reports of 1 inch (greater than 1 inch) rains in 2 h occur on average in the HPD and 17 400 (25 000) reports of 1 inch (greater than 1 inch) rains in 3 h occur on average in the HPD. The logarithmic decay with increasing amount is slower than for the 1 in. h−1 rainfalls, although the seasonality is more pronounced. Overall, the 2-h reporting time values fall off to 11.8% for each inch of increased accumulation, while the 3-h reporting time values fall off to 13.2% per inch of increased accumulation. In summer, for the 2-h reporting time, observations fall off to 13% for each one inch increase in rainfall, while the value is 10% in the winter. For 3 h, the rate decreases to 16% in summer and 12% in winter. It is possible that this is due to the impact of larger-scale weather systems in producing more sustained periods of relatively heavy precipitation, leading to persistent observations of more than one inch rainfall amounts per h over 2- and 3-h periods. However, the extreme precipitation values (e.g., 3" or more) still result predominantly from convective environments, which are most pronounced in the summer. The mean horizontal frequency of these events over the United States is now examined.
b. Monthly mean frequency distribution
Results show a distinct seasonal cycle in the heavy rainfall frequencies (Fig. 5), with the Gulf Coast states having the greatest and most persistent maxima in hourly 1 in. or greater rainfalls. A maximum frequency of 1.13 events per year occurs over Florida during August, while earlier in the year a maximum frequency of 0.86 events per year occurs over southern Louisiana during May. Away from the coastal zones, the largest frequency calculated is 0.52 events per year over southeastern Kansas during June. This local maximum shifts northward to northeastern Kansas during July. The dataset also captures numerous local features, such as the minima in heavy rainfall frequency in Missouri and along the Appalacians during the summertime. The North American monsoon (Douglas et al. 1993) is also seen clearly in the frequencies over Arizona during July through September.
One problem with using this analysis to assist in determining the flash flood threat is the lack of significant number of events in the western United States. In contrast with this heavy rainfall analysis, Maddox et al. (1979) find that 21% of the flash flood events occur over the western United States. These events are often associated with rainfall totals of 2–4 in. over just a few h. However, there is little indication of frequent, heavy hourly rainfall in the HPD outside of Arizona (Fig. 5). This may be due in part to the sparser gauge network and shorter gauge record lengths over the western United States (Fig. 1). The location of the rain gauges may also contribute to this sampling problem. Whereas the high terrain features strongly influence the development of convection in the western states (e.g., Caracena et al. 1979; Maddox et al. 1978; Yoshizaki and Ogura 1988), the rain gauges are typically located at lower elevations. Therefore, the gauge network is not able to capture the rainfalls observed over the higher elevations. In addition, owing to the high variability and rocky nature of the terrain over the western United States, it does not require a large amount of rainfall to create a flash flood. These factors all contribute to the inadequate representation of western events in the HPD. It may be that lightning strike data may be an important source of data in helping to develop a rainfall climatology over this region of the country (see Sheridan et al. 1997).
The hourly rainfall data indicate a high frequency of heavy hourly rainfall events along the Gulf Coast states that is not seen in the 5-yr climatology of flash flood events reported by Maddox et al. (1979). This suggests that although heavy rainfall events are frequent along the coastal zones, the interaction between the rainfall and the underlying ground surface may help to minimize the flash flood threat. However, an examination of severe weather reports during the month of July from 1993–97 indicates that flash flooding does occur in the Gulf Coast states, with nearly two flash flood events per year on average in Alabama, Georgia, and Florida. Therefore, the complete lack of flash flood reports in the 5-yr data set used by Maddox et al. (1979) may also be attributed in part to a reporting problem or to variability between different periods of record.
A slightly different perspective of the seasonal distribution of heavy rainfall is seen when objectively analyzing the observed number of 4 in. (3 h)−1 rainfall events. Since these events occur less frequently, the values are more uncertain than the values found using the 1 in. h−1 threshold and may be dominated by a handful of events. However, even with this additional uncertainty, a number of interesting features are present that may be realistic (Fig. 6), although a longer record length is needed to obtain statistical certainty. The July local maximum in heavy rain in Minnesota stands out, indicating that possibly the summer peak rainfall rates are sustained there longer than in other regions of the country. Numerous regions along the Gulf Coast have smaller frequencies of 4 in. (3 h)−1 rainfall than seen in regions of the midwestern states. This suggests that one potential explanation for the perceived fewer flash flood events in the southeastern states seen by Maddox et al. (1979) may be due to less frequent occurrence of persistent, heavy rainfall than found in the midwestern states. In essence, rainfalls of 1 in. h−1 are common in the southeast, but sustaining such rainfall amounts over several hours is not.
The Texas Hill Country also shows up as a local maximum for this heavier rainfall amount, as does a portion of the front range near the Colorado–Wyoming–Nebraska borders where mesoscale convective systems often are initiated (Wetzel et al. 1983). In general, the patterns seen here mimic those seen at the lower rainfall threshold (Fig. 5), but at a significantly reduced frequency. Therefore, given the caveat that the dataset spans too short a time period to produce robust results, it appears that knowledge of the 1 in. h−1 frequencies is useful in determining regions where heavier rainfall can occur over a several-hour period, except in the western United States.
4. Estimating frequency of extreme events
Although the HPD provides the most complete set of high temporal and spatial resolution observations of precipitation available, it is clearly inadequate for capturing extreme precipitation events owing to the relatively short record length of 40 yr and the large station spacing of 50 km. Since it is these extreme events that provide the most danger of a major disaster, it is important to have some basis for climatological estimates of risk. The regular, logarithmic decline of the number of events with increasing precipitation amount allows us to make estimates of the number of more extreme events, assuming that the observations in the HPD are representative of the true climatological frequency of extreme precipitation over the contiguous United States. This is of particular interest for efforts to understand the likelihood of heavy precipitation events at even higher spatial resolution.
The regular, logarithmic decline in the number of rainfall observations per inch increase in hourly rainfall amounts allows us to estimate the annual frequency of heavier hourly rainfall amounts. For example, there are approximately 0.0005 times as many 6 in. h−1 events as 3 in. h−1 events. Since there are approximately 20 3 in. h−1 events observed each year, that implies that approximately 0.01 6 in. h−1 events would be observed per year by the stations in the HPD. Thus, we expect that the order of magnitude of the observed 6 in. h−1 events will be 1 per century. To find the number of 6 in. h−1 events that likely occur each year in the United States, one must determine, or estimate, the representativeness of the observations. Precipitation, particularly that associated with convection, is associated with extremely large spatial gradients. Smith et al. (1994) provide examples of the poor spatial correlations between observational sites for convective precipitation. As a starting point, we assume that each rain gauge represents an area of 1 km × 1 km. Since the contiguous United States has an area of approximately 7.5 × 106 km2, the 3000 gauges in the HPD cover a fraction of only 4 × 10−4 of the total area of the United States. This implies that, if we had an observational dataset with 1 km horizontal spacing, we would observe approximately 2500 times as many events, for any given hourly rainfall threshold, as we currently observe with the HPD. Thus, there should be approximately 25 6 in. h−1 events per year in the United States, and similarly, there should be approximately 50 000 3 in. h−1 events!
Although no such extremely high spatial resolution rain gauge network exists, the deployment of the WSR-88D radars in a national network provides an opportunity to develop a radar-estimated precipitation climatology. This promises to allow us to make estimates of the spatial and temporal correlations of precipitation on a national scale, although there are numerous problems with the estimation of rainfall from reflectivity, including the Z–R relationship (Zrnic 1996; Vieux and Bedien 1998; Baeck and Smith 1998) and chaff releases from military aircraft (Maddox et al. 1997), as well as the problem of beam blockage. Assuming that improved precipitation estimation techiques will be developed (e.g., Zrnic 1996), the WSR-88D network may also allow us to come up with the best estimates to date of heavy precipitation.
An improved understanding of the threat of flash flooding requires us to develop knowledge of the climatology of heavy precipitation. The HPD represents the most complete description of short timescale precipitation measurements that covers a significant time period over the entire United States. Analyses of this dataset show a distinct seasonal cycle in the distribution of heavy rain events that begins along the Gulf Coast and expands into the midwestern states during the summer. This general evolution is very similar to that observed for flash floods (Maddox et al. 1979), suggesting that the HPD can help in defining the threat for flash floods.
With an average station spacing of 50 km, the HPD still misses most of the truly large precipitation events that actually occur. However, properties of the observed dataset allow us to make reasonable estimates of the real frequency of very heavy precipitation. These estimates should be of value, both for emergency managers and for weather forecasting concerns, in efforts to allocate resources and plan for the inevitable flash flood events associated with heavy precipitation. It is possible that national network radar estimates of precipitation can be used to refine the climatology presented, but it will be years before a significantly long period of record of radar observations exists that allows for reasonable estimates to be produced.
Danny Mitchell was invaluable in developing code that allowed us to work with the HPD and his assistance is acknowledged with gratitude. We would also like to thank Chuck Doswell for his helpful comments during this work. The HPD is available from the National Climatic Data Center.
Baeck, M. L., and J. A. Smith, 1998: Rainfall estimation by the WSR-88D for heavy rainfall events. Wea. Forecasting,13, 416–436.
Barnes, S. L., 1964: A technique for maximizing details in numerical weather maps. J. Appl. Meteor.,3, 396–409.
——, 1973: Mesoscale objective map analysis using weighted time series observations. NOAA Tech. Memo. ERL NSSL-62, 60 pp. [NTIS COM-73-10781.].
Bosart, L. R., and F. Sanders, 1981: The Johnstown flood of July 1977: A long-lived convective storm. J. Atmos. Sci.,38, 1616–1642.
Caracena, F., R. A. Maddox, L. R. Hoxit, and C. F. Chappell, 1979:Mesoanalysis of the Big Thompson storm. Mon. Wea. Rev.,107, 1–17.
Doswell, C. A., III, H. E. Brooks, and R. A. Maddox, 1996: Flash flood forecasting: An ingredients-based approach. Wea. Forecasting,11, 560–581.
Douglas, M. W., R. A. Maddox, K. W. Howard, and S. Reyes, 1993:The Mexican monsoon. J. Climate,6, 1665–1677.
Groisman, P. Ya, and D. R. Legates, 1994: The accuracy of United States precipitation data. Bull. Amer. Meteor. Soc.,75, 215–227.
Maddox, R. A., L. R. Hoxit, C. F. Chappell, and F. Caracena, 1978:Comparison of the meteorological aspects of the Big Thompson and Rapid City flash floods. Mon. Wea. Rev.,106, 375–389.
——, C. F. Chappell, and L. R. Hoxit, 1979: Synoptic and meso-alpha aspects of flash flood events. Bull. Amer. Meteor. Soc.,60, 115–123.
——, K. W. Howard, and C. L. Dempsey, 1997: Intense convective storms with little or no lightning over central Arizona: A case of inadvertent weather modification? J. Appl. Meteor.,36, 302–314.
Sheridan, S. C., J. F. Griffiths, and R. E. Orville, 1997: Warm season cloud-to-ground lightning–precipitation relationships in the south-central United States. Wea. Forecasting,12, 449–458.
Smith, J. A., and A. A. Bradley, 1994: The space–time structure of extreme storm rainfall in the southern plains. J. Appl. Meteor.,33, 1402–1417.
Vieux, B. E., and P. B. Bedient, 1998: Estimation of rainfall for flood prediction from WSR-88D reflectivity: A case study, 17–18 October 1994. Wea. Forecasting,13, 407–415.
Wetzel, P. J., W. R. Cotton, and R. L. McAnelly, 1983: A long-lived mesoscale convective complex. Part II: Evolution and structure of the mature complex. Mon. Wea. Rev.,111, 1919–1937.
Yoshizaki, M., and Y. Ogura, 1988: Two- and three-dimensional modeling studies of the Big Thompson storm. J. Atmos. Sci.,45, 3700–3722.
Zrnic, D. S., 1996: Weather radar polarimetry–Trends toward operational applications. Bull. Amer. Meteor. Soc.,77, 1529–1534.