An estimate is made of the probability of an occurrence of a tornado day near any location in the contiguous 48 states for any time during the year. Gaussian smoothers in space and time have been applied to the observed record of tornado days from 1980 to 1999 to produce daily maps and annual cycles at any point on an 80 km × 80 km grid. Many aspects of this climatological estimate have been identified in previous work, but the method allows one to consider the record in several new ways. The two regions of maximum tornado days in the United States are northeastern Colorado and peninsular Florida, but there is a large region between the Appalachian and Rocky Mountains that has at least 1 day on which a tornado touches down on the grid. The annual cycle of tornado days is of particular interest. The southeastern United States, outside of Florida, faces its maximum threat in April. Farther west and north, the threat is later in the year, with the northern United States and New England facing its maximum threat in July. In addition, the repeatability of the annual cycle is much greater in the plains than farther east. By combining the region of greatest threat with the region of highest repeatability of the season, an objective definition of Tornado Alley as a region that extends from the southern Texas Panhandle through Nebraska and northeastward into eastern North Dakota and Minnesota can be provided.
Climatological descriptions, or “climatologies,” of the frequency and variation of hazardous weather threats are important to a wide variety of groups. Knowing what hazards are threats at different times of the year and locations around the country can help various groups, such as weather forecasters, emergency managers, insurance companies, and the public, to be better prepared. In August of 1998, the National Tornado Forum, sponsored by the Federal Emergency Management Agency, the Department of Housing and Urban Development, and the National Weather Service, recommended the development of site-specific hazard maps to assist emergency managers. Furthermore, the Storm Prediction Center (SPC) of the National Weather Service has the responsibility for issuing guidance products about the nature of threats of severe thunderstorms1 every day in the United States. In 1999, the SPC began experimental issuance of products that described the threat in probabilistic terms, specifically the occurrence of one or more severe-weather events within 25 n mi of any location in the United States, and made the forecasts operational in 2001. To issue this product, the SPC has a need for information regarding the climatological frequency of severe-thunderstorm events, including tornadoes. As part of the effort to meet these needs, we have begun a project to quantify the probabilities of a wide variety of threats on a daily basis, based on the reports archived by the SPC over the years. Here, we present the first results of that project, which address tornadoes. Other severe-thunderstorm hazards will be treated in subsequent papers.
Grazulis (1993, hereinafter G93) and Grazulis et al. (1993) recently presented histories of efforts to estimate aspects of the climatological threat of tornadoes. It appears that Finley (1887) showed the first map of tornado occurrence in the United States, displaying the locations of approximately 1300 tornado touchdowns from 1760 to 1885. The effects of low population and communication difficulties were apparent in Finley's map (reproduced in G93), particularly in what is now Oklahoma and western Texas. In fact, there were no reported tornadoes in Finley's study in a triangle between Oklahoma City, Oklahoma; Amarillo, Texas; and Lubbock, Texas! Increased settlement of the Great Plains and more efforts to collect information on tornadoes led to an increase in the estimates of tornadoes as time went on. G93 showed a map from Wolford (1960) with a maximum in tornado occurrence that runs from near Dallas, Texas, through central Oklahoma and eastern Kansas to central Iowa, based on 7206 reported tornadoes from 1916 to 1955. Values peaked at slightly over five tornadoes per year per 2° latitude–longitude box. Court (1970) presented a summary of work that had been done to date to estimate the occurrence of tornadoes. To that point, almost all tornado climatologies considered numbers of tornadoes only, without regard for the intensity, and few showed the annual cycle for anything less than large areas.
The 1970s brought significant changes in the nature of climatologies being developed, particularly in efforts to incorporate information about intensity (Fujita 1981) and to make estimates of the probability of various wind thresholds being exceeded (Abbey and Fujita 1975, 1979). This work was prompted by the need to assess threats to nuclear power plants and was sponsored by the Nuclear Regulatory Commission (NRC). Kelly et al. (1978) filtered the reports received at the National Severe Storms Forecast Center (now SPC) from 1950 to 1976, removing “doubtful” reports, and generated national annual cycles, diurnal cycles normalized by local solar time, and maps of the annual-averaged tornado reports. They identified a “tornado alley” that runs between 97° and 98°W (roughly Dallas northward into eastern Nebraska), with a secondary axis curving northeastward from the Caprock Escarpment in the Texas Panhandle through northwest Missouri and eastward into north-central Indiana. As part of the NRC-sponsored efforts, a group at the University of Chicago compiled tornado reports that date back as far as 1916, as summarized in Fujita (1987). They produced maps of tornado occurrence through the entire period of their record, as well as subsections of the record, including reports by time of day (regardless of when during the year the tornado occurred) and by month.
Thom (1963) attempted to estimate the probability of a tornado striking a point in the United States. He used data from 1916 to 1962 but broke it into two periods, with the second period starting in 1953, the year for which he believed that “a large proportion of the tornadoes in the areas of high frequency was being reported.” As a result, he felt that the reports from 1953 to 1962 were “stable,” although this assumption was overly optimistic, as we shall show. He used only this period for his computations after carrying out statistical tests for a 1° × 1° grid box in central Iowa and finding that, whereas the means of tornadoes per year between the two periods were similar, the variances had changed dramatically. He took his data on that grid and smoothed them in both the north–south and east–west directions, using weights of 0.25, 0.50, and 0.25, with the 0.50 at the analysis location. Our work here is in the spirit of Thom (1963), using gridded values that have been smoothed, although the method of smoothing is different.
Another smoothing technique involves the use of overlapping grid boxes. Schaefer et al. (1986) used this method when they presented what they referred to as a “minimum-assumption hazard” model of tornado climatology. Their model was based on the areal coverage of tornadoes, derived from reported tornado lengths and widths, within the boxes. The areal coverages were then smoothed, and a probability of a tornado hitting any point within the box could be derived.
It is important to recognize the limitations of the raw tornado report dataset at the beginning of the analysis. First, the dataset is fundamentally an extremely small sample at any particular location on any day. As a result, we want to take information from the area surrounding a point to arrive at an improved estimate of what the threat is at a particular point. Second, the dataset has problems in terms of the accuracy and temporal consistency of the reports. These limitations have been discussed elsewhere (e.g., Doswell and Burgess 1988; Grazulis 1993) and include such factors as basic errors in the reporting or recording of time and location information, spatial and temporal variability in the efforts to collect severe-weather reports for warning verification programs, changes in the nature of detailed damage surveys, increased population, increased public awareness, and the proliferation of video cameras.
As a result of these limitations, we will take a conservative approach to the analysis of these data. We believe that the most reliable and temporally consistent aspects of the reports are the data of occurrence and the approximate location of its “touchdown” point. Limiting our consideration to these two pieces of data necessarily will lead to limitations in the results (e.g., we cannot say anything about the areal coverage nor can we address the probability of a tornado occurring at a point) but should provide a reasonable estimate of what we are considering. In addition, they match up well with the needs of SPC forecasters attempting to assess spatial coverage of events over a 24-h period for their convective outlook products. In this paper, we will make estimates of the threat of one or more tornadoes touching down near any location during a 24-h period without considering the intensity or number of tornadoes at those locations. Thus, instead of looking directly at the climatology of tornadoes, we will be looking at the climatology of tornado days. Changnon (1982) discussed differences in assessing tornado trends if one looks at number of tornadoes, tornado days, or days with tornadoes that result in deaths. Showalter and Fulks (1943) and the U.S. Weather Bureau Climatological Services Division (1952) showed maps of tornado days on a state-by-state basis, with the inherent difficulty of interpretation as a result of the different sizes of states. Our results will include annual cycles at any location in the United States. Although this treatment is limited in scope, it takes advantage of the “best” aspects of the dataset.
2. Nature of the dataset
The dataset we will use is the so-called smooth log of severe-weather reports collected by the SPC and archived in the National Oceanic and Atmospheric Administration publication Storm Data. We limit consideration to the period from 1980 through 1999 over the contiguous 48 states, in order to correspond to a collateral effort to develop a new method for verification of the SPC convective outlook forecasts (Brooks et al. 1998). Another reason to use this relatively short period is the increase over time of tornado reports (Fig. 1). On average, tornado reports have increased by approximately 14 per year, such that the annual number of reports has almost doubled since the mid-1950s. A somewhat slower increase has been seen in the annual number of days that tornadoes have been reported in the United States (Fig. 2). This increase, approximately 0.5 days yr−1, means that the number of “tornado days” has increased only by about 10%–15% since the 1950s. This slow increase implies that, whereas change in the reporting of severe weather has had a large impact on the raw number of tornadoes, the tornado-day variable has been less sensitive through time than the number of tornadoes.
Our approach is to look at the log of reports on a daily basis and, for each day, assign the touchdown location of all tornadoes to the centroid of a box on a Lambert conic conformal grid with nominal 80-km horizontal spacing in both directions.2 The grid is true at 30° and 60°N. Grid boxes are within 5% of the same size over the United States. We consider a grid location to be “on” if one or more tornadoes touched down in the box and “off” if there were no reports. As will be seen later, the dichotomous approach makes some aspects of the analysis simpler. The primary final product of our approach will be the probability of a tornado touching down at any point on the grid on any day of the year. From that product, we can look at maps of threat at any time or can look at the annual cycle of threat of any point.
3. Statistical treatment
To get reasonable signals from the noisy dataset, we have chosen to smooth it using nonparametric density estimation in space and time (Silverman 1986). (So-called objective-analysis techniques fall into this general category.) We first calculate the mean number of days with tornadoes that occur at each point on the grid, for each day of the year, for whatever number of years N with which we are concerned. In the course of this work, we will consider values of N = 1, 5, and 20. It is obvious that, if we look only at one year, the values will be 0 or 1 at each grid point. For longer periods of time, the values will be bounded by 0 and 1 and will represent the probability of a tornado touching down at that grid point on that day, according to the “frequentist” approach to probability, m = M/N, where M is the number of years during the period of record with tornadoes on the day of interest. (Note that for 29 February, N is the number of leap years in the period of interest.) Thus, m is the mean unsmoothed frequency of tornado occurrence on the day of interest. At each grid point, we next smooth in the time dimension to find the mean value fn on day n of the year, according to
where k is the day of the year and σt is the smoothing parameter in time. To avoid problems at the beginning and end of the year, we make the data record periodic in time, so that the value of n − k is less than one-half of the number of days in the year.
The Gaussian smoother implicitly assumes that the data from one particular day of the year provides information about the probability of occurrence of a tornado at a location on days close to that particular day but provides little information about days far away. For instance, it is reasonable to believe that the occurrence of a tornado on 3 April says something about the likelihood of a tornado on 2 or 4 April but says much less about the likelihood of a tornado on 3 June.
After the temporal smoothing is completed, we smooth in space, again using a Gaussian kernel:
where px,y,n is the probability of a tornado touchdown being reported in the grid box at location x, y on day n, di,j is the Euclidean distance between the analysis location (x, y) and the data location (i, j), and σx is the smoothing parameter in space. Here, I and J are the number of grid points in the east–west and north–south directions on the grid. In principle, different values for the spatial smoothing parameters in each direction could be chosen, but we see little physical evidence to suggest that there should be different values, expect perhaps in regions of strong changes in topographic features. Moreover, anisotropic smoothing can distort the analyzed patterns. For the sake of simplicity and to use the same smoothers over the entire domain, we have chosen not to employ anisotropic smoothing.
Because one of the goals of our work is to provide information for SPC forecasters to use for climatological baselines of threat on any day at any location, we want to have reasonably smooth fields in both space and time. The effect of the temporal smoother can be seen by comparing the raw frequency of tornado occurrence anywhere in the United States with values of different smoothers (Fig. 3). To determine the appropriate value of the temporal smoother, we have built statistical models of the annual tornado cycle for the entire United States. We would like the statistical models to produce output that cannot be distinguished statistically from the observed record. To test this goal, we create annual cycles based on a variety of values of σt from 1 to 30 days. We then create 1000 samples of 20-yr records of tornado days from the statistical models and calculate the root-mean-square difference (rmsd) between the 20-yr record and the input smoothed cycle. We can compare that rmsd to the rmsd between the observed 20-yr record and the smoothed cycle. In the ideal result, the rmsd calculated from the statistical model and the observed record would be the same. If they are, then the statistical model cannot be distinguished from the observed record by this measure. In practice, if no smoother is used, then the statistical models produce more variability than is observed, and if, in the limit, the overall annual mean frequency is used as the input probability for every day of the year, the statistical model produces too small of an annual cycle. The statistical models all have rmsd values within 5% of the observed record for all values of σt from 5 to 24 days. We add an additional constraint of wanting a simple annual cycle with one absolute maximum, one absolute minimum, and no relative extrema in the record. The “15 day” smoother is the smallest smoother that meets these second criteria. Thus, it is capable of producing time series of tornado occurrence on the national scale that have statistical properties that are indistinguishable from the properties of the observed record, and it produces a relatively simple function to describe the annual cycle. Although there is no a priori reason to use the same temporal smoother on the gridpoint scale that we use on the national scale, for simplicity we choose to do so.
There are no clear-cut ways to evaluate the proper scale of the spatial smoother. It is related to the spatial correlation structure of tornado occurrence and reporting practices. In an ideal world, we might choose to make the smoother a function of space on each day depending on the weather situation of the day. This approach would clearly be difficult to do on many days and is impractical for looking at a large number of days. The “120 km” spatial smoother was chosen to provide smooth fields that have areal coverage comparable to SPC convective outlook products, an important consideration for the application of our results. For other purposes, it may be appropriate to choose other values of the smoothing parameters.
The approach we have taken implicitly assumes that there is some (unknown) underlying statistical distribution of tornadoes. The observed distribution has resulted from a short period of sampling; we are attempting to recover a distribution that approximates the large-scale features of the unknown, underlying statistical distribution. We do not believe it is possible, using the existing record of events, to distinguish real, physical small-scale variability from random noise. To identify smaller-scale features, a much longer period of record would be required, either from a stable period of observations or by some technique that removes the long-term trend (which we assume to be largely due to nonmeteorological factors) from the observations. We do not have the long, stable record at present, and it is not clear how to detrend the data accurately in the absence of that long record, and so we have chosen to take what we believe is a conservative approach to the analysis. Therefore, small features in space or time have been smoothed away; given that we began the work with the assumption that we are working with fundamentally small sample sizes, any small-scale features derived from an analysis with smaller values of the smoothing parameters could only be interpreted with extreme caution. Our choices mean that we only retain relatively large-scale features but that those features should be relatively reliably estimated. In locations at which events are very infrequent or at which reporting practices have changed more rapidly in time than at other locations so that the stationarity of the time series is even worse than is typical, the meaning of many derived quantities from the analysis is open to question. As a result, we have chosen to restrict analysis of the annual cycles and their variability to locations with at least 0.25 tornado days per year. It is not clear that there is much meaning to an annual cycle of an event that almost never occurs, and so this point is likely to be minor.
One of the advantages of choosing to define the “event” as one or more tornadoes on a day at a location is that the mean expected value of the event is the same as the probability of the event occurring, bounded between 0 and 1. As a result, we can integrate px,y,n over time during the year to get the mean expected number of days with at least one tornado in each grid box. The dual interpretation means that the method provides probabilistic estimates on any particular day, as well as the total threat during the year or other period of time.
a. Total threat
Perhaps the most basic and important quantity that can be derived from the data is the total threat of tornado touchdown, which, for our definition, is described by the mean number of days per year with at least one tornado touchdown at each grid point (Fig. 4). A broad “C”-shaped region over the central part of the United States has more than 0.75 tornado days per year. [This value is approximately the maximum value observed in Wolford (1960).] In addition, a second frequency maximum is found in Florida. Maxima within the C-shaped region are found in the southern plains of Texas and western Oklahoma and the high plains of northeastern Colorado, extending eastward into Iowa. Peak values in Florida and in northeastern Colorado are about 1.5 tornado days per year. West of the “C,” the threat drops off dramatically over the Rocky Mountains and west. The eastern extent of the highest threat, except for the Florida maximum, is limited roughly by the Appalachian Mountains.
Some caution must be attached to interpretation of details near the edges of the domain. Given that we have no data outside of the United States, the smoother generally leads to a slight underestimate near the edges of the domain. In the particular case of the Florida peninsula, with edges on either side, it is possible for the smoother to shift the apparent location of the maximum threat from the coasts into the central region. If the true maximum is along the coasts, associated perhaps with waterspouts coming on shore, then information from both coasts would get smoothed into the middle. Because there are no data off of the coasts, this condition hypothetically could result in the coastal values being underestimated and the central values being overestimated. It would be very difficult to ascertain if this is occurring on the scale of our analysis.
The variability in the number of tornado days per year can be described by looking at subperiods of the record. Dividing the 20-yr record arbitrarily into four 5-yr subperiods and running the data through the analysis process described in section 3, we can get some idea of the variability of tornado occurrence. The general region of 0.75 tornado days per year repeatedly shows up in the central part of the country, along with the absolute maxima in northeastern Colorado and Florida (Fig. 5). Details within the central region vary considerably among the different periods. For instance, estimates of tornado occurrence in extreme southwestern Oklahoma, based on the 5-yr subsets, are 1.6, 1.0, 1.2, and 1.4 days yr−1. We will return to the notion of the reliability of the tornado threat later when we consider the timing of the maximum threat during the year.
b. Annual cycle
Just as the most basic spatial quantity can be said to be the number of tornado days per year around the country, the annual cycle of the probability of a tornado day occurring anywhere in the United States is (arguably) the most basic temporal quantity. We consider the question of whether or not at least one tornado occurs in the United States on any day, use that as m to put into (1), and apply (1) to each of the 20 years separately. In terms of (1), this means setting N = 1, so that m = 0 or 1 for that day for that year. We can then calculate the mean, standard deviation, and extrema for each day of the year from those 20 annual cycles.3 The mean peaks at just over 90% on 12 June and reaches its minimum at about 17% on 28 December (Fig. 6). The standard deviation of the probability ranges from 6% in early June and early September to 10% in late April and early December. As suggested by the timing of the two primary maxima and minima, there is no obvious annual cycle to the variability. There are hints of less variability from June into early September, based on the difference between the maximum and minimum probabilities. It is tempting to associate this with the late spring and summer lack of baroclinity in the atmosphere, but the sample size is not large enough to have much confidence that the lack of variability is robust, and we did not assess it statistically.
The spatial variability in the variation of the threat during the year is also of great importance. To illustrate the temporal cycle, we show the probability for selected dates during the year (Fig. 7).4 Note that the dates we have chosen are not evenly distributed through the year. In the middle of February (Fig. 7a), as the probability of a tornado somewhere in the United States starts to increase, the most significant threat (greater than 0.25% on a day) is located in Florida and from Louisiana into southern Alabama. The area of threat grows and the peak threat becomes higher (∼0.5%) by early April, with the maximum now located in northeastern Texas and southeastern Oklahoma, with the area of at least 0.25% probability extending as far west as the Texas Panhandle and as far north as Iowa (Fig. 7b). A month later, the peak threat has increased dramatically to 1.8% and has moved westward over the southern Texas Panhandle (Fig. 7c). In addition, the 0.5% probability contour has reached South Dakota and eastward through Ohio into central Kentucky, indicating the large area of relatively high tornado threat in May. June reveals a rapid increase in tornado probability in northeastern Colorado to almost 2%, with an axis of greater than 1% extending east-northeastward into Iowa (Fig. 7d). Meanwhile, the highest threat of tornadoes in the southern United States is confined to the Texas Panhandle and the Florida peninsula. By August, the only location with a probability greater than 0.75% is in northeastern Colorado, and the probability is less than 0.25% for the entire southern half of the United States, except for Florida (Fig. 7e). Note that on 5 August the probability of a tornado day anywhere in the United States is only slightly lower (75% as compared with 88%) than on 20 May (Fig. 6). The peak probability values found anywhere in the country, however, are much smaller (∼0.8% as compared with 1.8%). This result is likely related to the consistency and frequency of tornadoes occurring in outbreaks during the spring tornado season in the plains. Springtime outbreaks frequently result in a large number of the grid points in our analysis in a small area getting tornadoes, while the summer events tend to have fewer tornadoes and the tornadoes are more likely to be widely scattered.
By the middle of November, the greatest tornado threat is limited to the southern United States, primarily east of Texas and west of Georgia (Fig. 7f). The peak probability at this time (0.6% over southwestern Mississippi) is actually the maximum probability for any location in the United States at any time from the beginning of September through the middle of March. As we will show later, a small part of this region actually has its greatest threat for a tornado day during November.
Another way to look at the changing threat during the year is to look at the annual cycle at different locations (Fig. 4). By considering the annual cycles based on short (5-yr) subperiods, as well as the complete record, it is also possible to look at the variability at those locations (Fig. 8). In particular, locations in the high-threat, low-variability part of the southern plains are well represented by Lubbock (Fig. 8c). Here, the threat is confined to a very short period of the year, but it is very consistent in different subsets of years. The threat peaks in late May at slightly less than 2% on any given day but is less than 0.2% before 1 April and after 15 July. In northeastern Colorado, the current national peak threat location for number of tornado days (not shown), the seasonality is slightly less consistent than Lubbock's, and the period of highest threat lasts slightly longer and is centered a little later in the year. Located eastward in the plains, York, Nebraska, shows a longer period of threat during the year and a somewhat reduced consistency between different subsets of years (Fig. 8a) in comparison with Lubbock. Note that the total threat is approximately the same at both York and Lubbock, as seen in Fig. 6. At the northeastern tip of the “C,” Columbus, Ohio, is characterized by much greater between-period variability and a longer period of nonzero threat than is seen at either York or Lubbock (Fig. 8b). Even greater variability is seen at Hattiesburg, Mississippi (Fig. 8d). It is practically impossible to define a tornado “season” at Hattiesburg. The area is affected by synoptic systems in winter and spring and by tropical storm–spawned tornadoes during the summer and autumn. The timing of the peak threat is very different in the four subsets of years and, in fact, the only time during the year that the threat goes to zero is during the middle of summer if there are no tropical systems. The annual mean number of days with tornadoes at Hattiesburg is only 20% less than at Lubbock and York (Fig. 4), but it clearly is distributed very differently. Unlike Lubbock at which the threat is very high for a few months, at Hattiesburg it never gets above 1% and is distributed over almost the entire year. This difference has implications for public safety that we will discuss later.
The cycle in peninsular Florida resembles that of Hattiesburg with one notable exception. Superimposed on the low threat throughout the year is a summer maximum centered in late June and July (Fig. 8e). From a separate analysis, outside the scope of this paper, that is restricted to tornadoes of F2 intensity and greater, it is clear that this summer maximum in peninsular Florida is almost entirely associated with weak (F0 and F1) tornadoes. We speculate that many of these tornadoes are of a nonsupercellular nature, either from waterspouts coming onto shore or from tornadoes forming in low-shear environments on convergence lines, such as the sea breeze or when the Gulf of Mexico sea breeze and Atlantic sea breeze merge, in a similar way to the nonsupercellular tornadoes of northeastern Colorado (e.g., Brady and Szoke 1989).
c. Date of maximum threat
The location of the maximum threat for a tornado in the United States changes throughout the year. It is instructive to consider the timing of the maximum at every location that gets at least 0.25 tornadoes per year. At frequencies of occurrence less than that, the sample size is much too small to get meaningful results. Except for the region around Tallahassee, Florida, the date of the maximum follows a reasonably simple pattern. As can be seen from the variability of the Hattiesburg annual cycle, small changes in the timing of the absolute maximum of tornado occurrence could occur with small changes in the annual cycle. In the Tallahassee region, the peak in November is slightly higher than the peak in April, whereas the situation is reversed in Hattiesburg. As will be discussed later, the definition of the timing of maximum threat in this region is open to question, given the interannual variability. The maximum threat occurs in April over much of the southeastern United States, except Florida, as discussed earlier (Fig. 9). It progresses later in the year when moving westward toward Texas so that almost all locations between the Rockies and the Appalachians and south of a line from central Kansas eastward have their peak threat by the end of May. Locations farther north have progressively later peaks; the Mid-Atlantic states, east of the Appalachians, see their peak threat in late July. Two regions depart from this general pattern: peninsular Florida, with a summer peak associated with nonsupercellular convection as discussed before, and the Gulf Coast near Tallahassee, with a peak threat in late November.
The variability of the timing of the maximum threat is also of significant interest. We have used the trimmed standard deviation to evaluate the variability (Wilks 1995). For each location, we have calculated the trimmed standard deviation, with two extreme values removed from each end of the dataset, of the date of the maximum for the 20-yr series. Thus, the trimmed standard deviation considers the 16 central values of the date of the maximum. The least variable season is found in the Texas Panhandle (Fig. 10), with a trimmed standard deviation of less than 10 days. Put another way, there is approximately a 70% chance of the maximum date of the tornado season occurring in a 20-day window in the Texas Panhandle. There is a meridional region of the Great Plains with a trimmed standard deviation of less than 20 days that extends northward from the Texas Panhandle to the Canadian border. In contrast, going only as far east as Dallas leads to a large increase to over 30 days. At that value of the standard deviation, there is a 30% chance that the maximum occurs outside of a 60-day window. Most of the southeastern United States has a standard deviation of greater than 60 days, implying a 30% chance that the maximum occurs outside of a 120-day window. As a result of this higher variability, it becomes almost impossible to define the timing of the peak threat in that part of the United States. The Gaussian assumption implicit in calculating the standard deviation reflects this fact. The standard deviation is an appropriate measure of the variability in the Great Plains in regions for which the variability is small but is much less appropriate in the southeastern United States. We tested other, more robust estimators of the variability (e.g., defining lengths of windows in which a given number of years have their maximum) and the general pattern remained the same, even if the details were different. Under all assumptions, variability was less in the Great Plains than in the Southeast.
d. Quasi-objective determination of Tornado Alley
Tornado Alley as a distinct geographical location is a popular concept, but one that is historically ill-defined. Although it is perhaps of little direct scientific concern, its popularity makes it of at least indirect interest. Typically, it refers most often only to the frequency of events. We believe that an additional criterion, the repeatability of the season, is also an important aspect of a reasonable definition. In other words, two complementary features, frequency of occurrence and reliability of the season, are of concern. The combination leads to a simple, quasi-objective definition that requires both a large number of tornadoes and a season that can be reasonably expected to occur at about the same dates every year. We have chosen values of at least 0.5 tornado touchdown days per year (see Fig. 4) and a trimmed standard deviation in the timing of the peak threat of less than 20 days (see Fig. 10) as standards for these criteria (Fig. 11). The region from west Texas northeastward through central Minnesota meets those standards, as does a small region near southern Lake Michigan. The addition of the reliability criterion eliminates the southeastern part of the “C” from Fig. 4. Inclusion of spatial continuity as an additional criterion would lead to the plains portion as a logical, objectively based location for Tornado Alley. It is somewhat west of many of the popular descriptions of Tornado Alley (e.g., Wolford 1960; Kelly et al. 1978). The values of the criteria are admittedly arbitrary, but we believe that the underlying concepts are the important issues. Our choices emphasize the strong gradient regions in occurrence along the west side and the timing along the east side. Changes in the values owing to further increases in tornado reporting efficiency might extend or shrink the region, but the general area in the plains seems to be robust. It is also important to note that the cutoff in the southeastern part of the region is driven by the timing constraint whereas the cutoff in the northwestern part is driven by the occurrence constraint. The criteria we have chosen affect the exact location of the boundaries. Because the occurrences are based on reports, it is possible that an increase in reporting efficiency might add additional area on the northwestern part of the area shown in Fig. 11, such as the Dakotas. There may also be a slight effect in the northern United States from not having Canadian reports in the dataset.
We have developed a climatology of the daily probability of one or more tornadoes occurring anywhere in the contiguous United States that can serve a wide variety of purposes. It provides a basic description of the temporal and spatial threat associated with tornadoes. It is limited at this time in that we have not considered intensity or pathlength and width and we have not looked at time of day. By our choice of smoothers, we cannot see high-frequency behavior in either space or time. This approach has the advantage, on the other hand, of identifying the strongest large-scale signals.
In addition, we are limited by the quality of the observations. Although focusing on tornado days, rather than individual tornadoes, reduces the apparent secular increase, we can have little confidence that the record is complete. Population biases, especially for weak tornadoes, have been addressed previously (e.g., Grazulis 1993; King 1997) and are certainly present here. The relative minimum over southwestern Kansas and the Oklahoma panhandle may be a result of low population density and lack of interstate highways and the associated reporting problems, although that is speculative. In addition, the lack of population in the other locations on the plains may also be associated with underreporting. Reliability of estimates of any quantity in regions of low event frequency (such as the Rockies and points west) is open to question.
The peak probability on any given day at any point rarely exceeds 2%. The SPC began issuing a probabilistic convective outlook product in 2001, with forecasters indicating the probability that tornadoes (as well as wind and hail, separately) will occur. The definitions and grid are identical to that used here. The lowest probability contour that the forecasters are allowed to use is 2%, and so, for almost all days and locations, the presence of a contour indicates that the forecasters believe tornadoes to be more likely than their climatological probability would indicate.
Many meteorological questions are raised by the climatology. For instance, the rapid expansion westward of tornado threat in the early spring and its subsequent northward progression are likely associated with the climatological return of low-level moisture. Synoptic climatological studies could be useful in identifying the progression of atmospheric conditions that lead to the general climatological distribution. Those studies could then shed light on likely areas of underreporting and could, in time, form the basis for seasonal forecasting of tornado threat.
The presence of the strong seasonal cycle in the Great Plains is a dominant feature of the record. It seems likely that the consistency is tied to the proximity of the region to the Rocky Mountains and the Gulf of Mexico. Deep convection requires the presence of warm, moist air at low altitude, steep lapse rates aloft, and some mechanism to lift the warm, moist air (Doswell et al. 1996). In addition, supercell thunderstorm environments are characterized by strong vertical shear of the horizontal winds (Rasmussen and Blanchard 1998). A simple conceptual model of putting the ingredients together to produce supercells and, thus, presumably to increase the likelihood of tornadoes in the plains of the United States can be described as follows. Southerly or southeasterly low-level flow from the general vicinity of the Gulf of Mexico (providing the warm, moist air), and westerly or southwesterly midtropospheric flow from over the high terrain of the western United States (providing the steep lapse rates aloft) combine to produce the correct thermodynamic environment. This combination of flow patterns can clearly provide significant wind shear. The combination is also climatologically common in the spring of the year in the plains. As a result, most of the ingredients that are common in tornadic environments occur frequently in the spring of the year.
The identification of a quasi-objective “Tornado Alley” is meaningful when the record of killer tornadoes is considered. From 1980 to 1999, 21 tornadoes resulted in 10 or more deaths in the United States. Only two of those (Andover, Kansas, on 26 April 1991 and Oklahoma City on 3 May 1999) occurred in the area outlined by the high frequency and repeatable season. In a simplistic sense, given that approximately 40% of all U.S. tornadoes occur in that area, the expected number of tornadoes that result in 10 or more deaths would be 8, given that 21 such tornadoes occur overall in the United States.5 We speculate that the frequency and repeatability of the threat lead to improvements in many aspects of the response system in those regions identified as being in Tornado Alley. Forecasters are more experienced in handling the situations and are more likely to be prepared when they occur. It may well be easier to train and call out spotters as part of the warning process when their awareness is heightened and when they know that their volunteer service will only be required for a relatively small part of the year. Public awareness and, as a result, response are almost certainly heightened during the peak threat season in Tornado Alley. It may be easier to get the public's attention in an area where the threat is high over a limited period of time than in an area where the threat is lower for a much longer period of time.
The technique described here holds promise for analyses of other properties of severe weather. It can be used on any length of record, including individual days. As such, it has been used in the verification of the convective outlook forecasts of the SPC (Brooks et al. 1998). Single-day analyses also provide the possibility of objectively estimating the rarity of outbreaks by comparing the probability distribution for a particular day to the climatological distribution for that day. If a spatial smoother is applied to one day's tornado reports, the ratio of the single day's values to the climatological values is a measure of the rarity of the event. An outbreak occurring in midwinter in the northern plains, when and where the climatological probabilities are extremely low, is a much rarer event than the exact same distribution of tornadoes shifted to the southern plains in early May. On a longer timescale, the technique also allows us to estimate the variability of the tornado threat in addition to the mean threat. In principle, this implies that estimates of what constitutes “well above normal” and “well below normal” and the regional variability of that can be made. Such estimates could be useful for setting limits for detectability of climate-change-related changes in tornado occurrence or, potentially, as a basis for seasonal forecasting of tornado threats.
We thank Messrs. Richard Thompson and Roger Edwards of the SPC for suggestions early in this work and Mr. Matt Briggs for calling our attention to kernel density estimation techniques. We also thank Ms. Daphne Zaras of NSSL and Messrs. Nathan Blais and Brad Flickinger of Putnam City West and North High Schools, respectively, in Oklahoma City for their work in the development of the Web interface to the climatology: http://www.nssl.noaa.gov/hazard/. The suggestions of anonymous reviewers significantly improved the manuscript.
Current affiliation: NOAA/Forecast Systems Laboratory, Boulder, Colorado
Corresponding author address: Dr. Harold E. Brooks, NOAA/NSSL, 1313 Halley Circle, Norman, OK 73069. Email: email@example.com
To be classified as severe, a thunderstorm in the United States must produce hail that is at least 3/4 in. (2 cm) in diameter, have a wind gust of 50 kt (25 m s−1) or more, or produce a tornado. In addition, damaging hail and damaging thunderstorm gusts of any magnitude are considered to be severe.
A grid box of 80 km per side has the same area as a circle with a radius of 24.6 n mi, very close to the area under consideration by the SPC probability forecasts.
For nonleap years, an artificial 29 February is introduced, as a linear interpolation between 28 February and 1 March. Given the nature of the temporal smoother and the relatively low occurrence of tornadoes at that time of year in most of the country, the results are insensitive to how the leap years are handled.
It is impossible to convey all features of interest in a reasonable number of maps in the context of a journal publication. We have selected dates that show many of the features of interest, but readers are encouraged to look at an animation of the annual cycle with frames once per weak that is available at our Severe Weather Climatology Web site, http://www.nssl.noaa.gov/hazard/. Clickable maps that allow users to see the annual cycles at any location in the United States, as described below, are also available there.
Population density may play some role here, but the population density of relatively high threat states outside of Tornado Alley, as defined here, is only slightly more than twice that of the Tornado Alley states.