## Abstract

Over the last 50 yr, the number of tornadoes reported in the United States has doubled from about 600 per year in the 1950s to around 1200 in the 2000s. This doubling is likely not related to meteorological causes alone. To account for this increase a simple least squares linear regression was fitted to the annual number of tornado reports. A “big tornado day” is a single day when numerous tornadoes and/or many tornadoes exceeding a specified intensity threshold were reported anywhere in the country. By defining a big tornado day without considering the spatial distribution of the tornadoes, a big tornado day differs from previous definitions of outbreaks. To address the increase in the number of reports, the number of reports is compared to the expected number of reports in a year based on linear regression. In addition, the F1 and greater Fujita-scale record was used in determining a big tornado day because the F1 and greater series was more stationary over time as opposed to the F2 and greater series. Thresholds were applied to the data to determine the number and intensities of the tornadoes needed to be considered a big tornado day. Possible threshold values included fractions of the annual expected value associated with the linear regression and fixed numbers for the intensity criterion. Threshold values of 1.5% of the expected annual total number of tornadoes and/or at least 8 F1 and greater tornadoes identified about 18.1 big tornado days per year. Higher thresholds such as 2.5% and/or at least 15 F1 and greater tornadoes showed similar characteristics, yet identified approximately 6.2 big tornado days per year. Finally, probability distribution curves generated using kernel density estimation revealed that big tornado days were more likely to occur slightly earlier in the year and have a narrower distribution than any given tornado day.

## 1. Introduction

Climatological tornado data can be used to provide the basis for a scientific approach for forecasting tornadoes, establishing regional insurance premiums, and developing construction code requirements (e.g., Doswell and Burgess 1988). Assessing climatological risk and accurately forecasting tornado outbreak events are of great importance to those affected, considering the significant potential for loss of life and property. Typically, when many tornadoes are reported in a short time and over a given region, the event may be considered an outbreak. Outbreaks may occur in a small, confined area of a state, or they may be as widespread as several states. Furthermore, an outbreak may occur in a relatively short, intense 12-h period or may occur from many tornadoes over a number of days. Regardless of when or where an outbreak occurs, all outbreak definitions are subjective.

Because universally accepted definitions of outbreaks do not exist, with some obvious exceptions (e.g., the 3–4 April 1974 “Super Outbreak”; Hoxit and Chappell 1975; Locatelli et al. 2002), defining tornado outbreaks are usually left to individual investigators. For example, Pautz (1969) defined a major tornado outbreak, or “family” outbreak, as five or more tornadoes within the same weather system on a given day. Later, Galway (1975) developed an outbreak definition that included three classifications of family outbreaks: small (6–9 tornadoes), moderate (10–19 tornadoes), and large (≥20 tornadoes). He found that 73% of the tornado deaths from 1952 to 1973 were attributed to outbreaks with 10 or more tornadoes. By contrast, Hagemeyer (1997) focused strictly on Florida peninsular tornado outbreaks, which he defined as four or more tornadoes in a single day. Because 83% of the Florida peninsular tornado outbreaks occurred within 4-h periods, Hagemeyer (1997) added a 4-h period to his criteria. The distinctive nature of Florida peninsular tornado outbreaks allowed for such a small temporal window, although such a criterion would not be applicable for the rest of the United States. Galway (1977) suggested that accounting for the number of tornadoes alone is an inadequate way of defining an outbreak since the intensities of the tornadoes can be relevant. For example, an event consisting of 24 tornadoes where 12 produced F2 damage (Fujita scale) would be regarded very differently from an event consisting of 24 tornadoes producing F0 damage.

The purpose of this paper is to discuss the evolving nature of the historical tornado record and develop a new method for identifying tornado outbreaks. Section 2 outlines the methodology and dataset used in this study and also discusses some problems and inaccuracies contained in the tornado record. Section 3 discusses the characterization of tornado outbreaks through the historical tornado record, detrending of the data, and a description of some thresholds used to identify tornado outbreaks (referred to hereafter as “big tornado days,” for reasons we explain later). Finally, section 4 summarizes this paper.

## 2. Data

This study uses tornado reports compiled from 1954 to 2003 from the Storm Prediction Center's (SPC's) tornado database (McCarthy 2003). Reports were collected in convective day increments, defined by the SPC as 1200 to 1200 UTC 24 h later. The start of this dataset was chosen to be 1954 because consistent tornado reports have been available only since the early 1950s (e.g., Grazulis 1993; Bruening et al. 2002). The end of this dataset in 2003 was determined by the last full year of tornado reports available. Perhaps the foremost problem with past tornado reporting is the unorganized manner in which reporting was conducted and documented (Grazulis et al. 1993). Many tornadoes have gone and will continue to go unreported, but the underreporting has been reduced in recent decades (Brooks and Doswell 2002).

Assessing the tornado record requires some care. Doswell and Burgess (1988) argued that much of the information about tornadoes comes from untrained witnesses, and there is ample reason to question the quantitative aspects of the database. Doswell and Burgess (1988) and Grazulis (1993) showed that the accuracy and temporal consistency of the tornado reports were limited. These limitations include basic errors in reporting and/or recording of time and location, spatial and temporal variability in the collection efforts for warning verification, changes in damage survey procedures, population increase and migration, and storm spotter network creation (with the increasing use of portable video cameras).

Additional errors are potentially introduced into the dataset when the Fujita scale is applied to rate the tornado damage. For example, Doswell (1985) explained that, whenever a structure is completely destroyed (F4 or F5 according to Fujita's definition), the estimates can signify only a lower limit to the wind speed. Furthermore, Doswell and Burgess (1988) cautioned that since an F-rating is determined by the maximum observed damage at a point anywhere within the total path of the tornado, a single occurrence of the highest damage level then labels the whole path. They stressed that the F-scale is a damage scale, not an intensity or wind speed scale. Determining ranges of wind speed without accounting for construction quality, population density, and location is not possible. Marshall (2002) argued that F-scale ratings are dependent upon the person reviewing the damage. A person with knowledge of how buildings fail, perhaps a structural engineer, would probably rate a building differently than a person without that knowledge. On the other hand, damage and wind speed are not unrelated (e.g., Schaefer et al. 1986), although any relationship is not straightforward (e.g., Reynolds 1971; Doswell and Burgess 1988). These issues with the tornado database affect our ability to use this dataset most effectively.

## 3. Characterization of the tornado record

### a. Historical tornado record

Over the past half century, the number of tornadoes reported in the United States has doubled from roughly 600 per year in the 1950s to around 1200 in the 2000s (Fig. 1). The 1374 reported tornadoes in 2003 were second only to the 1426 reported in 1998. The changes are not likely due to meteorological causes alone. Report discrepancies, public awareness, Doppler radar, and National Weather Service vigilance all have contributed to the increasing trend. In addition to the general increase in the annual number of tornado reports, the tornado record possesses interannual variability, too. In the late 1980s, a relative deficit of tornadoes occurred (Fig. 1). Furthermore, 2002 stands out as being a relative minimum in tornado reports, the most significant below-normal year since 1989.

Despite the increase in the number of reported tornadoes, the numbers of F1 and greater tornadoes has remained fairly consistent over the 50 yr at around 500 reports per year (Fig. 1). Brooks and Doswell (2001) suggested that stronger tornadoes have been reported more consistently over time. Therefore, nearly all the doubling of tornado reports over the last 50 yr is most likely due to the increased reporting of F0 tornadoes. Thus, given the obvious changes in the dataset, how can the tornado record from the 1950s and 1960s be compared with the tornado record today?

### b. Detrending the tornado record

To account for the increase in tornado reports over time, a simple least squares linear regression was fit to the annual number of tornado reports (Fig. 1). Although the background increase in the number of reports is not necessarily linear, linear regression offered a reasonable fit to the data (Bruening et al. 2002). The linear regression yielded an expected increase in the number of reports of about 14 tornadoes per year, indicating that for 2004, 1224 reported tornadoes could be anticipated. As of early October 2004, a preliminary tally of 1516 tornadoes had been reported—surpassing not only the expected number of tornadoes, but also exceeding 1998 as the year with the most reported tornadoes on record (see http://www.noaanews.noaa.gov/stories2004/s2327.htm).

The increase in the number of reported tornadoes is expected to level off eventually, suggesting that the linear regression may only be a temporary solution to the report inflation. The absence of a complete record of tornadoes represents a challenge for detrending the data (Brooks et al. 2003). On the other hand, Dotzek et al. (2003) and Feuerstein et al. (2005) demonstrated that the annual tornado record for the United States could be statistically modeled best by Weibull distributions (over the past few decades). In particular, the modeled distributions suggest that the greatest increase would occur in the number of weak tornadoes and show a trend toward converging to an asymptotic climatological intensity distribution. If so, we would expect an increase in weak tornado occurrences and the overall report inflation rate over the last 50 yr to diminish. Still, there is no way to tell at present if, and when, stabilization of the tornado record will take place.

### c. Defining a big tornado day

For the purposes of this paper, the number of tornadoes per day was examined from a national perspective, irrespective of their spatial locations. A “big tornado day” is a single day when numerous tornadoes and/or many tornadoes exceeding some intensity threshold were reported anywhere in the country. We chose this term with the intention of distinguishing a big tornado day from previous applications of the word outbreak. To determine the number of tornadoes needed for a big tornado day, thresholds must be applied to the data. To ensure that thresholds were placed as objectively as possible, natural breakpoints in the data were sought to identify potential threshold values. Ideally, if a natural breakpoint does occur in the data, it would indicate that there are two separate populations of big tornado days and few tornado days.

First, we take into account only the number of reported tornadoes and disregard the intensities. As we have seen above, the total number of reports has increased dramatically over the 50-yr period. Therefore, choosing a fixed value for a threshold would likely bias the results toward recent years and would prove to be problematic. Consequently, the linear regression performed on the annual number of reported tornadoes is used to level the dataset and support threshold placement. A reasonable technique is to take a fraction of the annual expected value from the linear regression to determine threshold placement. Figure 2 shows a log–linear plot of varying percentages of the annual expected number of tornadoes. Fractions of the annual expected value associated with the linear regression were considered possible threshold values. This method of determining a threshold takes into account the general increase in reports and does not assign a fixed threshold value for all 50 yr. In other words, the minimum number of tornado reports needed to be considered a big tornado day is dependent upon the year under investigation. For instance, say 1% of the linear regression value is considered the minimal threshold. Thus, a big tornado day would possess more than 5.5 tornadoes in 1954, more than 8.3 tornadoes in 1975, and more than 12.1 tornadoes in 2003 and would occur roughly 25 times per year. To identify big tornado days occurring roughly once a decade, 7% of the linear regression value (or 84 tornadoes in 2003) would satisfy that criterion (Fig. 2).

Next, we consider the affect of adding intensity information to determine a big tornado day. The Fujita-scale damage ratings were used to establish intensity. Previously, we saw that the F1 and greater series had been more consistent over the last 50 yr, so we may be able to choose a fixed value for intensity thresholds (Fig. 1). Similarly, Fig. 3 demonstrates that the F4 and greater record has also remained relatively consistent throughout the time period, yet the F2 and greater and F3 and greater records have a declining trend over this period, with a possible discontinuity in the late 1970s. The F2 and greater record has the greatest discontinuity in the annual number of F2 and greater rated tornadoes, suggesting overrating problems with the F2 rated series. To describe the degree of statistical stationarity in the F1 and greater series and the F2 and greater series, we sorted the number of big tornado days per year in decreasing order and separated the early and later portions of the dataset (Fig. 4). A stationary series would have similar distributions in both periods, while a nonstationary series would not. Figure 4 shows that the distributions from the earlier and later portions of the F1 and greater series are more similar to each other than the F2 and greater series. Hence, *the F1 and greater series is more stationary over the last 50 yr in comparison with the F2 and greater series*. The large separation between the two curves in the F2 and greater series is consistent with overrating problems of tornadoes present in the early period of the record, as suggested by Grazulis (1993) and Brooks and Craven (2002). Moreover, Brooks and Doswell (2001) suggested that stronger tornadoes had been reported more consistently over time; the ratings were apparently just “shifted” out of F1 and greater ratings to higher categories. In other words, the nonstationarity of the F2 and greater and the F3 and greater records may be a result of overrating problems with the individual tornado reports (Fig. 3). The F4 and F5 record may be overrated as well, but the relative rarity of these violent tornadoes may make detecting such an overrating signal difficult. Because of this overrating, we have chosen to use the F1 and greater record for intensity information in determining big tornado day events over the 50-yr record.

Figure 5 demonstrates a log–linear plot of the number of big tornado days per year identified by the number of F1 and greater and F2 and greater tornadoes one wished to be considered a big tornado day (similar to Fig. 2). This figure illustrates that there is no obvious break in the F1 and greater or F2 and greater dataset to identify an objective threshold, again except for events that occurred roughly once per decade. The presence of a relatively stationary record over many years allows for an arbitrarily chosen fixed threshold value for the minimum number of F1 and greater tornadoes needed to be labeled a big tornado day. Consequently, thresholds must be chosen arbitrarily and depend on the number of events one wishes to analyze. For example, an average of one big tornado day per year corresponds to days with at least 3% of the linear regression value. This would be comparable to about 36 tornadoes of any F-scale rating (for 2003), 26 or more F1 and greater tornadoes, or 15 or more F2 and greater tornadoes. If a big tornado day is defined as a once-in-a-decade event, then days with at least 7% of the expected annual number of tornadoes would be considered. This threshold corresponds to requiring 84 tornadoes of any F-scale rating (for 2003), 50 or more F1 and greater tornadoes, or 30 or more F2 and greater tornadoes to occur in a day. Such a threshold level is highly constrained and should be used to identify only the most extreme events. Analogous to Fig. 2, Table 1 lists the top 25 big tornado days identified by the percentage of the linear regression for any tornado, total number of F1 and greater tornadoes, and total number of F2 and greater tornadoes. The 3 April 1974 Super Outbreak earned the top big tornado day in all three categories and approached a once-in-a-century event (Fig. 2).

### d. Examples of thresholds

As shown previously, there is no completely objective procedure to set thresholds. Any definition, therefore, is necessarily arbitrary, and the choice of an appropriate threshold for any user depends upon the user's purpose. For instance, if a user intends to carry out detailed analyses of hourly observations for the entire day of the tornadoes, practical constraints will force the user to choose a relatively high threshold, so that a small number of big tornado days are identified. On the other hand, if the purpose of the research is to generate a large number of proximity soundings, the threshold must be set much lower, so a large number of big tornado days are identified.

Besides the constraints implied by user requirements, testing a threshold could demonstrate how well the big tornado days selected by that threshold agree with lists of big tornado days identified by independent experts. Although individuals would be unlikely to create the exact same list, many days would probably be in common. For instance, the Palm Sunday 11 April 1965 outbreak (Fujita et al. 1970) would almost certainly appear on any reasonable list. Farther down any list of big tornado days, the agreement would likely lessen, but many would still be in common. In terms of the arbitrary thresholds, the highest threshold would include the days that all people would agree were big tornado days, and lower thresholds would include a large fraction of all of the days that people would include on a list. Formally, the high threshold could be thought of as approaching the intersection of all expert opinions, and the low threshold as approaching the union of expert opinions. We cannot carry out this test with experts in the course of this work, but such a test provides a framework for considering the underlying nature of the problem.

Time series of the number of big tornado days for each year can be created to demonstrate the behavior of certain thresholds. As an example, thresholds of 1.5% of the expected annual total for any tornado, eight or more F1 and greater tornadoes, and four or more F2 and greater tornadoes were chosen. These thresholds, on average, yielded approximately 14–15 big tornado days per year. Figure 6 illustrates the time series of these thresholds with each series offset along the vertical axis to allow easier examination. A closer look at each series revealed that the F2 and greater series (hollow circles) consistently identified more big tornado days than the any-tornado series (gray triangles) and the F1 and greater series (dashed hollow circles) for the first 19 yr. The F2 series, however, frequently identified less big tornado days in the last 19 yr of the dataset than the any-tornado and F1 series (as was suggested by Fig. 2). Thus, the F2 and greater series is not an appropriate choice to identify big tornado days.

To incorporate both total number of tornadoes and intensity information, it is logical to consider a day to be a big tornado day if it meets a minimum number of either the total number of reported tornadoes and/or the minimum number of F1 and greater tornadoes. Figure 7 shows some examples of the sensitivity of the number of big tornado days by varying the thresholds. First, the thresholds of 1.5% of any tornado and/or at least eight F1 and greater tornadoes (black circles) are demonstrated. The years 1957 and 1973 are highly noticeable with this threshold, and the late 1980s, 2000, and 2002 stand out as years with relatively few big tornado days. From 1954 to 2003, a total of 905 days (18.1 per year) were identified, with 550 days (11 per year) selected by both thresholds, 158 (3.2 per year) by the any tornado threshold alone, and 197 (3.9 per year) by the F1 threshold alone. Moreover, Fig. 7 illustrates some higher thresholds (2.5% for any tornado and/or 15 F1 and greater tornadoes) that showed similar characteristics but identifies 1967 and 2003 as the third and fourth highest big tornado day years. The higher threshold obviously selected fewer days, with a total of 311 days (6.2 per year) found, 169 (3.4 per year) that met both criteria, 72 (1.4 per year) from the any-tornado criterion, and 70 (1.4 per year) from the F1 and greater criterion.

### e. Annual cycle of big tornado days

Another interesting way to examine the tornado record is to estimate the daily probability of a tornado anywhere in the United States. Figure 8 shows the daily mean number of reported tornadoes per year over the 50-yr period. To construct a statistical model of the data in Fig. 8, kernel density estimation with a Gaussian smoother (*σ _{t}* = 15 days) was used to smooth the data in time and space (Brooks et al. 2003). The resulting graph displays the probability of any tornado at any location in the United States on each day of the year (Fig. 9). The peak probability of 90% chance of any tornado occurring in the United States is near 12 June (Fig. 9). In contrast, constraining the data using a threshold of 1.5% of the annual expected value and at least eight F1 and greater tornadoes shifts the peak of the probability approximately three weeks earlier in the year. This curve is scaled by a magnitude of eight, and the actual probability is listed on the right axis. Hence, big tornado days are more likely to occur slightly earlier in the year than just any day with a tornado. Furthermore, the peak in the distribution is narrower, indicating that big tornado days are more concentrated in the spring and early summer, with few occurring in late summer. Finally, the secondary peak in the late fall is prominent in the big tornado day record, implying that fall tornadoes may be even more concentrated in outbreak-type events rather than spring tornadoes.

## 4. Conclusions

Completely objective approaches to defining a big tornado day do not exist. Any definition, therefore, is necessarily arbitrary, and the choice of an appropriate threshold for any user will depend upon the user's purpose. A simple least squares linear regression was fit to the annual number of tornado reports to offset the general inflation (Fig. 1). Fractions of the annual expected value associated with the linear regression were considered as the possible minimum number of tornado reports needed to be identified as a big tornado day. This method of determining a threshold accounted for the general increase in reports and did not assign a fixed value for all 50 yr. In other words, the minimum number of tornado reports needed to be considered a big tornado day was dependent upon the year under investigation.

The increase in the number of reported tornadoes was quite evident in the annual number of tornadoes, yet the numbers of F1 and greater tornadoes remained fairly consistent (Fig. 1). The F2 and greater series had far more tornadoes rated at least an F2 in the earlier part of the dataset than the latter (Fig. 3). The distributions from the earlier and later portions of the F1 and greater series were more similar to each other than the F2 and greater series (Fig. 4). Thus, *the F1 and greater series is more stationary over the last 50 yr in comparison with the F2 and greater series*. Therefore, we chose the F1 and greater record for intensity information in determining big tornado day events over the 50-yr record. The presence of a reasonably stationary record over many years allowed for a fixed threshold value to be arbitrarily chosen for the minimum number of F1 or greater tornadoes needed to be labeled a big tornado day. Therefore, a big tornado day must meet some minimum number of reported tornadoes (based on the linear regression) and/or the minimum number of F1 or greater tornadoes. Although many combinations of thresholds are possible (e.g., Fig. 7), such thresholds must be chosen arbitrarily based on the user's purpose.

Finally, big tornado days are more likely to occur slightly earlier in the year and have a narrower distribution than just any day with a tornado (Fig. 9). Regardless of when or where a big tornado day occurs, all outbreak definitions require some degree of subjectivity and should be treated with discretion.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Stephanie M. Verbout, School of Meteorology, University of Oklahoma, 100 East Boyd St., Room 1326, Norman, OK 73019. Email: Stephanie.Nordin@ou.edu