## 1. Introduction

Mapping tornado density based on historical tornado reporting or observation data has been a well-attended-to branch of research for decades, addressing various research questions that are important to tornado climatology (e.g., Thom 1963; Schaefer et al. 1986; Concannon et al. 2000; Coleman and Dixon 2014; Widen et al. 2015). Data quality issues have been well documented and agreed on as a key limiting factor in such efforts (Brooks et al. 2003; Ray et al. 2003; Doswell et al. 2005; Verbout et al. 2006; Coleman and Dixon 2014) because of biases in tornado reporting, temporal or spatial inconsistency of tornado reports (e.g., caused by unevenness in population density, radar locations, and road networks; see Elsner et al. 2013), problems in defining tornado intensity, and inaccuracies or incompleteness of tornado properties. Various approaches are taken to produce similar or different tornado density maps (Coleman and Dixon 2014), but little agreement has been reached regarding the mapping methods or parameters (e.g., Brooks et al. 2003; Dixon et al. 2011; Dixon and Mercer 2012; Marsh and Brooks 2012). We believe this situation is related to several geographical information system (GIS) complications—present at conceptual as well as methodological levels—that need to be fully recognized and systematically addressed. To say the least, individual tornadoes are discrete real-world events in space–time and are directly observable (reportable), while tornado density is not (see section 2b for details). As a result, many challenges emerge when a geographic translation needs to be made from one to the other. We define tornado density in this paper as the inverse-distance-weighted (IDW) count of tornado touchdown points or tornado-affected cells (when pathlength data are used). Though not addressed directly in this paper, we believe the density values derived from the past records of tornadoes have implications for future tornado risk or probability.

We examine how GIS factors function in the process of tornado density mapping, in hopes that better density maps can be produced in the future by wisely manipulating these factors. The next section presents GIS definitions of tornado and tornado density as the conceptual support for tornado density mapping. The third section demonstrates a few GIS complications and their effects on tornado density mapping. Conclusions and suggestions are given in the last section.

## 2. Tornado and tornado density viewed in GIS

### a. A GIS definition of tornado and its representation

Tornado density mapping is a GIS activity even though it may be conducted by a climatologist, as geographic space plays a vital role in these efforts. As a result, this mapping activity needs conceptual guidance regarding the questions “what is a tornado?” (see American Meteorological Society 2015) and “what is a tornado in GIS?” To accomplish this, we add the following elements in the tornado definition:

a tornado is a dynamic event in space–time (Yuan 1998) with a rather discrete geographic boundary and life span;

a tornado is truly three-dimensional in geographic space in that it covers a land surface area with horizontal shape and size and it has a complete vertical dimension; and

a tornado has its own geometric (e.g., pathlength, width, area, and direction), thematic (e.g., intensity, wind speed), and temporal (e.g., occurring time and speed/pattern of traveling) properties (Schaefer et al. 1986) that can be tied both to the geographic space–time (e.g., wind speed at a certain space–time point during the tornado) and to the tornado itself (e.g., intensity of a tornado).

The above definition allows the GIS representation of tornadoes to be evaluated more completely. Specifically, when individual tornadoes are conceptualized as discrete objects in geographic space [see Bian (2007) and Goodchild et al. (2007) for a definition of “objects” in GIS], they are usually mapped as various vector features: touchdown points, track polylines, or damage area polygons. Their third or vertical dimension is usually ignored, at least for the purpose of tornado density mapping. The temporal dimension is either ignored (such as in this study) or greatly aggregated (e.g., Concannon et al. 2000). Occasionally, an individual tornado (or tornado area) may also be conceptualized as a continuous field (Goodchild et al. 2007) and mapped using raster [e.g., see Fig. 4 of Standohar-Alfano and van de Lindt (2014)] or isoline (e.g., Yuan et al. 2002) data models. Tornado properties can be linked to any of the above spatial representations to serve various tornado mapping purposes.

According to the above tornado definition, all tornado density maps are simplifications of the real-world “tornado” phenomena. Users’ choices often vary (and vary in quality as well) with regard to how to simplify tornadoes into tornado density maps. However, it is difficult to assess which of the above-listed tornado simplifications or representations is absolutely “better,” because the assessment has to be linked to the actual mapping situation and the application purpose. For instance, to evaluate tornado hazards of “places” across the United States, it would be useful to map tornadoes to (named) counties and states (Boruff et al. 2003), but to identify the tornado alley, it would be better to map tornado density continuously using (small enough) grid cells or isolines (e.g., Dixon et al. 2011).

### b. A GIS interpretation of tornado density mapping

When examined in GIS, differences between the two concepts of “tornado” and “tornado density” are essential. Tornado density is a typical density field. To calculate and report a density value at a location, we must first specify an area (e.g., a county, a circle, or a rectangular cell) surrounding this location. The adjustment of the area in shape and size would result in a different density reading for the same location. Therefore, all tornado density values are relative (to the choice of the area) instead of absolute. In contrast, the existence of tornadoes and most other weather phenomena is fully independent of human definitions. A value on a density field usually describes a two-dimensional geographic area rather than an individual point (Goodchild et al. 2007), even though the value may be assigned to a point (e.g., central point) within the area. In this sense, tornado density is also different from many climatic variables that are ordinary fields in GIS, such as mean air temperature. Ordinary fields are strictly properties of individual points in geographic space and have nothing to do with the surrounding areas of these points (even with the existence of spatial autocorrelation). Tornado probability should be viewed as an ordinary field rather than a density field as well, since it is a property, unknown as it is, of individual locations (i.e., points), even though tornado probability of points often needs to be estimated from area-based tornado density.

Meanwhile, tornado density—once defined—is expected to vary across geographic space in a more or less continuous manner. Sudden beginnings and endings of tornadoes are common, but sudden changes of tornado density (and tornado probability) in geographic space are hardly natural, at least in vast, relatively low relief areas such as the central United States. This observation may explain why some researchers would smooth the crisp tornado density field they created (e.g., Ray et al. 2003; Ashley 2007). Nevertheless, how exactly tornado density should vary (continuously) across geographic space—such as its rate of change and its geographic scale (wavelength; see Dixon et al. 2014) of variability—is again relative and is controlled by human choices. From this perspective, tornado density is again different from many other climatic variables, including tornado probability.

More specifically, a tornado density value depends on at least four geographic factors:

the shape of the neighborhood area to which this value refers,

the size of the neighborhood area,

the relative importance (i.e., distinguishable weights or contributions) of individual tornadoes within the neighborhood area in the computation of the density value (e.g., because of their variable relationships with the value point), and

the thematic and geometric properties of individual tornadoes considered for the derivation of the density value.

The first two factors correspond neatly to the well-known modifiable areal unit problem (MAUP)—both scale and zoning problems—in GIS (Openshaw 1984; Jelinski and Wu 1996), just in a special computational environment (e.g., in a raster neighborhood computation rather than vector environment) and application background (i.e., a GIS problem hidden in climatology). These two factors determine that the tornado density computation is relative and modifiable because of the modifiable shape and size of the neighborhood area. Neighborhoods of three kinds are common in tornado density mapping, necessarily resulting in different outputs: circular areas (e.g., Dixon et al. 2011; Coleman and Dixon 2014), rectangular (or similar) grids (e.g., Kelly et al. 1978; Brooks et al. 2003; Ashley 2007), and irregular polygons such as counties or states (e.g., Paul 2001; Boruff et al. 2003; Broyles and Crosbie 2004). The neighborhood size adopted in previous works varies greatly as well, typically from 40 km (e.g., Coleman and Dixon 2014) to 200 km or larger (e.g., Kelly et al. 1978; Ray et al. 2003; Dixon and Mercer 2012; Smith et al. 2012).

An additional complication in tornado density literature is that neighborhoods (both the shape and size) are sometimes defined in separate stages. For instance, Ashley (2007) counted tornado variables into cells of 60 km in size and then applied a 3 × 3 (i.e., a 180 km × 180 km rectangle window) low-pass filter to generate smoother surfaces. Coleman and Dixon (2014) first accumulated tornado pathlengths into 1-km^{2} cells using a 250-km circular window and a kernel function and then aggregated the results into a 5-km grid. “Neighborhood” in these cases is a compound term combining the effects of area (e.g., window, kernel, and grid) choices in both of the stages. When a regular neighborhood is used, it would become incomplete and irregular in both shape and size along the edge of the study area (e.g., along the coastline), causing so-called edge effects (Yamada and Rogerson 2003) that need to be corrected (see section 3a).

The third and fourth factors involve how tornado density is reported. Ideally, the numbers should be read to spatially continuous points or sufficiently small raster cells (e.g., many times smaller than the neighborhood area) to better represent the spatial continuity of tornado density. The choice of small cells also implies that few cells contain multiple tornadoes so that most tornadoes can be differentiated in the density computation. In return, the density difference from a cell to its neighboring cells is primarily because they are associated with different neighborhood areas, but the significant overlapping of these areas would result in a density pattern that is more or less continuous in geographic space. When the geographic distribution of tornadoes is positively autocorrelated (meaning tornadoes tend to have a clustered distribution, as is the case in the United States), this spatial autocorrelation will strengthen the spatial continuity of the resultant density pattern.

For a cell to carry the density value of its surrounding area, though, tornadoes closer to this cell should be considered more important and be given a heavier weight simply because they are more closely related to the density reading at this cell, and vice versa. This weight adjustment is equivalent to distance decaying or inverse-distance weighting in GIS (Franke 1982; Silverman 1986; Mitas and Mitasova 1999) and has been implemented in multiple previous works (e.g., Brooks et al. 2003; Dixon et al. 2011), although the used terms vary. Distance decaying further enhances the geographic continuity of density transition between neighboring cells, as the weights of the same tornadoes would only vary slightly when they are assigned for density values of neighboring cells.

When distance decaying needs to be implemented, the distinction between a circular neighborhood area and a rectangular or irregular area for density computation is obvious. In a circular neighborhood case, the distance decaying rate and distance limit (of the neighborhood area) would be uniform in all directions, or isotropic. There is consequently the least amount of artifacts or need of justification for the use of circular neighborhoods. When rectangular or irregular shapes are used, however, artifacts related to the zoning problem of MAUP (Jelinski and Wu 1996) will be more severe. For instance, there will be cardinal versus diagonal differences in weight reduction rates within a rectangular neighborhood that are hardly justifiable.

Some researchers counted tornadoes into large nonoverlapping grids (Kelly et al. 1978; Ashley 2007) or polygons (Paul 2001; Boruff et al. 2003; Widen et al. 2015), with the benefit of easier and more meaningful (e.g., tornadoes in a county or in a rectangular area) interpretation of resultant patterns, as well as easy integration with socioeconomic or demographic data. However, it necessarily causes loss of geographic details in tornado density variability, because there are too few neighborhoods defined and each tornado can only be counted into one of these neighborhoods with one of two weights: 0 or 1. Sudden transitions of tornado density will be produced across the boundaries of these geographic units, but no changes will be mapped within. A subsequent spatial interpolation or smoothing is sometimes applied to these crisp density readings to generate a more continuous density variability (e.g., Ray et al. 2003; Ashley 2007), but this secondary step is less objective and less respectful of the original data in comparison with a continuous density field that is directly derived from data.

## 3. GIS-related methodological complications and interpretations

We used the Storm Prediction Center (SPC) tornado database (1973–2013) of the eastern half of the United States and experimented on a few key methodological issues in tornado density mapping. We only included tornadoes at or above an enhanced Fujita (EF) scale of 2 (EF2; 6455 in all; see Fig. 1) because of a higher level of data stability and consistency of these more significant tornadoes (e.g., Coleman and Dixon 2014).

### a. Neighborhood size choice, cell size choice, and edge effects

Dixon et al. [2011; also see Dixon and Mercer (2012)] indicate that the choice of the neighborhood size is more important than that of distance decay functions. Indeed, neighborhood size is essential because it reflects the scale problem of MAUP (Openshaw 1984; Jelinski and Wu 1996) in tornado density mapping and controls the scale (i.e., level of detail) of the resultant density pattern [e.g., see Figs. 5–7 of Dixon et al. (2014)]. However, it is still a subjective choice and is modifiable. When a precise choice of the neighborhood size is difficult to justify, an exploratory attitude and an examination of multiple sizes may become necessary. For example, Dixon et al. (2014) used global Moran’s *I* as an objective reference to identify the “ideal” radii as the distances at which the strongest spatial autocorrelation of tornado pathlengths was observed, since these distances correspond to the maximum spatial stability of tornado distributions.

While “one size fits all” does not exist (Dixon et al. 2014), as we believe as well, it is important to understand what exact differences exist between various sizes (also what the tornado density map will be used for) before judging which choice is an appropriate one. The range of the tested neighborhood sizes needs to be specified. Dixon et al. (2014) identified a scale range of 55–180 km that can serve as a useful reference. Putting the methodological and data quality factors aside, we believe the minimum tested neighborhood size should be chosen in consideration of the distance between neighboring tornadoes, so that the density readings can be supported by more than one tornado record. On the other hand, the maximum neighborhood size should be much smaller than the study area so as not to oversmooth the geographic variability of the density surface. We calculated the distance between each pair of nearest touchdowns in the study area to guide the choice of the neighborhood size, obtaining an average distance of 11.70 km and a maximum of 127.43 km in the eastern half of the United States with a high standard deviation (9.06 km, see Table 1).

Summary statistics of distances from each tornado (touchdown point) to its nearest neighbor, for all EF2–EF5 tornadoes in the study area and for those in selected states (AR, IA, KS, MO, NE, and OK).

The above results were influenced by adjacency to large water bodies and to mountainous areas where there are few tornado reports (see, e.g., Elsner et al. 2013), as well as by the inclusion of tornado-rare states. To remove the influence, we did the same distance computation in a more tornado-frequent region of the United States and included only a selection of contiguous, inland, and largely low relief states—Arkansas (AR), Iowa (IA), Kansas (KS), Missouri (MO), Nebraska (NE), and Oklahoma (OK)—involving 1896 tornado records. A moderate yet consistent distance reduction was observed (Table 1). A distance of 20 km, which corresponds to the 88.5th percentile of tornado–tornado distances (i.e., distance from each touchdown to its nearest neighbor) in the six selected states (86.2% in the entire study area) but is much smaller than the maximum, was chosen to be the minimum neighborhood size (i.e., radius) tested. With this radius, most neighborhood areas contain more than one tornado to meaningfully report tornado density variability at a local scale.

To understand how tornado density patterns vary with the change of neighborhood size, we calculated tornado density using touchdown point data across a range of 12 neighborhood sizes from 20 to 360 km: 20, 40, 60, 80, 100, 120, 150, 180, 210, 240, 300, and 360 km. The size 360 km was chosen in consideration of the size of the study area (see Fig. 1). We used a linear inverse-distance-weighting function [see Eq. (4) in section 3b] to count the weighted number of tornadoes within the moving neighborhood area of the tested radii.

Cell size is another important parameter, which determines how well the role of each tornado in tornado density mapping can be distinguished from its nearest neighbor(s). We used 1-km (the same for future maps unless otherwise specified) cells. Cells of this size are sufficiently small to distinguish almost all (≥97%) neighboring tornadoes for tornado density computations. Edge effects artificially reduce tornado density readings (e.g., weighted counts in neighborhood areas) along the boundary of the study area (e.g., in Florida). We removed these effects with a scale-specific edge effect remover (for this and all future density computations) in three steps:

Convert the entire study area (only) into a raster layer of 1-km cells and count the number of cells within an

*r*-radius circular area of each cell, where*r*is the chosen neighborhood size.Calculate the percentage of each circular area (

*r*) covered by the study area (the count obtained in step 1 divided by the number of cells in a full circle), which would be less than 100% along the edges, hence being able to serve as an edge effect remover at*r.*Correct the original density readings by dividing them with the percentages obtained above.

Figure 2 shows the outputs from the experiments on neighborhood sizes. The legend in Fig. 2 only displays the relative difference (e.g., high vs low) in density values on each of the maps, so as to facilitate a visual comparison of spatial patterns between the maps. The actual density values on the density maps have very different magnitudes when the neighborhood size varies. The adjustment of the neighborhood size primarily resulted in dramatically variable wavelengths (i.e., amount of details) of the mapped tornado density patterns, especially when neighborhood sizes are smaller than 210 km. These patterns are all interpretable, just at different scales (hence fitting for different applications). This is to say, the choice of the neighborhood size for tornado density mapping can (and needs to) be based on the targeted level of details or wavelength in the tornado density pattern, which is necessarily related to the mapping purpose. In this paper, the above statement is made without considering data quality (e.g., the stability and temporal span covered by the records) issues.

Tornado density maps at 12 different scales (neighborhood sizes; km) using the linear distance decay function and tornado touchdown points: (a) 20, (b) 40, (c) 60, (d) 80, (e) 100, (f) 120, (g) 150, (h) 180, (i) 210, (j) 240, (k) 300, and (l) 360.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Tornado density maps at 12 different scales (neighborhood sizes; km) using the linear distance decay function and tornado touchdown points: (a) 20, (b) 40, (c) 60, (d) 80, (e) 100, (f) 120, (g) 150, (h) 180, (i) 210, (j) 240, (k) 300, and (l) 360.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Tornado density maps at 12 different scales (neighborhood sizes; km) using the linear distance decay function and tornado touchdown points: (a) 20, (b) 40, (c) 60, (d) 80, (e) 100, (f) 120, (g) 150, (h) 180, (i) 210, (j) 240, (k) 300, and (l) 360.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

It should be noted that coarsening the neighborhood size did not shift the locations of tornado centers. Instead, it smoothed out many local tornado centers to reveal only those regional- to continental-scale tornado centers. With the use of the linear distance decay function, five prominent tornado centers are stably observed across most of the tested neighborhood sizes: central Oklahoma, northeastern Texas, north-central Arkansas, southwestern Mississippi, and northern Alabama.

### b. The role of distance decay functions

While the incorporation of distance decaying is common in literature, the choice of distance decay functions, as well as the very role of these functions, is not well agreed on (e.g., Dixon and Mercer 2012; Marsh and Brooks 2012). If the direct mapping purpose is to depict tornado density patterns instead of estimating the underlying probability surface, we believe the choice of distance decay functions is largely a subjective one and that attention should be paid primarily to their specialty and differences [e.g., see Silverman (1986), sections 3.3 and 3.4]. Our discussion here, then, is not to evaluate which is a “better” function, but to interpret and evaluate differences between a few functions. Some of the evaluated functions were modified in this paper from their original form for comparison and interpretation purposes, so that all of the evaluated functions would assign a weight from 0 (or nearly 0) to 1 within a circular neighborhood area of radius *r* (200 km in Fig. 3 for the demonstration). Tornadoes near the neighborhood edge would receive a weight close to 0, whereas those at the center would receive a weight of 1. The four distance decay functions are as follows:

- The adjusted Epanechnikov quadratic probability density function (Silverman 1986; implemented in Esri ArcGIS), similar to the one used by Dixon et al. (2011), is expressed as (where the constant ratio ¾ was removed so that the maximum weight could be 1)where
*W*(*x*) is the weight of tornado*x*,*d*_{x}is the distance from*x*to the circular neighborhood center, and*r*is the radius (size) of the neighborhood (*d*_{x}≤*r*), controlling the scale of the resultant density pattern. - A modified Gaussian kernel function (with the normalization term being removed for comparison purposes), similar to the one used by Brooks et al. (2003), iswhere
*σ*is a smoothing parameter whose role is equivalent to*r*. The variable*σ*is 60 km in Fig. 3 for this comparison, so that the modified Gaussian function can generate a similar range of weight reduction (i.e., from 1 to close to 0) within a similar distance range (200 km in Fig. 3) to be comparable with the other functions. - A quasi-IDW (see Franke 1982) function, where the
*d*_{x}factor has been removed from the denominator [see Eq. (2) of Franke (1982)] for a weakened distance decaying, is expressed as - The simplest case, a linear inverse-distance-weighted function, is expressed as

Four distance decay functions: 1) adjusted quadratic probability, 2) modified Gaussian, 3) quasi IDW, and 4) linear.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Four distance decay functions: 1) adjusted quadratic probability, 2) modified Gaussian, 3) quasi IDW, and 4) linear.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Four distance decay functions: 1) adjusted quadratic probability, 2) modified Gaussian, 3) quasi IDW, and 4) linear.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

The linear function generates a constant rate of weight reduction within the neighborhood and can be used as a reference for the evaluation of other functions. In comparison, the weights assigned by the quadratic function are heavily biased toward tornadoes closer to the neighborhood center for a long distance (i.e., up to 100 km, or *r*/2, where the weight reduction rate reached the rate of the linear function). In other words, tornadoes within *r*/2 receive biased high weights with relatively slow weight reduction, which allows high weights of these tornadoes to be combined to identify large tornado clusters, or large zones of high tornado density. Small but separate tornado clusters (i.e., density peaks) within *r*/2 may be combined while their individual prominence is reduced (e.g., see Fig. 6a, described in greater detail below). The weight reduces sharply beyond *r*/2, especially near the edge of the neighborhood (Fig. 3). Change of the neighborhood size (Fig. 4) would result in a new set of high-weight tornadoes that are located within a new *r*/2 distance, allowing this function to effectively detect the wavelengths of tornado density variability with changing neighborhood size or kernel radius (e.g., see Dixon and Mercer 2012; Dixon et al. 2014).

Change of the adjusted quadratic distance decay function with varying neighborhood sizes (km): 1) 60, 2) 100, 3) 150, 4) 180, and 5) 200.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Change of the adjusted quadratic distance decay function with varying neighborhood sizes (km): 1) 60, 2) 100, 3) 150, 4) 180, and 5) 200.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Change of the adjusted quadratic distance decay function with varying neighborhood sizes (km): 1) 60, 2) 100, 3) 150, 4) 180, and 5) 200.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

The smoothing parameter *σ* in the modified Gaussian function plays a similar role to *r* (e.g., in the quadratic function) but takes a much smaller value (60 km, or 30% of *r*, in Fig. 3) than *r* when the corresponding functions spread weights within a similar distance range (see Fig. 3). Additionally, the Gaussian function assigns high weights only to tornadoes immediately next to the neighborhood center (<30 km in Fig. 3, roughly *σ*/2 or 15% of *r*, while the rate of weight reduction reaches the rate of the linear function at *r* = 19 km). A sharp decline of weights follows (30–110 km). Across the second half of the radius (especially >120 km, or 2*σ*) the assigned weights stay very low. The Gaussian function therefore has a different “scale” implication compared to other functions: its use would result in strong bias or preference toward tornadoes immediately next to the neighborhood center (roughly *σ*/2 only), and tornadoes beyond 2*σ* are given very low weights in density computations. As a result, use of the Gaussian function would effectively identify small but densely packed tornado clusters such as the ones in eastern North Dakota and southeastern Nebraska (see Fig. 6b, described in greater detail below). In comparison to the quadratic function, the choice of the Gaussian function has an effect that is similar—but not equivalent—to reducing the neighborhood size since heavy weights are assigned to 15% of *r* rather than *r*/2, when the two functions distribute weights within a neighborhood of similar sizes.

Change of *σ* (equivalent to change of neighborhood size *r*) in the Gaussian function primarily affects the width of the first weight reduction stage—the high weight and short range (<*σ*/2) stage (Fig. 5)—making this function an effective detector of scale-specific tornado clusters (spikes). Since the Gaussian function can assign some (very low) weights to much more remote stations than the other functions (which is another key difference between the Gaussian function and the quadratic function), a secondary effect of adjusting *σ* is that the range of weight reduction (i.e., → 0) can be greatly affected. Figure 5 displays the modified Gaussian function curves with five *σ* values: 17, 32, 45, 55, and 71 km.

Change of the modified Gaussian distance decay function with varying *σ* (km): 1) 17, 2) 32, 3) 45, 4) 55, and 5) 71.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Change of the modified Gaussian distance decay function with varying *σ* (km): 1) 17, 2) 32, 3) 45, 4) 55, and 5) 71.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Change of the modified Gaussian distance decay function with varying *σ* (km): 1) 17, 2) 32, 3) 45, 4) 55, and 5) 71.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

In comparison with the modified Gaussian function, the quasi-IDW function produces a more stable weight decline throughout. Relative to the linear function, however, its weight reduction is sharper in the first *r*/2 (the weight reduction rate lowered to the rate of the linear function at 100 km, or *r*/2, in Fig. 3), but much slower at a low level near the neighborhood edge (>140 km). In comparison with the other functions, the quasi-IDW function assigns the lowest (relative) weights to tornadoes near the neighborhood center, making it ineffective in detecting either prominent regional high-density zones or local density spikes (see Fig. 6c).

Three-dimensional views of tornado density maps created using different distance decay functions with a 60-km neighborhood window: (a) adjusted quadratic probability, (b) modified Gaussian, (c) quasi IDW, and (d) linear.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Three-dimensional views of tornado density maps created using different distance decay functions with a 60-km neighborhood window: (a) adjusted quadratic probability, (b) modified Gaussian, (c) quasi IDW, and (d) linear.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Three-dimensional views of tornado density maps created using different distance decay functions with a 60-km neighborhood window: (a) adjusted quadratic probability, (b) modified Gaussian, (c) quasi IDW, and (d) linear.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Figure 6 shows 3D tornado density maps produced using the four distance decay functions after the weights were spread (i.e., reduced to 0, or close to 0) within a neighborhood of 60 km (roughly equivalent to *σ* of 17 km). This neighborhood size was chosen to demonstrate how the four functions may behave differently at a regional wavelength (i.e., within individual states of the United States), as well as to identify possible regional tornado concentrations or clusters.

It should be noted that the sum of weights assigned by each distance decay function (e.g., the area covered by each curve in Fig. 3) is unique, resulting in different magnitudes of tornado density values because of the adjustment of the distance functions (Fig. 6). On average, the quadratic function consistently reports higher density values than the other functions. However, the primary difference between distance decay functions is in how to spread weights among tornadoes within the neighborhood area, so as to distinguish the relative importance of each tornado in comparison with others. The magnitude difference of the assigned weights (hence density values) should therefore not affect the comparison and evaluation results of the functions.

### c. Choice of tornado properties and uncertainty in tornado density mapping

So far, tornado density maps in this paper are all based on ≥EF2 touchdown points. These maps simply report inverse-distance-weighted “counts” of tornadoes. The diversity of tornado events suggests that tornado-to-tornado differences (i.e., geometric, thematic, and temporal) should be incorporated as far as possible for more informative density mapping. In fact, the produced density maps often need to carry direct “real world” meanings to be more useful, which requires the incorporation and manipulation of more tornado properties in mapping. The side effect, however, is that stronger uncertainty in data (Verbout et al. 2006) and more complications in analyses will propagate through the analytical procedure (Heuvelink 1998) to make the output more uncertain. This uncertainty issue makes Concannon et al. (2000) and Brooks et al. (2003) advocate the use of simple data (viz., tornado occurrences and touchdown locations) in tornado density mapping because these data are comparatively reliable and consistent.

Existing efforts in literature have been abundant in reporting tornado risk or probability (e.g., Brooks et al. 2003; Doswell et al. 2005; Coleman and Dixon 2014), frequency (e.g., Kelly et al. 1978), hazards (e.g., Boruff et al. 2003), fatalities (e.g., Ashley 2007), tornado clusters (e.g., Broyles and Crosbie 2004; Dixon et al. 2011), and so on. Some of these efforts involve more complex analyses and/or more data than simply counting touchdowns, hence implying possible presence of more uncertainty sources. However, the outputs of existing works, including maps presented in this paper, are typically reported in an “as is” manner. Since “ground truth” data are unavailable [see Tyrrell (2003) for a rare exception] to support the evaluation of the mapping outputs, what is needed is an uncertainty and propagation analysis for tornado density (or probability) mapping [see, e.g., Meyer et al. (2002) for a similar effort]. We thereby suggest a backward step be taken: while the ultimate goal may still be to map tornado probability, efforts should first be made to define the uncertainty level of the involved density maps.

As an experiment, we conducted a Monte Carlo analysis after incorporating tornado pathlengths and their errors into density mapping. This analysis starts with an effort to map tornado density according to the pathlengths of involved tornadoes. In the United States, SPC data report tornado paths as straight lines connecting start and end points, regardless of the actual shape of tornado tracks. We used the linear distance decay function and created a tornado density map by incorporating tornado path data of SPC (Fig. 7), following two steps. First, we converted each tornado path (line) into one or multiple square-shaped “tornado” cells of 291 × 291 m^{2} in size, with the tornado pathlength determining the number of cells the tornado covers. The measurement 291 m is the mean width (with a standard deviation of 383 m) of ≥EF2 tornadoes that have the width information available, allowing the converted tornado cells to represent land areas that were affected by tornadoes. Second, we counted weighted tornado cells by use of a circular neighborhood window of 60 km in radius. The resultant density values are necessarily affected by the propagation of the length and shape error of the SPC path data. Figure 8 is a scatterplot of 146 randomly sampled density values based on touchdown points (Fig. 2c) and pathlengths (Fig. 7), revealing rather strong variation of these two sets of density values from each other.

Tornado density map created after the SPC tornado pathlength data were incorporated, by use of the linear distance decay function and a 60-km circular neighborhood window. The values are weighted counts of 291-m tornado path cells within the neighborhood, displayed using a 10-class quantile classification.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Tornado density map created after the SPC tornado pathlength data were incorporated, by use of the linear distance decay function and a 60-km circular neighborhood window. The values are weighted counts of 291-m tornado path cells within the neighborhood, displayed using a 10-class quantile classification.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Tornado density map created after the SPC tornado pathlength data were incorporated, by use of the linear distance decay function and a 60-km circular neighborhood window. The values are weighted counts of 291-m tornado path cells within the neighborhood, displayed using a 10-class quantile classification.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Scatterplot of tornado density values calculated from touchdown point data (at a spatial resolution of 1 km) and pathlength data (at a spatial resolution of 291 m), both using the linear distance decay function and a 60-km circular neighborhood window. The two sets of values were taken from 146 randomly sampled points of the study area, with a second-order polynomial trend line fitted.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Scatterplot of tornado density values calculated from touchdown point data (at a spatial resolution of 1 km) and pathlength data (at a spatial resolution of 291 m), both using the linear distance decay function and a 60-km circular neighborhood window. The two sets of values were taken from 146 randomly sampled points of the study area, with a second-order polynomial trend line fitted.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Scatterplot of tornado density values calculated from touchdown point data (at a spatial resolution of 1 km) and pathlength data (at a spatial resolution of 291 m), both using the linear distance decay function and a 60-km circular neighborhood window. The two sets of values were taken from 146 randomly sampled points of the study area, with a second-order polynomial trend line fitted.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

In comparison with the density map produced using tornado touchdown data (e.g., see Fig. 2c), the choice of pathlength data resulted in a different geographic pattern (Fig. 7), exemplifying the strong impact of tornado property choices (also see Fig. 8). For instance, Mississippi, Alabama, and Arkansas are shown to be states containing the densest tornado path cells. The tornado density is reduced in Oklahoma and dramatically reduced in northeastern Texas, indicating the abundance of frequent but short-path tornadoes in both of the states. States such as Iowa, Nebraska, Indiana, and Missouri have similar density on the two tornado density maps.

Next, we introduced a random error term into the SPC path data to repeat the above density calculation, as required by the Monte Carlo procedure. In the case study, this random error term was introduced as a quantitative measure of the “length” error in tornado path data (but not the track “location” or “shape” error), and it is based on an examination of the differences between straight-line paths and the curved, “real” tornado tracks. The random error term was created and incorporated in five steps:

Digitize 100 tracks of historical tornadoes using NOAA satellite tornado damage images, into two sets of lines: real, often curved tornado polylines and straight lines (e.g., see Fig. 9).

Calculate the lengths of these two sets of lines, as well as the real/straight error ratio of the two sets of lines (expectably ≥1) and then obtain summary statistics of the error ratio for the 100 sample tornadoes (Table 2).

For all tornadoes included in our study, create a random error term that has the same mean and standard deviation as the error ratio defined in step 2 (boldface numbers in Table 2) and then assign the random error term to all the tornadoes as a new attribute, which also becomes the attribute of the converted tornado cells.

Recalculate the tornado density at the selected scale and cell size (e.g., the same as those used for Fig. 7) by accumulating the random error term of the tornado cells in the same weighed manner, which will result in a new tornado density map.

Repeat steps 3 and 4 for 100 times to complete the 100-repetition Monte Carlo analysis (Heuvelink 1998).

Examples of digitized real tornado paths (solid lines) and straight-line tracks (dashed lines) as reported by SPC. Shown are (top left) El Reno, Oklahoma; (top right) Moore, Oklahoma; (bottom left) Atlanta, Georgia; and (bottom right) Joplin, Missouri.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Examples of digitized real tornado paths (solid lines) and straight-line tracks (dashed lines) as reported by SPC. Shown are (top left) El Reno, Oklahoma; (top right) Moore, Oklahoma; (bottom left) Atlanta, Georgia; and (bottom right) Joplin, Missouri.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Examples of digitized real tornado paths (solid lines) and straight-line tracks (dashed lines) as reported by SPC. Shown are (top left) El Reno, Oklahoma; (top right) Moore, Oklahoma; (bottom left) Atlanta, Georgia; and (bottom right) Joplin, Missouri.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

Differences between digitized real tornado path and straight-line lengths of 100 tornadoes. See the text for an explanation of the boldface numbers.

From a new set of 100 tornado density maps, an average density map was obtained (Fig. 10a). Compared to Fig. 7, this map incorporated sample statistics (Table 2) of the tornado pathlength error when the actual all-population error was unavailable. It is also a more stable solution for tornado density mapping because of numerous repetitions. This density map is accompanied by an uncertainty map (standard deviation of the 100 repetitions for each cell, Fig. 10b) that reports the variability (or stability) of each tornado density reading on Fig. 10a. We also defined a relative error of tornado density as the relative difference between the Monte Carlo mean density and the SPC (pathlength) density, calculated as the percentage of their difference (the former minus the latter) in the density value derived from SPC paths. The corresponding error map (Fig. 10c) reports the magnitude of error in each density reading that was caused by the error of tornado pathlengths. Although there are numerous minor errors in reported tornado pathlengths (Table 2), they resulted in relatively small errors in tornado density maps (Fig. 10c). We believe efforts like this are important in future tornado density mapping, especially when more tornado properties need to be incorporated. It should be noted that the above Monte Carlo analysis is based on the sample error of tornado pathlengths. It ignored the effects of other error sources and error-related factors, such as the possibility that long- and short-path tornadoes have different error patterns.

(a) The average tornado density map, (b) after a Monte Carlo analysis, as well as the corresponding uncertainty (std dev of density values) map, and (c) density error (relative difference between Monte Carlo mean and SPC density) map.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

(a) The average tornado density map, (b) after a Monte Carlo analysis, as well as the corresponding uncertainty (std dev of density values) map, and (c) density error (relative difference between Monte Carlo mean and SPC density) map.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

(a) The average tornado density map, (b) after a Monte Carlo analysis, as well as the corresponding uncertainty (std dev of density values) map, and (c) density error (relative difference between Monte Carlo mean and SPC density) map.

Citation: Journal of Applied Meteorology and Climatology 55, 2; 10.1175/JAMC-D-15-0141.1

## 4. Concluding remarks

From a GIS perspective, we defined tornadoes as multidimensional real-world events within a framework of geographic space, time, and theme. Tornado density, in contrast, was recognized as a density field that depends on various human choices to exist, involving the neighborhood window, the method used to compute the density, the considered tornado properties, and the output resolution. Tornado density values can therefore only be relative. The identification of the above factors provided a conceptual foundation for the later interpretation of the tornado mapping process, to allow us to clarify the involved complications.

Three vital factors were identified to cause complications in tornado density mapping: the neighborhood size, the choice of the distance decay function, and the involved tornado properties. We tested 12 neighborhood sizes ranging from 20 to 360 km. Once a neighborhood size is chosen, the choice resulted in a density pattern that can be interpreted at the corresponding scale and may fit for applications at a particular scale. This choice should therefore be made in consideration of the application purpose.

The tested distance decay functions have “scale” implications. They also have their own specialty strengths for tornado density mapping. The adjusted Epanechnikov quadratic function can be used to delineate large zones of high tornado density at various scales and to detect the variable wavelengths of tornado density variability with the adjustment of neighborhood sizes. The modified Gaussian function can identify prominent tornado density “spikes,” or small areas of high tornado density. The quasi-IDW function cannot effectively identify tornado clusters at any distance range or scale.

Changing the choice of tornado properties in tornado density mapping implies the change of the resultant tornado density maps in both geographic patterns and in meanings of the density readings. Northeastern Texas, for instance, was identified to be a high-density center on the touchdown-based density maps, but had a dramatically reduced importance when the tornado pathlength was incorporated in density mapping. In addition, changing from the touchdown choice to the pathlength choice resulted in adjusted meanings of tornado maps from counts to coverage areas.

The adjustment of the considered tornado properties is also directly related to the level of uncertainty in the resultant density maps. Uncertainty and propagation analyses were therefore prescribed for future tornado density mapping. An experiment based on tornado pathlength data was conducted to demonstrate how a simple Monte Carlo analysis with manageable efforts can characterize uncertainty in resultant tornado density maps. A few GIS technical issues, such as the choice of the cell size and the removal of edge effects, as well as the corresponding GIS methods, were also discussed and demonstrated in the paper.

## REFERENCES

American Meteorological Society, 2015: “Tornado.” Glossary of Meteorology. [Available online at http://glossary.ametsoc.org/wiki/Tornado.]

Ashley, W., 2007: Spatial and temporal analysis of tornado fatalities in the United States: 1880–2005.

,*Wea. Forecasting***22**, 1214–1228, doi:10.1175/2007WAF2007004.1.Bian, L., 2007: Object-oriented representation of environmental phenomena: Is everything best represented as an object?

,*Ann. Assoc. Amer. Geogr.***97**, 267–281, doi:10.1111/j.1467-8306.2007.00535.x.Boruff, B. J., J. A. Easoz, S. D. Jones, H. R. Landry, J. D. Mitchem, and S. L. Cutter, 2003: Tornado hazards in the United States.

,*Climate Res.***24**, 103–117, doi:10.3354/cr024103.Brooks, H. E., C. A. Doswell, and M. P. Kay, 2003: Climatological estimates of local daily tornado probability for the United States.

,*Wea. Forecasting***18**, 626–640, doi:10.1175/1520-0434(2003)018<0626:CEOLDT>2.0.CO;2.Broyles, J. C., and K. C. Crosbie, 2004: Evidence of smaller tornado alleys across the United States based on a long track F3–F5 tornado climatology study from 1880–2003.

*22nd Conf. Severe Local Storms*, Hyannis, MA, Amer. Meteor. Soc., P5.6. [Available online at https://ams.confex.com/ams/11aram22sls/techprogram/paper_81872.htm.]Coleman, T. A., and P. G. Dixon, 2014: An objective analysis of tornado risk in the United States.

,*Wea. Forecasting***29**, 366–376, doi:10.1175/WAF-D-13-00057.1.Concannon, P. R., H. E. Brooks, and C. A. Doswell III, 2000: Climatological risk of strong and violent tornadoes in the United States. Preprints,

*Second Symp. on Environmental Applications*, Long Beach, CA, Amer. Meteor. Soc., 9.4. [Available online at http://ams.confex.com/ams/annual2000/techprogram/paper_6471.htm.]Dixon, P. G., and A. E. Mercer, 2012: Reply to “Comments on ‘Tornado risk analysis: Is Dixie Alley an extension of Tornado Alley?’”

,*Bull. Amer. Meteor. Soc.***93**, 408–410, doi:10.1175/BAMS-D-11-00219.1.Dixon, P. G., A. E. Mercer, J. Choi, and J. S. Allen, 2011: Tornado risk analysis: Is Dixie Alley an extension of Tornado Alley?

,*Bull. Amer. Meteor. Soc.***92**, 433–441, doi:10.1175/2010BAMS3102.1.Dixon, P. G., A. E. Mercer, K. Grala, and W. H. Cooke, 2014: Objective identification of tornado seasons and ideal spatial smoothing radii.

,*Earth Interact.***18**, doi:10.1175/2013EI000559.1.Doswell, C. A., H. E. Brooks, and M. P. Kay, 2005: Climatological estimates of daily nontornadic severe thunderstorm probability for the United States.

,*Wea. Forecasting***20**, 577–595, doi:10.1175/WAF866.1.Elsner, J. B., L. E. Michaels, K. N. Scheitlin, and I. J. Elsner, 2013: The decreasing population bias in tornado reports across the Central Plains.

,*Wea. Climate Soc.***5**, 221–232, doi:10.1175/WCAS-D-12-00040.1.Franke, R., 1982: Scattered data interpolation: Tests of some methods.

,*Math. Comput.***38**, 181–200.Goodchild, M. F., M. Yuan, and T. J. Cova, 2007: Towards a general theory of geographic representation in GIS.

,*Int. J. Geogr. Inf. Sci.***21**, 239–260, doi:10.1080/13658810600965271.Heuvelink, G. B. M., 1998:

*Error Propagation in Environmental Modelling with GIS.*Taylor and Francis, 146 pp.Jelinski, D. E., and J. Wu, 1996: The modifiable areal unit problem and implications for landscape ecology.

,*Landscape Ecol.***11**, 129–140, doi:10.1007/BF02447512.Kelly, D. L., J. T. Schaefer, R. P. McNulty, C. A. Doswell III, and R. F. Abbey Jr., 1978: An augmented tornado climatology.

,*Mon. Wea. Rev.***106**, 1172–1183, doi:10.1175/1520-0493(1978)106<1172:AATC>2.0.CO;2.Marsh, P. T., and H. E. Brooks, 2012: Comments on “Tornado risk analysis: Is Dixie Alley an extension of Tornado Alley?”

,*Bull. Amer. Meteor. Soc.***93**, 405–407, doi:10.1175/BAMS-D-11-00200.1.Meyer, C. L., H. E. Brooks, and M. P. Kay, 2002: A hazard model for tornado occurrence in the United States.

*16th Conf. on Probability and Statistics*, Orlando, FL, Amer. Meteor. Soc., J3.6. [Available online at https://ams.confex.com/ams/annual2002/techprogram/paper_27595.htm.]Mitas, L., and H. Mitasova, 1999: Spatial interpolation.

*Geographical Information Systems: Principles, Techniques, Management and Applications*. P. Longley et al., Eds., Wiley, 481–492.Openshaw, S., 1984:

*The Modifiable Areal Unit Problem*. Geo Books, 40 pp.Paul, F., 2001: A developing inventory of tornadoes in France.

,*Atmos. Res.***56**, 269–280, doi:10.1016/S0169-8095(00)00077-6.Ray, P. S., P. Bieringer, X. Niu, and B. Whissel, 2003: An improved estimate of tornado occurrence in the Central Plains of the United States.

,*Mon. Wea. Rev.***131**, 1026–1031, doi:10.1175/1520-0493(2003)131<1026:AIEOTO>2.0.CO;2.Schaefer, J. T., D. L. Kelly, and R. F. Abbey Jr., 1986: A minimum assumption tornado-hazard probability model.

,*J. Climate Appl. Meteor.***25**, 1934–1945, doi:10.1175/1520-0450(1986)025<1934:AMATHP>2.0.CO;2.Silverman, B. W., 1986:

*Density Estimation for Statistics and Data Analysis*.Chapman and Hall, 176 pp.Smith, B. T., R. L. Thompson, J. S. Grams, C. Broyles, and H. E. Brooks, 2012: Convective modes for significant severe thunderstorms in the contiguous United States. Part I: Storm classification and climatology.

,*Wea. Forecasting***27**, 1114–1135, doi:10.1175/WAF-D-11-00115.1.Standohar-Alfano, C. D., and J. van de Lindt, 2014: Empirically based probabilistic tornado hazard analysis of the United States using 1973–2011 data.

,*Nat. Hazards Rev.***16**, 04014013, doi:10.1061/(ASCE)NH.1527-6996.0000138.Thom, H. C. S., 1963: Tornado probabilities.

,*Mon. Wea. Rev.***91**, 730–736, doi:10.1175/1520-0493(1963)091<0730:TP>2.3.CO;2.Tyrrell, J., 2003: A tornado climatology for Ireland.

,*Atmos. Res.***67–68**, 671–684, doi:10.1016/S0169-8095(03)00080-2.Verbout, S. M., H. E. Brooks, L. M. Leslie, and D. M. Schultz, 2006: Evolution of the U.S. tornado database: 1954–2003.

,*Wea. Forecasting***21**, 86–93, doi:10.1175/WAF910.1.Widen, H. M., T. Fricker, and J. B. Elsner, 2015: New methods in tornado climatology.

,*Geogr. Compass***9**, 157–168, doi:10.1111/gec3.12205.Yamada, I., and P. A. Rogerson, 2003: An empirical comparison of edge effect correction methods applied to

*K*-function analysis.,*Geogr. Anal.***35**, 97–109, doi:10.1111/j.1538-4632.2003.tb01103.x.Yuan, M., 1998: Representing spatiotemporal processes to support knowledge discovery in GIS databases.

*Proceedings of the 8th International Symposium on Spatial Data Handling*, T. K. Poiker and N. Chrisman, Eds., International Geographic Union, 431–440.Yuan, M., M. Dickens-Micozzi, and M. A. Magsig, 2002: Analysis of tornado damage tracks from the 3 May tornado outbreak using multispectral satellite imagery.

,*Wea. Forecasting***17**, 382–398, doi:10.1175/1520-0434(2002)017<0382:AOTDTF>2.0.CO;2.