## Abstract

The statistical relationship between elevation roughness and tornado activity is quantified using a spatial model that controls for the effect of population on the availability of reports. Across a large portion of the central Great Plains the model shows that areas with uniform elevation tend to have more tornadoes on average than areas with variable elevation. The effect amounts to a 2.3% [(1.6%, 3.0%) = 95% credible interval] increase in the rate of a tornado occurrence per meter of decrease in elevation roughness, defined as the highest minus the lowest elevation locally. The effect remains unchanged if the model is fit to the data starting with the year 1995. The effect strengthens for the set of intense tornadoes and is stronger using an alternative definition of roughness. The elevation-roughness effect appears to be strongest over Kansas, but it is statistically significant over a broad domain that extends from Texas to South Dakota. The research is important for developing a local climatological description of tornado occurrence rates across the tornado-prone region of the Great Plains.

## 1. Introduction

A tornado is a rotating column of air swirling upward from the surface and extending from a cumuliform cloud. The strongest tornadoes develop under rotating thunderstorms (i.e., supercells). Not all supercells produce tornadoes. This fact suggests that tornado initiation is sensitive to an interplay of many processes across a range of spatial scales, including the scale of a few kilometers at which the flow is described as a converging, swirling plume (Lewellen et al. 2000). Research shows that the underlying surface can affect this convergent inflow (Bluestein 2000; Dunn and Vasiloff 2001; Prociv 2012). A statistically significant decrease in the number of intense tornadoes with an increase in topographic variability is found across the eastern two-thirds of the United States (Karpman et al. 2013), but it is not clear to what extent population variability confounds this effect.

Research also shows that surface roughness can affect inflow—in particular, the velocity distribution and flow curvature (Lewellen 1962; Davies-Jones 1973; Dessens 1972; Leslie 1977). Rough terrain reduces the tangential velocity (Leslie 1977). An experimental study argues, however, that the amount of roughness used in these studies is outside the range encountered in regions where tornadoes most often occur (Church et al. 1979). A study by Jagger et al. (2015) for Kansas found fewer tornadoes in counties with greater elevation variation. They used elevation standard deviation computed within each county as a metric for elevation roughness and found a 1.8% reduction in the number of tornadoes for every 1-m increase in roughness.

In this paper, we examine the finding of Jagger et al. (2015) in more detail by 1) aggregating the tornado paths to cell counts over varying grid resolutions and domains, 2) applying alternative definitions of elevation roughness, 3) examining the effect for different ratings on the enhanced Fujita scale (EF), and 4) excluding early data records. The main question posed here is, How general is the elevation-roughness effect? The approach is ecological, fitting spatial statistical models to aggregate data rather than to individual tornadoes. An important application of the model is local tornado climatological behavior. Methods are described in section 2, including the data sources, the procedure to aggregate the data to grid cells, and the spatial model. Results are presented in section 3. Raw and corrected tornado rates are compared. The magnitude and uncertainty of the elevation-roughness effect are quantified. The robustness of the effect is examined in section 4. The paper ends with a short summary of the main findings and a discussion of the results.

## 2. Methods

### a. Data

The large-scale climatological behavior of tornadoes has been extensively studied (e.g., Brooks et al. 2003; Dixon et al. 2014). Tornadoes are rare at any one location, however, making it difficult to uncover small-scale relationships like those related to variations in elevation or other terrain features. The uneven quality of tornado records (Verbout et al. 2006; Diffenbaugh et al. 2008; Brooks 2013) confounds simple grouping and averaging methods. Therefore, in this study we use a spatial model that is capable of statistically controlling for the inconsistencies in the data quality. We apply the model to data aggregated over various domains and at various spatial scales.

Analysis and modeling are performed using the open-source R language for statistical computing and freely available government data, including tornado reports from the National Weather Service’s Storm Prediction Center (SPC), population and administrative boundaries from the U.S. Census Bureau, and elevation from NASA’s Shuttle Radar Topography Mission. We overlay straight-line tornado paths, population, and elevation roughness onto a regular latitude–longitude grid covering the central Great Plains between 95° and 102°W longitude and between 36° and 42°N latitude. The choice is based on our interest in the effect of elevation variation separate from the effects of other surface-roughness features like forests or other vegetation.

The data used in the study and the code to create the figures and results are archived online (https://github.com/jelsner/Roughness). The SPC database, which contains all reported tornadoes in the United States over the period 1950–2014, was obtained from the Internet (http://www.spc.noaa.gov/gis/svrgis/zipped/tornado.zip). The database is in “shapefile” format, with each tornado provided as a straight-line track. The native coordinate reference system of the tornado database is Lambert conformal conic (LCC). Start locations are recorded to two-digit decimal precision prior to 2009 and four-digit precision afterward. Locations are more accurate later in the record, when estimates were made with GPS. Elevation is obtained from a digital elevation model as a georeferenced tagged image file (TIF; http://www.viewfinderpanoramas.org/DEM/TIF15/) at 3-arc-s resolution (~80 m). The raster dataset contains elevation above mean sea level in meters with an accuracy of ±7 m per pixel. Population is obtained from the “gridded population of the world” dataset, version 3, from the Socioeconomic Data and Applications Center at Columbia University (http://sedac.ciesin.columbia.edu/data/collection/gpw-v3). The dataset contains raw estimates of population density for 2010. Values are given as persons per square kilometer.

### b. Tornado counts by grid cell

We start with a regional domain between 95° and 102°W longitude and between 36° and 42°N latitude (Fig. 1). First, a grid over the domain is created at a spatial resolution of 0.25° in latitude and longitude, resulting in 24 cells in the north–south direction and 28 cells in the east–west direction (672 total). The resolution is chosen as a compromise between too fine and too coarse; cells are chosen to be large enough to capture at least a few tornadoes but small enough to be relevant to the scale of the physical mechanisms involved. Sensitivity of the results to this choice is examined in section 4.

Next, the tornado paths are placed onto the grid, and the number of paths that intersect each cell are counted. The tornado path is a buffered line segment of the straight-line track given in the tornado database. The size of the buffer is determined by one-half of the path width. Path width is given as the average cross-path distance before 1994 and the maximum distance after that year. No adjustment is made for this difference because the effect on the counts is estimated to be less than 1% given the size of the cells relative to the average path width. The overlay is performed after projecting the geographic grid to the LCC coordinate reference system of the tornado database.

There are 6749 tornadoes with paths that intersect the domain, which results in an average of 14.3 tornadoes per 0.25° grid cell with a standard deviation of 7.19 tornadoes (Fig. 2). The variance-to-mean ratio is 3.61. The highest number of tornadoes in any cell is 47 (two cells), and the lowest number is 1 (Fig. 3). There are 81 cells with 12 or 13 tornadoes. Because others have used the Poisson distribution for modeling tornado counts (Tippett et al. 2012, 2014), we show it here for comparison. The Poisson distribution is clearly not the appropriate distribution for our modeling purpose because the observed cell counts have a large spread.

### c. Elevation, elevation roughness, and population density

Values of elevation, elevation roughness, and population data are added to each grid cell. The 80-m-pixel elevation raster is first cropped to the extent of the domain. Elevation roughness is computed as the largest difference in elevation in a pixel relative to the elevation in the eight neighboring pixels (Wilson et al. 2007). The high-resolution values of elevation and elevation roughness are aggregated and interpolated (bilinear) to match the resolution of the cells in the domain. The distribution of roughness values is shown in Fig. 4. Population-density values are similarly cropped, aggregated, and interpolated to match the resolution of the cells in the domain.

Elevation roughness is a physical variable that is related to the spatial variation in elevation and that differs from roughness length, which is the vertical length scale (parameter) of the logarithmic wind profile and which is important for estimating momentum, heat, and mass exchange between the ground and the atmosphere (Toda and Sugita 2003). The roughness length is approximately one-tenth of the height of the surface-roughness elements (WMO 2008). Therefore, in the open areas, without trees or other obstructions, that are typical across most of the study domain, an elevation-roughness value of 15 m is comparable to a roughness length of 0.75 (1.5/2).

The average population density is 11.7 persons per square kilometer, with a standard deviation of 44.07 persons per square kilometer. The correlation (Pearson, here and throughout) between population density and tornado frequency is +0.28 [(+0.21, +0.35) = 95% confidence interval]. The correlation between population density and elevation is −0.21 [(−0.28, −0.14) = 95% confidence interval], indicating, as expected, that relatively fewer people live on the higher plains across the western part of the domain. The correlation between population density and elevation roughness is +0.007 [(−0.07, +0.08) = 95% confidence interval], indicating no relationship between where people live and roughness in this part of the country. Population density and elevation roughness are mapped in Fig. 5.

The relationships between tornado occurrence and population density and between tornado occurrence and elevation roughness across the cells are displayed in Fig. 6. The graphs illustrate that tornadoes are reported with greater frequency in grids with higher population density and are reported with a significantly lower frequency in grids with higher elevation roughness, although there is considerable scatter about these log-linear bivariate relationships. The relationships suggest that further investigation is warranted. In particular, the relationship between tornadoes and elevation roughness needs to account for the relationship between tornadoes and population. Spatial correlation also needs to be considered. Therefore, we choose a model that includes elevation roughness and population density as well as a term for the spatial correlation.

### d. Spatial statistical model

The number of tornadoes in each cell *T*_{s} is assumed to follow a negative binomial distribution (Elsner and Widen 2014) with mean *μ*_{s}. The model is

where NegBin(*μ*_{s}, *n*) indicates that the conditional tornado counts (*T*_{s} | *μ*_{s}, *n*) are described by a negative binomial distribution with mean *μ*_{s} and dispersion *n*. The mean depends on the cell area (exposure) and is linearly related to the fixed and random effects through the logarithmic link function *ν*_{s}. The fixed effects include population density Pop_{s} and elevation roughness ER_{s}. The random effect *u*_{s} follows a Besag formulation (Besag 1975):

where *N* is the normal distribution with mean (1/*m*_{i}) × and variance (1/*m*_{i})*τ*, with *m*_{i} being the number of neighboring cells to cell *i* and *τ* being the precision; *i* ~ *j* indicates that cells *i* and *j* are neighbors.

We assign vague Gaussian priors with known precision to the *β*s. To complete the model, the dispersion *n* is assigned a vague log-gamma prior and the precision *τ* is assigned a vague log-Gaussian prior. The priors and the likelihood are combined with the Bayes rule to obtain the posterior distributions for the model parameters. The integrals cannot be solved analytically, and therefore we use the method of integrated nested Laplace approximation (INLA), which provides a fast alternative to Markov chain Monte Carlo simulation for models that have a latent Gaussian structure (Rue et al. 2009). This is done with functions from the INLA package (Rue et al. 2014).

## 3. Results

### a. Smoothed raw and corrected rates

We first fit the model using only the random-effect term. The result is a map of smoothed report anomalies relative to the regional average (Fig. 7). The map is similar to one obtained using kernel-density estimation (Dixon et al. 2014). Brownish colors indicate cells with tornado rates above the domain average, and blue shades indicate cells with rates below the average. Plots in the top and right margins of the map show the cumulative anomalies in the east–west and north–south directions, respectively. The pattern features regions of above-average activity in south and west-central Kansas and south-central Nebraska. A region of below-average activity is noted over west-central Nebraska.

Next we add population density (log base 2) and elevation roughness to the model as fixed effects. We tested elevation as a fixed effect but found it to be insignificant. The deviance information criterion (DIC) for the model with the two fixed effects is 4035, which compares with 4149 for the random-effects-only model. DIC measures the relative quality of a statistical model given the set of data. The smaller the DIC is, the better is the model. Taken individually, population density reduces the DIC to 4055 and elevation roughness reduces it to 4118. Thus, it is clear that population density and roughness significantly improve the model.

The random-effect term is the tornado anomaly *after* controlling for population and roughness (Fig. 7b). The random effect is the best guess at regional tornado activity independent of population and roughness (nonclimatic influences). Values are again expressed as a percent difference from the regional average. The map features a corridor of above-average activity from north-central Oklahoma northward through west-central Kansas and then northeastward through south-central Nebraska. There is a distinct westward shift in the dominant north–south axis of above-average anomalies relative to the smoothed-report anomalies. This is clear by comparing the plots in the top margins of Fig. 7.

### b. Population and elevation-roughness effects

Magnitudes of the fixed effects are summarized by the corresponding coefficient’s posterior density. The coefficient on the logarithm (base 2) of population density has a posterior mean of 0.1171 [(0.0948, 0.1395) = 95% credible interval] (Fig. 8). This translates to a 12.4% {[exp(0.1171) − 1] × 100%} increase in the tornado rate for a doubling of the population holding elevation roughness constant. This result is consistent with Jagger et al. (2015) who show an 11% increase for a population doubling using population values at the lower-resolution county level from Kansas.

The coefficient on the elevation-roughness term has a posterior mean of −0.0230 [(−0.0301, −0.0159) = 95% credible interval] (Fig. 8b). This translates to a 2.3% increase in the tornado rate for every meter of decrease in roughness, holding population constant. The magnitude of the elevation-roughness effect can be interpreted as follows: a cell with 10 tornadoes and a roughness of 25 m would expect to have on average almost 16 tornadoes if it had a roughness of only 5 m [10(1 + 0.023)^{25−5} = 15.8].

This relationship between tornado frequency and roughness has been quantified for tornado activity using county-level aggregation (Jagger et al. 2015), but here we provide a quantification at a finer spatial scale. The magnitude of the effect is consistent with Jagger et al. (2015), who show a similar 2% increase in tornado occurrence for every meter of decrease in roughness across Kansas, controlling for population, where the “terrain” roughness was defined as the standard deviation in elevation in each county.

## 4. Robustness of the elevation-roughness effect

After establishing a model for tornadoes that includes elevation roughness as a significant fixed effect, we examine by how much the relationship between tornadoes and elevation roughness changes as we adjust the period of record, the EF rating, the spatial resolution, the definition of roughness, and the domain. We start with changing the study period. The number of tornadoes is reduced to 4898 using data only since 1975 and to 2940 using data only since 1995. The magnitude of the roughness term increases slightly for the model fit using the later years but the difference is not large (Table 1). The slight increase might be related to better data quality later in the period and to the fact that the population density is for 2010 only (see, e.g., Elsner et al. 2013a). The ability to spot a distant tornado improves when the landscape is flat. This observational bias could masquerade as an elevation-roughness effect. If this is the case, the data bias should decrease over time with the increasing number of storm spotters and chasers getting close to tornadoes to diminish this observational bias. Because we see that the effect actually increases slightly using data over the most recent period, we conclude that an observational bias related to visibility is not likely to be the explanation.

Next, we examine the roughness effect for subsets of the tornado database stratified by EF rating. The effect increases from a 2.3% increase in the tornado rate per meter of decrease in roughness for all tornadoes to a 3.6% increase in the tornado rate per meter of decrease in roughness for EF3+ (intense) tornadoes. We note that the *difference in effects* between all tornadoes and intense tornadoes is likely not significant, however.

The elevation-roughness effect changes with gridcell size. To demonstrate, we increase the resolution by reducing the cell size from the original 0.25° to 0.125° (2688 cells) and then to 0.0625° (10 752 cells). The model is fit to the data aggregated at these two additional resolutions, and we note that increasing the resolution drops the effect down to 1.7% per meter of decrease in roughness (Table 1). To check whether the elevation-roughness effect continues to decrease with increasing resolution, we treat the set of tornadogenesis locations as a point process. A point process is a realization of spatial locations that can be statistically modeled in a manner similar to data that are aggregated by area [see Elsner et al. (2013b) for an application of a point-process model to tornadoes]. The model has the same form as that used in Eq. (3). We find that the effect is a 1.9% increase in tornado rate per meter of decrease in roughness.

The elevation-roughness effect also depends on the definition of roughness. We show this influence by replacing elevation roughness with an index of ruggedness. The ruggedness index is defined as the mean of the absolute difference between the value of a pixel and the value of its eight surrounding pixels (Wilson et al. 2007). The larger the differences are, the more rugged (less smooth) is the elevation. In this case, the effect increases to 6.5% [(4.5%, 8.4%) = 95% credible interval] increase in the tornado rate per meter of decrease in roughness, which is significantly higher than the effect when using the maximum minus the minimum to define roughness.

The elevation-roughness effect might be specific to the domain chosen. This hypothesis is tested by fitting the model to tornadoes occurring over broader and smaller domains. Here we expand and contract the study domain by 1° to the north, south, east, and west. Within the expanded domain, there are 11 024 tornadoes (1955–2014), and within the contracted domain, there are 3480 tornadoes. The model is fit separately to the data over these domains. Over the larger domain, the magnitude of the roughness effect, as quantified by the posterior mean, is smaller, at a 1.8% increase in the tornado rate per meter of decrease in roughness. Over the smaller domain, the magnitude is larger, at a 3.1% increase in the tornado rate per meter of decrease in roughness. More homogeneous terrain type might explain the larger effect over the smaller domain.

Last, the model is fit to data over an extended domain that includes areas farther north into South Dakota (45°N) and farther south into central Texas (30°N). Over this much-larger area, the elevation-roughness term has a posterior mean of 1.6% [(1.2%, 2.0%) = 95% credible interval] increase in tornado rate per meter of decrease in roughness, consistent with results from the model fit to data over the original, smaller domain. Thus, we conclude that a significant elevation-roughness effect occurs throughout much of the Great Plains, although it appears to be largest in a region centered on Kansas.

## 5. Summary and discussion

Studies about the influence of surface features on tornadoes date back to at least the mid-twentieth century, yet the observational literature on this topic is limited. A recent study by Karpman et al. (2013) shows an elevation influence on tornadoes. That study used relatively coarse data (150 arc s, or ~5 km) and did not control for population. Here we provide a comprehensive treatment of this problem by using the best available data across a large portion of the central Great Plains and by employing a model that controls statistically for population.

Three-quarters of all tornadoes in this domain occur in April–June. On average, at this time of the year, a dryline forms over the high plains and separates moist air originating over the Gulf of Mexico from dry air originating over the southwestern United States and high plateau of Mexico (e.g., Schultz et al. 2007). Thunderstorms, in the form of discrete supercells, tend to form east of the dryline along a roughly north–south axis. This climatological behavior is evident in the smoothed tornado-rate anomalies shown in Fig. 7.

Regions with uniform elevation over this tornado-favorable domain tend to have more tornadoes relative to regions with greater elevation roughness. The effect amounts to a 2.3% increase in the tornado rate (using 0.25° cells) per meter of decrease in elevation roughness. The magnitude of the effect is consistent over time. The effect is stronger for an alternative definition of roughness (ruggedness). The effect is also stronger for the subset of tornadoes with higher damage ratings, but the difference is not significant.

Speculation on the physical interpretation centers on the possibility that elevation roughness reduces the fluxes of angular momentum into the mesocyclones, leading to fewer tornadoes. Although decreasing angular momentum may tend to decrease tornado intensity, some exceptions occur. For example, surface roughness is shown to decrease corner-flow swirl ratio and, in some cases, could cause a vortex flow to make a transition from a medium swirl ratio to a critical swirl ratio (Lewellen et al. 2000). In a flow with critical swirl ratio (i.e., vortex breakdown), the near-surface vortex is strengthened substantially relative to the vortex flow aloft (Fiedler and Rotunno 1986; Lewellen et al. 2000), and the change in swirl ratio could offset the roughness effect. This possibility will need to be tested with a high-resolution dynamical model. In particular, as noted in section 2c, the surface roughness length is different than the elevation roughness used here, and therefore relating the two might help in the design of numerical simulations.

Results are valid only for the domains considered. The significance of the roughness effect will likely be less in regions outside the Great Plains where factors like variations in vegetation might play a role similar to elevation roughness. We did not look at vegetation directly in this study, but vegetation is strongly correlated with elevation in this part of the country. We found that elevation is not a significant factor when included in the model. Further, it is noted that elevation roughness is negatively correlated with the tornado frequency at the county level in South Dakota and Illinois and significantly so for Kansas and Mississippi [see Table 1 of Jagger et al. (2015)], indicating that the results are somewhat general across the eastern half of the United States. Further, the study used aggregate data and so the interpretations do not necessarily apply at the level of individual tornadoes. Last, the effect cannot be extrapolated to infer that tornadoes will never occur where elevation roughness is extreme (see, e.g., Monteverdi et al. 2014). That is, no amount of elevation variation will guarantee safety from a tornado. Tornadoes can and do occur in mountainous regions.

## Acknowledgments

We thank three anonymous reviewers who helped us strengthen the interpretation of the results.

## REFERENCES

*Guide to Meteorological Instruments and Methods of Observation*. 7th ed. World Meteorological Organization, 681 pp. [Available online at https://www.wmo.int/pages/prog/gcos/documents/gruanmanuals/CIMO/CIMO_Guide-7th_Edition-2008.pdf.]