## Abstract

Categorization of storm surge with the Saffir–Simpson hurricane scale has been a useful means of communicating potential impacts for decades. However, storm surge was removed from this scale following Hurricane Katrina (2005), leaving no scale-based method for storm surge risk communication despite its significant impacts on life and property. This study seeks to create a new, theoretical storm surge scale based on fiscal damage for effective risk analysis. Advanced Circulation model simulation output data of maximum water height and velocity were obtained for four storms: Hurricane Katrina, Hurricane Gustav, Hurricane Ike, and Superstorm Sandy. Four countywide fiscal loss methods were then considered. The first three use National Centers for Environmental Information Storm Events Database (SED) property damages and Bureau of Economic Analysis (BEA) population, per capita personal income, or total income. The fourth uses National Flood Insurance Program total insured coverage and paid claims. Initial correlations indicated the statistical mode of storm surge data above the 90th percentile was most skillful; this metric was therefore chosen to represent countywide storm surge. Multiple linear regression assessed the most skillful combination of storm surge variables (height and velocity) and fiscal loss method (SED property damages and BEA population, i.e., loss per capita), and defined the proposed scale, named the Kuykendall scale. Comparison with the four storms’ actual losses shows skillful performance, notably a 20% skill increase over surge height-only approaches. The Kuykendall scale demonstrates promise for skillful future storm surge risk assessment in the analytical, academic, and operational domains.

## 1. Introduction

When quantifying and communicating natural phenomena, scientists have often employed categorization-based approaches. Storm surge, in particular, has been subject to categorization since the creation of the Saffir–Simpson hurricane scale (SSHS). In 1969, Dr. Herbert S. Saffir first presented his scale linking hurricanes to structural damage. The scale was based on maximum wind speeds, ranging from 33 m s^{−1} to over 69 m s^{−1}, and contained five categories tied to the qualitative amount of damage incurred. Saffir’s scale underwent additional revisions, the most noteworthy being the addition of associated pressure and storm surge values by Dr. Robert Simpson, before taking its final form (Saffir 1973, 2003; Simpson 1974). Storm surge is formally defined as “a rise and onshore surge of seawater as the [result] primarily of the winds of a storm,” so its inclusion in a wind speed–oriented scale made sense (American Meteorological Society 2017).

The SSHS was the standard in storm surge risk communication for decades. However, in 2005, Hurricane Katrina struck the Gulf Coast and became one of the top-three deadliest, costliest, and most intense (in terms of minimum pressure) landfalling hurricanes in the United States (Blake et al. 2011). Despite possessing a category 5 magnitude storm surge, Katrina’s official SSHS rank was only category 3 at landfall. This discrepancy brought the scale’s effectiveness into question (Knabb et al. 2011).

Kantha (2006) was one of the first to propose replacing the SSHS with a new, more comprehensive scale. The new scale included the maximum wind speed, the radius of maximum wind speed, and the translational speed of the overall hurricane. Others soon followed Kantha’s lead, with some developing a scale around the integration of kinetic energy (Powell and Reinhold 2007) and others splitting their scale into separate values for size and intensity (Hebert et al. 2008). None of these scales ever transitioned into operational use. Instead, in 2010 the SSHS was changed to the Saffir–Simpson hurricane wind scale (SSHWS). This version is currently in use and still describes structural damage with the same five categories, but the categories are now solely dependent on maximum wind speed and thus do not include pressure or storm surge estimates (Rappaport and Welshinger 2010). The National Hurricane Center (NHC) stated that the exclusion was due to the dependence of storm surge on a larger number of factors besides maximum wind speed (Schott et al. 2012), such as hurricane size, forward speed, approach angle, local bathymetry, and landfall topography (Jelesnianski 1972; Irish et al. 2008). However, there was now no scale commensurate to the SSHWS to characterize surge, which in the last half century has proven the deadliest component of tropical cyclones and has caused tremendous amounts of damage (Rappaport 2014).

Research into developing a more comprehensive hurricane damage scale continues. Applications of statistics and neural networks led to the development of the Hurricane Impact Level model (Pilkington and Mahmoud 2016), which uses a storm’s pressure, winds, storm surge, and other variables to create a measure of economic impact. This model shows promise for both operational use (Pilkington and Mahmoud 2017a) and for the theoretical assessment of risk (Pilkington and Mahmoud 2017b), but it only offers analysis of potential economic loss on an overall storm-level scale. It is not currently built to provide information about which areas experienced greater economic impacts.

Historically speaking, storm surge has never been assigned an independent scale beyond the SSHS, though there have been attempts to develop one. A year before the switch to the SSHWS, Irish et al. (2009) developed a hydrodynamics-based scale for storm surge, taking into account the factors suggested by Jelesnianski (1972) and Irish et al. (2008). While the scale has an extensive theoretical and mathematical background, its numerical values are not intuitive, and assumptions made during the scale’s derivation have been questioned (Kantha 2010). After the switch to the SSHWS, focus shifted to faster forecasting of storm surge instead of scale development. Irish et al. (2011) took a probabilistic approach, using joint probability statistics and scaling laws to develop storm surge response functions. These functions allow for the rapid creation of storm surge forecasts for a tropical cyclone prelandfall given its current meteorological characteristics. Others built upon this idea, proposing that the probability forecasts for various historic or simulated storms be collected in an archive for comparison with future tropical cyclones (Taflanidis et al. 2013; Condon et al. 2013). The methods developed after the SSHWS switch, while useful for forecasters, offer little help for communicating storm surge damage to the public.

There have, however, been recent advancements in storm surge risk communication. The NHC made the Potential Storm Surge Flooding Map operational for the 2016 hurricane season to spatially represent storm surge risk. The map is created by running the Sea, Lake, Ocean, and Overland Surges from Hurricanes (SLOSH) model multiple times using an approaching hurricane’s characteristics. After accounting for track and intensity errors from historic NHC forecasts, a probabilistic storm surge is created based on the SLOSH model simulations. The method is similar to those proposed by Irish et al. (2011), Taflanidis et al. (2013), and Condon et al. (2013), but the NHC goes one step further and visualizes the results. An example is provided in Fig. 1, which displays storm surge heights that have at maximum a 10% chance of being exceeded. The map can be created for multiple landfall regions, excludes leveed areas, and provides the option to omit natural wetland areas, all useful traits for public understanding and use (NHC 2016). However, the map does not link the probabilistic storm surge heights to resulting damage. Storm surge watches and warnings were also made operational during the 2017 hurricane season, but these tools do not give quantitative measures of storm surge or its potential damages (NHC 2017).

We argue that any scale that attempts to characterize storm surge impacts or risk should have three important characteristics. First, because the scale’s primary function is communication, the scale should reflect the damage done to people, property, or both. Second, the scale should be quantitative to avoid differing interpretations of the damage. Third, the scale should be continuous and avoid saturation issues at the ends of the scale (Kantha 2006). The scale used to define earthquake severity, the Richter scale, has two of these three characteristics because it is a continuous, quantitatively based scale that links the magnitude of an earthquake to the measured amplitude of the waves produced. Unlike the SSHWS, the Richter scale is not limited to integer values, but instead offers continuity by adding a tenths place to the magnitude value. The logarithmic relationship underlying the scale resolves the issue of saturation at higher values (Richter 1935). In part as a result of the Richter scale’s use in postevent analysis, the public has also acquired an intuitive understanding of such logarithmic scales of risk. The Richter scale has many qualities that would be beneficial in a prospective storm surge risk scale.

The storm surge scale proposed here is designed to have the intuitive, continuous, and quantitatively based nature of the Richter scale, as well as the link to a tangible impact of the SSHWS, but with fiscal damage instead of structural damage as the communicated risk. Where the SSHWS links structural damage to observed wind speeds, this surge scale links real fiscal damage estimates to storm surge variables. This storm surge index offers a theoretical new way to categorize and analyze hurricanes alongside the SSHWS.

## 2. Data and methods

### a. The storms

Four storms were used for this study: Hurricane Katrina, Hurricane Gustav, Hurricane Ike, and Superstorm Sandy. These four storms were chosen for the following reasons.

Hurricane Katrina (2005) was chosen because of its devastating storm surge impacts to the Louisiana and Mississippi coasts. Measurements from the Federal Emergency Management Agency (FEMA) indicated a storm surge of 7.3–8.5 m along the western Mississippi coast and 5.2–6.7 m along the eastern Mississippi coast. In Louisiana, storm surge heights ranged from 1.5 to 5.8 m in the areas surrounding Lake Pontchartrain. In New Orleans, the storm surge caused multiple levee failures and widespread flooding across the city (Knabb et al. 2011).

Hurricane Gustav’s (2008) storm surge heights were not as impressive as they were for Katrina, with maximum heights of 3.6–3.9 m recorded in Louisiana and approximately 2.7 m in Mississippi. However, due to its similar landfall location (approximately 110 km west of Hurricane Katrina’s landfall location) this storm offered a chance to study the differences in storm surge impacts for an area that had recently experienced a major hurricane event (Beven and Kimberlain 2009; Knabb et al. 2011).

Hurricane Ike (2008) was chosen because it was another major storm surge event in the Gulf of Mexico but for a different landfall location. The largest storm surge was seen on the eastern side of Galveston Bay in Texas, where estimates measured between 4.5 and 6.1 m on average. Bolivar Peninsula, arguably the most devastated area due to its lack of protections, was exposed to at least 3 m of storm surge across most of the peninsula. In addition, Ike’s landfall location placed it in the middle of an area with extensive shipping and energy infrastructure (Berg 2014).

Superstorm Sandy was chosen because of its unique landfall location. The East Coast of the United States has a large population and infrastructure density, so when Sandy made landfall its storm surge had a widespread impact. The largest storm surge was seen along the New Jersey and New York coastlines, with heights up to 2.7 m in each state (Staten Island and Manhattan for New York, and Monmouth and Middlesex Counties for New Jersey). While these values are moderate compared to storms like Ike and Katrina, significant storm surge events are rarer in the Northeast than along the Gulf Coast. Also, the type and density of infrastructure differs from that observed in the Gulf of Mexico. Finally, the opportunity to include a storm that was posttropical led to its inclusion in this study (Blake et al. 2013).

The focus areas for all four storms were narrowed to the states with the most notable storm surge to remove potential noise and focus the scale on areas with the largest impacts. Louisiana, Mississippi, and Alabama were chosen for Hurricanes Katrina and Gustav. Texas and Louisiana were chosen for Hurricane Ike. New York and New Jersey were chosen for Superstorm Sandy. The reason only four storms were used for this study is discussed later in section 2c(2).

### b. Fiscal loss data

There is currently no established method used for quantifying fiscal damages from storm surge, so new methods had to be developed. We developed two methods, each based on a distinct source of information about fiscal damages from flooding. The SED method uses version 3.0 of the National Centers for Environmental Information (NCEI) Storm Events Database (SED) and demographic data from the Bureau of Economic Analysis (BEA), both of which are publically available. The NFIP method uses data from the National Flood Insurance Program (NFIP), which are not publically available, and were obtained directly from FEMA.

The SED catalogues county-level information on major meteorological events and can be searched for specific dates and event types, such as storm surge or coastal flooding, both of which were used for this study. The formal definitions of storm surge and coastal flooding are nearly identical in the database, but the key difference is whether the storm responsible was of a tropical nature (MacAloney 2016). Sandy was posttropical when it impacted New Jersey and New York, and thus records associated with its storm surge are classified as coastal flooding. Here, we used the property damage estimate from the SED storm surge and coastal flooding records, which is defined as the damage inflicted to personal property and to public infrastructure and facilities. Property damage estimates in the SED are derived from a combination of sources by local National Weather Service (NWS) offices, including emergency managers, the U.S. Geological Survey, the U.S. Army Corps of Engineers (USACE), power utility companies, and newspapers (MacAloney 2016). There are shortcomings of the SED property damage data that must be noted. First, the damage estimates may not reflect actual values of property damage incurred. Two, entries are limited by how NWS offices designate losses. This was problematic for Sandy because the only records of coastal flooding found were for New Jersey. New York only had recorded losses from high winds. Hurricane Katrina had a similar issue for Alabama. There is no doubt that the additional data would have affected the results of this study, but the degree of impact unfortunately cannot be tested. Finally, the focus on property damages means that other, indirect impacts of storm surge are not taken into account.

For a given storm, fiscal damage is expected to increase as population, income, or both increase. Hence, a normalization method is needed to allow fair comparison between counties, which is why the BEA demographic data were used. Population, income per capita, and total income were obtained from the BEA for the same impacted counties and same years of each storm. The fiscal loss measure for the SED method was calculated by dividing SED property damage estimates by each of the three BEA data types, leading to three versions of the SED method: the SED-POP method uses county population, SED-PCPI uses county income per capita, and the SED-TI method uses total county income.

The data used for the NFIP method were the total claims paid as of December 2016 and the total insurance coverage at the time of the respective storm, each aggregated at the county level. A normalized level of damage was calculated by dividing the paid claims by the insurance coverage. One drawback of the NFIP method is that the data do not distinguish between storm surge flooding and freshwater flooding, which is outside the scope of this study. Counties not directly affected by storm surge were discarded, but there is still a possibility of overestimation with this method. Also, NFIP data for only two of the four storms could be acquired: Hurricane Katrina and Superstorm Sandy.

### c. Storm surge data

Two main questions needed answering to determine how best to incorporate storm surge variables into this study: “Which variables of storm surge should be included in the scale?” and “Which model should the data for the variables come from?” The answer to the first question influences the answer to the second, so it is addressed first.

#### 1) Storm surge variables: Height and velocity

Traditionally, storm surge height is the standard metric for communicating possible threats to life and property, but by no means is it the only possible metric. Damage to a structure during a storm surge event is expected to be proportional to the force of moving water on that structure. The force on a solid rectangular body of width and height immersed in a fluid flow, a well-studied problem in fluid mechanics (Fox and McDonald 1978), is given by

where is the drag coefficient of the body, *ρ* is the density of the fluid, and *υ* is the fluid speed. Equation (1) shows that, as expected, height is an important factor behind the destructive force of storm surge, but it also shows that the force is very sensitive to the velocity of the storm surge. Velocity has yet to be included with height in storm surge threat communication and analysis metrics, but the basic dynamics imply both are important to consider. Therefore, the following five parameter combinations are tested: height only, velocity only, height and velocity, height and velocity squared, and height and velocity cubed. Velocity squared and velocity cubed are explored to see how well storm surge force and power, respectively, are linked to fiscal loss.

#### 2) Storm surge data source: SLOSH versus ADCIRC

Two models were considered for use in this project: The SLOSH model and the Advanced Circulation (ADCIRC) model. SLOSH is a finite-difference, numerical-dynamic model used by the NHC for tropical cyclone storm surge height prediction (Jelesnianski et al. 1992; FEMA–URS–U.S. Army Corps of Engineers 2003). The model uses the equations of motion within a polar frame of reference to generate potential storm surge heights on a polar, continuous grid. However, the polar grids suffer resolution issues at the edges, and thus coastal areas that are in the grid but farther away from the centroid see more error in predictions. More importantly, even though velocity is simulated by the model, only storm surge height predictions are reported. The exploration of storm surge velocity was crucial to this study, so SLOSH was eliminated as a possible storm surge data source.

The ADCIRC model, specifically the ADCIRC 2DDI version, is a two-dimensional, depth-integrated, numerical-hydrodynamic model (Luettich et al. 1992; Blain et al. 1994; USACE 2017). The model utilizes a generalized wave-continuity equation and the momentum balance equations, with a finite-element method in space and a finite-difference method in time, to simulate the water level and velocity. Grids for ADCIRC are external inputs that all share an unstructured and spatially extensive design, covering at maximum the Gulf of Mexico and most of the western Atlantic Ocean in the Northern Hemisphere. The large spatial extent eliminates dependence on approximate boundary conditions, though it is possible to use smaller areas if desired. The grids can be modified to give areas of interest higher resolution for refined local estimates. The high levels of detail achievable by ADCIRC unfortunately make it computationally expensive; hence, it is mostly used for poststorm analysis. The large file sizes associated with the ADCIRC output data are also why only four storms were used in this study (a 2-TB hard drive was necessary to transfer data for three of the four storms). However, ADCIRC does offer maximum depth-averaged water velocities alongside maximum water elevations during the storm of interest, so it was chosen for this study.

The ADCIRC data used for this study were acquired from two sources: the USACE and the ADCIRC website. Data for Katrina came from the ADCIRC website, and details of the run and the data files can be found and downloaded from the Example Problems subsection of the Documentation page. Data for Gustav, Ike, and Sandy came from the USACE; details and data for these storms may be requested directly from the organization.

### d. Data preparation

This section will describe the multiple steps taken to prepare both the storm surge data and the fiscal loss data for subsequent use in the main analysis.

The ADCIRC model output covers a large spatial area, most of which is outside the scope of this study. The majority of fiscal losses from storm surge occur on land, so only the model output needed to represent storm surge inundation, which we define as the water surface height above normally dry ground level (NHC 2013), was extracted. Therefore, at every normally dry ground grid point, the storm surge inundation was computed by subtracting the provided land elevations from the raw maximum water elevation data. The ADCIRC maximum water velocities are depth averaged and, thus, required no adjustment. All grid points associated with open water were removed to limit the areas of focus to the affected dry coastlines. Topologically Integrated Geographic Encoding and Referencing (TIGER)/Line shapefiles for state boundaries (circa 2016) were downloaded from the U.S. Census Bureau and used in the Quantum Geographic Information System (QGIS) program to limit the data to the states of interest for the respective storms. County boundary shapefiles (circa 2016) were then used to label the storm surge grid points with the county they fell within. After these initial steps, it was discovered that some counties had a very small number of grid points experiencing storm surge. A minimum requirement of 50 grid points was set to remove these counties and help limit the amount of noise present in the analysis. Finally, data associated with Orleans Parish in Louisiana were also removed from the study, due to the large levee-protected area within its borders. Upon completion of these preparations, the final numbers of counties remaining for the SED-POP, SED-PCPI, SED-TI, and NFIP methods were 45, 45, 45, and 52, respectively.

Two steps were taken to process the fiscal loss data. The first was to apply inflation modifiers to all data so that fair comparison could be done. January 2015 was chosen for the purposes of this study. The second step was to express the financial losses in a base-10 logarithm (log_{10}) format so that the scale had the beneficial property of eliminating saturation at high values, as in the Richter scale.

### e. Main analysis procedure

The main analysis of this study comprised three steps: 1) determining the statistical metric used for the storm surge variables, 2) choosing one of the four fiscal loss metrics and one of the five storm surge variable combinations, and 3) creating the storm surge scale.

Step 1 involved independent correlation of the two possible storm surge variables to the four fiscal loss metrics for each individual storm. For both height and velocity, three main representations [all the model output, model output above the 90th percentile, and model output within the interquartile range (IQR)] and three subrepresentations (the mean, median, and mode of each main representation) for a given county were considered, resulting in nine possible storm surge representations (All-Mean, All-Median, All-Mode, >90th-Mean, >90th-Median, >90th-Mode, IQR-Mean, IQR-Median, and IQR-Mode). The three main representations were chosen to determine whether the full range of data, the “worst-case scenario,” or the most common scenario for the surge would offer the best property damage predictability. The three subrepresentations were chosen to determine the most appropriate statistical measure for the surge in a given county. With four storms, two storm surge variables, four fiscal loss metrics, and nine representations, the total number of cases explored was 4 × 2 × 4 × 9 = 288. The goal of this step was to determine which representation would be the most appropriate for the storm surge variables at subsequent steps in the analysis.

Step 2 used Minitab statistical software to explore linear and multiple linear regression models using the five possible storm surge variable combinations for each of the four fiscal loss metrics. In contrast to step 1, which examined the four storms separately, the step 2 regressions were conducted for all storms at once (hence the number of points in the regressions equals the total number of counties affected by all four storms). Various statistical checks were used to determine the “best” model, including key checks to confirm that the assumptions of linear regression are satisfied, evaluation of statistical significance (i.e., *p* values), and measurement of resolved variance through evaluating coefficients of determination *R*^{2}. Equally critical, we evaluated the predictive skill of the resulting statistical models through cross validation as measured by coefficients of determination *R*^{2}_{pred} when predicting new data, (i.e., statistical cross validation). The results from these checks determined the best storm surge variable(s) and fiscal loss metric for the storm surge scale. A robustness check of the chosen model was also done by removing one storm, rerunning the regression with the remaining three, and then using the new model to predict the removed storm. The *R*^{2} for the removed storm using the new model was compared with the *R*^{2} for that storm using the original model. The model is deemed robust if the *R*^{2} of the new model is only slightly less than the *R*^{2} of the original model.

Step 3 involved the creation of the storm surge scale itself. The model equation chosen in step 2 served as the basis for the scale. Predicted versus actual losses were then examined for a detailed look at how the scale performed for the individual storms.

## 3. Results

### a. Step 1: Choosing the statistical metric for the storm surge variables

The first eight tables in the online supplemental material display the results of each correlation between the four fiscal loss metrics and the various representations of storm surge height and velocity, and are the source of the values calculated for Tables 1 and 2 . The number of data points used for each correlation is equal to the number of counties affected by each storm using the given fiscal loss metric, respectively. Table 1 contains the average of the four storm correlations when using all data, data > 90th percentile, and data within the IQR and shows that using storm surge data > 90th percentile yields the best overall correlation results for both storm surge height and velocity. Table 2 contains the average of the four storm correlations when using the statistical mean, median, or mode of the data > 90th percentile. The statistical mode consistently shows the best correlations for velocity, while the statistical mean shows the best for three of the four fiscal loss metrics for height. Closer inspection of the difference between the mode and mean averages suggests that choosing the mode over the mean preserves higher correlation values overall. The mode of the storm surge data > 90th percentile is thus chosen to represent the storm surge height and velocity from this point onward.

### b. Step 2: Choosing the fiscal loss metric and the storm surge variable combination

We now present the various statistical checks applied to the regressions of the mode of the storm surge data > 90th percentile for the five height and velocity combinations versus the four fiscal loss metrics for all storms at once, which is a total of 20 regressions.

The first statistical check implemented is based on five key assumptions of linear and multiple linear regression:

Linear relationship—there must be an inherent linear relationship between each of the predictors (

*x*_{1},*x*_{2}, etc.) and the outcome (*y*). To validate this assumption, the correlation coefficients must be reasonably high.No or little multicollinearity—predictors must not show interdependence. To validate this assumption, the variance inflation factor must be less than 10.

Multivariate normality—there must be a normal distribution of residuals. To validate this assumption, a normal probability plot of residuals must show a linear relationship (approximately follow the

*y*=*x*line).Homoscedasticity—residuals must have consistent magnitude throughout the regression. To validate this assumption, residuals must be evenly scattered in the vertical across a horizontal zero line in a versus fit plot with no signs of fanning outward/inward.

No autocorrelation—residuals must be independent of each other. To validate this assumption, the Durbin–Watson statistic must be between 1.5 and 2.5.

If the assumptions are not satisfied, then the model should not be explored further. Analysis of the four fiscal loss methods showed that both the SED-PCPI and NFIP methods were unable to satisfy the assumptions using any of the five storm surge variable combinations. For the SED-POP method, the height-only model was eliminated, while for the SED-TI method the height-only and the height and velocity models were both eliminated. The remaining storm surge variable models for the SED-POP and SED-TI methods were used for the rest of the analysis. Results from the assumptions analysis are also available in the online supplement.

The second statistical check requires the *p* values of the storm surge variable coefficients to reach a 95% significance level, the results of which are also shown in the supplemental material. All the coefficients of the remaining storm surge model variables for both the SED-POP and SED-TI methods reach the 95% significance level, so none of the remaining models can be eliminated as a result of this statistical check.

Finally, we estimate the actual variance resolved by the statistical model via coefficients of determination. These include the nominal coefficient of determination from the regression *R*^{2} and, arguably more relevant, cross-validated estimates of resolved variance via *R*^{2}_{pred}. Our results are provided in Table 3 and show that a combination of height and velocity for the SED-POP method, in which fiscal loss is measured by loss per capita (LPC), yields the best results (i.e., greatest cross-validated skill); thus, this model was chosen for the final scale. The model is given by

where LPC is in dollars, *h* is storm surge height (m), and *υ* is the storm surge velocity (m s^{−1}) (mode of data > 90th percentile). Further comparison of the height-only model to the height and velocity model of SED-POP showed that *R*^{2} and *R*^{2}_{pred} increased from 55.77% to 66.10% and from 51.93% to 62.04%, respectively. The inclusion of velocity improves the predictive skill by ~20% for both coefficients of determination.

Before creating the scale, the robustness of the model needed to be tested. To do this, the regression was first redone without one of the four storms. The new equation was then used on the removed storm to see if the *R*^{2} values were still comparable to the *R*^{2} achieved using the regression done with all four storms. Results of this analysis (Table 4) show similar values of *R*^{2} even with one storm removed when developing the regression coefficients. Overall, the model equation created with the SED-POP method and storm surge height and velocity appears robust.

### c. Step 3: Scale creation

We chose to have our storm surge scale, named the Kuykendall scale (*K* scale for short, name explained in appendix A), be linear with so that if *K* increases by 1, then LPC increases tenfold (similar to the Richter scale). Furthermore, we set *K* = 0 to correspond to the lowest possible value for the LPC ($0.01). Hence, , and the *K* value corresponds to the number of zeros to the right of $1, including cents. For example, *K* = 2.00 corresponds to an LPC = $1.00, *K* = 5.00 to an LPC = $1,000.00, and so on. Values of the *K* scale are given with two decimal places to address the larger spread of potential loss with higher values of the scale, which is a consequence of the scale’s logarithmic nature.

Combining Eq. (2) with the definition of the *K* scale gives the final, quantitative relationship for the scale:

a visual representation of which is shown in Fig. 2. The largest calculated mode of storm surge height > 90th percentile was 7.38 m (Harrison County, Mississippi, from Hurricane Katrina), and the largest mode of storm surge velocity > 90th percentile was 1.71 m s^{−1} (St. Bernard Parish, Louisiana, from Hurricane Katrina), hence, the chosen axis ranges. The contours of Fig. 2 have a consistent linear slope as storm surge height and velocity increase. However, note that values of *K* less than 1.756 are not present in the plot because storm surge height and/or velocity would need to be negative to achieve such values. This is not physically possible, and thus according to Eq. (3), losses per capita less than $0.57 are also not possible. Therefore, any values less than *K* = 1.76 (rounded to the hundredths place to mirror *K*-scale format) would mean an LPC = $0.00 (Table 5). In the real data, there was one instance (Cameron County, Texas, during Hurricane Ike) where a *K* value less than 1.76 was observed, and this discrepancy and its implications are discussed later.

An example shows how the scale works. Assume the modes of storm surge height and velocity > 90th percentile are 4.25 m and 1.00 m s^{−1}, respectively, for a given county. Equation (3) gives *K* = 5.78. The projected LPC in dollars is , or approximately $6,000. Or, one could use the whole integer value of 5 to say the projected LPC was on the order of thousands of dollars. Either way is acceptable, depending on the desired final message. One drawback of the scale’s logarithmic nature is that errors have greater impacts at higher *K* values. For example, a change from 3.11 to 3.12 results in a loss increase of ~$0.30, while a change from 5.11 to 5.12 results in a loss increase of ~$30.00. While these examples are not extreme, they do highlight the need for accurate measures of storm surge height and velocity when looking for more precise measures of loss.

It is useful to examine the four analyzed storms together to see how the scale performed. Figure 3 is a scatterplot of the actual versus predicted *K*-scale values and shows a good match for Katrina, overestimation for Gustav, slight but consistent underestimation for Ike, and a smaller spread of predicted losses than actual losses for Sandy. The reason for Gustav’s consistent overestimation is likely the similar impact area as Katrina. Structures lost during Katrina had not been rebuilt for Gustav, and hence the loss is less than predicted. If Gustav had made landfall prior to Katrina, the actual losses likely would have been much higher. The reason for Ike’s consistent underestimation is not as clear but could be linked to discrepancies between reported SED property damages and ADCIRC model output. Closer examination of counties with higher residuals (>1.0) showed some areas with reported storm surge impacts did not have associated ADCIRC output and could explain some of the underestimation. Sandy shows a stronger vertical orientation of data in Fig. 3 than any of the other storms, suggesting that fiscal losses from Sandy were more sensitive to storm surge heights and velocities than what the *K*-scale predicts. The sensitivity could be tied to the infrastructure and/or preparedness differences of New Jersey versus a location along the Gulf Coast. Despite any outliers, we find a significant and positive linear relationship between fiscal loss per capita and both storm surge height and velocity.

Additional spatial plots with the affected counties and their associated *K*-scale values for each storm are provided in appendix B.

## 4. Discussion

### a. Additional findings from steps 1–3

In the correlations for step 1 (first eight tables in the online supplemental material), some negative correlations were seen for Sandy and Gustav. For Sandy, this appeared occasionally in the All and IQR representations for height across all fiscal loss methods. However, this was never seen in the >90th percentile representation. This suggests there are limitations when using the SED, particularly when counties known to be devastated by storm surge in New York were not included in the database. Gustav also saw negative values but for all height representations in the SED-PCPI method and a single instance in both the SED-POP and SED-TI methods. This may be due to the respective denominators from the BEA, but the data would need further study to evaluate this hypothesis. In general, the correlations with storm surge velocity are more consistently positive regardless of the method or representation selected, compared to the correlations with storm surge height.

In the regressions for step 2, the model skill for velocity is consistently higher than for velocity squared and velocity cubed. Height and velocity squared are related to force, while height and velocity cubed are related to power. However, the *R*^{2} and *R*^{2}_{pred} values appear to decrease asymptotically to the values seen in the height-only model. One possible explanation is the inherent nature of the fiscal losses. Log transformation of the original fiscal losses were done to ensure behavior similar to the Richter scale. Initial explorations of the step 1 correlations without the log transformation yielded poorer results, implying that the fiscal losses are not linear in terms of surge height and velocity. The log transformation appears to have captured this nonlinearity, which may be why further regression using other nonlinear terms, such as velocity squared or cubed, produced poorer results. A redone step 2 analysis without the log transformation would be useful in exploring the potential skill of these nonlinear terms.

In the robustness test from step 2, removing Hurricane Gustav from the regression counterintuitively improved the model performance. It is possible that Gustav is a low outlier among other landfalling tropical systems due to its occurrence in the same geographic location but only three years after Katrina. Inclusion of more historic and recent storms (e.g., Hurricanes Harvey and Irma) in future research could determine if Gustav is truly an outlier and should be removed from the development process of the *K* scale.

Finally, in the creation of the scale from step 3, one county from Hurricane Ike, Cameron County, was observed with a *K* value less than the minimum value of Eq. (3). Closer examination of the county was conducted, but no significant protections (such as the levees in Orleans Parish) were found, and the county fulfilled all remaining criteria for use in this study. Since this county could not be removed, its implications must be considered carefully. We hypothesize that this county may be one of other such cases currently not covered by the *K* scale and that an increased sample size is necessary to bring the first coefficient in Eq. (3) (the minimum value of the scale) closer to zero to account for them. This would better align the scale’s equation with the design shown in Table 5 for smaller values of loss.

### b. Other storm surge variables

Storm surge height and velocity were considered for this study, but other variables might also increase the scale’s predictive skill. One variable considered, but ultimately not explored, was the duration of the storm surge in an area. Other variables not yet considered may also add skill to the *K* scale.

### c. Further refinement of impacted area representation

There are a number of ways in which the spatial representation of areas impacted by storm surge could be improved for the *K* scale. In terms of resolution, counties were used for this study, but smaller geographic representations like zip code tabulation areas or even individual structures could be explored as well. Higher resolution would also require refinement of the population estimates used in the *K* scale. Population density maps could be compared to the aerial spread of a given storm surge to achieve finer estimates of the impacted population. Finally, different landfall locations have different levels of experience with hurricanes, which leads to differences in protections (e.g., seawall in Galveston or levees in New Orleans) and preparedness. A quantification method for protection and preparedness would be necessary first, but their consideration could improve our scale’s predictive performance.

### d. Academic and operational use

The Kuykendall scale has great potential in the world of research. A consistent scale could be used in historical or poststorm analyses of landfalling hurricanes to compare losses from the past to losses of the present. Such a scale would also be an important component of any analysis of future storm surge losses, particularly in a changing climate. Studies of the impacts of sea level rise on storm surge have already been conducted, but the scale could help bolster that kind of analysis and make the results even clearer to a general audience (Reed et al. 2015).

The *K* scale requires refinement and further testing before it could be considered operational, but its use can be envisioned in a multitude of settings. The most immediately feasible use would be in risk analysis, via historical storm analysis and archiving in particular. Insurance and reinsurance providers would greatly benefit from a metric that could determine which areas are at greatest risk for storm surge fiscal loss based on historical hurricane landfalls. This kind of analysis would also assist emergency managers in preparing and then delegating their efforts toward their most at-risk communities. Analogs based on archived storms would also provide context for improved understanding of the potential impacts of an approaching storm.

Risk communication could also benefit from the *K* scale in the future. As was shown in the example from section 3c, the *K*-scale values can be interpreted with different levels of precision based on the desired message, which then could be used by various government officials, broadcast networks, etc. The *K*-scale value could also be shown alongside the SSHWS to better illustrate the expected main impact of the approaching storm (greater wind impact than storm surge impact, or vice versa). The dollar sign is eye-catching, and the monetary losses can be compared to personal income to make individual decisions and preparations. However, as was also mentioned section 3c, errors in storm surge variable measurement lead to errors in potential loss, and this is especially impactful for larger *K* values due to the logarithmic design of the scale. Higher levels of caution are therefore necessary when interpreting and communicating the potential losses calculated by the *K* scale at higher levels of precision. We advise that for communication-oriented uses, the *K* scale should first be represented as a whole number with a magnitude of impact, such that a 3 would represent tens of dollars of loss, a 4 represents hundreds, a 5 represents thousands, and so on. Additional decimal places and their more detailed potential losses could then be shown, depending on the desired final message. We also recommend that the *K* scale first be tested to assess the public’s understanding, perceptions, and reactions before implementation as an operational tool.

The communication potential of the *K* scale is also dependent on it being operable in a real-world time frame, which it is not capable of in its current form. At the time of this study, neither SLOSH nor ADCIRC could be used to generate the *K* scale on an operational timeline. The SLOSH model is able to run within an operational time frame, and while it does simulate storm surge velocity, that parameter is currently not available as SLOSH output. The ADCIRC model produces outputs of storm surge velocity, but it is too computationally expensive to be used operationally. A balance between the two models is necessary first. This could be achieved via modifications to one of the existing models or via new, more computationally efficient models (e.g., Mandli 2013). Creation of such a model would be beneficial for the future use of the *K* scale and more broadly for the academic, analytical, and operational domains.

## 5. Conclusions

The goal of this study was to create a fiscally based scale for tropical cyclone storm surge with the capacity to evaluate potential losses in a skillful, readily communicated manner. The scale builds on concepts from two already well-established scales: the Saffir–Simpson hurricane wind scale and the Richter scale. The emphasis of the SSHWS motivated the fiscal basis of the scale. The emphasis of the Richter scale on a nonsaturating measure of damage influenced the use of a log_{10} scale as well as additional decimal places for added precision of loss estimates.

The ADCIRC model was used in the study in order to explore the possible inclusion of storm surge velocities as well as surge heights in defining a risk scale. Four different county-level fiscal loss methods were also explored: three relying on combinations of NCEI SED property damages and BEA population, per capita personal income, or total income data, and another utilizing NFIP insured coverage and paid claims data. Data were preprocessed to maximize the signal-to-noise ratio, to ensure that storm surge inundation is measured, to remove Orleans Parish due to a large levee protected area, to account for inflation, and to apply the log transform to the fiscal loss metrics. Among the various metrics of storm surge variables considered, the mode of the storm surge data above the 90th percentile was found to yield the strongest relationship with fiscal loss. Among the various fiscal loss metrics and storm surge variable combinations considered, loss per capita was found to have the strongest cross-validated relationship when both storm surge height and velocity were used as predictors.

The logarithmic basis of the storm surge scale, named the Kuykendall scale or *K* scale, makes communication of the scale’s meaning straightforward: every integer increase in *K* leads to a tenfold increase in loss per capita. Comparison with the real data from the four storms revealed that, while there was variance within the individual storms, the *K* scale was able to approximately capture the actual losses and the relationship of increasing LPC with increasing storm surge height and velocity. The *K* scale is the first known incorporation of both storm surge height and velocity into a single metric describing storm surge impacts.

The *K* scale has potential for further improvement and future use. The addition of more storm data and storm surge variables, as well as refinements to resolution and impacted populations, exploration of nonlinear relationships, and consideration of the landfall location’s protections are all possible areas for further exploration and potential refinement of the scale. After some real-world testing, the *K* scale could find important applications in academic, analytical, and operational domains.

## Acknowledgments

The author thanks Chris Massey of the USACE and Joseph Nimmich, Elizabeth Asche, and Amanda Pieschek of FEMA for their respective contributions to this research project. This research was supported by Penn State’s Center for Solutions to Weather and Climate Risk and the Penn State Earth Systems Science Center, which is part of the Penn State Earth and Environmental Systems Institute.

### APPENDIX A

#### Kuykendall Scale

The name “Kuykendall” was given to the storm surge scale presented in this study to honor the lead author’s (AW) aunt, Paula Kuykendall. She watched over AW and her younger brother while both parents were out of the country during Hurricane Ike in 2008. Mrs. Kuykendall had no prior experience with hurricanes and so overprepared for the coming storm, despite AW’s insistence at the time that a category 2 storm was not worth the worry. The preparation proved beneficial during the following week-and-a-half without power. This experience taught AW valuable first-hand lessons about hurricane risk communication and the consequences of risk misrepresentation.

### APPENDIX B

#### Spatial Representation of *K*-Scale Values

Figures B1–B4 show spatial representations of *K*-scale values for the four storms used in this study.

## REFERENCES

*Introduction to Fluid Mechanics*. 2nd ed. John Wiley and Sons, 683 pp.

*Tropical Meteorology Special Symp./19th Conf. on Probability and Statistics*, New Orleans, LA, Amer. Meteor. Soc., JP1.4, https://ams.confex.com/ams/88Annual/techprogram/paper_135054.htm.

*Storm Data*preparation. NWS Instruction 10-1605, 97 pp., http://www.nws.noaa.gov/directives/sym/pd01016005curr.pdf.

*64th Interdepartmental Hurricane Conf.*, Savannah, GA, Office of the Federal Coordinator for Meteorology, http://documentslide.com/documents/64th-interdepartmental-hurricane-conference-56ec5f412f3c3.html.

*Hurricane! Coping with Disaster: Progress and Challenges since Galveston 1900*, R. Simpson et al., Eds., Amer. Geophys. Union, 155–164.

## Footnotes

Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/WAF-D-17-0174.s1.

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).