Efficient Probabilistic Prediction and Uncertainty Quantification of Tropical Cyclone–Driven Storm Tides and Inundation

William J. Pringle aEnvironmental Science Division, Argonne National Laboratory, Lemont, Illinois

Search for other papers by William J. Pringle in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-2877-4812
,
Zachary Burnett bNOAA/Coast Survey Development Laboratory, Silver Spring, Maryland
cUniversity Corporation for Atmospheric Research, Boulder, Colorado

Search for other papers by Zachary Burnett in
Current site
Google Scholar
PubMed
Close
,
Khachik Sargsyan dSandia National Laboratories, Livermore, California

Search for other papers by Khachik Sargsyan in
Current site
Google Scholar
PubMed
Close
,
Saeed Moghimi bNOAA/Coast Survey Development Laboratory, Silver Spring, Maryland

Search for other papers by Saeed Moghimi in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-5255-7068
, and
Edward Myers bNOAA/Coast Survey Development Laboratory, Silver Spring, Maryland

Search for other papers by Edward Myers in
Current site
Google Scholar
PubMed
Close
Open access

Abstract

This study proposes and assesses a methodology to obtain high-quality probabilistic predictions and uncertainty information of near-landfall tropical cyclone–driven (TC-driven) storm tide and inundation with limited time and resources. Forecasts of TC track, intensity, and size are perturbed according to quasi-random Korobov sequences of historical forecast errors with assumed Gaussian and uniform statistical distributions. These perturbations are run in an ensemble of hydrodynamic storm tide model simulations. The resulting set of maximum water surface elevations are dimensionality reduced using Karhunen–Loève expansions and then used as a training set to develop a polynomial chaos (PC) surrogate model from which global sensitivities and probabilistic predictions can be extracted. The maximum water surface elevation is extrapolated over dry points incorporating energy head loss with distance to properly train the surrogate for predicting inundation. We find that the surrogate constructed with third-order PCs using elastic net penalized regression with leave-one-out cross validation provides the most robust fit across training and test sets. Probabilistic predictions of maximum water surface elevation and inundation area by the surrogate model at 48-h lead time for three past U.S. landfalling hurricanes (Irma in 2017, Florence in 2018, and Laura in 2020) are found to be reliable when compared to best track hindcast simulation results, even when trained with as few as 19 samples. The maximum water surface elevation is most sensitive to perpendicular track-offset errors for all three storms. Laura is also highly sensitive to storm size and has the least reliable prediction.

Significance Statement

The purpose of this study is to develop and evaluate a methodology that can be used to provide high-quality probabilistic predictions of hurricane-induced storm tide and inundation with limited time and resources. This is important for emergency management purposes during or after the landfall of hurricanes. Our results show that sampling forecast errors using quasi-random sequences combined with machine learning techniques that fit polynomial functions to the data are well suited to this task. The polynomial functions also have the benefit of producing exact sensitivity indices of storm tide and inundation to the forecasted hurricane properties such as path, intensity, and size, which can be used for uncertainty estimation. The code implementing the presented methodology is publicly available on GitHub.

© 2023 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: William J. Pringle, wpringle@anl.gov

Abstract

This study proposes and assesses a methodology to obtain high-quality probabilistic predictions and uncertainty information of near-landfall tropical cyclone–driven (TC-driven) storm tide and inundation with limited time and resources. Forecasts of TC track, intensity, and size are perturbed according to quasi-random Korobov sequences of historical forecast errors with assumed Gaussian and uniform statistical distributions. These perturbations are run in an ensemble of hydrodynamic storm tide model simulations. The resulting set of maximum water surface elevations are dimensionality reduced using Karhunen–Loève expansions and then used as a training set to develop a polynomial chaos (PC) surrogate model from which global sensitivities and probabilistic predictions can be extracted. The maximum water surface elevation is extrapolated over dry points incorporating energy head loss with distance to properly train the surrogate for predicting inundation. We find that the surrogate constructed with third-order PCs using elastic net penalized regression with leave-one-out cross validation provides the most robust fit across training and test sets. Probabilistic predictions of maximum water surface elevation and inundation area by the surrogate model at 48-h lead time for three past U.S. landfalling hurricanes (Irma in 2017, Florence in 2018, and Laura in 2020) are found to be reliable when compared to best track hindcast simulation results, even when trained with as few as 19 samples. The maximum water surface elevation is most sensitive to perpendicular track-offset errors for all three storms. Laura is also highly sensitive to storm size and has the least reliable prediction.

Significance Statement

The purpose of this study is to develop and evaluate a methodology that can be used to provide high-quality probabilistic predictions of hurricane-induced storm tide and inundation with limited time and resources. This is important for emergency management purposes during or after the landfall of hurricanes. Our results show that sampling forecast errors using quasi-random sequences combined with machine learning techniques that fit polynomial functions to the data are well suited to this task. The polynomial functions also have the benefit of producing exact sensitivity indices of storm tide and inundation to the forecasted hurricane properties such as path, intensity, and size, which can be used for uncertainty estimation. The code implementing the presented methodology is publicly available on GitHub.

© 2023 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: William J. Pringle, wpringle@anl.gov

1. Introduction

Tropical and subtropical storms build up storm surges that affect populated coastal regions in the United States and internationally. The temporarily higher sea levels from these storm surges result in widespread inundation of coastal low-lying areas, invoking flood and wave damage to residential and commercial structures. Storm surges from named storm events are estimated to cause billions of dollars in damages in the United States annually (NCEI 2022). Under the requirements of the Consumer Option for an Alternative System to Allocate Losses (COASTAL) Act, the National Oceanic and Atmospheric Administration (NOAA) is responsible for determining the extent of storm surge and storm tide to inform response and application of relief funding from the Federal Emergency Management Administration (FEMA) after a storm event.

While currently not being used, in this project we are investigating the application of a Hurricane Surge On-demand Forecast System (HSOFS) that could be employed when a tropical cyclone (TC) approaches and makes landfall along U.S. coastlines to provide predictions of TC-driven storm surge and inundation (Vinogradov et al. 2018). HSOFS uses a hydrodynamic storm tide model to simulate coastal water levels and inundation on high-resolution unstructured meshes, which may also be coupled to a wind-wave model to capture wave setup effects (Dietrich et al. 2011; Moghimi et al. 2020). The system would be utilized to produce either 1) near-landfall forecasts for support of recovery and response in the immediate aftermath of TC landfall or 2) hindcasts for allocating flooding-related insurance losses as part of the COASTAL Act (Abdolali et al. 2021). But as with any modeling, the uncertainty in the results is dependent on the uncertainty and accuracy of the input parameters, predominantly those of the TC track, intensity, and size. Therefore, it is becoming increasingly important to provide probabilistic predictions and uncertainty estimates for decision making. Obtaining the probabilistic result makes the predictions more informative and robust and reduces the likelihood of overestimation or underestimation of the severity of storm surge.

However, the complexity of HSOFS leads to a relatively high computational load, limiting the number of model ensembles achievable in a time and resource-limited environment. This has been an obstacle in the development of a probabilistic version. In contrast, the National Hurricane Center’s (NHC) Probabilistic Tropical Storm Surge (P-Surge) model (Taylor and Glahn 2008) performs hundreds of ensemble simulations within the allotted 1-h time frame (∼30 min per simulation per CPU) through an ad hoc full factorial perturbation of estimated TC track, intensity, and size errors. P-Surge is based on the Sea, Lake, and Overland Surges from Hurricanes (SLOSH) hydrodynamic code (Jelesnianski et al. 1992), which makes several physical simplifications (Joyce et al. 2019) and employs limited-area meshes for efficiency. Thus, the aim of this study is to develop an efficient ensemble prediction framework (requiring few model simulations) that can be used by the more comprehensive and computationally intensive HSOFS model for accurate near-landfall probabilistic forecasts of TC storm tide and inundation.

To this end, Davis et al. (2010) divided the range of the TC track errors into equal-area bins depending on a user-defined priority level and estimated that 27 ensemble members resolved 90% of inundation. Additionally, Kyprioti et al. (2021a) showed that quasi–Monte Carlo methodologies can be used to improve sampling efficiency of TC parameter errors over the full factorial approach used by P-Surge. However, there may still be limitations in the information available from smaller model ensembles [O(10)] that we aim for here. A possible solution is to seek a surrogate approximation that can be used to rapidly sample a wider distribution of input TC parameters and obtain robust statistical quantities, without having to query and iterate over the costly hydrodynamic model.

Many such surrogate models for storm surge prediction have been proposed, using machine learning techniques such as Gaussian processes (GP; kriging), artificial neural networks (ANN), and convolution neural networks (CNN), often combined with dimensionality reduction and k-means clustering via principal component analysis (PCA) (e.g., Jia and Taflanidis 2013; Taflanidis et al. 2013; Kim et al. 2015; Hashemi et al. 2016; Lee et al. 2021; Kyprioti et al. 2021b; Plumlee et al. 2021). This approach often involves training a surrogate model using a large ensemble of synthetic TCs that can then be used to predict storm surge based on the current TC parameters as inputs (Taflanidis et al. 2013; Kim et al. 2015; Hashemi et al. 2016; Lee et al. 2021). As noted by Lee et al. (2021), one of the limitations of this approach is that nonlinear interactions of surge with other processes (e.g., astronomical tides, sea level change, and hydrology) are ignored, which could be particularly important for inundation behavior. A potential solution is to generate a new surrogate model for the current TC forecast (e.g., Plumlee et al. 2021) that includes (at least some of) these interactions in the hydrodynamic model, and we seek to do so here considering the interaction between surge and astronomical tides.

In this study we seek a solution that develops a surrogate model on-the-fly to provide both robust statistics and uncertainty information of storm tide and flooding predictions for the current storm. A method potentially well suited to this application is polynomial chaos (PC) theory, which is a convenient means to propagate uncertainties from inputs to outputs of interest for general computational models (Sargsyan 2017). It can further be interrogated to rapidly evaluate moments and sensitivities due to their analytical availability, or quantiles and probability density functions (PDFs) via computationally inexpensive sampling. PC has been recently used for developing probabilistic predictions and analyzing the sensitivity of surge to TC parameters with good success (Sochala et al. 2020; Ayyad et al. 2021). These studies focused only on the surge: the best track hindcast problem (Sochala et al. 2020) and analysis of the sensitivity to idealized TCs approaching perpendicular to the shoreline (Ayyad et al. 2021). In contrast, our focus is on the real-life forecast problem for storm tide and the desire to efficiently train the surrogate model with a small sample set.

Thus, we adopt PC theory and develop strategies around efficient random variable sampling, dimensionality reduction, penalized regression with cross validation, and manipulation of the training set to optimize the setup for PC construction. We evaluate the accuracy of this PC-based surrogate model and demonstrate the reliability of the probabilistic prediction for three historical U.S. landfalling hurricanes (Irma in 2017, Florence in 2018, and Laura in 2020). Statistical quantities and variance-based sensitivities from the PC surrogate can be distributed along with surrogate itself as a product of the ensemble HSOFS modeling system. The ensemble generation and PC analysis methodology presented in this paper is implemented in an open-source Python framework called EnsemblePerturbation.

2. Methods and experiment

A flowchart of the proposed methodology in this study is shown in Fig. 1. To fully comprehend components of the flowchart we refer the reader to the rest of this section (sections 2a2d), as well as to the results (section 3) for details. Finally, in section 2e we describe the experiments we conduct to assess the accuracy of different options and evaluate the reliability of the probabilistic prediction.

Fig. 1.
Fig. 1.

Flowchart of the proposed methodology for efficient probabilistic predictions and uncertainty quantification of TC-driven storm tide and inundation.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

a. Storm surge modeling component of the on-demand system

The hydrodynamic storm tide model is based on the advanced circulation (ADCIRC; Luettich and Westerink 2004) code, which solves the shallow water equations using the continuous Galerkin finite-element method over an unstructured triangular mesh. ADCIRC is also the modeling engine for the Global Surge and Tide Operational Forecast System (https://registry.opendata.aws/noaa-gestofs/). In this study we use version 55 of ADCIRC (Pringle et al. 2021) including both astronomical tides and atmospheric-driven surge, but without coupling to a wind-wave model to capture the wave setup effect. Parametric representations of the TC vortex (based on track advisories provided by the NHC) are used to construct surface wind and pressure forcing driving storm surge in the ADCIRC model. Here, we use the classical symmetrical Holland vortex model (Holland 1980) that is built directly into the ADCIRC code. The unstructured mesh used in this study encompasses the western North Atlantic and Gulf Coast region with 1.81 million vertices, and resolution ranges from roughly 200 m at the coast and overland up to a maximum of 46 km in the open ocean (Technology Riverside Inc. and AECOM 2015). The vertical datum is mean sea level (MSL), and the floodplain extends up to an elevation of 10 m above MSL. Manning’s n friction coefficients, surface canopy coefficients, and surface directional effective roughness lengths based on land-use data are used to account for surface roughness effects on the hydrodynamics and to modify the atmospheric forcing overland, respectively (Technology Riverside Inc. and AECOM 2015). The minimum water depth parameter in the ADCIRC model is set to 0.05 m. Note that this actually allows for the water depth to become as low as 80% of this value, that is, 0.04 m.

b. TC perturbation

In this study, the forecasted TC is perturbed according to historical NHC forecast error statistics of position, intensity, and size (Taylor and Glahn 2008). Positional members are perturbed based on estimated errors for cross track (CT) and along track (AT), whereby CT refers to a perpendicular offset of the forecast track and AT refers to a slowdown or speedup of the TC along the forecasted track. Intensity is described by both maximum sustained wind speed (Vmax) and minimum central pressure (Pc), which is related through the B parameter of the Holland model (Holland 1980), Vmax2=B(PbPc)/(eρ), where e is the base of natural logarithms, ρ is the air density, and Pb is the background (ambient) air pressure. We choose Vmax as the independent perturbed variable and Pc becomes a dependent variable based on keeping the B parameter consistent with the original forecast. The size of the TC is changed by perturbing the radius of maximum wind (Rmax).

Perturbation of CT, AT, and Vmax across the whole forecast is achieved through a lookup table of historical NHC mean absolute forecast errors for certain lead times, distributed with the P-Surge model (Penny and Cangialosi 2019; included here in Tables A1A3). TCs are divided into three intensity bins based on the initial 0-h Vmax due to different error statistics between these categories. For instance, the mean absolute CT error is 11.6 n mi (1 n mi = 1.852 km) at the 12-h lead time and 27.8 n mi at the 48-h lead time for the medium strength TC (50–95 kt; 1 kt ≈ 0.51 m s−1). In this way if we perturb the CT variable by one “mean absolute error” the track will be offset a perpendicular distance from the original position of 11.6 n mi at the 12-h mark and 27.8 n mi at the 48-h mark. The same idea is true for AT and Vmax. The CT, AT and Vmax errors are treated as Gaussian random variables (μ = 0, σ = 1) whereby the mean absolute error is defined as 0.7979σ (Gonzalez and Taylor 2018). A negative value will perturb the CT and AT in one direction and positive value in the other, and similarly for Vmax a negative value corresponds to an intensity underestimate of the forecast and vice versa for a positive value (see Fig. 2 for an illustration of ±1σ perturbations). Note that we enforce [15, 175] kt bounds on Vmax based on the lowest values we found in the NHC data and the highest values of historically recorded TCs in the North Atlantic.

Fig. 2.
Fig. 2.

Example perturbations of a NHC advisory (Hurricane Florence) along the forecast. The ±1σ perturbations for the Gaussian distributed (a) CT, (b) AT, and (c) maximum sustained wind speed (Vmax) errors, and ±0.5 value perturbation of the uniformly distributed (d) radius of maximum wind (Rmax) errors.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

For Rmax the idea is similar to the other variables except that at each lead time the perturbation is only from the initialized 0-h value as NHC does not provide estimates of Rmax along the forecast. Unlike the other variables, the Rmax error is bounded and treated as a random variable with a uniform distribution (∈[−1, 1]) where the upper and lower error bounds at each lead time are found through a lookup table with the TCs divided into five size bins based on the initialized 0-h Rmax (Table A4). We determined these upper and lower error bounds by linearly extrapolating the values used by P-Surge at the 15th, 50th, and 85th percentile to the 0th and 100th percentile. [Referring to Table A4, an initially small storm is skewed toward having negative Rmax forecast errors (becoming larger along the forecast), and vice versa for initially large storms (see Fig. 2 for an illustration of ±0.5 perturbations).] Note that we enforce [5,200] n mi bounds on Rmax.

c. Ensemble generation

An ensemble of TC forecasts is generated by equally sampling the random variables (CT, AT, Vmax, and Rmax errors) based on the probabilistic property of each variable for forward uncertainty propagation analysis (section 2d). The idea is to build a surrogate model based on relatively few samples [O(10)], which can be then trivially queried to generate the probabilistic forecast, as well as conduct a global sensitivity analysis or obtain a forecast given user-defined values of the TC error variables. In contrast, P-Surge employs an ad hoc full factorial sampling methodology whereby seven perturbations of AT (slow to fast) and three of Vmax and Rmax (weak/large, “medium,” and strong/small) are used, along with enough perturbations of CT to cover 90% of the Gaussian distribution with spacing Rmax at the 48-h forecast (Gonzalez and Taylor 2018). Each possible permutation is used where each TC event is assigned a weight based on the combined probability and the probabilistic result is determined through summation of the weighted model results. This leads to 63 TC events per CT perturbation, or ∼400–900 TC events based on 7–15 CT perturbations (Kyprioti et al. 2021a), which would be prohibitive for HSOFS in a resource and time-limited environment.

We sample the variables using a quasi-random low-discrepancy sequence, of which several are available in the chaospy Python package (Feinberg and Langtangen 2015) employed by EnsemblePerturbation, including widely used Sobol and Halton types—the latter was used by Sochala et al. (2020). Here, we recommend the use of the Korobov sequence (Korobov 1959) because the random variables are sampled symmetrically about zero and cover a predictable range across all variables for any given sample size, which is not the case for the other chaospy sequence implementations. The benefit of such low-discrepancy sequences is avoidance of the “curse of dimensionality” that is associated with quadrature integration, which the P-Surge methodology could be viewed as a subset of. For instance, third-order quadrature integration for the four-dimensional problem requires 44 = 256 samples, as all possible permutations of just four perturbations of each variable is used. Smolyak sparse grid quadrature can be used to alleviate the issue, although ∼150 samples are still required for third-order quadrature. Instead, a low-discrepancy sequence can be used to sample each variable more densely without imposing the condition of using all permutations across the four-dimensional space, as illustrated in Fig. 3. Later we show that these low-discrepancy sequences with sample size O(10) can be used to generate a surrogate model that, due to enhanced regression techniques, are indeed of improved quality over the sparse quadrature.

Fig. 3.
Fig. 3.

Perturbed values of random variables (λ) across the 4D space for (a) sparse quadrature and (b) Korobov sequence sampling methodologies. Sparse quadrature has variable weights attached to each perturbation sample (indicated by marker grayscale and size) while the Korobov sequence is equally weighted (marker colors are used to distinguish unique samples across the panels).

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

d. Forward uncertainty propagation

We wish to know a probabilistic form of the model output Z = f(λ, x) (maximum water surface elevation in x) that is dependent on the set of input TC error variables, λ = (CT, AT, Vmax, Rmax). However, the underlying hydrodynamic storm tide model is too computationally expensive to sample a large number of times to properly understand the uncertainty and sensitivity of Z to the TC error variables. Therefore, we employ a surrogate approximation g(λ, x) ≈ f(λ, x) through PC theory (Sargsyan 2017; Sochala et al. 2020), which is constructed from a training set (section 2c). The resulting PC surrogate model is then a parametric representation of Z that can be trivially sampled a large number of times, and from which moments and global sensitivities can be analytically extracted. To ensure a suitable training set for generating an accurate PC surrogate (which requires a degree of smoothness) we manipulate Z and apply a dimensionality reduction technique for computational efficiency. These methodologies are outlined in the rest of this section.

1) Model output manipulation

There are two related difficulties that we encounter here with using the maximum water surface elevation for training the PC surrogate: 1) Some mesh points are inundated during some TC events and not in others, and 2) water surface elevation cannot physically go below the ground elevation (water depth must be positive or zero) but the surrogate model cannot be easily constrained to prevent a physically unrealistic negative depth prediction.

In the first problem the intuitive solution is to set Z to that of the ground elevation for mesh points that are not inundated (denoted NaN) in a certain TC event. However, this does not distinguish between TC events where the mesh point of concern might have been close to being inundated or very far from being inundated, resulting in a poor PC fit for predicting inundation. The solution we propose is to artificially extrapolate Z from wet mesh points over dry mesh points for the PC surrogate training purposes (Fig. 4), in which this artificially extrapolated Z can be negative and below the ground elevation. Here, we use an inverse-distance weighting (IDW) extrapolation (Plumlee et al. 2021) with a hydraulic head loss factor,
Z˜d=w=1k(Zdwhdw)Ddwpw=1kDdwp,
where Z˜d is the artificial maximum water surface elevation of the dry mesh point d, Zdw is the maximum water surface elevation of the wth closest wet mesh point to d, Ddw is the distance from mesh point w to d, p is the IDW extrapolation order, k is the number of nearest neighbors to use for extrapolation, and hdw is the head loss (Rucker et al. 2021) from mesh point w to d,
hdw=Ddwff,
where ff is a hydraulic friction factor. This methodology is similar to the weighted k nearest neighbor pseudosurge methodology used by Kyprioti et al. (2021c), in which there are four free parameters of the weighting scheme that require calibration. Ostensibly, the head loss factor we use here adds some physical meaning to the extrapolation. The factor ff can be related to Manning’s equation like in Rucker et al. (2021), requiring the Manning’s n friction coefficient, flow velocity, and flow depth. These flow quantities are not available once Z is extrapolated over the dry regions, so we simply view this relation in terms of guiding ff to a physically reasonable value. In section 3a(2) we compare values for ff, k, and p and how they affect surrogate model prediction accuracy. Note that for some originally dry mesh points it is possible that the extrapolated artificial depths using Eq. (1) will be greater than the minimum allowable water depth (0.04 m), thus become “wet.” Kyprioti et al. (2021c) corrected these mesh points by setting them to a small negative water depth. We did not perform this correction but found that about 0.1%–2% of the mesh points fell into this criterion, depending on the value of ff.
Fig. 4.
Fig. 4.

Example of extrapolating maximum water surface elevations (a) over dry mesh points to obtain artificial maximum water surface elevations (b) for a Hurricane Florence training set member. Here, the extrapolation uses nearest neighbor (k = p = 1), and a friction factor, ff = 0.0004. The gray line indicates the coastline, and the shaded gray area indicates land. Mesh points with ocean depths greater than 25 m have been removed from the dataset.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

A possible solution to the second problem is to build the surrogate model based on log(H) (Plumlee et al. 2021), where H is the simulated maximum water depth, which is physically always positive, guaranteeing that the surrogate prediction will be positive. However, an issue we find here is that our water surface elevation extrapolation technique proposed above leads to artificial negative water depths for otherwise dry mesh points in the training set. Therefore, H in the training data would need to be modified to be positive for these points by adding a constant, which can be subtracted back from the surrogate prediction. Of course, this means that the surrogate model can actually predict a negative real water depth for such dry mesh points, just as was provided to it for training. We test the accuracy of constructing the surrogate in log space versus linear space in section 3a(2).

2) Dimensionality reduction and PC surrogate

Building a surrogate model for all HSOFS mesh points (1.81 million), or even a subset of points around TC landfall [O(105)], would be prohibitively expensive; therefore, we seek a method to reduce the dimensionality of the problem. Such dimensionality reduction is common practice and has been used for building other surge surrogate models (e.g., Jia et al. 2016; Sochala et al. 2020; Kyprioti et al. 2021b; Lee et al. 2021). Here, we achieve dimensionality reduction via Karhunen–Loève (KL) expansions that are then coupled with PC surrogates.

As before, our model output of maximum water surface elevations, Z = f(λ, x) is dependent on the set of input TC error parameters λ and is spatially varying with x. The KL expansion can be written as
Z=f(λ,x)=f¯(x)+j=1Lξj(λ)μjϕj(x),
in terms of uncorrelated, zero-mean, unit-variance random variables ξj(λ) and eigenvalue–eigenfunction pairs [μj, ϕj(x)] of the covariance,
C(x,x)=Eλ{[f(λ,x)f¯(x)][f(λ,x)f¯(x)]},
truncated at eigenvalue L (≪dimensions of x) that explains a user-defined level of variance. The expectation Eλ indicates averaging across parameter λ, as does the bar symbol, that is, f¯(x)=Eλ[f(λ,x)]. The forward uncertainty propagation problem therefore reduces to seeking a function approximation for the KL coefficient functions ξj(λ), for which we employ a PC form here:
ξj(λ)k=0KcjkΨk(ξ),
where Ψk(ξ) are multivariate orthogonal polynomials with respect to the PDF of the stochastic germ ξ, which is a vector with elements being standard random variables that are chosen according to the expected PDF of the corresponding element of λ, that is, Gauss–Hermite for the Gaussian distributed CT, AT, and Vmax errors and the Legendre–Uniform for the uniformly distributed Rmax errors in this study (see section 2b). Finally, by substituting the PC Eq. (5) into the KL expansion (3) and switching the summations we arrive at the following joint KL-PC surrogate expansion,
Z=f(λ,x)g(λ,x)=k=0Kck(x)Ψk(ξ),
where
ck(x)=δk0f¯(x)+j=1Lcjkμjϕj(x),
and δ is the Kronecker delta. Given the PC coefficients ck(x), moments and global sensitivity indices can be analytically extracted from the KL-PC surrogate expansion (Sargsyan 2017; Sargsyan et al. 2021).

In our Python implementation of the joint KL-PC surrogate expansion in EnsemblePerturbation, the scikit-learn PCA class is used to perform the KL decomposition, and the chaospy (Feinberg and Langtangen 2015) package is used to perform the PC expansion, utilizing linear regression models from scikit-learn. These regression models are introduced and assessed in section 3. Also, chaospy functions are used to compute global sensitivity indices and percentiles from the KL-PC surrogate.

e. Experimental design

HSOFS is primarily concerned with forecasts of TC surge and inundation near (<48 h) landfall. To provide a rigorous test of the methodologies (section 3a), in this study we explore results for the 48-h lead time NHC advisory for three historical U.S. hurricanes: Irma in 2017, Florence in 2018, and Laura in 2020 (e.g., Fig. 5b). The storm tide model is spun up from a quiescent state with astronomical tides and best track hindcast forcing (e.g., Fig. 5a) for 7 days prior to the forecast so that the storm tide signal becomes effectively established (Westerink et al. 2008). For each hurricane we analyze a subset of the model based on mesh points that have ocean depths ≤ 25 m and that fall within the 34-kt wind speed swath of the NHC advisory. The spatial dimension of this subset is reduced through KL decomposition with truncation at the 99.99% variance level. We find that the number of eigenvalues at the variance level is close to but slightly fewer than the number of members in the training ensemble.

Fig. 5.
Fig. 5.

Hurricane Florence 2018 track and intensity information; (a) best track hindcast and (b) NHC forecast advisory 48-h prior to landfall. Figure generated using the troPYcal Python package (https://github.com/tropycal/tropycal).

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

In the first set of methodology experiments [section 3a(1)] we compare regression models, sampling methods, and sample size for construction of the surrogate model. In these experiments we conduct the analysis in linear space only on mesh points that are inundated across all TC events to avoid complicating the experiment with the water surface elevation extrapolation over dry mesh points [Eqs. (1) and (2) are not used in section 3a(1)]. First, we compare sparse quadrature integration (158 samples) to a 59-member Korobov sequence using different scikit-learn regression models for constructing the surrogate model. Then, using the best regression model from that experiment, we compare Korobov sequences with sample sizes of 19, 39, and 59, which cover 90.0%, 95.0% and 96.7% of the distributions of λ, respectively. In addition, at the 99.99% truncation level we obtain 18, 35–37, and 45–55 eigenvalues, respectively. In the second set of methodology experiments [section 3a(2)] we conduct the analysis on all mesh points in the subset, in both log space versus linear space, and with varying parameters of the water surface elevation extrapolation method. Note that in all cases we use third-order PCs, which we found to be the only reliable PC order; second-order PCs are not flexible enough while the desired size of the training set is too small to allow for higher-order four-dimensional PCs to be constructed accurately.

To validate the surrogate model, we use a 128-member test set for each storm where λ is randomly chosen from their distributions for each test set member. Note that this test set is never used to train the model. To evaluate the accuracy of the surrogate model we compute the root-mean-square error (RMSE) across the mesh points between the model simulation and the surrogate model for a single test set member. To compare the surrogate model across all test set members we plot cumulative distribution functions (CDFs) of the RMSE and compute the two-sided t-test statistic of the RMSE distribution between two surrogate models. In the second set of experiments, we also compute the percentage of mesh points that are falsely classified as wet or dry in the surrogate model prediction. In section 3b we compare modeled and surrogate predicted Z for the test set at a coastal location in the nearby vicinity of the forecast track for each hurricane and how that compares to the simulated astronomical tide. The model simulation provides the timing of Z, which we use to gain an understanding of how it coincides with the tidal phase, and how well the surrogate model predicts the storm tide (sum of surge and astronomical tide).

Finally, using the recommended methodology based on the experiments we construct joint KL-PC surrogate models for each hurricane to produce example products of an ensemble HSOFS, that is, global sensitivities of Z with respect to λ, and exceedance probability maps (section 3c). The probabilistic predictions are compared to simulated model results of the best track hindcast hurricane forcing. The reliability of the probabilistic prediction is assessed by comparing the fraction of elevation exceedances in the best track results above the height of the given exceedance probability. Here, mesh points where the predicted exceedance elevation is NaN (dry) are ignored in the computation, while a NaN in the best track simulation is set to ground elevation. In addition, inundation area is compared, where we would intuitively expect the prediction in the vicinity of the median (50% exceedance probability) to be similar to that of the best track; although strictly speaking exceedance probabilities are related to water levels only and this comparison for inundation areas should only be used as a guide.

For reference, error statistics against high-water mark (HWM) observations for the best track hindcast in the subset domain are shown in Table 1. For this comparison, model results were adjusted to the NAVD88 vertical datum and the three-monthly average sea level measured at a selected reference tide gauge located near landfall. HWMs greater than 400 m (∼2 mesh elements) from the nearest wet model grid point were discarded.

Table 1

Error statistics for best track hindcast model simulations against high-water mark observations. Datum correction and mean sea level adjustments are added to the model results before comparing with the observations; N = number of observations.

Table 1

3. Results

a. Methodology experiments

1) Regression

Results across all three storms show that fitting the surrogate model using (sparse) quadrature integration is superior to ordinary least squares (OLS) linear regression but far inferior to penalized linear regression from the 59-member Korobov sequence (Fig. 6, Table 2). For this test we compare Lasso and elastic net regression that uses l1-norm regularization and combined l1-norm and l2-norm regularization with equal weighting, respectively. The l1-norm regularization penalizes nonzero coefficients to form sparse models, and l2-norm regularization penalizes the size of the coefficients to form smooth models. These linear regression models avoid overfitting the KL parameters ξ in the training set that we see for OLS regression (Fig. 7), resulting in poor performance against the test set. For instance, the penalized regression decides to return a null polynomial for mode 6 and mode 9 in the example (Fig. 7) presumably because these modes with smaller eigenvalues are too noisy, which is an advantage to building the surrogate for the KL expansion as noisy components of the data can be effectively ignored by the overall surrogate model. Further, as a rule, the penalized regression gives a poorer fit to the KL training parameters than for the OLS regression, although correlation is generally high for the large eigenvalue modes. On the other hand, sparse quadrature gives a very poor fit to the KL training parameters, which indeed leads to poor performance against the test set relative to the penalized regression. Sparse quadrature projection is known to perform poorly for noisy data due to the presence of negative quadrature weights and amplification of small errors during PC construction (Sargsyan 2012). Nevertheless, sparse quadrature does validate better than OLS regression.

Fig. 6.
Fig. 6.

CDF curves of the surrogate model RMSE across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura), using a 158-member sparse quadrature training set and a 59-member Korobov sequence training set with different regression methodologies. The ts is the two-sided t-test statistic between the sparse quadrature RMSE and the Korobov RMSEs corresponding to the colors in the legend (largest positive value indicates smallest average RMSE); sparse_quadrature: sparse quadrature integration. Lasso_SS: Lasso penalized linear regression with shuffle-split cross validation. Lasso_LOO: Lasso penalized linear regression with leave-one-out cross validation. ElasticNet_SS: Elastic net penalized linear regression with shuffle-split cross validation. ElasticNet_LOO: Elastic net penalized linear regression with leave-one-out cross validation.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

Table 2

Statistics of surrogate model RMSE (m) across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura) and the three experiments: regression methodology test (Fig. 6), sample size test (Fig. 8), and extrapolation parameter test (Fig. 9). The integers after k and p are parameters from the extrapolation Eq. (1), and the decimal after n is the Manning’s n coefficient used to compute ff in Eq. (2). Min = minimum, Max = maximum, S.D. = standard deviation.

Table 2
Fig. 7.
Fig. 7.

Comparison of the predicted (surrogate) and modeled KL training parameters ξ of the top nine KL modes for the Hurricane Florence forecast. The R2 is the coefficient of determination of the surrogate prediction corresponding to the colors in the legend; μ indicates the eigenvalues of each KL mode evaluated from the sparse quadrature (sq) and 59-member Korobov sequence (k59) training data.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

With both Lasso and elastic net, we use built-in cross-validation estimators to automatically select the best penalization weight and return the most robust fit to the data. Shuffle-split (SS) and leave-one-out (LOO) cross-validation estimators are used to automatically divide the overall 59-member training set into a number of different splits of training and validation subsets during the regression fitting process. We used the scikit-learn default options for SS here where 10 unique splits are used each with 10% of the members used for validation. For LOO, the number of splits equals the number of members each using a different single member (the one left out) for validation. Results show that, overall, there are relatively small differences between the four combinations of cross validators and regularization strategies, although ElasticNet_LOO notably outperforms for Irma, as well as having a strong performance for both Florence and Laura. Therefore, we decide to use ElasticNet_LOO for the remainder of this paper, remarking also that LOO is attractive because there are no parameter choices to be made, while SS requires choosing the relative size of the training and validation subsets, in which the optimal choice may differ with sample sizes and storms. Furthermore, for elastic net regression we can also use cross validation to select the optimal weighting between l1 and l2 penalties if so desired, although we used 0.5 (equal weighting) here for simplicity.

When reducing the training sample size using the Korobov sequence, test performance remains notably superior to the sparse quadrature baseline, but does degrade as expected (Fig. 8, Table 2). The 39-member sequence performs about as well as, or better in the case of Irma, than the 59-member sequence in the lower half distribution but noticeably worse in the upper half distribution. While the 19-member sequence performs similarly to the 39-member sequence in the upper half distribution. If for practical purposes, we select the 90th percentile of the RMSE as an arbitrary measure of performance (RMSE90), for all three storms the 59-member sequence has an RMSE90 accuracy of approximately 0.3 m, whereas the RMSE90 accuracy is approximately 0.5 m for the 39-member sequence and 0.5–0.7 m for the 19-member sequence. Therefore, in section 3a(2) we choose the 59-member Korobov sequence as it provides about twice the accuracy of the smaller sample sizes under this assessment.

Fig. 8.
Fig. 8.

CDF curves of the surrogate model RMSE across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura), using 19-, 39-, and 59-member Korobov sequence training sets with ElasticNet_LOO regression. The ts is the two-sided t-test statistic between the sparse quadrature RMSE and the Korobov RMSEs corresponding to the colors in the legend (largest positive value indicates smallest average RMSE); korobov59: 59-member Korobov sequence; korobov39: 39-member Korobov sequence; korobov19: 19-member Korobov sequence.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

2) Water surface elevation extrapolation and logarithmic transformation

Here we vary the number of IDW neighbors (k = [1, 4, 16]), IDW order (p = [1, 2]), and the friction factor (ff = [0.0001, 0.0004, 0.0016]) for the water surface elevation extrapolation over dry mesh points. The friction factor values are derived through the Manning’s relation, ff=n2Uf2/Hf4/3, with n = [0.025, 0.05, 0.1] s m−1/3 (Manning’s n coefficient), Hf = 1 m (flow depth), and Uf = 0.4 m s−1 (flow velocity). The surrogate model for the k = 1, p = 1, n = 0.05 case is computed in log space as well as linear space. Results show that surrogate model accuracy is worse in terms of both RMSE and false wet/dry classification than in linear space (Fig. 9, Table 2). We also tried surrogate model generation in log space only for mesh points that are always wet in the training set [like in section 3a(1)], which did indeed provide an improvement to the false dry classification percentage compared to linear space (not shown). However, it would appear that when extrapolation is used over dry points and negative depths are introduced to the training set, this benefit disappears. Therefore, we only show the other extrapolation parameter experiment results in linear space.

Fig. 9.
Fig. 9.

Comparison of the surrogate model accuracy for different extrapolation parameters and construction in linear space vs log space (linear and log in the legend, respectively). The integers after k and p in the legend are parameters from the extrapolation Eq. (1), and the decimal after n is the Manning’s n coefficient used to compute ff in Eq. (2). Results are shown across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura), using 59-member Korobov sequence training sets with ElasticNet_LOO regression. (left) RMSE CDF curves, where ts is the two-sided t-test statistic between the log-space RMSE and the linear-space RMSEs corresponding to the colors in the legend (largest positive value indicates smallest average RMSE). (right) Total percentage of mesh points across all test set members with a false wet/dry classification.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

Surrogate model RMSE tends to increase with the friction factor, especially for n = 0.1 (Fig. 9). Performance for n = 0.025 and 0.05 are similar except for Laura, in which the smaller n = 0.025 is clearly superior. Laura made landfall at the east Texas–west Louisiana region composed of low-gradient wetlands and lakes with low n coefficients. Perhaps also for this reason, false wet/dry classifications (inundation) for Laura are more sensitive to the friction factor. In comparison, Florence made landfall at the steeper and more urbanized Carolinas, and while Irma impacted the low-gradient Everglades of southwest Florida, reef-like islands where high n coefficients are used in the model limit the inundation potential.

Overall, false wet/dry classifications follow a clear pattern where the surrogate model with a larger friction factor gives more false dry predictions but fewer false wet predictions, and vice versa for a smaller friction factor. In addition to the fact that the smaller friction factor produces smaller RMSEs, we prefer the smaller n = 0.025, since from an emergency management standpoint it would generally be considered preferable to be biased toward false wet classifications. As for the IDW extrapolation parameters, using more neighbors and going to second-order does not provide any discernible benefit to nearest neighbor (k = p = 1). As such, and for a preference toward simplicity, nearest neighbor using n = 0.025 is selected for presentation of the results in section 3c.

b. Storm tide prediction

Modeled maximum water surface elevations (Z) and their timing at coastal locations show an interesting relationship between the time of forecasted landfall and the astronomical tide (Fig. 10). For the Florence and Laura landfall locations, the timing of Z tends to coincide with the high tide(s) just prior to the forecasted landfall. North Carolina has semidiurnal dominated tides so that Z for Florence mostly coincides with either the lower or higher high tide before landfall, while Louisiana has diurnal dominated tides and Z for Laura mostly coincides with the one high tide just prior to landfall. In the case of Irma, the tides around the Florida Keys are very small and there is only a minor relation between the phasing of Z and the astronomical tide. Furthermore, this location is along the track south of where we count as landfall (around the Florida Everglades but there is no tide gauge here), and the maximums likely occur later here because of the buildup of surge inside Florida Bay after the hurricane makes landfall along the Florida Everglades coast.

Fig. 10.
Fig. 10.

Comparison of the modeled maximum water surface elevation (Z) and its timing, the surrogate predicted Z, and the modeled astronomical tide time series at a coastal location in the nearby vicinity of the forecasted track of each hurricane (Irma, Florence, Laura). Results are shown across all test set members (128 samples) using the surrogate model trained with the 59-member Korobov sequence with ElasticNet_LOO regression. The x-axis position of the surrogate predicted Z (for visualization purposes only) is chosen by offsetting from the modeled timing by 2 h to the left; the surrogate model does make any predictions of timing in this study. Coastal locations are NOAA tide gauges (https://tidesandcurrents.noaa.gov) with station names and IDs shown in the panel titles. ME: mean error; SDE: standard deviation of the error between the modeled and predicted Z.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

The best surrogate model setup from section 3a (59-member Korobov sequence training set with ElasticNet_LOO regression and extrapolation using n = 0.025 (ff = 0.0001) and k = p = 1) is shown to make accurate predictions of storm tide (combined surge and astronomical tide) for most test members [mean absolute error (MAE) less than 0.5 m for all hurricanes]. In particular, the surrogate model appears to perform best for the test set members where Z is far above the astronomical tide. However, for some test members the surrogate prediction can be erroneously low and below that of the astronomical high tide, which could not occur in reality. This appears to occur for the test set members where the surge is very small or perhaps even negative, that is, the modeled Z is just above the maximum astronomical tide elevation for the 5-day simulation time period.

c. Probabilistic predictions and global sensitivities

Maps of the sensitivities and probabilistic predictions extracted [see section 2d(2)] from the best surrogate model setup from section 3a (59-member Korobov sequence training set with ElasticNet_LOO regression and extrapolation using n = 0.025 (ff = 0.0001) and k = p = 1), are shown here to demonstrate the product output. First, total effect sensitivity indices of λ plotted in Fig. 11 indicate that the CT error is the most sensitive variable across all storms and over most of the region. The CT sensitivity tends to be smaller just to the right-hand side of the forecasted track and larger on the left-hand side, particularly for Florence and Irma where landfall is approximately perpendicular to the coastline. This is likely because the anticlockwise rotation of TC winds is such that surge will develop on the right-hand side for most tracks while winds can even push water away from the coastline on the left-hand side for certain tracks. We find a more uniform sensitivity to CT for Irma because it makes landfall parallel to the long side of the Florida Peninsula, which means both east and west coasts are similarly sensitive to a left or right shift in the track.

Fig. 11.
Fig. 11.

Total effect sensitivity indices of maximum water surface elevation (Z) to λ (CT, AT, Vmax, Rmax) for the three hurricane forecasts (Irma, Florence, Laura). The dashed black line is the track of the NHC forecast advisory 48 h prior to landfall. The gray line indicates the coastline, and shaded gray area indicates land. Mesh points with ocean depths greater than 25 m have been removed from the dataset.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

The importance of the other error variables is somewhat storm and location dependent. Vmax is the second-most important for Irma, while Rmax is for Laura. Florence is approximately equally sensitive to AT, Vmax, and Rmax. Irma was the forecasted to be the strongest hurricane at landfall (125 kt), followed by Florence (100 kt) and Laura (80 kt), and indeed the sensitivity to Vmax follows this order. Since surge is proportional to surface wind stress, which is proportional to the square of the wind speed, more intense TCs should have greater sensitivity to Vmax as shown. Furthermore, Laura Vmax was just 65 kt at the 48-h forecast time leading to reduced mean absolute forecast errors compared to the more intense (Vmax > 95 kt) Irma and Florence hurricanes at the 48-h forecast time (see Table A3). However, best track data shows that Laura rapidly intensified after the 48-h forecast time in the lead up to landfall, reaching a maximum of 130 kt just before landfall despite just 100 kt being forecast. This could help explain why the best track maximum water surface elevation just to the right of landfall surpasses even the 10% exceedance probability prediction (Fig. 12).

Additionally, at the forecast time, the size of Laura is the largest (Rmax = 40 n mi), followed by Irma (Rmax = 30 n mi) and then Florence (Rmax = 15 n mi). Because Laura is so large (it fits into the largest category for Rmax forecast errors in Table A4), it is biased to become much smaller by landfall (indeed the best track Rmax was 20 n mi at landfall). This leads to a larger sensitivity for regions located in this uncertain zone of storm extent, which in this case is the Atchafalaya region on the right end of the Laura domain. Similarly, because Florence is a smaller storm at forecast time it is biased to be larger at landfall (best track Rmax was 30 n mi at landfall), and Pamlico Sound, also located on the right end of the Florence domain, is hence quite sensitive to Rmax. The sensitivity to Rmax is also smaller for Irma likely because it approaches parallel to the narrow peninsula and a change in size should have little effect on what regions it can impact.

Second, 10%, 50%, and 90% exceedance probabilities of the maximum water surface elevations are shown in Fig. 12, illustrating how the surrogate model can predict changes to both water levels and inundation extents across the distribution. Indeed, over most of the domain the 50% exceedance probability tends to show a closer match to the best track hindcast than the 10% or 90% probabilities. However, as expected, in the regions where a large or small maximum water surface elevation occurs in the hindcast, the match appears closer to the 10% and 90% exceedance, respectively.

Fig. 12.
Fig. 12.

Best track hindcast and probabilistic predictions of maximum water surface elevations (10%, 50%, and 90% exceedance probabilities) for the three hurricane forecasts (Irma, Florence, Laura). The dashed black line is the best track for the left-hand side panels or the track of the NHC forecast advisory 48 h prior to landfall in the other panels. The gray line indicates the coastline, and shaded gray area indicates land. Mesh points with ocean depths greater than 25 m have been removed from the dataset.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

More quantifiably, reliability assessments show that surrogate model probabilistic forecast has generally reasonable accuracy for all three storms (Fig. 13). The 10% exceedance for all three storms is greater than that of the best track indicating a high bias at this extreme end of the distribution. In comparison, at the low end (toward 90% exceedance), the Irma and Florence predictions are biased low, while Laura is consistently biased high over the whole distribution. This may be related to the higher sensitivity to Rmax for Laura than the other storms—best track results show a relatively localized high water surface elevation region. Notably, Rmax is treated differently than the other error variables and is difficult to measure and forecast, motivating alternative treatment for storm size in the future.

Fig. 13.
Fig. 13.

Reliability plot of the 48-h probabilistic forecast of maximum water surface elevation against the best track hindcast for the three hurricane events (Irma, Florence, Laura). The surrogate model results are compared to direct empirical evaluation of the Korobov sequence training set used to generate the surrogate model, as well as to direct empirical evaluation of the randomly generated 128-member test set (random128).

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

Interestingly, however, in terms of inundation area, the 50% exceedance probability was close to the best track hindcast for Laura as was for Irma (Fig. 14), though the inundation area for Laura is more sensitive overall to the choice of exceedance probability than the other storms, highlighting the larger uncertainty for this hurricane, whereas the inundation area for Florence matches the best track area at the 30% exceedance. Notably, the 10% exceedance inundation area for Florence is about a factor of 2 greater than the best track, demonstrating large uncertainty at the lower probability end of the distribution.

Fig. 14.
Fig. 14.

Predicted inundation area of the 48-h probabilistic forecast as a fraction of the inundation area of the best track hindcast for the three hurricane events (Irma, Florence, Laura). The surrogate model results are compared to direct empirical evaluation of the Korobov sequence training set used to generate the surrogate model, as well as to direct empirical evaluation of the randomly generated 128-member test set.

Citation: Artificial Intelligence for the Earth Systems 2, 2; 10.1175/AIES-D-22-0040.1

Comparing results for surrogate models trained on smaller sample sizes of the Korobov sequence show remarkably consistent results in terms of reliability and inundation area across all storms (Figs. 13 and 14). Generally, reliability of the surrogate model trained on the 19-member Korobov sequence is at least as good as for 59 members. For Laura the reliability increases slightly as the sample sizes increase, and the best result is found from direct empirical evaluation of the 128-member test set. For Irma and Florence, the surrogate model clearly shows improvement in reliability over direct empirical evaluation of the training and test sets, highlighting its potential added value.

4. Discussion

The framework developed here has demonstrated that reliable probabilistic predictions of storm tide elevations and inundation can be achieved by training a KL-PC surrogate model on just O(10) perturbed storm events using low-discrepancy Korobov sequences. The use of the surrogate model was found to provide generally more reliable probabilistic predictions than the direct empirical evaluation of the training set—the added value of the surrogate model—except for Laura at moderate-high probabilities of exceedance. Moreover, the surrogate model offers additional benefits: 1) it can be used to rapidly predict the water surface elevations and inundation for any new storm perturbation, 2) it provides robust global sensitivity information, and 3) the water surface elevation extrapolation step can be used to purposely bias surrogate inundation prediction low or high, as desired.

For determining the adequate number of training samples from the Korobov sequence, our results show that the surrogate trained on a smaller number of samples (19 here) can provide similarly reliable probabilistic predictions as the surrogate trained on more samples (39 or 59 here). This could be related to the fact that we are focused on the forecast problem near landfall (48 h here), which means that the variation can be expected to be small relative to forecasts > 48 h. Nevertheless, when evaluating surrogate model errors against the test set, using 59 training samples is indeed more accurate (RMSE90 ≈ 0.3 m) than using 19 samples (RMSE90 ≈ 0.6 m). We therefore suggest that if only the probabilistic prediction is required, and in a short time frame, that the user could select to train the surrogate using a smaller sample size (e.g., 19, which covers 90% of the distribution of λ). This is in line with previous related research that suggested 27 samples were sufficient for this purpose (Davis et al. 2010) and with Kyprioti et al. (2021a), who also demonstrates the efficiency of low-discrepancy sequences. In contrast, if the user is interested in predicting the water surface elevations for a given perturbation of the storm, we recommend a higher number of training samples (e.g., 59) be used to produce a more accurate surrogate model. It is also possible that for forecasts closer to landfall, fewer training samples will be required. Note that the user is free to choose any number of samples, and is not restricted to 19, 39, and 59 used here.

In this study we used PCs due to their simplicity and usefulness in treating uncertainties. The use of penalized regression with cross validation (we recommend elastic net with LOO cross validation) was able to robustly fit the PCs across both the training and test set, whereas OLS overfits the training set. Furthermore, it was found that the head loss water extrapolation technique for filling in dry mesh points was critical to fitting PCs accurately to overland areas. We also tried extrapolation without head loss, but this resulted in poor estimation of inundation onset (too many false wet predictions). In addition, we tested the correction to the extrapolation technique for dry mesh points that become artificially wet as in Kyprioti et al. (2021c), and found that while false wet predictions are decreased, false dry predictions increase in tandem, which we view as less desirable. Other studies have used extrapolation without head loss but adopted CNN and GP machine learning methods (Lee et al. 2021; Plumlee et al. 2021) that have more degrees of freedom than PCs, which may help to hide the deficiency in the extrapolation technique. Whether or not this is viewed positively or negatively, the use of PCs does lead to strong knowledge of the effect of inputs on surrogate performance. This and the fact that PCs allow for exact extraction of variance-based sensitivity indices without additional sampling highlights their usefulness for understanding uncertainty. Nevertheless, future work could explore whether ANN/CNN or GPs can improve surrogate model accuracy and reliability in our framework.

In addition to the accuracy of the surrogate model, reliability is also dependent on the TC perturbation methodology. In this study we followed NHC P-Surge methodology that utilizes historical statistics of forecast errors. Future work may consider how to perturb TC tracks in a less self-similar fashion (see Fig. 2) and with consideration of the current storm dynamics. In addition, we validated the reliability against the model simulation of the best track hindcast, but real-world observation validation should be assessed in future. This may require the use of more sophisticated TC vortex models to better capture the asymmetric wind structure, such as the asymmetric Holland model (GAHM) that is built into ADCIRC (Dietrich et al. 2018). In this case a new methodology that perturbs the quadrant-based wind speed contours will be necessary. As noted, Laura was the most sensitive to Rmax while having the lowest reliability and highest RMSEs, and this could be improved if the storm size and structure could be varied with more control.

Though this study focused only on the spatially varying maximum water surface elevation, the KL-PC methodology can be generally applied to a spatiotemporal surrogate construction, to account for the temporal evolution of the water surface elevations and hence predict the timing of the peak flood. We showed that when the tidal range is fairly large, as along the North Carolina coast where Florence made landfall, modeled maximum water surface elevations almost always coincide with high tide. In most cases the surrogate accurately predicts the surge and tide sum, but if the surge is very small, the surrogate can make erroneously low predictions of the maximum surface elevations that are well below that of the high tide elevation. Training the surrogate model on the time varying water surface elevation would allow us to better control this behavior. Furthermore, use of log-space surrogate construction to preserve surrogate model positivity might be more useful than found in this study. Here, when pseudonegative water depths were introduced into the training set from the maximum water surface elevation extrapolation, the log-space surrogate construction was found to be deleterious instead of beneficial. This is in contrast to Plumlee et al. (2021) who found the log-transform necessary for use with GPs.

5. Conclusions

A methodology for efficient ensemble perturbation of TC wind forcing forecasts and uncertainty quantification of the resultant simulated storm tide and coastal flooding has been presented. Probabilistic prediction results based on the 48-h forecast prior to landfall for three historical hurricanes are promising as compared to model simulations of the best track hindcast. The methodology has been implemented into a general Python framework that can be extended to develop new TC perturbation methodologies, use more sophisticated TC vortex models, and facilitate perturbations to parameters in the hydrodynamic model such as bottom roughness.

Acknowledgments.

WP was supported by the U.S. National Oceanic and Atmospheric Administration (NOAA) Hurricane Supplemental Grant NA19OAR0220123 through the University Corporation for Atmospheric Research (UCAR) under a work-for-others agreement A20049 to Argonne National Laboratory through U.S. Department of Energy Contract DE-AC02-06CH11357. ZB and SM were supported by the NOAA Coastal Act program through UCAR. KS was supported in part by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) Program through the FASTMath Institute. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under Contract DE-NA0003525. We gratefully acknowledge the computing resources provided on Bebop, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. We also would like to acknowledge support from NOAA Research and Development High Performance Computing (NOAA RDHPCS) for providing access to HERA supercomputer and Cloud resources through the ca-hsofs-c project.

Data availability statement.

Data and scripts related to this manuscript are available from Pringle et al. (2022). EnsemblePerturbation is hosted on the NOAA Office of Coast Survey—Modeling GitHub website (https://github.com/noaa-ocs-modeling/EnsemblePerturbation) with “readthedocs” documentation linked therein.

APPENDIX

NHC Historical Forecast Error Tables

Tables A1A4 show the mean absolute forecast errors and upper- and lower-bound forecast errors.

Table A1

Mean absolute forecast error: cross track (n mi). VT = forecast validation time.

Table A1
Table A2

Mean absolute forecast error: along track (n mi).

Table A2
Table A3

Mean absolute forecast error: Vmax (kt).

Table A3
Table A4

Upper- and lower-bound forecast errors: Rmax (sm); sm = U.S. statute mile.

Table A4

REFERENCES

  • Abdolali, A., A. V. D. Westhuysen, Z. Ma, A. Mehra, A. Roland, and S. Moghimi, 2021: Evaluating the accuracy and uncertainty of atmospheric and wave model hindcasts during severe events using model ensembles. Ocean Dyn., 71, 217235, https://doi.org/10.1007/s10236-020-01426-9.

    • Search Google Scholar
    • Export Citation
  • Ayyad, M., M. R. Hajj, and R. Marsooli, 2021: Spatial variation in sensitivity of hurricane surge characteristics to hurricane parameters. J. Eng. Mech., 147, 04021070, https://doi.org/10.1061/(ASCE)EM.1943-7889.0001984.

    • Search Google Scholar
    • Export Citation
  • Davis, J. R., V. A. Paramygin, D. Forrest, and Y. P. Sheng, 2010: Toward the probabilistic simulation of storm surge and inundation in a limited-resource environment. Mon. Wea. Rev., 138, 29532974, https://doi.org/10.1175/2010MWR3136.1.

    • Search Google Scholar
    • Export Citation
  • Dietrich, J. C., and Coauthors, 2011: Modeling hurricane waves and storm surge using integrally-coupled, scalable computations. Coast. Eng., 58, 4565, https://doi.org/10.1016/j.coastaleng.2010.08.001.

    • Search Google Scholar
    • Export Citation
  • Dietrich, J. C., A. Muhammad, M. Curcic, A. Fathi, C. N. Dawson, S. S. Chen, and R. A. Luettich Jr., 2018: Sensitivity of storm surge predictions to atmospheric forcing during Hurricane Isaac. J. Waterw. Port Coastal Ocean Eng., 144, 04017035, https://doi.org/10.1061/(asce)ww.1943-5460.0000419.

    • Search Google Scholar
    • Export Citation
  • Feinberg, J., and H. P. Langtangen, 2015: Chaospy: An open source tool for designing methods of uncertainty quantification. J. Comput. Sci., 11, 4657, https://doi.org/10.1016/j.jocs.2015.08.008.

    • Search Google Scholar
    • Export Citation
  • Gonzalez, T., and A. Taylor, 2018: Development of the NWS’ probabilistic tropical storm surge model. 33rd Conf. on Hurricanes and Tropical Meteorology, Ponte Vedra, FL, Amer. Meteor. Soc., 186, https://ams.confex.com/ams/33HURRICANE/webprogram/Paper340247.html.

  • Hashemi, M. R., M. L. Spaulding, A. Shaw, H. Farhadi, and M. Lewis, 2016: An efficient artificial intelligence model for prediction of tropical storm surge. Nat. Hazards, 82, 471491, https://doi.org/10.1007/s11069-016-2193-4.

    • Search Google Scholar
    • Export Citation
  • Holland, G. J., 1980: An analytic model of the wind and pressure profiles in hurricanes. Mon. Wea. Rev., 108, 12121218, https://doi.org/10.1175/1520-0493(1980)108<1212:AAMOTW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Jelesnianski, C., J. Chen, and W. Shaffer, 1992: SLOSH: Sea, lake, and overland surges from Hurricanes. NOAA Tech. Rep. NWS 48, 71 pp., https://sites.law.lsu.edu/coast/2014/01/sea-lake-and-overland-surge-from-hurricanes-slosh/.

  • Jia, G., and A. A. Taflanidis, 2013: Kriging metamodeling for approximation of high-dimensional wave and surge responses in real-time storm/hurricane risk assessment. Comput. Methods Appl. Mech. Eng., 261–262, 2438, https://doi.org/10.1016/j.cma.2013.03.012.

    • Search Google Scholar
    • Export Citation
  • Jia, G., A. A. Taflanidis, N. C. Nadal-Caraballo, J. A. Melby, A. B. Kennedy, and J. M. Smith, 2016: Surrogate modeling for peak or time-dependent storm surge prediction over an extended coastal region using an existing database of synthetic storms. Nat. Hazards, 81, 909938, https://doi.org/10.1007/s11069-015-2111-1.

    • Search Google Scholar
    • Export Citation
  • Joyce, B. R., J. Gonzalez-Lopez, A. J. Van der Westhuysen, D. Yang, W. J. Pringle, J. J. Westerink, and A. T. Cox, 2019: U.S. IOOS coastal and ocean modeling testbed: Hurricane-induced winds, waves and surge for deep-ocean, reef fringed islands in the Caribbean. J. Geophys. Res. Oceans, 124, 28762907, https://doi.org/10.1029/2018JC014687.

    • Search Google Scholar
    • Export Citation
  • Kim, S.-W., J. A. Melby, N. C. Nadal-Caraballo, and J. Ratcliff, 2015: A time-dependent surrogate model for storm surge prediction based on an artificial neural network using high-fidelity synthetic hurricane modeling. Nat. Hazards, 76, 565585, https://doi.org/10.1007/s11069-014-1508-6.

    • Search Google Scholar
    • Export Citation
  • Korobov, N. M., 1959: Approximate evaluation of repeated integrals. Dokl. Akad. Nauk SSSR, 124, 12071210.

  • Kyprioti, A. P., E. Adeli, A. A. Taflanidis, J. J. Westerink, and H. L. Tolman, 2021a: Probabilistic storm surge estimation for landfalling hurricanes: Advancements in computational efficiency using quasi-Monte Carlo techniques. J. Mar. Sci. Eng., 9, 1322, https://doi.org/10.3390/jmse9121322.

    • Search Google Scholar
    • Export Citation
  • Kyprioti, A. P., A. A. Taflanidis, N. C. Nadal-Caraballo, and M. Campbell, 2021b: Storm hazard analysis over extended geospatial grids utilizing surrogate models. Coastal Eng., 168, 103855, https://doi.org/10.1016/j.coastaleng.2021.103855.

    • Search Google Scholar
    • Export Citation
  • Kyprioti, A. P., and Coauthors, 2021c: Improvements in storm surge surrogate modeling for synthetic storm parameterization, node condition classification and implementation to small size databases. Nat. Hazards, 109, 13491386, https://doi.org/10.1007/s11069-021-04881-9.

    • Search Google Scholar
    • Export Citation
  • Lee, J.-W., J. L. Irish, M. T. Bensi, and D. C. Marcy, 2021: Rapid prediction of peak storm surge from tropical cyclone track time series using machine learning. Coastal Eng., 170, 104024, https://doi.org/10.1016/j.coastaleng.2021.104024.

    • Search Google Scholar
    • Export Citation
  • Luettich, R. A., and J. J. Westerink, 2004: Formulation and numerical implementation of the 2D/3D ADCIRC finite element model version 44.XX. University of North Carolina at Chapel Hill & University of Notre Dame Tech. Rep., 74 pp., https://adcirc.org/wp-content/uploads/sites/2255/2018/11/adcirc_theory_2004_12_08.pdf.

  • Moghimi, S., and Coauthors, 2020: Development of an ESMF based flexible coupling application of ADCIRC and WAVEWATCH III for high fidelity coastal inundation studies. J. Mar. Sci. Eng., 8, 308, https://doi.org/10.3390/jmse8050308.

    • Search Google Scholar
    • Export Citation
  • NCEI, 2022: Billion-dollar weather and climate disasters. NOAA, https://www.ncei.noaa.gov/access/billions/.

  • Penny, A., and J. P. Cangialosi, 2019: P-surge error update. NOAA Tech. Rep., 13 pp.

  • Plumlee, M., T. G. Asher, W. Chang, and M. V. Bilskie, 2021: High-fidelity hurricane surge forecasting using emulation and sequential experiments. Ann. Appl. Stat., 15, 460480, https://doi.org/10.1214/20-AOAS1398.

    • Search Google Scholar
    • Export Citation
  • Pringle, W. J., D. Wirasaet, K. J. Roberts, and J. J. Westerink, 2021: Global storm tide modeling with ADCIRC v55: Unstructured mesh design and performance. Geosci. Model Dev., 14, 11251145, https://doi.org/10.5194/gmd-14-1125-2021.

    • Search Google Scholar
    • Export Citation
  • Pringle, W. J., Z. R. Burnett, K. Sargsyan, S. Moghimi, and E. Myers, 2022: Efficient probabilistic prediction and uncertainty quantification of tropical cyclone-driven storm tides and inundation: Model data and analysis code, version 3. Zenodo, https://doi.org/10.5281/zenodo.6588626.

  • Rucker, C. A., N. Tull, J. C. Dietrich, T. E. Langan, H. Mitasova, B. O. Blanton, J. G. Fleming, and R. A. Luettich Jr., 2021: Downscaling of real-time coastal flooding predictions for decision support. Nat. Hazards, 107, 13411369, https://doi.org/10.1007/s11069-021-04634-8.

    • Search Google Scholar
    • Export Citation
  • Sargsyan, K., 2012: Adaptive basis selection and dimensionality reduction with Bayesian compressive sensing. SIAM Conf. on Uncertainty Quantification, Raleigh, NC, Society for Industrial and Applied Mathematics, https://www.osti.gov/servlets/purl/1078647.

  • Sargsyan, K., 2017: Surrogate models for uncertainty propagation and sensitivity analysis. Handbook of Uncertainty Quantification, R. Ghanem, D. Higdon, and H. Owhadi, Eds., Springer, 673–698, https://doi.org/10.1007/978-3-319-11259-6_22-1.

  • Sargsyan, K., and Coauthors, 2021: UQTk version 3.1.1 user manual. Sandia National Laboratories Tech. Rep. SAND2021-3655, 140 pp., https://doi.org/10.2172/1777090.

  • Sochala, P., C. Chen, C. Dawson, and M. Iskandarani, 2020: A polynomial chaos framework for probabilistic predictions of storm surge events. Comput. Geosci., 24, 109128, https://doi.org/10.1007/s10596-019-09898-5.

    • Search Google Scholar
    • Export Citation
  • Taflanidis, A. A., A. B. Kennedy, J. J. Westerink, J. Smith, K. F. Cheung, M. Hope, and S. Tanaka, 2013: Rapid assessment of wave and surge risk during landfalling hurricanes: Probabilistic approach. J. Waterw. Port Coast. Ocean Eng., 139, 171182, https://doi.org/10.1061/(ASCE)WW.1943-5460.0000178.

    • Search Google Scholar
    • Export Citation
  • Taylor, A., and B. Glahn, 2008: Probabilistic guidance for hurricane storm surge. 19th Conf. on Probability and Statistics, New Orleans, LA, Amer. Meteor. Soc., 7.4, https://ams.confex.com/ams/pdfpapers/132793.pdf.

  • Technology Riverside Inc. and AECOM, 2015: Mesh development, tidal validation, and hindcast skill assessment of an ADCIRC model for the hurricane storm surge operational forecast system on the US Gulf-Atlantic Coast. National Oceanic and Atmospheric Administration/National Ocean Service Coast Survey Development Laboratory, Office of Coast Survey Tech. Rep., 179 pp., https://doi.org/10.7921/G0MC8X6V.

  • Vinogradov, S. V., Y. Funakoshi, E. Myers, S. Moghimi, and J. Calzada, 2018: ADCIRC-based storm surge operational forecast systems at NOAA. Annual ADCIRC User Group Meeting, College Park, MD, ADCIRC User Group, https://doi.org/10.17615/w9kz-kf59.

  • Westerink, J. J., and Coauthors, 2008: A basin- to channel-scale unstructured grid hurricane storm surge model applied to southern Louisiana. Mon. Wea. Rev., 136, 833864, https://doi.org/10.1175/2007MWR1946.1.

    • Search Google Scholar
    • Export Citation
Save
  • Abdolali, A., A. V. D. Westhuysen, Z. Ma, A. Mehra, A. Roland, and S. Moghimi, 2021: Evaluating the accuracy and uncertainty of atmospheric and wave model hindcasts during severe events using model ensembles. Ocean Dyn., 71, 217235, https://doi.org/10.1007/s10236-020-01426-9.

    • Search Google Scholar
    • Export Citation
  • Ayyad, M., M. R. Hajj, and R. Marsooli, 2021: Spatial variation in sensitivity of hurricane surge characteristics to hurricane parameters. J. Eng. Mech., 147, 04021070, https://doi.org/10.1061/(ASCE)EM.1943-7889.0001984.

    • Search Google Scholar
    • Export Citation
  • Davis, J. R., V. A. Paramygin, D. Forrest, and Y. P. Sheng, 2010: Toward the probabilistic simulation of storm surge and inundation in a limited-resource environment. Mon. Wea. Rev., 138, 29532974, https://doi.org/10.1175/2010MWR3136.1.

    • Search Google Scholar
    • Export Citation
  • Dietrich, J. C., and Coauthors, 2011: Modeling hurricane waves and storm surge using integrally-coupled, scalable computations. Coast. Eng., 58, 4565, https://doi.org/10.1016/j.coastaleng.2010.08.001.

    • Search Google Scholar
    • Export Citation
  • Dietrich, J. C., A. Muhammad, M. Curcic, A. Fathi, C. N. Dawson, S. S. Chen, and R. A. Luettich Jr., 2018: Sensitivity of storm surge predictions to atmospheric forcing during Hurricane Isaac. J. Waterw. Port Coastal Ocean Eng., 144, 04017035, https://doi.org/10.1061/(asce)ww.1943-5460.0000419.

    • Search Google Scholar
    • Export Citation
  • Feinberg, J., and H. P. Langtangen, 2015: Chaospy: An open source tool for designing methods of uncertainty quantification. J. Comput. Sci., 11, 4657, https://doi.org/10.1016/j.jocs.2015.08.008.

    • Search Google Scholar
    • Export Citation
  • Gonzalez, T., and A. Taylor, 2018: Development of the NWS’ probabilistic tropical storm surge model. 33rd Conf. on Hurricanes and Tropical Meteorology, Ponte Vedra, FL, Amer. Meteor. Soc., 186, https://ams.confex.com/ams/33HURRICANE/webprogram/Paper340247.html.

  • Hashemi, M. R., M. L. Spaulding, A. Shaw, H. Farhadi, and M. Lewis, 2016: An efficient artificial intelligence model for prediction of tropical storm surge. Nat. Hazards, 82, 471491, https://doi.org/10.1007/s11069-016-2193-4.

    • Search Google Scholar
    • Export Citation
  • Holland, G. J., 1980: An analytic model of the wind and pressure profiles in hurricanes. Mon. Wea. Rev., 108, 12121218, https://doi.org/10.1175/1520-0493(1980)108<1212:AAMOTW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Jelesnianski, C., J. Chen, and W. Shaffer, 1992: SLOSH: Sea, lake, and overland surges from Hurricanes. NOAA Tech. Rep. NWS 48, 71 pp., https://sites.law.lsu.edu/coast/2014/01/sea-lake-and-overland-surge-from-hurricanes-slosh/.

  • Jia, G., and A. A. Taflanidis, 2013: Kriging metamodeling for approximation of high-dimensional wave and surge responses in real-time storm/hurricane risk assessment. Comput. Methods Appl. Mech. Eng., 261–262, 2438, https://doi.org/10.1016/j.cma.2013.03.012.

    • Search Google Scholar
    • Export Citation
  • Jia, G., A. A. Taflanidis, N. C. Nadal-Caraballo, J. A. Melby, A. B. Kennedy, and J. M. Smith, 2016: Surrogate modeling for peak or time-dependent storm surge prediction over an extended coastal region using an existing database of synthetic storms. Nat. Hazards, 81, 909938, https://doi.org/10.1007/s11069-015-2111-1.

    • Search Google Scholar
    • Export Citation
  • Joyce, B. R., J. Gonzalez-Lopez, A. J. Van der Westhuysen, D. Yang, W. J. Pringle, J. J. Westerink, and A. T. Cox, 2019: U.S. IOOS coastal and ocean modeling testbed: Hurricane-induced winds, waves and surge for deep-ocean, reef fringed islands in the Caribbean. J. Geophys. Res. Oceans, 124, 28762907, https://doi.org/10.1029/2018JC014687.

    • Search Google Scholar
    • Export Citation
  • Kim, S.-W., J. A. Melby, N. C. Nadal-Caraballo, and J. Ratcliff, 2015: A time-dependent surrogate model for storm surge prediction based on an artificial neural network using high-fidelity synthetic hurricane modeling. Nat. Hazards, 76, 565585, https://doi.org/10.1007/s11069-014-1508-6.

    • Search Google Scholar
    • Export Citation
  • Korobov, N. M., 1959: Approximate evaluation of repeated integrals. Dokl. Akad. Nauk SSSR, 124, 12071210.

  • Kyprioti, A. P., E. Adeli, A. A. Taflanidis, J. J. Westerink, and H. L. Tolman, 2021a: Probabilistic storm surge estimation for landfalling hurricanes: Advancements in computational efficiency using quasi-Monte Carlo techniques. J. Mar. Sci. Eng., 9, 1322, https://doi.org/10.3390/jmse9121322.

    • Search Google Scholar
    • Export Citation
  • Kyprioti, A. P., A. A. Taflanidis, N. C. Nadal-Caraballo, and M. Campbell, 2021b: Storm hazard analysis over extended geospatial grids utilizing surrogate models. Coastal Eng., 168, 103855, https://doi.org/10.1016/j.coastaleng.2021.103855.

    • Search Google Scholar
    • Export Citation
  • Kyprioti, A. P., and Coauthors, 2021c: Improvements in storm surge surrogate modeling for synthetic storm parameterization, node condition classification and implementation to small size databases. Nat. Hazards, 109, 13491386, https://doi.org/10.1007/s11069-021-04881-9.

    • Search Google Scholar
    • Export Citation
  • Lee, J.-W., J. L. Irish, M. T. Bensi, and D. C. Marcy, 2021: Rapid prediction of peak storm surge from tropical cyclone track time series using machine learning. Coastal Eng., 170, 104024, https://doi.org/10.1016/j.coastaleng.2021.104024.

    • Search Google Scholar
    • Export Citation
  • Luettich, R. A., and J. J. Westerink, 2004: Formulation and numerical implementation of the 2D/3D ADCIRC finite element model version 44.XX. University of North Carolina at Chapel Hill & University of Notre Dame Tech. Rep., 74 pp., https://adcirc.org/wp-content/uploads/sites/2255/2018/11/adcirc_theory_2004_12_08.pdf.

  • Moghimi, S., and Coauthors, 2020: Development of an ESMF based flexible coupling application of ADCIRC and WAVEWATCH III for high fidelity coastal inundation studies. J. Mar. Sci. Eng., 8, 308, https://doi.org/10.3390/jmse8050308.

    • Search Google Scholar
    • Export Citation
  • NCEI, 2022: Billion-dollar weather and climate disasters. NOAA, https://www.ncei.noaa.gov/access/billions/.

  • Penny, A., and J. P. Cangialosi, 2019: P-surge error update. NOAA Tech. Rep., 13 pp.

  • Plumlee, M., T. G. Asher, W. Chang, and M. V. Bilskie, 2021: High-fidelity hurricane surge forecasting using emulation and sequential experiments. Ann. Appl. Stat., 15, 460480, https://doi.org/10.1214/20-AOAS1398.

    • Search Google Scholar
    • Export Citation
  • Pringle, W. J., D. Wirasaet, K. J. Roberts, and J. J. Westerink, 2021: Global storm tide modeling with ADCIRC v55: Unstructured mesh design and performance. Geosci. Model Dev., 14, 11251145, https://doi.org/10.5194/gmd-14-1125-2021.

    • Search Google Scholar
    • Export Citation
  • Pringle, W. J., Z. R. Burnett, K. Sargsyan, S. Moghimi, and E. Myers, 2022: Efficient probabilistic prediction and uncertainty quantification of tropical cyclone-driven storm tides and inundation: Model data and analysis code, version 3. Zenodo, https://doi.org/10.5281/zenodo.6588626.

  • Rucker, C. A., N. Tull, J. C. Dietrich, T. E. Langan, H. Mitasova, B. O. Blanton, J. G. Fleming, and R. A. Luettich Jr., 2021: Downscaling of real-time coastal flooding predictions for decision support. Nat. Hazards, 107, 13411369, https://doi.org/10.1007/s11069-021-04634-8.

    • Search Google Scholar
    • Export Citation
  • Sargsyan, K., 2012: Adaptive basis selection and dimensionality reduction with Bayesian compressive sensing. SIAM Conf. on Uncertainty Quantification, Raleigh, NC, Society for Industrial and Applied Mathematics, https://www.osti.gov/servlets/purl/1078647.

  • Sargsyan, K., 2017: Surrogate models for uncertainty propagation and sensitivity analysis. Handbook of Uncertainty Quantification, R. Ghanem, D. Higdon, and H. Owhadi, Eds., Springer, 673–698, https://doi.org/10.1007/978-3-319-11259-6_22-1.

  • Sargsyan, K., and Coauthors, 2021: UQTk version 3.1.1 user manual. Sandia National Laboratories Tech. Rep. SAND2021-3655, 140 pp., https://doi.org/10.2172/1777090.

  • Sochala, P., C. Chen, C. Dawson, and M. Iskandarani, 2020: A polynomial chaos framework for probabilistic predictions of storm surge events. Comput. Geosci., 24, 109128, https://doi.org/10.1007/s10596-019-09898-5.

    • Search Google Scholar
    • Export Citation
  • Taflanidis, A. A., A. B. Kennedy, J. J. Westerink, J. Smith, K. F. Cheung, M. Hope, and S. Tanaka, 2013: Rapid assessment of wave and surge risk during landfalling hurricanes: Probabilistic approach. J. Waterw. Port Coast. Ocean Eng., 139, 171182, https://doi.org/10.1061/(ASCE)WW.1943-5460.0000178.

    • Search Google Scholar
    • Export Citation
  • Taylor, A., and B. Glahn, 2008: Probabilistic guidance for hurricane storm surge. 19th Conf. on Probability and Statistics, New Orleans, LA, Amer. Meteor. Soc., 7.4, https://ams.confex.com/ams/pdfpapers/132793.pdf.

  • Technology Riverside Inc. and AECOM, 2015: Mesh development, tidal validation, and hindcast skill assessment of an ADCIRC model for the hurricane storm surge operational forecast system on the US Gulf-Atlantic Coast. National Oceanic and Atmospheric Administration/National Ocean Service Coast Survey Development Laboratory, Office of Coast Survey Tech. Rep., 179 pp., https://doi.org/10.7921/G0MC8X6V.

  • Vinogradov, S. V., Y. Funakoshi, E. Myers, S. Moghimi, and J. Calzada, 2018: ADCIRC-based storm surge operational forecast systems at NOAA. Annual ADCIRC User Group Meeting, College Park, MD, ADCIRC User Group, https://doi.org/10.17615/w9kz-kf59.

  • Westerink, J. J., and Coauthors, 2008: A basin- to channel-scale unstructured grid hurricane storm surge model applied to southern Louisiana. Mon. Wea. Rev., 136, 833864, https://doi.org/10.1175/2007MWR1946.1.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Flowchart of the proposed methodology for efficient probabilistic predictions and uncertainty quantification of TC-driven storm tide and inundation.

  • Fig. 2.

    Example perturbations of a NHC advisory (Hurricane Florence) along the forecast. The ±1σ perturbations for the Gaussian distributed (a) CT, (b) AT, and (c) maximum sustained wind speed (Vmax) errors, and ±0.5 value perturbation of the uniformly distributed (d) radius of maximum wind (Rmax) errors.

  • Fig. 3.

    Perturbed values of random variables (λ) across the 4D space for (a) sparse quadrature and (b) Korobov sequence sampling methodologies. Sparse quadrature has variable weights attached to each perturbation sample (indicated by marker grayscale and size) while the Korobov sequence is equally weighted (marker colors are used to distinguish unique samples across the panels).

  • Fig. 4.

    Example of extrapolating maximum water surface elevations (a) over dry mesh points to obtain artificial maximum water surface elevations (b) for a Hurricane Florence training set member. Here, the extrapolation uses nearest neighbor (k = p = 1), and a friction factor, ff = 0.0004. The gray line indicates the coastline, and the shaded gray area indicates land. Mesh points with ocean depths greater than 25 m have been removed from the dataset.

  • Fig. 5.

    Hurricane Florence 2018 track and intensity information; (a) best track hindcast and (b) NHC forecast advisory 48-h prior to landfall. Figure generated using the troPYcal Python package (https://github.com/tropycal/tropycal).

  • Fig. 6.

    CDF curves of the surrogate model RMSE across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura), using a 158-member sparse quadrature training set and a 59-member Korobov sequence training set with different regression methodologies. The ts is the two-sided t-test statistic between the sparse quadrature RMSE and the Korobov RMSEs corresponding to the colors in the legend (largest positive value indicates smallest average RMSE); sparse_quadrature: sparse quadrature integration. Lasso_SS: Lasso penalized linear regression with shuffle-split cross validation. Lasso_LOO: Lasso penalized linear regression with leave-one-out cross validation. ElasticNet_SS: Elastic net penalized linear regression with shuffle-split cross validation. ElasticNet_LOO: Elastic net penalized linear regression with leave-one-out cross validation.

  • Fig. 7.

    Comparison of the predicted (surrogate) and modeled KL training parameters ξ of the top nine KL modes for the Hurricane Florence forecast. The R2 is the coefficient of determination of the surrogate prediction corresponding to the colors in the legend; μ indicates the eigenvalues of each KL mode evaluated from the sparse quadrature (sq) and 59-member Korobov sequence (k59) training data.

  • Fig. 8.

    CDF curves of the surrogate model RMSE across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura), using 19-, 39-, and 59-member Korobov sequence training sets with ElasticNet_LOO regression. The ts is the two-sided t-test statistic between the sparse quadrature RMSE and the Korobov RMSEs corresponding to the colors in the legend (largest positive value indicates smallest average RMSE); korobov59: 59-member Korobov sequence; korobov39: 39-member Korobov sequence; korobov19: 19-member Korobov sequence.

  • Fig. 9.

    Comparison of the surrogate model accuracy for different extrapolation parameters and construction in linear space vs log space (linear and log in the legend, respectively). The integers after k and p in the legend are parameters from the extrapolation Eq. (1), and the decimal after n is the Manning’s n coefficient used to compute ff in Eq. (2). Results are shown across all test set members (128 samples) for the three hurricane forecasts (Irma, Florence, Laura), using 59-member Korobov sequence training sets with ElasticNet_LOO regression. (left) RMSE CDF curves, where ts is the two-sided t-test statistic between the log-space RMSE and the linear-space RMSEs corresponding to the colors in the legend (largest positive value indicates smallest average RMSE). (right) Total percentage of mesh points across all test set members with a false wet/dry classification.

  • Fig. 10.

    Comparison of the modeled maximum water surface elevation (Z) and its timing, the surrogate predicted Z, and the modeled astronomical tide time series at a coastal location in the nearby vicinity of the forecasted track of each hurricane (Irma, Florence, Laura). Results are shown across all test set members (128 samples) using the surrogate model trained with the 59-member Korobov sequence with ElasticNet_LOO regression. The x-axis position of the surrogate predicted Z (for visualization purposes only) is chosen by offsetting from the modeled timing by 2 h to the left; the surrogate model does make any predictions of timing in this study. Coastal locations are NOAA tide gauges (https://tidesandcurrents.noaa.gov) with station names and IDs shown in the panel titles. ME: mean error; SDE: standard deviation of the error between the modeled and predicted Z.

  • Fig. 11.

    Total effect sensitivity indices of maximum water surface elevation (Z) to λ (CT, AT, Vmax, Rmax) for the three hurricane forecasts (Irma, Florence, Laura). The dashed black line is the track of the NHC forecast advisory 48 h prior to landfall. The gray line indicates the coastline, and shaded gray area indicates land. Mesh points with ocean depths greater than 25 m have been removed from the dataset.

  • Fig. 12.

    Best track hindcast and probabilistic predictions of maximum water surface elevations (10%, 50%, and 90% exceedance probabilities) for the three hurricane forecasts (Irma, Florence, Laura). The dashed black line is the best track for the left-hand side panels or the track of the NHC forecast advisory 48 h prior to landfall in the other panels. The gray line indicates the coastline, and shaded gray area indicates land. Mesh points with ocean depths greater than 25 m have been removed from the dataset.

  • Fig. 13.

    Reliability plot of the 48-h probabilistic forecast of maximum water surface elevation against the best track hindcast for the three hurricane events (Irma, Florence, Laura). The surrogate model results are compared to direct empirical evaluation of the Korobov sequence training set used to generate the surrogate model, as well as to direct empirical evaluation of the randomly generated 128-member test set (random128).

  • Fig. 14.

    Predicted inundation area of the 48-h probabilistic forecast as a fraction of the inundation area of the best track hindcast for the three hurricane events (Irma, Florence, Laura). The surrogate model results are compared to direct empirical evaluation of the Korobov sequence training set used to generate the surrogate model, as well as to direct empirical evaluation of the randomly generated 128-member test set.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 858 742 64
PDF Downloads 831 711 79