## Abstract

The majority of structural damage and loss of life during a hurricane is due to storm surge, thus it is important for communities in hurricane-prone regions to understand their risk due to surge. Storm surge in particular is largely influenced by coastal features such as topography/bathymetry and bottom roughness. Bottom roughness determines how much resistance there is to the flow. Manning’s formula can be used to model the bottom stress with the Manning’s *n* coefficient, a spatially dependent field. Given a storm surge model and a set of model outputs, an inverse problem may be solved to determine probable Manning’s *n* fields to use for predictive simulations.

The inverse problem is formulated and solved in a measure-theoretic framework using the state-of-the-art Advanced Circulation (ADCIRC) storm surge model. The use of measure theory requires minimal assumptions and involves the direct inversion of the physics-based map from model inputs to output data determined by the ADCIRC model. Thus, key geometric relationships in this map are preserved and exploited. By using a recently available subdomain implementation of ADCIRC that significantly reduces the computational cost of forward model solves, the authors demonstrate the method on a case study using data obtained from an ADCIRC hindcast study of Hurricane Gustav (2008) to quantify uncertainties in Manning’s *n* within Bay St. Louis. However, the methodology is general and could be applied to any inverse problem that involves a map from model input to output quantities of interest.

## 1. Introduction

We use a recently developed measure-theoretic method that requires minimal assumptions, can be used to identify optimal measurement locations, and can be applied to any computational model to estimate parameters. The focus is on coastal ocean modeling during a hurricane with specific attention on critical parameters affecting storm surge as a way to introduce to the community this advanced mathematical approach. We lay the groundwork for future research, which could apply this methodology to atmospheric parameters in a coupled atmospheric–ocean hurricane model. This future application could use both water surface elevation and meteorological data to improve estimates of parameters critical to accurate hurricane forecasting such as the central pressure, radius of maximum wind, maximum wind speed, forward speed, and the hurricane track. Furthermore, this methodology is general enough that it is broadly applicable to any physics-based predictive model assuming that the quantities of interest map varies smoothly with respect to small perturbations to the model parameters.

### a. Storm surge and Manning’s n

In recent years a succession of hurricanes has caused significant damage to communities along the U.S. coastline, with the majority of damage and loss of life due to storm surge (i.e., inundation caused as hurricanes make landfall). During a hurricane, the balance between momentum gain from wind stress and momentum loss from bottom stress is one of the primary controlling factors of storm surge (Schubert et al. 2008). Water levels and currents during a storm surge event are usually modeled by the shallow-water equations (SWEs), with bottom friction described using a standard parameterization, such as the Gauckler–Manning–Strickler formula. In this formula, the variation of bottom friction due to varying roughness is described through the Manning’s *n* parameter. Correctly characterizing bottom friction was shown to be especially important in hindcasting Hurricane Ike (Zheng et al. 2013; Martyr et al. 2013; Kerr et al. 2013; Kennedy et al. 2011). It is therefore critical to determine the Manning’s *n* values in order to ensure accurate and precise predictions of storm surge.

### b. Uncertainty in Manning’s n

Manning’s *n* is currently used to model momentum loss due to bottom stress in a variety of coastal applications, but determination of a proper value of Manning’s *n* in any model is fraught with uncertainties. On the modeling side, Manning’s *n* as developed in the Gauckler–Manning–Strickler formula only relates average square cross-sectional velocity, hydraulic radius, and the channel bed slope for open-channel flow in a fully turbulent fairly regular unvegetated domain (Gioia and Bombardelli 2001; Yen 1992b,a, 2002; Leopold et al. 1964). Manning’s *n* can be empirically calculated for environmental flows that fit within the original scope of the Gauckler–Manning–Strickler formula by sampling the diameters of roughness elements or from field or laboratory measurements (Myers et al. 1999; Barnes 1967; Bathurst 1985; Clifford et al. 1992). However, the Gauckler–Manning–Strickler formula is commonly used beyond this original scope to model flow resistance for various environmental flows (Forzieri et al. 2011; Schubert et al. 2008). For environmental flows, Arcement and Schneider (1989), along with Chow (2008) and Barnes (1967), have provided guidelines for selecting Manning’s *n* that extend the original domain of application to flood plains, vegetated water ways, and channels with irregular boundaries. There are multiple subgrid processes that cause momentum loss, two of which are 1) the bottom friction due to bed forms and bed material (sand, silt, mud, etc.) and 2) the form drag due to flexible vegetation. Manning’s formula only represents bottom friction, but is used as a convenient model to capture form drag caused by vegetation (Lane 2014; Wamsley et al. 2010; Kouwen and Unny 1973). Regardless, complex models require a representation of momentum loss due to subgrid processes such as bottom friction, form drag due to vegetation, bed forms, and various porous-media-like structures that are present in coastal estuaries (Dietrich et al. 2010, 2011a; Mignot et al. 2006).

It is possible to determine the Manning’s coefficient through hydraulic calibration for a specific geographic location (Orlandini 2002). Unfortunately, using this process to develop a Manning’s *n* field for a highly resolved finite element (FE) mesh over a large, complex physical domain, such as those often used in storm surge modeling, would be extremely cost prohibitive. Alternatively, a Manning’s *n* value can be developed for specific types of land cover. The Coastal Change Analysis Program (C-CAP) Regional Land Cover database (NOAA) and the National Land Cover Database (NLCD, USGS) document the geographical distribution of different types of land cover using a land cover classification scheme. We use these datasets to determine the spatial distribution of land cover classifications (Agnew 2012; Bunya et al. 2010; Schumann et al. 2007; Medeiros et al. 2012; Wamsley et al. 2010; Tao and Kouwen 1989).

Because of the modeling error and uncertainty inherent in empirical observations discussed above, the actual values of Manning’s *n* are subject to a great deal of uncertainty. This uncertainty must be quantified, and reduced wherever possible, in order to improve inferences drawn from storm surge predictions.

### c. Quantifying uncertainties in Manning’s n

Model input parameters such as Manning’s *n* are often inferred by solution to an inverse problem using model output data sensitive to the model input parameters of interest. The primary goal of this paper is to demonstrate how a measure-theoretic approach for inverse sensitivity analysis can be used to both *quantify and reduce* uncertainties in the inputs of a storm surge model. Specific focus is given to how this novel mathematical approach can aid in the design of observation networks in order to increase precision in quantified uncertainties. While the approach used here is quite general, we focus on the important Manning’s *n* parameter in the Bay St. Louis area using simulations of Hurricane Gustav (2008) to design spatial configurations of buoys recording maximum water elevation data. This is a compelling case study and also serves to make the mathematical ideas concrete. To simulate data from Hurricane Gustav, we use the Advanced Circulation model for oceanic, coastal, and estuarine waters (ADCIRC). The ADCIRC model solves the SWEs on highly resolved, unstructured FE meshes (Luettich and Westerink 2004, 2010).

There are many types of uncertainty quantification (UQ) methods currently adopted in the scientific community for quantifying uncertainties in the inputs of a model. Below, in order to locate and distinguish the contribution of this work within the vast field of UQ research, we provide a brief literature review of some of the more popular UQ methods and draw clear distinctions to the measure-theoretic approach.

First, to provide a common framework for understanding the differences between UQ methods that formulate and solve inverse problems, we introduce some general notation. Define the quantities of interest (QoI) as the set of observable output data of a physics-based model used to formulate the inverse problem. The QoI define a physics-based map from the input parameter space, which we denote by , where is the domain of input parameters and is the range of possible model output data. The first complication in the formulation and the solution to an inverse problem is that, in general, the inverse map defines a *set-valued* function. Essentially, the mapping between the input and output is nonunique in that a single datum maps to a set of parameters rather than a single point in parameter space. This is what is often referred to as ill-posedness of the inverse problem. Another complication to the formulation and solution of the inverse problem is that observed data, even from the computational model, are often uncertain due to measurement, modeling, and discretization errors. It is common to describe these uncertainties in terms of probabilities (e.g., using a probability measure or density ). The solution to the inverse problem is then a probability measure or density ( or , respectively) on the input parameter space. We emphasize that the introduction of stochasticity/probability on does not alter the fundamental fact that the direct inverse of the physics-based QoI map is a set-valued function.

The set-valued inverses of the QoI map impart a significant amount of geometric information on the parameter space that should be respected if we are to maintain confidence in the physical interpretation of solutions to the inverse problem. This is precisely the advantage of using a measure-theoretic framework for the formulation, solution, analysis, and numerical approximation of the inverse problem for scientific inference (Breidt et al. 2011; Butler and Estep 2013; Butler et al. 2012, 2014; T. Butler et al. 2014, unpublished manuscript, hereafter B14). We have previously studied this problem for the hydrodynamic model of an idealized inlet in Butler et al. (2015). The takeaway message is that the solution to the stochastic inverse problem is a nonparametric probability measure that fully exploits the geometric information imparted by the physics-based QoI map. A major contribution of this work is a demonstration that the geometry can be further exploited to aid in optimal experimental design, which here means choosing the optimal spatial configuration of buoys to record maximum water elevations and reduce uncertainties in Manning’s *n*. This is significantly different than other work in this community that has studied the effectiveness of targeted measurements without the use of such geometric analysis to mixed results (e.g., see Hamill et al. 2013).

Approaches for solving the core deterministic inverse problem of mapping a datum to parameters often form the foundation of UQ methods for solving stochastic inverse problems. One common approach for deterministic inversion is to “regularize the map” or use “regularization,” which implies that the map *Q* is fundamentally changed, usually in favor of defining a misfit functional mapping between and , and the inverse problem is formulated and solved as an optimization problem with a unique solution (Ding and Wang 2005). Since altering the map *Q*, which comes purely from the physics in the model, creates a new mapping between the spaces to using information not present in the model, the choice of regularization should be physically supported. Thus, any UQ method based upon regularization fundamentally formulates and solves a different type of inverse problem and requires additional assumptions.

Alternate methods for solving stochastic inverse problems include Bayesian approaches (Yang et al. 2007; Hall et al. 2011). A Bayesian approach often formulates the solution on directly in terms of a posterior taken to be proportional to the product of a likelihood function and prior density on . The map *Q* often appears as an argument of the likelihood function [e.g., using discrepancies between the model-predicted value of for a particular and an actual datum ]. This essentially replaces the physics-based map *Q* with a statistical map given by the likelihood. As with regularization based approaches, since the map is fundamentally different between the spaces and , we see that Bayesian approaches again define and solve a different type of inverse problem requiring additional assumptions. In fact, under certain conditions, the maximum likelihood point of a posterior density defined using a Bayesian approach and the minimum of a misfit functional solved using regularization can be shown to be the same.

In the coastal ocean modeling and atmospheric science communities, data assimilation techniques based upon the ensemble Kalman filter are perhaps the most commonly used methods for estimating both parameters and state variables of models (Mayo et al. 2014; Aksoy 2015; Ruiz et al. 2013). The attractiveness of these type of data assimilation methods are both their ease of implementation and ability to provide so-called real-time updates to parameter estimates and state variable forecasts as new data become available. In Ruiz et al. (2013), it was shown that proper estimation of parameters can have positive effects on predictions both in the short term and in medium-range forecasts. However, one of the most difficult steps in using data assimilation techniques is the initialization step. For example, in Mayo et al. (2014), it was shown that incorrect initialization of Manning’s *n* values in Galveston Bay lead to final ensemble estimates of Manning’s *n* values that are incorrect by as much as 63% despite relatively little error in predicting water elevations from tidal forcing. Fundamentally, this is due to the fact that the inverse of the map from Manning’s *n* to water elevation data is set valued. The consequences of such incorrect estimates of Manning’s *n* values for predictions of storm surge during extreme events are uncertain. We demonstrate that it may be possible to exploit the measure-theoretic approach to provide better initialization of an ensemble of model parameters in real-time data assimilation schemes. In fact, since the probability measures are often nonparametric, this approach may in particular enhance initializations of ensembles for particle filtering methods (van Leeuwen 2009). While a full exploration is outside the scope of this study, we provide some results showing how to utilize the solution to the measure-theoretic solution to generate and analyze ensembles of Manning’s *n* fields. Moreover, open source software, data files used in this study, and instructions are provided online for the interested reader to reconstruct the numerical results and experiment with different applications of these results.

### d. Outline

We describe the state-of-the-art ADCIRC model in section 2. In section 3, we give a concrete description of the measure-theoretic approach for quantifying uncertainties as well as considerations in designing experiments in the context of storm surge and Manning’s *n*. In section 4, we provide details on the setup of the parameter and data spaces for the Hurricane Gustav case study as well as computational details on the simulation. In section 5, we show numerical results for the solution to the stochastic inverse problem to estimate Manning’s *n*, and we describe how choices of buoy configurations affect the ensembles of Manning’s *n* fields generated from the solution to the inverse problem. Finally, conclusions and directions of possible future research are provided in section 6.

## 2. ADCIRC coastal circulation model and Manning’s *n*

*n*

We model hurricane storm surge using ADCIRC, a prominent FE model that solves the SWEs on an unstructured mesh, with a finite-difference temporal discretization (Luettich and Westerink 2010). The ADCIRC model has been extensively verified and validated using hindcasts of numerous hurricanes including but not limited to Hurricanes Ike, Katrina, and Gustav (Bunya et al. 2010; Dietrich et al. 2010; Kennedy et al. 2011; Hope et al. 2013; Dietrich et al. 2011a). The SWEs are commonly used to model fluid dynamic systems where the horizontal length scale is large relative to the vertical length scale. Denote the total water column height , where *h* is the bathymetric depth, and *ζ* is the free surface above the geoid or datum then the continuity equation is

where , are the -directed flux per unit width, and are the depth-averaged velocities (Luettich and Westerink 2010). The flux per unit width and in the shallow-water equations should not be confused with the QoI map denoted by *Q*. Note that in ADCIRC the continuity equation (2.1) is replaced by the generalized wave continuity equation (GWCE) for superior accuracy properties (Luettich and Westerink 2010).

The accompanying momentum equations in conservative form are

where *g* is gravity, *f* is the Coriolis parameter, is the atmospheric pressure at the sea surface, is the reference density of water, and *η* is the Newtonian equilibrium tidal potential. The terms are the bottom stresses, are the imposed surface stresses, are the vertically integrated lateral stress gradients, are momentum dispersion terms, and are the vertically integrated baroclinic pressure gradients (Luettich and Westerink 2010).

The bottom stress terms are defined using a quadratic drag law:

where and denotes the magnitude of the water velocity. The constant is defined using the Gauckler–Manning–Strickler formula:

where *n* is a spatially heterogeneous parameter denoting the Manning’s *n* roughness coefficient. For *H* sufficiently small, is set to a constant to prevent division by zero. Because of this formulation, in regions where *H* is small (i.e., estuaries, tidal flats, shallow channels, etc.) the bottom stress is especially sensitive to Manning’s coefficient.

## 3. Measure-theoretic inversion for Manning’s *n*

*n*

To the authors’ knowledge this is the first time the measure-theoretic approach used in this work for formulating and solving stochastic inverse problems has been presented to the intended audience of this journal. We therefore provide a brief but completely abstract description of the measure-theoretic approach in appendix A, the computational algorithm for computing approximate nonparametric probability measures in appendix B, and some details about the structure of solutions that lead to optimal experimental design in appendix C. The interested reader may use these appendices as well as the references therein to formulate and solve additional measure-theoretic inverse problems in their desired research applications. Here, we summarize the framework and formulation of the stochastic inverse problem from Butler and Estep (2013) and B14 placed in the context of the given application.

### a. Formulation of the stochastic inverse problem

In this application, the ADCIRC model provides a mapping from the Manning’s *n* values to the space of temporally variable free surface elevations and velocities. The mapping is complicated to study for a variety of reasons; for example, the Manning’s *n* values enter into the ADCIRC model nonlinearly through the bottom stress terms [(2.1)–(2.4)] and the values of Manning’s *n* are uncertain for a variety of reasons (see section 1). Moreover, the space of functions defining the free surface elevations and velocities is an infinite dimensional space. Practical considerations demand the determination of a finite set of observations that are useful for quantifying uncertainties in Manning’s *n*. Since the ADCIRC model has been well validated against maximum storm surge data (Bunya et al. 2010; Dietrich et al. 2010; Kennedy et al. 2011; Hope et al. 2013; Dietrich et al. 2011a), we limit the focus in this study to observation stations recording maximum free surface elevations (i.e., maximum storm surge).

Another key consideration in the use of this data is that small perturbations to input parameters do not lead to arbitrarily large variations in such output data. This has been numerically observed in the responses of the ADCIRC model in a previous work demonstrating both continuity and differentiability of maximum water elevation data with respect to Manning’s *n* (Butler et al. 2015). Thus, given a defined set of observation stations and a compact set of physically possible Manning’s *n* values associated with a set of land classification types, the range of observable data can be well approximated by evaluating the ADCIRC model using a finite sampling of Manning’s *n* values.

The use of the ADCIRC model and maximum storm surge data cannot, in general, determine a unique vector of Manning’s *n* values. However, given the smooth response of the storm surge data to Manning’s *n*, the inverse set of Manning’s *n* values associated to a particular output datum is identifiable and can be approximated (see appendix A). This implies that the map from Manning’s *n* to storm surge data defines a bijection between *sets of Manning’s n values* and maximum storm surge data. Conceptually, it is useful to think of this simply as a way to generalize a contour map.

By simple properties of random variables/vectors, if a set of maximum storm surge data is assigned a particular probability, then the set of all Manning’s *n* values that map to this set must also be assigned the same probability. In other words, any description of uncertainty in the maximum storm surge data as a probability measure uniquely defines a probability measure on the generalized contour map of Manning’s *n* values. This is one type of solution to the stochastic inverse problem, but it is not particularly satisfying since we generally would prefer to be able to compute the probabilities of more general sets of Manning’s *n* values and not be forced to work with “contour events” that are complicated to approximate.

Obtaining the type of probability measure on the space of Manning’s *n* values we desire requires the adoption of an ansatz. The details of this are rather technical requiring understanding of a disintegration theorem (see appendix A). Suffice it to say that we simply proportion out probabilities onto subsets of the set-valued inverses according to their relative volumes (although other types of ansatz are possible). This particular ansatz further exploits the generalized contour map resulting in geometrically complex nonparametric probability measures solving the inverse problem (e.g., see the marginal probability plots in section 5).

### b. Approximation of the inverse solution

The basic idea is to exploit the smoothness of the maximum storm data with respect to Manning’s *n* values to reduce evaluation of the ADCIRC model to a small number of samples of Manning’s *n* values. Specifically, the collection of samples of Manning’s *n* values implicitly defines a partitioning of the space of all possible Manning’s *n* values into nonoverlapping subsets. Evaluation of the ADCIRC model at each sample of Manning’s *n* values defines a particular maximum storm surge datum. Then, having discretized the probability measure on the space of maximum storm surge data, the smoothness provides the justification for identifying which subsets of Manning’s *n* values approximate which contour events of specified probability. Finally, the ansatz is applied to proportion the probability of a contour event to each subset approximating it. The end result is a probability measure defined on the original space of Manning’s *n* values. The interested reader is referred to appendix B for a more thorough description of the details on the algorithm used in this work.

### c. The condition of the stochastic inverse problem and optimal observations

In Butler et al. (2015), we saw that a quantifiable aspect of the QoI map defined as skewness provides a type of condition number for the stochastic inverse problem. We provide some brief remarks on how this may be used to identify optimal configurations of buoys (i.e., how we may define and solve an optimal experimental design problem). Here, we have chosen to focus on optimal buoy configuration; however, this methodology is applicable to any set of potential observations including sets composed of varying types of observables. For more details, we refer the interested reader to appendix C.

First, to define the perspective in which we consider one configuration of buoys to be more optimal than others, we consider a general application of the solution of the stochastic inverse problem, which is to sample the resulting probability measure in order to generate an ensemble of predictions. The predictions can either be for data at future times or for unobservable data at current (or even past) times (e.g., where there are no experimentally verifiable data due to limitations in the ability to deploy buoys). Thus, for simplicity, we assume the goal of choosing a particular configuration of buoys used to solve the stochastic inverse problem is to subsequently provide ensembles of predictions such that the mean of the ensemble is more accurate and the variance of the ensemble is reduced.

We note that the problem studied in this work is nominally a problem of parameter identification under uncertainty (i.e., there is a true vector of Manning’s *n* values leading to the noisy/uncertain maximum storm surge data). If a particular configuration of buoys leads to more probability concentrated in *smaller sets* containing the true Manning’s *n* value than other configurations of buoys, then we expect that the accuracy of a prediction ensemble mean is improved and variance of the ensemble is reduced. In appendix C, we connect the skewness of a map to this basic idea. We show in the numeric results that this skewness provides a reasonable surrogate for determining which possible configuration of buoy stations is optimal for solving the inverse problem and investigate the effect on predictions. Given this knowledge we can then algorithmically select which observations would be best used for a particular set of parameters. When there are many types of observations to choose from these may simply be added to the space of potential observables from which we choose the subset of observables with the smallest skew as defined in appendix C. An interesting consequence of the measure-theoretic perspective and skewness results is that we often find that “data rich” problems where many QoI are recorded are actually “information poor” in the sense that the information contained in many QoI is redundant and fails to improve the solution to the stochastic inverse problem in meaningful ways. While a full discussion of this is beyond the scope of this work, we demonstrate in the numerical results of section 5 that removing certain components of a “poorly skewed” QoI map may have negligible effect on solutions to the stochastic inverse problem. On the other hand, a careful selection of a small number of optimal (or nearly optimal) buoy stations can provide significant precision implying that the overall cost of collecting useful data may be mitigated by such an analysis.

## 4. Hurricane Gustav case study

In this section we examine parameter estimation of Manning’s *n* using a hindcast of Hurricane Gustav.^{1} We focus on Bay St. Louis, a coastal region in southern Louisiana and Mississippi affected by Hurricane Gustav. Hurricane simulations on meshes fine enough to resolve inundation are computationally expensive. To reduce this cost we employ a recently available subdomain implementation of ADCIRC (Baugh et al. 2013; Simon 2011; Altuntas 2012; Baugh et al. 2015). This also allows us to focus on specific regions of interest. It is important to note that this methodology can be applied to any region of the coast.

### a. Model development and hindcast details

The full-domain ADCIRC model for Hurricane Gustav is based on the model in Dietrich et al. (2011a) using a mesh of the domain containing the bathymetry/topography data shown in Fig. 1 developed over a number of years by various researchers. The corresponding Manning’s *n* field for the full-domain is shown in Fig. 2. We use the same wind field and riverine inflows developed in Dietrich et al. (2011a). After a 20-day tidal and riverine spinup we begin wind forcing at 15-min intervals starting at 0000 UTC 26 August 2008 (approximately 6.5 days before landfall) and ending at 0000 UTC 4 September 2008 (approximately 2.5 days after landfall). However, this model differs from the hindcast in that we do not model waves with Simulating Waves Nearshore (SWAN) (Dietrich et al. 2011b). The full-domain model contains 2 720 591 elements and using P(arallel) ADCIRC (PADCIRC) runs to completion on 288 processors on the Stampede supercomputer at the Texas Advanced Computing Center (TACC 2015) in approximately 11.5 h.

We reduce the cost of forward model evaluations by using Subdomain ADCIRC (Simon 2011; Altuntas 2012; Baugh et al. 2013, 2015). Results are shown for a subdomain extracted from the full-domain mesh. This subdomain was chosen for the proximity to existing observation station data sites and geographic interest (e.g., inundation and complex local land classification distributions). This subdomain focuses on Bay St. Louis in Mississippi, contains 15 001 elements, and is shown in Fig. 3. Subdomain ADCIRC forces the boundary of the subdomain with the conditions from the full-domain: velocity, sea surface height, wet/dry condition (Simon 2011; Altuntas 2012; Baugh et al. 2013, 2015). A full 29-day hurricane simulation of the subdomain runs to completion on Stampede in approximately 5.5 h on two processors (this time includes reconstructing and distributing the Manning’s *n* mesh across the processors). For more details of the steps involved in subdomain modeling in ADCIRC, we refer the interested reader to appendix D.

### b. Defining the parameter domain

We apply domain-specific knowledge inherent in the land cover classification system to reduce the dimension of the parameter space due to spatial variation. Additionally, subdomains can feature as few as 4 dominant land classifications whereas the full domain may feature as many as 23 land classifications thus further reducing the dimension of the parameter space (Dietrich et al. 2011a; Agnew 2012). This *drastically* reduces the dimension of the space of Manning’s *n* values while retaining both a highly spatially variable and high-resolution representation of the Manning’s *n* field.

The parameter space for any subdomain model is determined as in Butler et al. (2015) using a mesoscale representation of a Manning’s *n* field parameterization by land cover classification. The result is a linear mapping from to , where *M* is the total number of nodes in the FE mesh and is the vector representation of the continuous, piecewise-linear mesoscale parameter field for a given set of empirical pixelated land classification data. We construct this representation offline in parallel using PolyADCIRC (Graham 2016), which contains a Python wrapped version of GridData (Tanaka 2009). The process of creating this type of mesoscale representation has been well studied and is a generally accepted method for determining the Manning’s *n* field using *fixed* values of Manning’s *n* coefficients (Agnew 2012; Butler et al. 2015). We refer the interested reader to Butler et al. (2015) for a detailed description of the construction of the land cover classification meshes.

In this parameterization the uncertain parameters are the Manning’s *n* coefficients associated with each land cover classification. We use land cover classification data from C-CAP (NOAA Coastal Services Center 2013) to create a set of land cover classification meshes. We fill in regions not covered by the C-CAP data with existing Manning’s *n* nodal data from the original mesh used for the hindcast. The Manning’s *n* coefficients used for the full-domain model are included in Table 1, the values on the continental shelf in the Gulf of Mexico are set to for sand/gravel bottoms and for muddy bottoms.

We determine the dominant land cover classifications by calculating the total fraction of each land cover classification as the sum of the nodal fractions of the land cover classification normalized by the total number of nodes that are tabulated by percentage in Table 1. We define our uncertain parameters as the Manning’s *n* coefficients for the first four dominant land cover classifications and hold the Manning’s *n* coefficients constant for the remaining land cover classifications. We allow the Manning’s *n* coefficients to vary between 33% and 175% of their original values based on tables available in Furniss et al. (2006). The resulting four-dimensional parameter domain is shown in Table 2 and the corresponding land cover classification meshes are shown in Fig. 4. These define a specific mesoscale representation of the Manning’s *n* field that is parameterized by the dominant land cover classifications in the physical domain. This parameterization removes the consideration of spatial uncertainty in the bottom friction field (as opposed to using a Karhunen–Loève expansion) thus significantly reducing the problem size. In the event such land cover classification data were unavailable, satellite images and airborne lidar could be used to estimate the number of dominant land cover classification (Jung et al. 2014; Song et al. 2002; Antonarakis et al. 2008; Brennan and Webster 2006; Im et al. 2008; Zhou et al. 2009).

### c. Defining the data domain

Here, we describe an approach for determining optimal buoy configurations recording the simulated maximum storm surge data that define the QoI map in order to reduce uncertainties in the stochastic inverse problem. As discussed in section 3 and appendix C, an advantage of the measure-theoretic approach is that a relatively small number of carefully chosen buoy stations defining the QoI can provide good precision results in the solution to the stochastic inverse problem. We therefore limit focus to the practical problem of deploying a small number of buoy stations (in this case, at most four) to record maximum storm surge data in Bay St. Louis. The motivation is that since the cone of uncertainty of a hurricane’s trajectory narrows significantly 24–36 h prior to landfall, there is a limited window of time to either deploy new, or reconfigure existing, buoy stations. The numerical results demonstrate the significant impact a carefully chosen configuration of buoys can have on the quantification and reduction of uncertainties.

The configuration of buoys is based upon a preliminary study using a small set of ADCIRC model solves that can be analyzed quickly once the cone of uncertainty narrows. Specifically, in the subdomain of interest, we computed approximate gradient data of proposed QoI using finite forward differences in order to determine the skewness of possible QoI maps. A total of 16 clusters defining a total of 80 samples of Manning’s *n* values shown in Fig. 5 in the parameter space were chosen at random to approximate the gradients of the various proposed QoI.

We note that the numerical experiments herein use the full high-resolution ADCIRC simulation that implies every node in the FE mesh is a potential observation station. Examining every possible combination of maximum water elevation at every point in the FE mesh results in an infeasibly large set of possible observation stations to examine. We therefore interpolate the FE solution onto a set of hypothetical observation stations on a regular grid and then further reduce the problem size by limiting the set of possible observation stations to the set of stations where the maximum storm surge data exhibits sensitivity to the parameters. Specifically, we consider only the stations where the maximum change in maximum water elevation over the set of forward model solves associated with evaluating the samples in the clusters shown in Fig. 5 is greater than 0.10 (m). This results in a set of 194 possible observation stations (see Table 3).

We use the Butler–Estep–Tavener (BET)^{2} python package (Graham et al. 2015) to optimally choose QoI to result in a well-conditioned stochastic inverse problem with optimal (approximate) skewness (Walsh 2015). Specifically, we use the average condition numbers of the Jacobians of the possible QoI maps on the set of 16 clusters of parameter samples mentioned above as an approximation to the skewness of a QoI map. For each possible QoI map *Q*, the average condition number is denoted by . Table 4 shows the top 10 QoI maps based on rankings of values of . Figure 6 shows the location of the observation stations chosen by this method. Since the stations in the first (blue) and last (white) columns of Table 4 are nearly collocated it makes intuitive sense that the skewness for these sets of stations are nearly identical.

## 5. Numerical results for measure-theoretic inversion

We define and solve five stochastic inverse problems to quantify uncertainty in Manning’s *n* values using five different QoI maps. In the numerical results below, we use the notation to refer to the station numbers defining a particular QoI map. For example, the optimal QoI map is given by stations . We define as a uniform distribution on a hyper-rectangle^{3} centered at , where (0.1523, 0.1751, 0.1561, 0.0269). Since is uniform on a hyper-rectangle we can exactly represent with a simple function by choosing the partition carefully. We then follow the steps of algorithm 1 from appendix B using 480 uniform independent and identically distributed (i.i.d.) random samples of (see Fig. 7) to estimate . Note that each of these random samples represents a full high-resolution subdomain ADCIRC simulation with the accompanying high-resolution Manning’s *n* field defined by a particular *λ*.

### a. Probability densities

We compare the results of the stochastic inverse problem for three QoI maps using optimal stations , near-optimal stations , and nonoptimal stations . The estimated skewness of the QoI maps for the optimal and near-optimal sets of stations is very similar (see Table 4). Figure 7 shows the 480 uniform i.i.d. random samples from with respect to in the parameter space. Figures 8 and 9 show the propagation of these samples in the data spaces for the optimal and nonoptimal QoI maps, respectively, colored by . Note that the reference parameter and reference solution are denoted by a highlighted circle in Figs. 7–9. We omit the data samples of the near-optimal QoI map as they are visually indistinguishable from those of the optimal QoI map. Observe how maps with significant skewness result in sets of output samples that appear to be more correlated. This effect is explored further when we consider generating ensembles below.

Figures 10 and 11 (and 13) show the resulting approximate marginals of for the stochastic inverse problem using the optimal, near-optimal, and nonoptimal stations, respectively. The reference solution is denoted by a white circle in Figs. 10 and 11 (and 13). Let , and the corresponding contour event . For the optimal and near-optimal sets of stations defining QoI maps, , which represents the set of all probable Manning’s *n* values based on using a particular QoI map, is almost identical in both shape and size for both cases. For example, the size of this set as measured by is approximately 7.904 × 10^{−3} for the optimal QoI choice and 8.390 × 10^{−3} for the near-optimal QoI choice. Subsequently, the solutions to the stochastic inverse problems associated to these two QoI maps are nearly identical.

Notice that even though we have the most difficulty in identifying , the region of nonzero probability still contains the reference parameter. Part of this difficulty may be attributed to the lower density of nodes with primary contributions from in inland regions (see Fig. 3). Table 5 shows the ratios of wetted nodes weighted by the nodal contributions of each parameter. We have only included nodes defined by the C-CAP data (NOAA Coastal Services Center 2013) in these ratios, which may be the reason that the ratios for are so low as there are open water nodes defined by the default value that are not included in this ratio.

The nonoptimal QoI map stations , shown in Fig. 12, identify as a larger event in the parameter space as shown in Fig. 13. Figure 9 shows that the QoI map formed from the nonoptimal set of stations is highly skewed in comparison to the QoI maps formed from the optimal and near-optimal sets of stations. The estimated volume of the region of interest for the nonoptimal QoI map 1.917 × 10^{−2} is about 250% greater than that of where *Q* is formed from an optimal or near-optimal set of stations. We can examine the marginals that result if we only use a subset of the nonoptimal stations in Fig. 14 and in Fig. 15. The marginals in Figs. 14 and 15 do not appear substantially different than the marginals shown in Fig. 13. Including additional QoI that result in a QoI map with large skewness may not improve the solution to the stochastic inverse problem computed from fewer QoI with smaller skewness. In other words, QoI maps with high skewness in their components have solutions to the stochastic inverse problem that are similar to solutions from a QoI map with fewer component maps. In a sense, the solution of the stochastic inverse problem also provides insight into the quantitative improvement of hindcasting simulations based on these results.

For any set of stations used in inversion, using the inverse probability measure to produce hindcast simulations of maximum water elevation data at these same stations will always result in reductions in ranges by at least 90% compared to a noninformed analysis. This is due to the fact that we imposed a uniform probability density on the data space that was 10% of the range of the preliminary study. Therefore, to truly judge the impact of a choice of stations used in inversion on hindcast simulations, we must analyze the reduction in ranges at all of the locations not used for inversion. Given any station we calculate the reduction in the range of hindcast simulated maximum water elevation at this station as

which compares the range of probable QoI values to the range of possible QoI values for the *i*th station. For the stations not used in the inversion based on the optimal and near-optimal choices of buoys, the reduction in ranges were on average 89% compared to an uninformed analysis. We tabulate the reduction in ranges for the optimal stations , near-optimal stations , nonoptimal stations , three stations , and two stations in Table 6. Notice that in the case where we use nonoptimal stations that there is still a significant average reduction (approximately 37%) in the ranges of associated hindcast simulations of maximum water elevation at the unused locations compared to the ranges obtained by an uniformed analysis. However, the average reduction in that case is significantly less than when the optimal or near-optimal stations were used to solve the stochastic inverse problem. In conclusion, by determination of the optimality of stations based on skewness, we have also chosen the stations that result in the largest reduction of range over all station locations.

### b. Generating ensembles of Manning’s n

We now describe the process of generating an ensemble of samples from the nonparametric probability measures that solve the measure-theoretic inverse problem. Such ensembles can be used as part of an initialization step in data assimilation or other predictor/corrector forecasting schemes. Here, we focus on the description of the process using the available code and the analysis of such ensemble results in terms of the effect of the QoI map on the structure of the resulting ensemble. We refer the interested reader to appendix E for more information on how to obtain the data files and code to generate initial ensemble estimates of Manning’s *n* values for any application in this subdomain of Bay St Louis.

There are multiple ways of generating an ensemble of samples in according to , here we do so via bootstrap sampling or resampling with replacement. We show the effect of choosing either near-optimal QoI maps or nonoptimal QoI maps on the ensemble generation. This reflects two designs of experiment choices that we refer to as choice “*o*” as optimal or choice “*n*” for near optimal. We denote the QoI maps being considered as for the near-optimal stations and for the nonoptimal stations . The resulting probabilities from using and to solve the stochastic inverse problem described above are denoted by and , respectively, where is defined the same way as in section 5.

In Table 7 we report the sample statistics for a single ensemble of 50 and 100 bootstrap samples and in Table 8 we report sample statistics for 100 such ensembles of bootstrap samples. We denote the estimate of the mean and variance for the *i*th ensemble of bootstrap samples by and , respectively. Since the range of each parameter varies, for ease of comparison we also report the two norm of these vector values with a normalized parameter domain . Notice in Table 8 the variance of the mean, the mean of the variance, and the variance of the variance for the 100 ensembles of bootstrapped samples are all smaller based on sampling compared to sampling . This is to be expected as .

Another point of emphasis is the accuracy of certain point estimates from an ensemble of samples generated from either or . We denote the difference between the reference value and the *k*th sample of the *i*th ensemble of bootstrapped samples to be . The mean of the difference of the *i*th ensemble is then and the variance is . These values are reported in Table 9 for a single set of bootstrap samples and Table 10 for 100 sets of bootstrap samples. As expected, the error in the mean of ensembles associated with choosing to solve the stochastic inverse problem have better accuracy to the referenced truth than ensemble averages associated with choosing .

We also examine the support of and by considering the number of unique samples associated with a given ensemble of bootstrap samples. If we designate the original set of samples used to solve the inverse problem as , then we can express the estimate of the mean for the *i*th ensemble of bootstrap samples as

where are the set of resulting weights for each of the 480 samples reflecting the number of times they appear in the *i*th ensemble of bootstrap samples. The number of unique samples associated with the *i*th ensemble of bootstrap samples is then , and if this indicates that the sample is not in the *i*th ensemble. Since has less skewness than , we expect that the numbers of these unique samples will be smaller for than . In Table 11 we report the number of these unique samples and in Table 12 we report the corresponding sample statistics for 100 ensembles of bootstrap samples. We indeed see that the number of unique samples per ensemble for (i.e., 4) is less than the number of unique samples per ensemble for (i.e., 9). This indicates a tighter concentration of probability for the less skewed map , than for the more skewed map . In other words, the more optimal map does a better job of concentrating probability around the referenced truth resulting in less expected variance in each ensemble, and we expect such results to improve the accuracy of any subsequent predictions in general.

## 6. Conclusions

We formulated and numerically solved a stochastic inverse problem involving spatially heterogeneous Manning’s *n* fields parameterized by land cover classification data using maximum water elevation data obtained from a hindcast of Hurricane Gustav using the ADCIRC model. The novel measure-theoretic framework and computational algorithm used was based on the authors’ previous work (Butler et al. 2015; Breidt et al. 2011; Butler and Estep 2013; Butler et al. 2012, 2014; Graham 2015). This previous work explored the condition of the inverse problem defined in terms of the skewness of contour events. However, this previous work did not address the practical computation of skewness numbers nor use those skewness numbers to inform experimental design through the choice of QoI maps based on the underlying geometric structure of the generalized contours to obtain probability measures with small support (relative to using other QoI maps). We have done both in this work and provided a tool that automates this process. We have extended the definition of the inverse problem for the Manning’s *n* field from the relatively simple idealized inlet example to a complex hurricane hindcast example with a higher-dimensional data space. We employed a recently available subdomain implementation of ADCIRC (Baugh et al. 2013; Simon 2011; Altuntas 2012; Baugh et al. 2015) to reduce simulation time and focus on specific areas of interest rather than the much larger domain required for hurricane simulations.

We wish to emphasize that the methodology for parameter estimation presented herein is model agnostic, meaning that it widely applicable to many physics based models. Generally, the steps required to apply this methodology are as follows: 1) define the parameter space, often this involves domain or expert knowledge and a discretization of the parameter; 2) determine the optimal set of QoI for estimating this particular set of parameters, this often involves estimating gradient or Jacobian information through a preliminary study or sensitivity study; 3) sample the parameter space and run the computational model for these parameters; 4) use the resulting parameter data to solve the stochastic inverse problem using BET (Graham et al. 2015); and finally 5) use these results to generate ensembles from the resulting probability measures to enrich future predictions or calculate probabilities of events. If the parameter space is high dimensional, either spatially or in terms of types of parameters, it might be useful to develop a surrogate or reduced-order model for your system or reduce the dimensionality of your parameter space.

We have also demonstrated that the solution of the stochastic inverse problem can be used to generate an ensemble of samples in the parameter space whose accuracy and precision is dependent on the skewness of the QoI map. A map with less skewness generally results in more accuracy and precision. Moreover, when using bootstrap sampling, we showed that a map with less skewness may result in smaller ensemble sizes defined by the unique subset of samples. This has the potential to lower the computational complexity in a data assimilation scheme when parameters are initialized by such ensemble generation, and is the topic of future work.

## Acknowledgments

L. Graham’s work is supported by Department of Energy Office of Science (DE-SC0009324 and DE-SC0010678) and the National Science Foundation Graduate Research Fellowship (DGE-1110007). T. Butler’s work is supported in part by the National Science Foundation (DMS-1228206). C. Dawson’s work is supported by the National Science Foundation (DMS-1228243). C. Dawson would also like to acknowledge the support of the Gulf of Mexico Research Initiative Consortium for Advanced Research on Transport of Hydrocarbons in the Environment (CARTHE). T. Butler and C. Dawson recognize the support of the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Awards DE-SC0009286 and DE-SC0009279 as part of the DiaMonD Multifaceted Mathematics Integrated Capability Center. S. Walsh’s work is supported by the National Science Foundation (DMS-1228206). J. J. Westerink’s work is supported by the National Science Foundation (DMS-1228212) and supported by the Henry J. Massman and the Joseph and Nona Ahearn endowments at the University of Notre Dame. We thank the developers of Subdomain ADCIRC, J. Baugh, A. Altuntas, T. Dyer, and J. Simon at North Carolina State University, for access to their code without which this work would not have been possible. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors(s) and do not necessarily reflect the views of the National Science Foundation. The author(s) also acknowledges the Texas Advanced Computing Center (TACC 2015) at The University of Texas at Austin for providing HPC resources that have contributed to the research results reported within this work.

### APPENDIX A

#### Measure-Theoretic Inversion Outline

The following is a brief summary of the abstract framework for the formulation and solution of stochastic inverse problems using measure theory. For more details, especially as they pertain to the existence and uniqueness of solutions, and error analysis of approximate solutions, we refer the interested reader to Breidt et al. (2011), Butler and Estep (2013), and Butler et al. (2012, 2014).

Let denote a physics-based model that relates a physical state *Y* (often called the solution to the model) to model input parameters *λ*. Solutions to the model, *Y*, are implicitly functions of the parameters *λ*. However, no system is completely observable and we are often interested in a small number of directly observable QoI defined as functionals of the state. Since the state variables are implicitly a function of parameters, so are the QoI, and we make this dependence explicit in the notation below. We denote the vector-valued QoI map , where denotes the set of possible model QoI values (i.e., the set of maximum storm surge values at a finite set of observation stations). We can explicitly write out the components of the QoI map as , where for . In general, *d* can be any positive integer, but we are particularly interested in the mathematically challenging (and common) case, where . It is technically easier to describe the solution to the inverse problem in terms of certain manifolds if we assume . While the theory and algorithms do not require , we demonstrate in the numerical results that using more QoI does not always lead to a significantly improved solution to the inverse problem. A point of emphasis of applying this method is that it is possible to obtain reasonable results by choosing a specific small number of QoI for which we solve the inverse problem.

We now describe the minimal assumption required for formulating and applying the measure-theoretic approach in practice. We assume that the QoI map *Q* is piecewise smooth (i.e., that it is differentiable almost everywhere). The notion of differentiability implies some assumption of metrics or norms on the input and output spaces so that distances between points can be computed. With this minimal assumption, *σ*-algebras and on the input and output sets, respectively, are easily constructed. These simply represent the family of sets for which it makes sense to ask questions about their measure (i.e., volume). Moreover, we can use the metrics or norms to induce measures and on these measurable spaces so that it is possible to describe a natural notion of volume to sets in and , respectively. In other words, we represent the input parameter and output data spaces as the measure spaces and .

Defining probability measures and on these spaces allows us to define and study stochastic forward and inverse sensitivity analysis problems. We are particularly interested in the inverse problem where is prescribed on and the goal is to determine on , which is *consistent* with the data (i.e., propagates to ). Understanding what we mean by consistency of the solution to the inverse problem first requires careful consideration of the structure of the QoI map and the possible inclusion of an additional assumption formulated as an ansatz.

Here, we give some basic terminology useful in describing the structure of solutions to the inverse problem. Assuming for simplicity, that the Jacobian of *Q* has full rank, the implicit function theorem guarantees the existence of locally smooth -dimensional manifolds defining the set-valued inverses of *Q*. When and , these are simply contour curves, see Fig. A1. Thus, we call any particular set-valued inverse of *Q* a *generalized contour*, which defines an equivalence relation on .

We denote the measure space of equivalence classes imposed by as . The *σ*-algebra and measure on are induced by the model structure and do *not* require additional assumptions. It is possible to explicitly represent as a *d*-dimensional manifold in , which is called a *transverse parameterization*, see Fig. A1.

We emphasize that it is not necessary to explicitly construct the transverse parameterization or generalized contours in order to solve the inverse problem. However, we find it useful to describe the solution of the inverse problem in terms of these manifolds.

Note that a single point in is identified with a single generalized contour, so the map *Q* defines a bijection between and and also between and the partitioning of in terms of the (uncountable family of) generalized contours . In other words, an event in maps uniquely to an event in which maps uniquely to a *contour event* in . The implication is that the QoI map induces a separate *σ*-algebra of contour events on which we denote by . At this point we have exploited all of the geometric information in . We now describe how this geometric information manifests itself in the structure of solutions to the inverse problem. Given on , the marginal probability, on , is completely and uniquely determined by the bijection between the spaces and as depicted in Fig. A1. Thus, the solution to the inverse problem is uniquely determined if we stop at measuring probabilities of *contour events* (Butler et al. 2014).

However, the goal is to determine a solution on the parameter space . This requires an application of the disintegration theorem, which is a powerful result in probability theory that allows for the rigorous construction of conditional probability measures on lower-dimensional manifolds (Butler and Estep 2013; Dellacherie and Meyer 1978). The result is like a version of Fubini’s theorem for nonlinear manifolds that allows us to write joint probabilities in terms of iterated integrals of marginal and conditional probabilities.

**Theorem A0.1. ***(Disintegration of probability theorem).* Assume that is a probability measure on . There exists a family of conditional probability measures on giving the disintegration:

where denotes a projection map.

To apply this result, we must obtain a family of probability measures on each measurable generalized contour space . We adopt the “nonpreferential” standard choice for ansatz illustrated in Fig. A2 where we set the family of probability measures along the generalized contours to be uniform with respect to the volume (length) of the generalized contours, , which is based on the Disintegration of Volume Measure (B14). Note that the standard choice of ansatz is valid for . This is easily satisfied as in many engineering and scientific applications when is bounded and finite dimensional (and therefore precompact). We also note that this ansatz exploits the same geometric information contained in the map *Q* in the determination of the volumes resulting in geometrically complex nonparametric probability measures solving the inverse problem as seen in the examples and numerical results of Butler and Estep (2013), Butler et al. (2014, 2015), as well as in the numerical results in section 5.

### APPENDIX B

#### Algorithm for Approximating Inverse Measure

The computational algorithms in this section are available as a fully documented Python package, BET (Graham et al. 2015). We describe the basic algorithm for the approximation of and briefly outline issues in the practical computation of . We use algorithm 1 to form the simple function approximation of (B14; Butler et al. 2016, manuscript submitted to *SIAM J. Sci. Comput*., hereafter B16)

where is the indicator function on . The samples implicitly define a Voronoi tessellation such that , where is a specified metric on (B14; B16). We can then approximate for any arbitrary event in the usual way with sums of or by direct integration of (Butler et al. 2014, B14; B16; Breidt et al. 2011; Butler et al. 2012). If we are interested in events, , that are larger than the Voronoi cells, , we can estimate using the counting measure defined by

Note that can be calculated as the summation of the probabilities of the Voronoi cells associated with samples binned to a particular event (Butler et al. 2015).

**Algorithm 1:** A Sampled Based Approximation of the Inverse Density (B14; B16)

Choose samples defining the partition of .

Assign values for the points , .

Choose a discretization partition of , .

Form the simple function approximation

,

where

and , .

Let .

Let .

Let , .

For do

Set

end

Within the for-loop of Algorithm 1, we use a discrete version of the ansatz to set the probability of a particular Voronoi cell to be the probability of (which is ) normalized by , which is the ratio of the (approximate) volume of the particular Voronoi cell [denoted by ] to the volume of all the Voronoi cells that map to the same set in the data space . These volumes are approximated using volume emulation, which does *not* require explicit construction of the Voronoi tesselation (B16).

### APPENDIX C

#### Condition of the Measure-Theoretic Inverse Problem

We refer the interested reader to Butler et al. (2015) for an in-depth description of the condition of the inverse problem. This is somewhat similar to the condition number for a matrix, which we use to provide a more familiar mathematical context. Suppose is a matrix, then a solution exists for all if and only if has full rank. The condition number of can be viewed as a measure of accuracy in the numerical inversion of the operator to obtain a numerical solution of . Specifically, the condition number can be used to give a bound on the numerical errors in computing *x* for a particular approximation method. Similarly, the accuracy of the numerical solution of the stochastic inverse problem depends on a “skewness” property of the Jacobian of *Q*, which can be used to inform the number of samples required in Algorithm 1 to compute accurate approximations of inverse sets .

When is a linear mapping, we can represent as a matrix, and the Jacobian of is also this matrix. We can restrict to by fixing a transverse parameterization. This is analogous to restricting to an invertible submatrix. The condition of the inverse problem depends on the dependencies between the rows of . If each row of represents a vector in the dependencies between the rows of are reflected by the angles the row vectors of make with respect to each other, these correspond with the angles of intersection the generalized contours of the component maps make with respect to each other. If the generalized contours of component maps are parallel, then the rows of are linearly dependent. This implies that the space of output data actually exists as a lower-dimensional manifold in (i.e., there is redundancy in the output data). Therefore, assuming all redundant data has been removed from the map , the accuracy in pointwise solutions to a deterministic inverse problem is determined by the condition of the matrix . When is nonlinear, we obtain similar interpretations, locally in , by analysis of the Jacobian of denoted by .

We are interested in the estimation of *inverse events* in to compute probabilities and not in the pointwise approximation of individual generalized contours. The skewness of is a quantifiable measure of the deformation of inverse events by the map , which serves as a type of condition number for the stochastic inverse problem, (see Butler et al. 2015). As shown in Butler et al. (2015), the number of samples required to solve the stochastic inverse accurately using Algorithm 1 is proportional to

where is locally linear in the Voronoi cell associated with sample .

If the QoI map used for the stochastic inverse problem has large skewness, the solution of the stochastic inverse problem may have a large support even if the region has small support. Thus, the volume of possible outcomes from the prediction problem may also be large. In other words, the forward sensitivity analysis problem of propagating events of high probability will likely result in a loss of precision when the QoI map used for inversion suffers from large skewness. Careful selection of quantities of interest to form the QoI map plays a significant role in solutions for *both* the stochastic inverse problem and related prediction problems. Algorithms for estimating the skew and selecting optimal sets of quantities of interest are included in Graham et al. (2015).

### APPENDIX D

#### Overview of Subdomain ADCIRC

The steps involved in subdomain modeling in ADCIRC are as follows:

Create full-domain ADCIRC model.

Create subdomain ADCIRC model:

Extract input files from full-domain ADCIRC model.

Create subdomain specific files.

Link meteorological forcing files in the subdomain directory.

Generate full-domain control file to control output of subdomain boundary forcing data.

Run ADCIRC on the full domain.

Extract subdomain boundary forcing data from the full domain.

Run ADCIRC on the subdomain.

There are two requirements for effectiveness of subdomain ADCIRC with respect to the imposed boundary conditions: 1) “locality of topographic and other changes” and 2) “sufficient forcing frequency of boundary conditions” (Baugh et al. 2015). We ran a series of initial tests to ensure that the subdomain was large enough and to determine a sufficient forcing frequency. Based on these initial tests we chose a boundary condition forcing interval of 10 min. We also only altered the Manning’s *n* values for nonboundary nodes ensuring a linear “buffer zone” between the full domain and subdomain Manning’s *n* fields. We developed the Python package PolyADCIRC (Graham 2016) to efficiently conduct parameter sweeps of of P(arallel)ADCIRC on HPC systems. Given the desired samples in , a set of land classification basis functions used for parameterizing the Manning’s *n* field, and other required input files for a particular ADCIRC simulation, PolyADCIRC runs batches of PADCIRC simulations and saves the output after each batch.

### APPENDIX E

#### Scripts and Data

Please contact the corresponding author for data files and scripts used to generate initial ensemble estimates of Manning’s *n* values for any application in this subdomain of Bay St. Louis, which are available upon request through www.designsafe-ci.org. These files were procuded using ADCIRC v50 (adcirc.org), Subdomain ADCRIC (www4.ncsu.edu/~jwb/subdomain), PolyADCIRC (dx.doi.org/10.5281/zenodo.45083), and BET v1.0-dev (github.com/lcgraham/BET/commit/f4ac1bc0dc95eff2cd536f6148ff896c5994c90a).

## REFERENCES

*Encyclopedia of Atmospheric Sciences,*G. R. North, J. Pyle, and F. Zhang, Eds.,Vol. 4, Elsevier, 181–186.

*Open-Channel Hydraulics*. Blackburn Press, 700 pp.

*Probabilities and Potential*. North-Holland Publishing Co., 202 pp.

*n*values for channels, closed conduits flowing partially full, and corrugated metal pipes. Accessed 27 January 2017. [Available online at http://www.fsl.orst.edu/geowater/FX3/help/8_Hydraulic_Reference/Mannings_n_Tables.htm.]

*Fluvial Processes in Geomorphology*. W. H. Freeman, 544 pp.

*Proc. Int. Soc. for Photogrammetry and Remote Sensing*, Paper 128, Part 3/B, 4 pp. [Available online at http://www.isprs.org/proceedings/xxxiv/part3/papers/paper128.pdf.]

*Channel Flow Resistance: Centennial of Manning’s Formula*, J. Dooge, Ed., Water Resources Publications, 136–185.

## Footnotes

© 2017 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

^{1}

An initial presentation of the stochastic inverse problem was included in Graham (2015).

^{2}

BET is a python-based package for solving stochastic inverse and forward problems within a measure-theoretic framework.

^{3}

We choose the lengths of the sides to be 10% the length of .