## 1. Introduction

Providing accurate and timely forecasts of storm surge is a problem of critical importance. We consider the problem of improving the relative accuracy of short-range forecasts of storm surge using sophisticated models solved on numerically coarse grids to provide timely predictions of elevated water levels. While coarse discretizations of models may be used to quickly forecast storm surge, we expect large numerical errors to arise due to the discretizations. We implement various data assimilation methodologies to compare the relative performances and capabilities of these schemes in improving the accuracy of forecasts. Below, we summarize the recent history of storm surge events that has spurred the mathematical development of state-of-the-art hydrodynamic models.

The effects of storm surge from a number of extreme weather events dating back several decades have motivated efforts to accurately forecast water elevations in order to minimize both the impact on economic activities and the loss of human life. In 1953, a catastrophic storm in the North Sea flooded approximately 600 km^{2} of land in the United Kingdom and the Netherlands and is considered responsible for the deaths of 2007 people (Wolf 2003). In 1970, the Bhola cyclone struck Bangladesh, and the resulting storm surge contributed to a death toll estimated as high as 500 000. In August 2005, Hurricane Katrina made landfall in New Orleans causing the death of approximately 1200 people (Blake et al. 2011).

The modeling and numerical simulation of storm surge has undergone several stages of evolution since the 1953 North Sea flooding. Until 1979, the primary set of equations used in storm surge modeling were empirically derived (Heaps 1983). Since 1980, the use of two-dimensional models solving the hydrodynamic equations in the vertically integrated form became the models of choice for predicting water elevations in coastal areas. The numerical frameworks used to evaluate these models have been enhanced by employing efficient solvers and more sophisticated (unstructured and adaptive) grids. In Fleming et al. (2008), variations of the wind speed or track of the storm were used in order to obtain five different forecasts representing worst case scenarios. In Brown et al. (2007), a storm surge model was coupled with a flood model to study various sources of modeling uncertainty in an urban area including the ways that buildings may impede the water flux. In Blain et al. (1995), the influences of the model domain size and its discretization were studied, showing that finer grids provided more accurate results but at an increased computational cost.

To reduce the computational cost, a parallel architecture and advanced numerical discretization schemes were adopted in the advanced circulation (ADCIRC) storm surge model, enabling 20 min of wall clock time per day of real-time simulation on very fine grids using 16 384 cores (Tanaka 2010). The parallel performance of the ADCIRC model continues to improve as computer architectures evolve. While improving the resolution and discretization may lead to more accurate forecasts, it could ultimately make the required computing time too long in a setting where providing timely forecasts is crucial. Moreover, the accuracy of storm surge forecasts depends on the quality of the input data. In particular, the forecasts are particularly sensitive to the specifications of many wind and model parameters (e.g., wind drag and bottom drag coefficients). Many of these parameters cannot be measured directly and may result in uncertainty in forecasts. The challenge is to build a prediction storm surge system capable of quickly assimilating data to provide timely and accurate forecasts for authorities in charge of evacuation and rescue plans.

Data assimilation (DA) methodologies can enrich model simulations and predictions by constraining their outputs with available observations. DA methods generally fall into one of two categories: variational methods that are essentially least squares model data fitting methods and sequential methods based on the Kalman filter (Bennet 1992; Evensen 2003). Because of their ease of implementation, remarkable efficiency and robustness, and reasonable computational burden, the sequential ensemble Kalman filter (EnKF) methods have seen widespread use in many geophysical applications.

Different EnKF variants were developed in recent years. Depending on whether or not the observations are perturbed before assimilation, it is customary to classify these variants of the EnKF as belonging to one of two types (Tippett et al. 2003): stochastic EnKF (SEnKF; see, e.g., Burgers et al. 1998; Houtekamer and Mitchell 1998) or deterministic ensemble square root filters (SR-EnKF; see, e.g., Anderson 2001; Bishop et al. 2001; Whitaker and Hamill 2002; Hoteit et al. 2002). A SEnKF essentially updates each forecast ensemble member with perturbed observations using the Kalman filter correction step. A SR-EnKF updates the ensemble mean and a specific square root form of the sample error covariance matrix without perturbing the observations. Among the SR-EnKFs, several have publicly available codes and have become increasingly popular. These include the singular evolutive interpolated Kalman (SEIK) filter (Pham 2001; Hoteit et al. 2002), which bears close similarities with another SR-EnKF, the ensemble transform Kalman filter (ETKF; see Bishop et al. 2001; Wang et al. 2004), as revealed in a recent work by Nerger et al. (2012b). There is also the ensemble adjustment Kalman filter (EAKF; see, e.g., Anderson 2001), which was developed under the umbrella of the Data Assimilation Research Testbed (DART) at the National Center for Atmospheric Research (NCAR). Many other ensemble-based Kalman filters have been developed using similar strategies, for example, Cohn and Todling (1996), Verlaan and Heemink (1997), Zupanski (2005), Beezley and Mandel (2008), Luo and Moroz (2009), and Luo and Hoteit (2012, 2013) to name but a few.

Specific to storm surge modeling resulting from hurricanes, the SEIK filter was recently applied to the short-range forecasting problem using the extensively validated ADCIRC model (Butler et al. 2012; Altaf et al. 2013). This particular problem exhibits unique fast-evolving dynamics. Thus, it remains an open question in the storm surge community how the forecasts obtained using other EnKFs would compare to those obtained using the SEIK filter. In this study, we will investigate the performances of some of the most common EnKFs, namely, the SEnKF, ETKF, SEIK, and EAKF for the assimilation of storm surge data using Hurricane Ike as the test case. This is also the first study that compares these filters under exactly the same conditions.

In practice, the ensemble sizes of EnKFs are significantly smaller than the numerical dimension of the system state [e.g., ensemble sizes are often

The paper is organized as follows: Section 2 presents an overview of the various EnKFs implemented in this study. Section 3 describes the auxiliary techniques of inflation and localization. An overview of the storm surge model, ADCIRC, is presented in section 4. In section 5, the performances of the various filters for forecasting the storm surge of Hurricane Ike are analyzed. Concluding remarks follow in section 6.

## 2. Ensemble Kalman filters

*m*

_{x}-dimensional system state at time instant

*k*,

**x**

_{k},

**x**

_{k−1}to

**x**

_{k}, and the observation operator

**x**

_{k}from the state space onto the observation space. When

_{k,k−1}and

_{k}are linear operators, for example, matrices, it is common to rewrite them in a different font style, as

_{k,k−1}and

_{k}, respectively, to distinguish them from the operators in the nonlinear cases (see, e.g., the appendix). It is also assumed that

**u**

_{k}and

**v**

_{k}are independent white noise of mean zero and covariance matrices

_{k}and

_{k}, respectively.

The EnKFs estimate the system state **x**_{k} at time instant *k*, given the observations _{k} = {**y**_{k}, **y**_{k−1}, …} up to and including time *k* and some prior knowledge of the system state **x**_{i} at some instant *i* ≤ *k*. If both the dynamical and observation systems are linear, the minimum variance [and maximum a posteriori (MAP)] solution to the state estimation problem is determined by the Kalman filter (Kalman 1960). The conventional Kalman filter cannot be applied directly if the system or observation operator is nonlinear. The EnKF is a modification that uses a Monte Carlo approach to estimate the minimum variance solution to the state estimation problem. At the analysis step of an EnKF, an ensemble of the system state, called the analysis ensemble, is generated with sample mean and covariance as the analysis state and error covariance matrix, with the ensemble size *n* typically much smaller than the dimension *m*_{x} in large-scale applications. By propagating the analysis ensemble through the dynamical model (i.e., through the transition operator), we obtain a forecast ensemble at the next data assimilation cycle. When *m*_{x} is very large and *n* ≪ *m*_{x}, the computational savings in using the EnKF compared to other methods can be substantial (Evensen 2003). For example, in a linear system, the computational cost at the prediction step is

The literature provides many variations on the implementation of the classical EnKF. In this study, we confine ourselves to the following variants: the SEnKF and three SR-EnKFs, namely, the ETKF, EAKF, and SEIK. For conciseness, we outline the main procedures of these filters in the appendix. To avoid complicating the discussion, we have focused on introducing the “plain” forms of these ensemble filters in the appendix without covariance inflation or localization. However, these two important auxiliary techniques are adopted in all of the numerical experiments and are briefly discussed in section 3 below.

## 3. Two auxiliary techniques in the EnKF

When an EnKF is used for data assimilation in large-scale models, more often than not we can only afford to implement the filter with a relatively small ensemble size. This results in some undesirable effects such as rank deficiency, underestimation of variances of the system state, and overestimation of the corresponding cross covariances (Hamill et al. 2009; Whitaker and Hamill 2002). It is customary to introduce covariance inflation (Anderson and Anderson 1999) and localization (Hamill et al. 2001) in order to mitigate these effects.

Covariance inflation addresses the problem of variance underestimation (Anderson and Anderson 1999). The motivation for covariance inflation is based on the observation that the sample variances of the system state tend to be underestimated with a relatively small ensemble size (and often neglected model errors), so we deliberately inflate the variances by a prescribed amount.^{1} In many situations, proper covariance inflation not only improves the estimation accuracy of the filter (Anderson and Anderson 1999), but also enhances its robustness from the point of view of robust filtering (Luo and Hoteit 2011) or “residual nudging” in the observation space (Luo and Hoteit 2013). Various inflation methods have been proposed and studied in the literature (see, e.g., Altaf et al. 2013; Anderson and Anderson 1999; Anderson 2007, 2009; Bocquet and Sakov 2012; Hamill and Whitaker 2011; Luo and Hoteit 2011, 2013; Meng and Zhang 2007; Miyoshi 2011; Whitaker and Hamill 2012; Zhang et al. 2004). A numerical comparison of different inflation schemes is beyond the scope of the current work. In this study we adopt the conventional inflation scheme originally proposed by Anderson and Anderson (1999) in all of the numerical experiments. Specifically, we implemented this scheme in such a way that the forecast sample covariance is (in effect) multiplied by a constant factor *λ*^{2} ≡ (1 + *δ*)^{2} for a positive scalar *δ*.

Localization is introduced into the EnKF in order to tackle the problems of rank deficiency and spuriously large cross covariances between different state variables (Hamill et al. 2001). One popular localization method is covariance localization (CL; see, e.g., Hamill et al. 2001). In this method, a tapering matrix based on the distances between the grid points of a physical model is computed. The Kalman update is then applied based on the Schur product (Horn and Johnson 1991, chapter 5) between the tapering matrix and the original sample forecast ensemble covariance. Compared to the original sample covariance matrix, the resulting “filtered” covariance matrix should have higher (full) rank and local cross covariances. A potential limitation of CL is that it is not fully consistent with the analysis ensemble sampling step of the square root EnKFs (Nerger et al. 2012a). As an alternative, we adopt another standard localization technique, called local analysis (LA; see, e.g., Cohn et al. 1998), in which the whole state space is divided into a set of disjoint local analysis domains (LADs), and the system state in a LAD is updated only using the observations within a preset distance to the LAD. The LA used in the experiments below is implemented in a way similar to that in CL. Specifically, the observation weighting is determined by the fifth-order polynomial tapering function [see, e.g., Hunt et al. (2007) and the references therein], together with a prespecified radius taken as the half-width of the cutoff distance. The relation between the CL and LA techniques was discussed and studied in Greybush et al. (2011), Janjić et al. (2011), and Sakov and Bertino (2011).

## 4. The ADCIRC model

The ADCIRC model (Luettich and Westerink 2005) solves the shallow water equations (SWEs) that describe the changes in sea surface elevation and depth-integrated horizontal flow on spatial domains such as the Gulf of Mexico possibly including the western North Atlantic, as seen in Fig. 1. The ADCIRC model discretizes the SWEs using a finite element method defined on unstructured meshes in space and finite difference schemes in time. The wind-wave model Simulating Waves Near Shore (SWAN) for capturing wave-induced initial states was recently coupled to ADCIRC (Dietrich et al. 2011).

Many hindcast studies of hurricanes from 1965 to 2008 have been used to verify and validate the ADCIRC model (see, e.g., Westerink et al. 2008; Bunya et al. 2010; Dietrich et al. 2010; Kennedy et al. 2011; Hope et al. 2013). The model may be run in forecast mode where data on the hurricane track and forward speed, and wind characteristics (wind speed, central pressure, and radius-to-maximum winds), are obtained every 6 h from the National Weather Service, and a parametric wind field is generated that provides forcing to ADCIRC. The theoretical, numerical, algorithmic, and high-performance computing developments for the ADCIRC model are well documented, and we direct the interested readers to Luettich and Westerink (2005) as a good starting point. In the numerical experiments below, data were obtained from the ADCIRC hindcast studies, while the data assimilation experiments used the forecast mode of ADCIRC to propagate the state variables forward in time; see section 5 for more details.

## 5. Numerical experiments

In this section, results of the various EnKFs discussed in section 2—all equipped with the LA and inflation techniques—are presented. We use meteorological data from Hurricane Ike, which at its peak was a category 4 hurricane and was a category 2 hurricane upon making landfall along the upper Texas coast (Berg 2009). Hurricane Ike traveled through the Atlantic, Caribbean, and Gulf of Mexico before making landfall early on 13 September 2008, as shown in Fig. 2.

### a. Configuration

The assimilation experiments are conducted using two different configurations of ADCIRC. The first configuration uses a fine-resolution grid including the Gulf of Mexico and western North Atlantic and high-fidelity wind fields that are computed from wind data collected during the actual hurricane. We refer to this as the hindcast configuration. Specifically, the hindcast simulation is forced with data-assimilated winds and atmospheric pressure fields provided by Ocean Weather, Inc. (OWI). The hindcast simulation used 1-s time steps on a grid of 3 322 439 nodes corresponding to 6 615 381 elements discretizing the Gulf of Mexico and the western North Atlantic seaboard (see Fig. 1). Measurement data of water levels are extracted from the hindcast simulation and used for assimilation.

The second configuration contains model errors (with respect to the hindcast configuration) and is used as the forecast model in the filters. The forecast model is configured using a coarser-resolution grid including only the Gulf of Mexico and is forced with coarse global wind fields generated by the dynamic Holland model (Holland 1980) using the best possible hurricane track data obtained from the National Oceanic and Atmospheric Administration (NOAA) archive. We generally refer to the forecasts as coming from a “coarse model” to indicate the coarser resolutions used in this second configuration. Specifically, the forecasts from the coarse model used a time step of 10 s on a grid of 8006 nodes and 14 269 elements covering the Gulf of Mexico, as shown in Fig. 3. The main differences between the hindcast and the forecast configurations are summarized in Table 1. Observations extracted from the hindcast simulation are assimilated into the coarser model using the various EnKFs.

Summary of differences between simulations for hindcast (truth) simulation used to generate data and simulations used in the data assimilation forecasting experiments.

Since the results of the hindcast studies have been validated, the corresponding global output is considered as the truth and is compared to the solution of the coarse model to evaluate and compare the performance of the various ensemble filters. In all the experiments, we set the standard deviation of the measurement noise of the hindcast data to produce an assumed 95% confidence interval of ±0.01 m, as in Butler et al. (2012). It should be noted that we expect that there can be large errors in absolute terms between the coarse model forecasts and the hindcast study due to the dissipation of water elevations across large elements. Thus, we use the relative improvements of errors in the forecasts of the coarse model to evaluate and compare the performances of the filtering methodologies. We note that another reason for using synthetic data from a hindcast simulation is that it limits the source of uncertainties in the error covariance matrices to the choice of the filtering scheme only. This allows for a full evaluation and a direct comparison of the various EnKF’s performances.

For the coarse model, after a 24-h spinup period between 0000 UTC 9 September and 0000 UTC 10 September 2008, data are assimilated every 2 h until 0600 UTC 14 September 2008, 1 day after Hurricane Ike made landfall, resulting in 51 assimilation steps. The data that are assimilated come from 43 observation stations from the hindcast simulation. The locations of these observation stations are shown in Fig. 4. These are actual observation stations and their data may be exploited in any real-time extreme event scenario. We also note that the stations are all located near shore where the coarse model forecasts typically have significant errors and often fall below the recorded surge values. While the numerical experiments assimilate only synthetic data at these stations, these experiments demonstrate whether this existing distribution of the stations enables a relative improvement of the short-range forecast from the coarse model.

To generate a representative initial ensemble with a small number of ensemble members, we apply an empirical orthogonal function (EOF) analysis by second-order exact sampling as done in earlier studies (Pham 2001; Hoteit et al. 2013). We simulated the ADCIRC model for 60 days using only tidal forcing to eliminate all transient behavior and recorded the model state every 5 h. The perturbations of these states from their mean are used to define a sample covariance matrix *σ*_{j} being the *j*th eigenvalue of *L*^{2} norm of approximations to the state in an (*n* − 1)-dimensional space and is useful in determining the ensemble size *n* given a prescribed *L*^{2} error tolerance, which is also the percentage of variance retained by the EOFs. In the experiments below, we start with an ensemble size of *n* = 10 that retains approximately 90% of the variance of this sequence of states suggesting, as expected, that the water elevation exhibits a low-dimensional structure when forced with tidal data.

### b. Results and discussion

To quantify and compare the various filter performances, an rms error metric is used. Figure 5 plots the average rms errors of the maximum water level forecasts for the Ike simulations using the SEnKF and the three SR-EnKFs with different values of inflation factor *λ* and radii (in kilometers). The assimilation results show that the SR-EnKFs perform very well with an ensemble of 10 members, though, as expected, the results are dependent on the localization radius. The optimal size for the LA varies from 25 to 100 km for all the SR-EnKFs.

The rms error of the SEIK filter varies from 0.58 to 0.75 m, with the smallest rms error obtained using *λ* = 1.2 and a radius of 100 km. Overall, the SEIK filter is able to reduce the rms error by almost 27% as compared to the forecasted average rms error when no localization is used. The ETKF and EAKF exhibit similar trends. The smallest rms error for the ETKF is obtained using *λ* = 1.2 and a radius of 25 km and for the EAKF using *λ* = 1.3 and a radius of 100 km. The SEIK and the ETKF showed very similar trends, while the EAKF provides comparable results with appropriate choices of localization and inflation. The EAKF is more sensitive to (and requires larger values of) the inflation. In particular, the EAKF requires stronger localization radii than the ETKF and SEIK and failed to provide significant improvements with large radii. Such a difference in behavior can possibly be attributed to the serial assimilation of the observations in the EAKF when it is equipped with LA. For any filter using a 2000-km radius (which is a large radius compared to the size of the Gulf of Mexico), we observe results that differ only slightly from the case where no localization is used.

By comparison, improvements are not as pronounced in the SEnKF with an ensemble of 10 members. The rms errors for the SEnKF vary between 0.66 and 0.75 m, with the smallest rms error obtained using *λ* = 1.2 and a radius of 500 km. Overall, no clear pattern of improvement is found with the SEnKF compared to the forecasted average rms error when LA is used for the three SR-EnKFs. It is likely that the large rms errors in the SEnKF are due to the observation sampling errors being amplified with the use of a small ensemble size in these runs, which is a documented phenomenon (Nerger et al. 2005).

Figure 6 shows the average rms errors of the maximum water level forecasts using the SEnKF and the ETKF for ensembles with *N* = 10, *N* = 20, and *N* = 40 members, respectively. Here, the SEnkF is compared only against the ETKF based on the results with an ensemble of size *N* = 10, where all three SR-EnKFs demonstrated comparable performances. As expected, the results show that the SEnKF performs better with increasing ensemble size, and a pattern becomes visible in the rms errors when the ensemble size reaches 40, as we get close to the number of assimilated observations. The rms errors for the SEnKF now vary between 0.54 and 0.75 m, with the smallest rms error obtained using *λ* = 1.1 and a radius of 100 km. Although the results from the ETKF remain comparatively better than the SEnKF, we expect that the SEnKF will converge to similar results with larger ensemble sizes. It is evident from Fig. 6 that the ETKF forecasts are only slightly improved as we increase the ensemble size, but the improvements are not as pronounced as in the SEnKF.

While the averaged rms errors provide a summary statistic of the estimation errors, they fail to provide useful information about the time or location where they occur. We are also interested in certain pointwise errors of maximum water level forecasts along the coast (29°–29.8°N, 94.4°–95.25°W; see Fig. 7) and forecasts of water elevations at particular times along the coast. Specifically, the forecast errors in the times leading up to the landfall event for Hurricane Ike are of particular importance and interest. Since it is not possible to study each configuration, the figures presented below illustrate the improvements in the errors obtained using the ETKF compared to the SEnKF for 2-h forecasts of the storm surge using the best values of the inflation factor *λ* and radii in the LA.

Figures 7 and 8 show plots of the errors between the true forecasts and analysis of water elevations at 0600 UTC 13 September 2008 (an hour before Ike made landfall at 0710 UTC) and 0800 UTC 13 September 2008 (an hour after Ike made landfall), respectively. The results are obtained from the empirically determined best choices of inflation factor and LA for ETKF (*n* = 10) and SEnKF (*n* = 40). In general, all forecasts underpredict the level of the surge, which we expect given the coarse discretization in the forecast model causing the dissipation of water levels to be more pronounced. The analysis step efficiently improves the quality of the state estimates and brings the model into better agreement with the data, and the pertinent comparative question is which filter provides the better relative errors in the forecast. It is evident that the ETKF provides more accurate forecasts, and especially analyses, compared to the SEnKF across a majority of the area near the coastline during the landfall period. The errors inside the bay are not resolved after the analysis. These errors are more pronounced in the SEnKF and again due to different configurations of the forecast model from the hindcast model as described in the configuration section 5a. Because of the lack of observation stations in this area these errors are not reduced after the filter update step.

Figures 9 and 10 show plots of the hydrographs of data from the hindcast at two stations close to the landfall areas. In these hydrographs, the stars denote the true measurements at the assimilation times, the plus signs denote the forecasted results with the 95% confidence intervals represented by the vertical dashed lines centered at plus signs, and the circles are the analyzed results for the ETKF filter with *λ* = 1.2 and a radius of 25 km and the SEnKF filter with *λ* = 1.1 and a radius of 100 km, respectively. We observe that forecast errors increase right before or during the surge. The analysis steps bring the model closer to the truth over the entire assimilation window. In particular, the ETKF filter performs very well, providing accurate forecast updates. Overall, the estimated uncertainties are quite reasonable with the truth falling within the estimated 95% confidence intervals.

Finally, Fig. 11 compares the forecast ensemble standard deviation and rms errors between the forecast ensemble members and the truth for the three stations close to the landfall area during the landfall period. These results are again for the best choices of inflation factor and localization radii (i.e., the ETKF filter with *λ* = 1.2 and a radius of 25 km and the SEnKF filter with *λ* = 1.1 and a radius of 100 km). We observe that the ensemble variances are generally comparable to the rms error. The ETKF produces rms errors that are consistently the smallest during the storm period compared to the SEnKF. By comparison, the rms error in the SEnKF is more consistent with the forecasted ensemble variances, particularly during the period of few hours preceding the landfall.

## 6. Conclusions

We investigated and compared the impacts of covariance inflation and localization on four ensemble Kalman filters, including the stochastic EnKF (SEnKF), the singular evolutive interpolated Kalman (SEIK) filter, the ensemble transform Kalman filter (ETKF), and the ensemble adjustment Kalman filter (EAKF), in the context of real-time short-range storm surge forecasting. To the best of the authors’ knowledge, this is the first study in which the local analysis (LA) technique is incorporated into these ensemble filters for realistic storm surge forecasting. The experimental results showed that the LA technique can improve the reliability of the surge forecast if the range of influence of the observations is properly specified, although it may not be possible to completely solve the problem of loss of accuracy during the storm surge using a coarse forecast model. Such an issue may instead be treated by including model error into the Kalman filter equations, resolving coarse meshes further, and/or expanding the state vector to include atmospheric parameters defining the wind field. These are topics of ongoing and future research.

The assimilation results also suggest that the (deterministic) square root ensemble Kalman filters (SR-EnKFs) may perform reasonably well even when implemented with small ensemble sizes. Overall, they provided comparable performances, particularly the ETKF and the SEIK. The EAKF was shown to be more sensitive to the choice of inflation and localization, requiring more inflation and stronger localization than the SEIK and the ETKF. The optimal localization radius seems to lie in the same range for all the SR-EnKFs. The SEnKF requires larger ensemble sizes in order to provide results comparable to the other filtering schemes. This is expected, as observation sampling errors are more pronounced in the SEnKF when implemented with small ensemble sizes, consistent with the findings of earlier studies.

## Acknowledgments

Research reported in this publication was supported by the King Abdullah University of Science and Technology (KAUST). X. Luo would like to thank the IRIS/CIPR cooperative research project “Integrated Workflow and Realistic Geology,” which is funded by industry partners ConocoPhillips, Eni, Petrobras, Statoil, and Total, as well as the Research Council of Norway (PETROMAKS) for financial support.

## APPENDIX

### Assimilation Schemes

*n*-member analysis ensemble

*k*− 1)th analysis step, then the set

*k*th step, where

**u**

_{k,i}zare samples of the dynamical noise. The various ensemble Kalman filters differ from each other in their implementations of the assimilation schemes at the analysis step. These are discussed in more details below.

#### a. The stochastic ensemble Kalman filter

The SEnKF is the original ensemble Kalman filter as it has been introduced by Evensen (1994) and latter updated by Burgers et al. (1998) to include stochastic perturbations to the observations, hence, its name stochastic EnKF.

**y**

_{k}is made available, one updates each member of the forecast ensemble by

*n*samples from the normal distribution of mean

**y**

_{k}and covariance

_{k}, and

_{k}is the Kalman gain matrix. The analysis ensemble

*k*th analysis step is thus obtained from Eq. (A3). Propagating

#### b. The (deterministic) square root ensemble Kalman filters

The SR-EnKFs do not require perturbing the observations in order to update the forecast ensemble. In contrast with the SEnKF that updates each ensemble member, the SR-EnKFs only update the forecast mean and a square root matrix of the forecast covariance matrix, in the same as in the square root Kalman filter (see, e.g., Simon 2006, chapter 6). An analysis ensemble is then generated based on the updated mean and square root matrix.

In what follows we discuss three of the SR-EnKFs, namely, the ETKF (see Bishop et al. 2001; Wang et al. 2004), SEIK (see, e.g., Pham 2001), and EAKF (see Anderson 2001).

##### 1) The ensemble transform Kalman filter

**y**

_{k}, the forecast mean

_{k}is the Kalman gain of the ETKF. The updated state

_{k}, called the transform matrix (Bishop et al. 2001; Wang et al. 2004), is a square root of

**Φ**

_{k}(Bishop et al. 2001; Wang et al. 2004);

**Ξ**

_{k}, called the centering matrix, satisfies

**Ξ**

_{k}(

**Ξ**

_{k})

^{T}=

**1**

_{n}being the

*n*-dimensional vector whose elements are all equal to 1. In this work, the centering matrix used in the ETKF is the same as that in Wang et al. [2004, their Eq. (C15)]. Different ways in constructing the centering matrices are also available in the literature (see, e.g., Pham 2001). For the experiments in this work, it seems that different centering matrices might not significantly modify the overall behavior of the filters. For instance, if the centering matrix used in the ETKF is replaced by the one in Pham (2001), the assimilation results do not change much.

*i*th column of

##### 2) The singular evolutive interpolated Kalman filter

_{k}and

**y**

_{k}, the analysis mean

_{k}

_{k}is a square root of the analysis covariance, provided that

_{k}is a square root of

_{k}is obtained by conducting Cholesky decomposition on

**Ξ**

_{k}is also constructed, following the method in Pham (2001), such that the conditions

**Ξ**

_{k}

**1**

_{n}=

**0**and

_{k}

_{k}

**Ξ**

_{k})

_{i}denotes the

*i*th column vector of

_{k}

_{k}

**Ξ**

_{k}.

A side remark is that the ETKF can be derived through the SEIK and vice versa (Nerger et al. 2012b). Indeed, such a link is manifested if one rewrites the forecast covariance matrix in the ETKF as _{n} play similar roles to _{k} and

##### 3) The ensemble adjustment Kalman filter

Compared with the ETKF and the SEIK, the EAKF (Anderson 2001) follows an alternative path to conduct the square root update. As discussed above, the square root update formulae in the ETKF and SEIK are in the form of *right multiplication*. In contrast, in the EAKF, the square root update formula is in the form of *left multiplication* instead (Anderson 2001).

In the context of the EAKF, it is customary to assume that the observation error covariance matrix _{k} is diagonal (otherwise a prewhitening procedure can be applied to achieve this). Under this assumption, one can assimilate the incoming observation in a serial way. Following Anderson (2007, 2009), we use a single scalar observation to demonstrate the assimilation algorithm in the EAKF. To this end, we first assume that the observation vector **y**_{k} ≡ *y*_{k} is a scalar random variable, with zero mean and variance *R*_{k}. If the observation vector has more than one element, then one can assimilate the observation vector serially by taking the analysis ensemble after assimilating, say, the first observation element as the forecast one before assimilating the second observation element, and so on.

Suppose that the *i*th ensemble member *m*_{x} elements *j* = 1, …, *m*_{x}), such that

*y*

_{k}, one updates

*δy*

_{k,i}with respect to

*j*th element (

*δ*

**x**

_{k,i})

_{j}of

## REFERENCES

Altaf, M. U., T. Butler, X. Luo, C. Dawson, T. Mayo, and H. Hoteit, 2013: Improving short-range ensemble Kalman storm surge forecasting using robust adaptive inflation.

,*Mon. Wea. Rev.***141**, 2705–2720, doi:10.1175/MWR-D-12-00310.1.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters.

,*Tellus***59A**, 210–224, doi:10.1111/j.1600-0870.2006.00216.x.Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts.

,*Mon. Wea. Rev.***127**, 2741–2758, doi:10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.Beezley, J. D., and J. Mandel, 2008: Morphing ensemble Kalman filters.

,*Tellus***60A**, 131–140, doi:10.1111/j.1600-0870.2007.00275.x.Bennet, A., 1992:

*Inverse Methods in Physical Oceanography.*Cambridge University Press, 346 pp.Berg, R., 2009: Tropical cyclone report: Hurricane Ike. National Hurricane Center Rep., 55 pp.

Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129**, 420–436, doi:10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.Blain, C. A., J. J. Westrink, and R. A. Luettich, 1995: Application of a domain size and gridding strategy in the prediction of hurricane storm surge.

*Computer Modeling of Seas and Coastal Regions II,*L. C. Wrobel, C. A. Brebbia, and L. Traversoni, Eds., Computational Mechanics Publications, 301–308.Blake, E. S., C. W. Landsea, and E. J. Gibney, 2011: The deadliest, costliest, and most intense United States tropical cyclones from 1851 to 2010 (and other frequently requested hurricane facts). NOAA Tech. Memo. NWSNHC-6, 47 pp.

Bocquet, M., and P. Sakov, 2012: Combining inflation-free and iterative ensemble Kalman filters for strongly nonlinear systems.

,*Nonlinear Processes Geophys.***19**, 383–399, doi:10.5194/npg-19-383-2012.Brown, J. D., T. Spencer, and I. Moeller, 2007: Modelling storm surge flooding of an urban area with particular reference to modelling uncertainties: A case study of Canvey Island, United Kingdom.

,*Water Resour. Res.***43,**W06402, doi:10.1029/2005WR004597.Bunya, S., and Coauthors, 2010: A high-resolution coupled riverine flow, tide, wind, wind wave, and storm surge model for southern Louisiana and Mississippi. Part I: Model development and validation.

,*Mon. Wea. Rev.***138**, 345–377, doi:10.1175/2009MWR2906.1.Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126**, 1719–1724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.Butler, T., M. U. Altaf, C. Dawson, I. Hoteit, X. Luo, and T. Mayo, 2012: Data assimilation within the advanced circulation (ADCIRC) modeling framework for hurricane storm surge forecasting.

,*Mon. Wea. Rev.***140**, 2215–2231, doi:10.1175/MWR-D-11-00118.1.Cohn, S. E., and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics.

,*J. Meteor. Soc. Japan***74**, 63–75.Cohn, S. E., A. da Silva, J. Guo, M. Sienkiewicz, and D. Lamich, 1998: Assessing the effects of data selection with the DAO physical-space statistical analysis system.

,*Mon. Wea. Rev.***126**, 2913–2926, doi:10.1175/1520-0493(1998)126<2913:ATEODS>2.0.CO;2.Dietrich, J. C., and Coauthors, 2010: A high-resolution coupled riverine flow, tide, wind, wind wave, and storm surge model for southern Louisiana and Mississippi. Part II: Synoptic description and analyses of Hurricanes Katrina and Rita.

,*Mon. Wea. Rev.***138**, 378–404, doi:10.1175/2009MWR2907.1.Dietrich, J. C., and Coauthors, 2011: Hurricane Gustav (2008) waves and storm surge: Hindcast, synoptic analysis, and validation in southern Louisiana.

,*Mon. Wea. Rev.***139**, 2488–2522, doi:10.1175/2011MWR3611.1.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**, 10 143–10 162, doi:10.1029/94JC00572.Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation.

,*Ocean Dyn.***53**, 343–367, doi:10.1007/s10236-003-0036-9.Fleming, J. G., C. W. Fulcher, R. A. Luettich, B. D. Estrade, G. D. Allen, and H. S. Winer, 2008: A real time storm surge forecasting system using ADCIRC.

*Estuarine and Coastal Modelling,*M. L. Spaulding, Ed., American Society of Civil Engineers, 893–912, doi:10.1061/40990(324)48.Greybush, S. J., E. Kalnay, T. Miyoshi, K. Ide, and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques.

,*Mon. Wea. Rev.***139**, 511–522, doi:10.1175/2010MWR3328.1.Hamill, T. M., and J. S. Whitaker, 2011: What constrains spread growth in forecasts initialized from ensemble Kalman filters?

,*Mon. Wea. Rev.***139**, 117–131, doi:10.1175/2010MWR3246.1.Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.Hamill, T. M., J. S. Whitaker, J. L. Anderson, and C. Snyder, 2009: Comments on “Sigma-point Kalman filter data assimilation methods for strongly nonlinear systems.”

,*J. Atmos. Sci.***66**, 3498–3500, doi:10.1175/2009JAS3245.1.Heaps, N. S., 1983: Storm surges, 1967–1982.

,*Geophys. J. Int.***74**, 331–376, doi:10.1111/j.1365-246X.1983.tb01883.x.Holland, G., 1980: An analytic model of the wind and pressure profiles in hurricanes.

,*Mon. Wea. Rev.***108**, 1212–1218, doi:10.1175/1520-0493(1980)108<1212:AAMOTW>2.0.CO;2.Hope, M. E., and Coauthors, 2013: Hindcast and validation of Hurricane Ike (2008) waves, forerunner, and storm surge.

*J. Geophys. Res. Oceans,***118,**4424–4460, doi:10.1002/jgrc.20314.Horn, R., and C. Johnson, 1991:

*Topics in Matrix Analysis.*Cambridge University Press, 607 pp.Hoteit, I., D. T. Pham, and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific.

,*J. Mar. Syst.***36**, 101–127, doi:10.1016/S0924-7963(02)00129-X.Hoteit, I., T. Hoar, G. Gopalakrishnan, J. Anderson, N. Collins, B. Cornuelle, A. Kohl, and P. Heimbach, 2013: A MITgcm/DART ensemble analysis and prediction system with application to the Gulf of Mexico.

,*Dyn. Atmos. Oceans***63**, 1–23, doi:10.1016/j.dynatmoce.2013.03.002.Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811, doi:10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.Hunt, B. R., E. J. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter.

,*Physica D***230**, 112–126, doi:10.1016/j.physd.2006.11.008.Janjić, T., L. Nerger, A. Albertella, J. Schröter, and S. Skachko, 2011: On domain localization in ensemble-based Kalman filter algorithms.

,*Mon. Wea. Rev.***139**, 2046–2060, doi:10.1175/2011MWR3552.1.Kalman, R., 1960: A new approach to linear filtering and prediction problems.

*J. Fluids Eng.,***82**, 35–45, doi:10.1115/1.3662552.Kennedy, A., and Coauthors, 2011: Origin of the Hurricane Ike forerunner surge.

*Geophys. Res. Lett.,***38,**L08608, doi:10.1029/2011GL047090.Luettich, R., and J. Westerink, 2005: ADCIRC: A parallel advanced circulation model for oceanic, coastal and estuarine waters. User’s manual for version 45.08. [Available online at http://adcirc.org/.]

Luo, X., and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform.

,*Physica D***238**, 549–562, doi:10.1016/j.physd.2008.12.003.Luo, X., and I. Hoteit, 2011: Robust ensemble filtering and its relation to covariance inflation in the ensemble Kalman filter.

,*Mon. Wea. Rev.***139**, 3938–3953, doi:10.1175/MWR-D-10-05068.1.Luo, X., and I. Hoteit, 2012: Ensemble Kalman filtering with residual nudging.

,*Tellus***64A**, 17130, doi:10.3402/tellusa.v64i0.17130.Luo, X., and I. Hoteit, 2013: Covariance inflation in the ensemble Kalman filter: A residual nudging perspective and some implications.

,*Mon. Wea. Rev.***141**, 3360–3368, doi:10.1175/MWR-D-13-00067.1.Meng, Z., and F. Zhang, 2007: Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part II: Imperfect model experiments.

,*Mon. Wea. Rev.***135**, 1403–1423, doi:10.1175/MWR3352.1.Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its implementation with the local ensemble transform Kalman filter.

,*Mon. Wea. Rev.***139**, 1519–1535, doi:10.1175/2010MWR3570.1.Nerger, L., W. Hiller, and J. Schröter, 2005: A comparison of error subspace Kalman filters.

,*Tellus***57A**, 715–735, doi:10.1111/j.1600-0870.2005.00141.x.Nerger, L., T. Janjić, J. Schröter, and W. Hiller, 2012a: A regulated localization scheme for ensemble-based Kalman filters.

,*Quart. J. Roy. Meteor. Soc.***138**, 802–812, doi:10.1002/qj.945.Nerger, L., T. Janjić, J. Schröter, and W. Hiller, 2012b: A unification of ensemble square root Kalman filters.

,*Mon. Wea. Rev.***140**, 2335–2345, doi:10.1175/MWR-D-11-00102.1.Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems.

,*Mon. Wea. Rev.***129**, 1194–1207, doi:10.1175/1520-0493(2001)129<1194:SMFSDA>2.0.CO;2.Pham, D. T., J. Verron, and M. C. Roubauda, 1998: Singular evolutive Kalman filters for data assimilation in oceanography.

,*J. Mar. Syst.***16**, 323–340, doi:10.1016/S0924-7963(97)00109-7.Sakov, P., and L. Bertino, 2011: Relation between two common localisation methods for the EnKF.

,*Comput. Geosci.***15**, 225–237, doi:10.1007/s10596-010-9202-6.Simon, D., 2006:

*Optimal State Estimation: Kalman, H-Infinity, and Nonlinear Approaches.*Wiley-Interscience, 552 pp.Tanaka, S., S. Bunya, J. J. Westerink, C. Dawson, and R. A. Luettich Jr., 2011: Scalability of an unstructured grid continuous Galerkin based hurricane storm surge model.

,*J. Sci. Comput.***46,**329–358, doi:10.1007/s10915-010-9402-1.Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters.

,*Mon. Wea. Rev.***131**, 1485–1490, doi:10.1175/1520-0493(2003)131<1485:ESRF>2.0.CO;2.Verlaan, M., and A. W. Heemink, 1997: Tidal flow forecasting using reduced rank square root filters.

,*Stochastic Hydrol. Hydraul.***11**, 349–368, doi:10.1007/BF02427924.Wang, X., C. H. Bishop, and S. J. Julier, 2004: Which is better, an ensemble of positive–negative pairs or a centered simplex ensemble.

,*Mon. Wea. Rev.***132**, 1590–1605, doi:10.1175/1520-0493(2004)132<1590:WIBAEO>2.0.CO;2.Westerink, J. J., and Coauthors, 2008: A basin- to channel-scale unstructured grid hurricane storm surge model applied to southern Louisiana.

,*Mon. Wea. Rev.***136**, 833–864, doi:10.1175/2007MWR1946.1.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation.

,*Mon. Wea. Rev.***140**, 3078–3089, doi:10.1175/MWR-D-11-00276.1.Wolf, P., 2003: 1953 U.K. floods: 50-year retrospective. Risk Management Solutions Rep., 11 pp. [Available online at http://storage.pardot.com/15772/68008/fl_1953_uk_floods_50_retrospective.pdf.]

Zhang, F., C. Snyder, and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter.

,*Mon. Wea. Rev.***132**, 1238–1253, doi:10.1175/1520-0493(2004)132<1238:IOIEAO>2.0.CO;2.Zupanski, M., 2005: Maximum likelihood ensemble filter: Theoretical aspects.

,*Mon. Wea. Rev.***133**, 1710–1726, doi:10.1175/MWR2946.1.

^{1}

Covariance inflation is also done through the forgetting factor in Pham et al. (1998).