1. Introduction
The ensemble Kalman filter (EnKF; see Burgers et al. 1998; Evensen 2006) and its variants (including, e.g., Anderson 2001; Bishop et al. 2001; Hoteit et al. 2002; Luo and Moroz 2009; Pham 2001; Tippett et al. 2003; Wang et al. 2004; Whitaker and Hamill 2002) can be considered as Monte Carlo implementations of the celebrated Kalman filter (Kalman 1960), in the sense that the mean and covariance of the Kalman filter are evaluated based on a finite (often small) number of samples of the underlying model states. Because of its ability to handle large-scale data assimilation problems, and its relative simplicity in implementation, the EnKF has received great attention from researchers in various fields.
In data assimilation, there are certain factors that may influence the performance of the EnKF. For instance, if the EnKF is implemented with a relatively small ensemble size, then the filter will often be subject to sampling errors. This may lead to some adverse effects (especially in high-dimensional models), including, for instance, underestimation of the variances of state variables, overestimation of the correlations between different state variables, and rank deficiency of the sample error covariance matrix (Whitaker and Hamill 2002; Hamill et al. 2009). In the literature, it is customary to adopt two auxiliary techniques: covariance inflation (Anderson and Anderson 1999) and covariance localization (Hamill et al. 2001), to improve the performance of the EnKF. Intuitively, covariance inflation compensates for the underestimated variances by artificially increasing it to some extent. It also increases the robustness of the EnKF from the point of view of H∞ filtering theory (Luo and Hoteit 2011). Various methods of covariance inflation are proposed in the literature (e.g., see Altaf et al. 2013; Anderson and Anderson 1999; Anderson 2007, 2009; Bocquet 2011; Bocquet and Sakov 2012; Luo and Hoteit 2011, 2013; Miyoshi 2011; Meng and Zhang 2007; Ott et al. 2004; Song et al. 2013; Triantafyllou et al. 2013; Whitaker and Hamill 2012; Zhang et al. 2004). On the other hand, covariance localization aims to taper the overestimated correlations through, for instance, a Schur product between the sample error covariance matrix and a certain tapering matrix. In effect, this also increases the rank of the sample error covariance matrix (Hamill et al. 2009).
Even equipped with both covariance inflation and localization, the EnKF may still suffer from filter divergence in certain circumstances, especially when there is substantial uncertainty, for example, in terms of model and/or observation errors, in data assimilation problems (see, e.g., the numerical results in Luo and Hoteit 2012). To mitigate filter divergence, in previous studies (Luo and Hoteit 2014, 2013, 2012) we considered a strategy, called data assimilation with residual nudging (DARN), which monitors and, if necessary, adjusts the distances (called residual norms) between the real observations and the simulated ones. Our numerical results showed that, under certain circumstances, a data assimilation algorithm equipped with residual nudging is not only more stable against filter divergence, but also performs better in terms of estimation accuracy.
The analytical and numerical results in Luo and Hoteit (2014, 2013, 2012) also show that, for linear observation operators, one is able to control the magnitudes of the residual norms under suitable conditions. An issue that we did not address yet is the nonlinearity in the observation operators. Our main motivation here is thus to fill this gap. To this end, we recast DARN as a least squares problem and adopt an iterative filtering framework1 to tackle the nonlinearity in the observation operators. Using this iterative filtering framework, one can achieve the objective of residual nudging under suitable conditions. For convenience, we refer to the observations from a linear (or nonlinear) observation operator as “linear observations” (or “nonlinear observations”), when it causes no confusion.
This work is organized as follows. Section 2 introduces the idea of DARN and outlines the method used in Luo and Hoteit (2013) for residual nudging with linear observations. In section 3, the aforementioned method is extended and modified to tackle problems with nonlinear observations. In section 4, various experiments are conducted to compare the proposed method with some existing algorithms in the literature. In addition, the stability of the proposed method is also investigated under different experimental settings. Finally, section 5 details our conclusions.
2. Residual nudging with linear observations
In this section, we focus on the case with linear observations. To this end, we rewrite the observation operator
Two methods were proposed in Luo and Hoteit (2014, 2013, 2012) for the purpose of residual nudging. In Luo and Hoteit (2014, 2012) it was suggested to solve a linear equation first, and then combine the resulting solution (called the “observation inversion”) with the original state estimate. In a follow-up work (Luo and Hoteit 2013), residual nudging was recast as a problem of choosing a proper covariance inflation factor, and some sufficient conditions in this regard were explicitly derived for the analysis residual norm to be bounded in the interval
Please note that from the above deduction, one can relate residual nudging to certain forms of covariance inflation. As discussed in Luo and Hoteit (2011), a Kalman filter (or ensemble Kalman filter) with covariance inflation is essentially a H∞ filter (or its ensemble implementation, see Luo and Hoteit 2011). Compared with the Kalman filter (or its ensemble variants), the H∞ filter (or its ensemble variants) puts more emphasis on the robustness of the estimation (Simon 2006). For more details of the similarities and differences between the Kalman and H∞ filtering methods, readers are referred to Luo and Hoteit (2011) and the references therein.
3. Residual nudging with nonlinear observations
When the observation operator
In data assimilation practices, one often has an initial state estimate with a relatively large residual norm, while it may be more difficult to have readily available a state estimate with a sufficiently small residual norm. Therefore, in what follows, we present an iterative framework that aims to construct a sequence of model states with gradually decreasing residual norms as the iteration index increases. If the iteration process [see Eq. (11) later] is long enough, the residual norm may become sufficiently low such that Eq. (3) is satisfied.
a. Iteration process to reduce the residual norm
Some remarks regarding the cost function in Eq. (9) are in order. First, for the objective in Eq. (3) of residual nudging, it is intuitive to use only the first term (called the data mismatch term hereafter) in Eq. (9) as the cost function (see, e.g., Kalnay and Yang 2010), which corresponds to the choice of γ = 0 in Eq. (9). In many situations, minimizing the term
In the literature, certain iteration processes are derived based on the cost function in Eq. (9) with γ = 1. As in the maximum likelihood ensemble filter (MLEF; see Zupanski 2005) and other similar iterative ensemble filters (see, e.g., Lorentzen and Nævdal 2011; Sakov et al. 2012), the rationale behind the choice of γ = 1 may be largely explained from the point of view of Bayesian filtering, in the sense that the solution of Eq. (9) corresponds to the maximum a posterior (MAP) estimate, when both the model state and the observation follow certain Gaussian distributions. However, from a practical point of view, such an interpretation may be only approximately valid in many situations. This is not only because the Gaussianity assumption may be invalid in many nonlinear dynamical models, but also because in reality it is often very challenging to accurately evaluate certain statistics (e.g., the error covariance matrices) of both the model state and the observation in large-scale problems.
With that said, in the iteration process below, we do not confine ourselves to a fixed cost function with either γ = 0 or γ = 1. Instead, we let γ be adaptive with the iteration steps, which facilitates the gradual reduction of the residual norm of the state estimate, and is thus useful for the purpose of residual nudging. In Bocquet and Sakov (2012), an iteration process with essentially adaptive γ values is also introduced by combining the original inflation method in Bocquet (2011) and the iterative EnKF in Sakov et al. (2012). Note that in Bocquet and Sakov (2012) and Sakov et al. (2012) the cost functions are constructed with respect to the observations both at the present time (the so-called EnKF-N) and ahead in time (the so-called IEnKF-N) with respect to the model states to be optimized, while in the current work, the observations and the model states to be estimated are in the same assimilation cycles.
For convenience of discussion, let
In this work, task 1 is undertaken by introducing a local linearization to the cost function in Eq. (9) at each iteration step, following Engl et al. (2000, chapter 11). More precisely, this involves linearizing the nonlinear operator
From the point of view of the deterministic inverse problem theory, Eq. (11) can also be considered as an implementation of the regularized Levenberg–Marquardt method (see, e.g., Engl et al. 2000, chapter 11), with the weight matrices for the data mismatch and regularization terms being
Figure 1 provides a schematic outline of the iteration process. Given a pair of quantities
It is worth noting that Eq. (11) is similar to the iteration formulas used in Chen and Oliver (2013) and Emerick and Reynolds (2013) in the context of the ensemble smoother (ES; see Evensen and van Leeuwen 2000), and in Stordal and Lorentzen (2014) in the context of the iterative adaptive Gaussian mixture (AGM) filter (Stordal et al. 2011). In Emerick and Reynolds (2013), a constraint,
b. Implementation in the framework of the ETKF
In this section, we consider incorporating the proposed iteration process [Eq.(11)] into the ETKF. The resulting filter is thus referred to as the iterative ETKF with residual nudging (IETKF-RN) hereafter. The idea here is to use the final model state
The remaining issues then involve specifying the following quantities in the iteration process: the covariance
1) Specifying the covariance
To evaluate the gain matrix in Eq. (4b), one needs to compute the matrix product,
Please note that if
2) Evaluating the Jacobian matrix
If the derivative of the observation operator
3) Updating the parameter
The initial value γ0 is chosen in a way such that relatively small changes are introduced to
The deterministic inverse problem theory (see, e.g., Engl et al. 2000, chapter 11) suggests that any parameter rule satisfying Eq. (13) is sufficient for the purpose of residual nudging. In our implementation, however,
4. Experiments
a. Experimental settings
The L96 model is integrated by the fourth-order Runge–Kutta method with a constant integration step of 0.05. In many of the experiments below, the following default settings are adopted unless otherwise stated: the L96 model is integrated from time 0 to 75 [section 4b(1)] or 525 [section 4b(2)] with the forcing term F = 8. To avoid the transition effect, the trajectory between 0 and 25 is discarded, and the rest [1000 and 10 000 integration steps in sections 4b(1) and 4b(2), respectively] is used as the truth in data assimilation. For convenience, we relabel the time step at 25.05 as step 1. The synthetic observation yk is obtained by measuring the odd number elements (xk,1, xk,3, …) of the state vector xk = (xk,1, xk,2, …, xk,40)T every four time steps (k = 4, 8, 12, …), in which the observation operator is given by
To generate the initial background ensemble, we run the L96 model from 0 to 5000 (overall 100 000 integration steps), and compute the temporal mean xlt and covariance
In all the experiments below, neither covariance inflation nor covariance localization is applied to the IETKF-RN. The former choice is because, in the presence of parameter γi in Eq. (11), conducting extra covariance inflation is equivalent to changing the initial value γ0, which is investigated in an experiment below. With regard to localization, our experience suggests that, in some cases (e.g., that with the default experimental settings at the beginning of this section and 20 ensemble members), conducting covariance localization may be beneficial for the IETKF-RN in the L96 model. In general, however, it is likely that the presence of covariance localization may alter the behavior of IETKF-RN, in the sense that there is no guarantee any more that the iteration process [Eq. (11)], when equipped with covariance localization, moves along a residual-norm descent direction. Therefore, for our purpose, it appears more illustrative and conclusive for us to demonstrate only the performance of the IETKF-RN without localization.
b. Experiment results
1) A comparison study among some algorithms
A comparison study is first conducted to investigate the performance of the IETKF-RN relative to the following algorithms: the normal ETKF (Bishop et al. 2001; Wang et al. 2004), the approximate Levenberg–Marquardt ensemble randomized maximum likelihood (LM-EnRML) method (Chen and Oliver 2013), and the iteration process of Eq. (11) with γi = 1 ∀ i fixed during the iteration process [for distinction, we call this algorithm “IETKF-RN (constant γ)”]. In the last algorithm, the iteration process aims to find a (local) minimum with respect to the cost function in Eq. (9) with γ = 1, which is essentially the same cost function adopted in, for example, the MLEF (Zupanski 2005). In this sense, the IETKF-RN (constant γ) algorithm can be considered as an alternative to the MLEF, with one of the differences from the MLEF being in the chosen optimization algorithm: in the MLEF, the conjugate gradient algorithm is adopted to minimize the cost function, while in the IETKF-RN (constant γ) algorithm, the Levenberg–Marquardt method is used instead. To show the necessity of using adaptive γ values in certain circumstances, it would be desirable to conduct the comparison under the same conditions as far as possible. Therefore in what follows, we compare the IETKF-RN (with adaptive γ) with the IETKF-RN (constant γ), rather than directly with the MLEF.
It is also worth commenting on a difference between the iteration processes of the IETKF-RN and the LM-EnRML. In the LM-EnRML, the terms
The normal ETKF is tested with the cubic observation function defined in section 4a, with both covariance inflation and localization. In the experiments, we vary the inflation factor and half-width of covariance localization within certain chosen ranges,4 and we observe that the normal ETKF ends up with large root-mean-squared errors (RMSEs) in all tested cases, suggesting that the normal ETKF has in fact diverged. Divergences of the EnKF have also been reported in other studies with nonlinear observations (see, e.g., Jardak et al. 2010).
Figure 2 reports the time series of residual norms (top panel) and the corresponding RMSEs (bottom panel) obtained by applying the approximate LM-EnRM method with the same cubic observation function. The top panel of Fig. 2 plots the background residual norm (dash–dotted line) and that of the final iterative estimate (also called the final analysis hereafter) of the iteration process (solid line), together with the targeted upper bound (dashed line), which is the threshold
Figure 3 plots the time series of the residual norms over the assimilation time window (top panel); residual norm reduction of the iteration process at time step 500 (middle panel), an example that illustrates gradual residual norm reduction during the iteration process; and the time series of the corresponding RMSEs of the final estimates (bottom panel), when the IETKF-RN (constant γ) algorithm is adopted to assimilate the cubic observations. Compared with Fig. 2, it is clear that in the top panel of Fig. 3, the residual norms of the final analysis estimates tend to be lower than the background ones in each assimilation cycle. In particular, in some cases, the final analysis residual norms approach, or even become slightly lower than, the prechosen upper bound of 8.94, while the corresponding initial background residual norms are often larger than 100. As a consequence of residual norm reduction, the corresponding time mean RMSE in the bottom panel reduces to 3.38, smaller than that in Fig. 2. Also note that the time series of the residual norms (top panel) appears spiky. This may be because the estimation errors at certain time instants are relatively large (although the corresponding final analysis residual norms may have reasonable magnitudes). Consequently, after model propagation, the resulting background ensembles may have relatively large residual norms. In addition, the iteration process at those particular time instants may converge slowly, or may be trapped around certain local optima, such that the final analysis residual norms are only slightly lower than the background ones (hence the spikes). This phenomenon is also found in other experiments later.
Similar results (see Fig. 4) are also observed when the iteration process in Eq. (11) is adopted, in conjunction with the adaptive parameter rule as described in section 3b, to assimilate the cubic observations. One may see that the time mean RMSEs in Figs. 3 and 4 are close to each other. In this sense, it appears acceptable in this case to simply take γi = 1 for all i, instead of adopting the more sophisticated parameter rule in section 3b.
In what follows, though, we show with an additional example that the iteration process, when equipped with the adaptive parameter rule in section 3b, tends to make the IETKF-RN more stable against filter divergence. To this end, we consider an exponential observation function
The better stability of the IETKF-RN with adaptive γ, in comparison to the IETKF-RN (constant γ) algorithm in the case of exponential observations, may be understood from the optimization-theoretic point of view, when the iteration process in Eq. (11) is interpreted as a gradient-based optimization algorithm. For this type of optimization algorithm, it is usually suggested to start with a relatively small step size, so that the linearization involved in the algorithms may remain roughly valid (Nocedal and Wright 2006). In this regard, the IETKF-RN with adaptive γ may appear to be more flexible (e.g., one may make the initial step size small enough by choosing a large enough value for γ0), while there is no guarantee that the IETKF-RN (constant γ) algorithm may produce a small enough step size in general situations.
2) Stability of the IETKF-RN under various experimental settings
Here we mainly focus on examining the stability of the IETKF-RN with adaptive γ under various experimental settings. To this end, in the experiments below, we adopt assimilation time windows that are longer than those in the previous section. We note that the stability of the algorithm demonstrated below should be interpreted within the relevant experimental settings, and should not be taken for granted under different conditions (e.g., when with longer assimilation time windows).
Unless otherwise mentioned, in this section, the default experimental settings are as follows. The IETKF-RN is applied to assimilate cubic observations of the odd number state variables every 4 time steps, with the length of the assimilation time window being 10 000 time steps. The variances of observation errors are 1. The IETKF-RN runs with 20 ensemble members and a maximum of 15 000 iteration steps.
In the first experiment, we examine the performance of the IETKF-RN with both linear and nonlinear observations [linear observations are obtained by applying f(x) = x to specified state variables, plus certain observation errors]. For either linear or nonlinear observations, there are two observation scenarios: one with all 40 state variables being observed (the full observation scenario), and the other with only the odd number state variables being observed (the half observation scenario). In each observation scenario, we consider the following four ensemble sizes: 5, 10, 15, and 20. For each ensemble size, we also vary the frequency, in terms of the number fa of time steps, with which the observations are assimilated (i.e., the observations are assimilated every fa time steps). In the experiment, the variances of observation errors are 1, and fa is taken from the set (1, 2, 4: 4: 60), where the notation υi: δυ: υf is used to denote an array of numbers that grows from the initial value υi to the final one υf, with an even increment δυ each time.
Figure 6 shows the time mean RMSEs (averaged over 10 000 time steps) as functions of the ensemble size and the observation frequency, in the full and half observation scenarios, respectively. In the full observation scenario (top panels) and the half observation scenario with linear observations (Fig. 6c), for each ensemble size, the corresponding time mean RMSE appears to be a monotonically increasing function of the number fa of time steps. On the other hand, when fa is relatively small, it appears that the time mean RMSEs of all ensemble sizes are close to each other. As fa increases, a larger ensemble size tends to yield a smaller time mean RMSE, although violations of this tendency may also be spotted in some cases of Fig. 6b, possibly due to the sampling errors in the filter. In the half observation scenario with nonlinear observations (Fig. 6d), the behavior of the IETKF-RN is similar to those in the other cases. There is, however, also a clear difference: instead of being a monotonically increasing function of fa, the time mean RMSE in Fig. 6d exhibits V-shaped behavior when fa is relatively small, achieving the lowest value at fa = 2, rather than at fa = 1 (possibly because the observations are overfitted at fa = 1).
Overall, Fig. 6 indicates that, the time mean RMSEs with linear observations (Figs. 6a,c) tend to be lower than those with nonlinear observations (Figs. 6b,d), suggesting that the nonlinearity in the observations may deteriorate the performance of the filter. On the other hand, the time mean RMSEs in the full observation scenarios (Figs. 6a,b) tend to be lower than those in the half observation scenarios (Figs. 6c,d). The latter may be explained from the point of view of solving a (linear or nonlinear) equation. In the full observation scenarios, at each time step that has an incoming observation yo, the number of state variables is equal to the observation size p. Therefore (provided that it is solvable), the equation
As a side remark, we note that it is possible for one to further improve the performance of the IETKF-RN in Fig. 6 with other experimental settings. For instance, for the half observation scenario with linear observations (Fig. 6c), if one lets
We also follow Sakov et al. (2012) to test the IETKF-RN with a longer assimilation time window that consists of 100 000 time steps. Here the half (nonlinear) observation scenario is investigated, with the ensemble size being 20 and the observation frequency being every 4 time steps. Under these experimental settings, Fig. 7 shows that the IETKF-RN runs stably, and its time mean RMSE is around 3.30, close to the values in the corresponding panels of Figs. 4 and 6.
Next, we test the performance of the IETKF-RN with different variances of observation errors. The experimental settings here are similar to those in Fig. 6d, except that the variances of observation errors are 0.01 and 10, respectively. As can been seen in Fig. 8, for these two variances, the IETKF-RN also runs stably for all tested ensemble sizes and observation frequencies. Comparing Fig. 6d and the panels of Fig. 8, the IETKF-RN exhibits similar behaviors in these cases. It also indicates that when fa is relatively small (i.e., fa = 4), smaller variances tend to lead to lower time mean RMSEs; while when fa is relatively large (i.e., fa = 60), the situation seems to be the opposite. Since both fa and the variances of observation errors affect the quality of the subsequent background ensembles, we conjecture that the above phenomenon occurs because for different combinations of fa and variances, different relative weights are assigned to the background ensembles and the observations at the analysis steps. As a result, when fa is relatively large, smaller variances do not necessarily lead to lower time mean RMSEs. Similar results are also found in, for example, Luo and Hoteit (2014, their Fig. 7).
We also investigate the effect of the maximum number of iterations on the performance of the IETKF-RN. The set of maximum numbers of iterations tested in the experiment is (1, 10, 100, 1000, 10 000, 100 000). Figure 9 shows the time mean RMSE of the IETKF-RN as a function of the maximum number of iterations. There, one can see that the time mean RMSE tends to decrease as the maximum number of iterations increases.
Finally, we examine the impacts of (potentially) mis-specifying the forcing term F in the L96 model and/or the variances of observation errors, on the performance of the IETKF-RN. In the experiment, the true forcing term F is 8, and the true variances of observation errors are 1 for all elements of the observations. The tested F values are taken from the set (4: 2: 12), and the tested variances of observation errors are {0.25, 0.5, 1, 2, 5, 10}. Figure 10 reports the time mean RMSE as functions of the forcing term F and the variances of observation errors. One can see that the time mean RMSE seems not very sensitive to the (potential) mis-specification of the variances of observation errors, possibly because with cubic observations,
The performance of the IETKF-RN in Fig. 10, on the other hand, does appear to be sensitive to the potential mis-specification of F. Interestingly, for all tested variances of observation errors, the filter’s best performance is obtained at F = 6, rather than F = 8.6 This suggests that, in certain situations, the filter might actually achieve better performance in the presence of certain suitable model errors, rather than with the perfect model. Similar observations are also reported in the literature (e.g., Gordon et al. 1993; Whitaker and Hamill 2012), in which it is found that introducing certain artificial model errors may in fact improve the performance of a data assimilation algorithm. Overall, Fig. 10 suggests that the IETKF-RN can also run stably even with substantial uncertainty in the system.
5. Conclusions
In this work, we introduced the concept of data assimilation with residual nudging. Based on the method derived in a previous study, we proposed an iterative filtering framework to handle nonlinear observations in the context of residual nudging. The proposed iteration process is related to the regularized Levenberg–Marquardt algorithm from inverse problem theory. Such an interpretation motivated us to implement the proposed algorithm with an adaptive coefficient γ.
For demonstration, we implemented an iterative filter based on the ensemble transform Kalman filter (ETKF). Numerical results showed that the resulting iterative filter exhibited remarkable stability in handling nonlinear observations under various experimental settings, and that the filter achieved reasonable performance in terms of root-mean-squared errors.
For data assimilation in large-scale problems, it may not be realistic to conduct a large number of iterations because of the limitation in computational resources. In this regard, one topic in our future research is to explore the possibility of enhancing the convergence rate of the iterative filter.
Acknowledgments
We thank three anonymous reviewers for their constructive comments and suggestions. The first author would also like to thank the IRIS–CIPR cooperative research project “Integrated Workflow and Realistic Geology,” which is funded by industry partners ConocoPhillips, Eni, Petrobras, Statoil, and Total, as well as the Research Council of Norway (PETROMAKS) for financial support.
REFERENCES
Altaf, U. M., T. Butler, X. Luo, C. Dawson, T. Mayo, and H. Hoteit, 2013: Improving short range ensemble Kalman storm surge forecasting using robust adaptive inflation. Mon. Wea. Rev., 141, 2705–2720, doi:10.1175/MWR-D-12-00310.1.
Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.
Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210–224, doi:10.1111/j.1600-0870.2006.00216.x.
Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.
Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 2741–2758, doi:10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.
Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420–436, doi:10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.
Bocquet, M., 2011: Ensemble Kalman filtering without the intrinsic need for inflation. Nonlinear Processes Geophys., 18, 735–750, doi:10.5194/npg-18-735-2011.
Bocquet, M., and P. Sakov, 2012: Combining inflation-free and iterative ensemble Kalman filters for strongly nonlinear systems. Nonlinear Processes Geophys., 19, 383–399, doi:10.5194/npg-19-383-2012.
Bocquet, M., and P. Sakov, 2013: Joint state and parameter estimation with an iterative ensemble Kalman smoother. Nonlinear Processes Geophys., 20, 803–818, doi:10.5194/npg-20-803-2013.
Bocquet, M., and P. Sakov, 2014: An iterative ensemble Kalman smoother. Quart. J. Roy. Meteor. Soc., 140, 1521–1535, doi:10.1002/qj.2236.
Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: On the analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 1719–1724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.
Chen, Y., and D. Oliver, 2013: Levenberg–Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput. Geosci., 17, 689–703, doi:10.1007/s10596-013-9351-5.
Emerick, A. A., and A. C. Reynolds, 2013: Ensemble smoother with multiple data assimilation. Comput. Geosci., 55, 3–15, doi:10.1016/j.cageo.2012.03.011.
Engl, H. W., M. Hanke, and A. Neubauer, 2000: Regularization of Inverse Problems.Springer, 322 pp.
Evensen, G., 2006: Data Assimilation: The Ensemble Kalman Filter.Springer, 279 pp.
Evensen, G., and P. J. van Leeuwen, 2000: An ensemble Kalman smoother for nonlinear dynamics. Mon. Wea. Rev., 128, 1852–1867, doi:10.1175/1520-0493(2000)128<1852:AEKSFN>2.0.CO;2.
Gordon, N. J., D. J. Salmond, and A. F. M. Smith, 1993: Novel approach to nonlinear and non-Gaussian Bayesian state estimation. IEE Proc., F, Radar Signal Process., 140, 107–113, doi:10.1049/ip-f-2.1993.0015.
Grcar, J. F., 2010: A matrix lower bound. Linear Algebra Appl., 433, 203–220, doi:10.1016/j.laa.2010.02.014.
Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128, 2905–2919, doi:10.1175/1520-0493(2000)128<2905:AHEKFV>2.0.CO;2.
Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 2776–2790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.
Hamill, T. M., J. S. Whitaker, J. L. Anderson, and C. Snyder, 2009: Comments on “Sigma-point Kalman filter data assimilation methods for strongly nonlinear systems.” J. Atmos. Sci., 66, 3498–3500, doi:10.1175/2009JAS3245.1.
Hoteit, I., D. T. Pham, and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific. J. Mar. Syst., 36, 101–127, doi:10.1016/S0924-7963(02)00129-X.
Jardak, M., I. M. Navon, and M. Zupanski, 2010: Comparison of sequential data assimilation methods for the Kuramoto–Sivashinsky equation. Int. J. Numer. Methods Fluids, 62, 374–402, doi:10.1002/fld.2020.
Kalman, R., 1960: A new approach to linear filtering and prediction problems. Trans. ASME, Ser. D, J. Basic Eng.,82, 35–45, doi:10.1115/1.3662552.
Kalnay, E., and S.-C. Yang, 2010: Accelerating the spin-up of ensemble Kalman filtering. Quart. J. Roy. Meteor. Soc., 136, 1644–1651, doi:10.1002/qj.652.
Liu, C., Q. Xiao, and B. Wang, 2008: An ensemble-based four-dimensional variational data assimilation scheme. Part I: Technical formulation and preliminary test. Mon. Wea. Rev., 136, 3363–3373, doi:10.1175/2008MWR2312.1.
Lorentzen, R., and G. Nævdal, 2011: An iterative ensemble Kalman filter. IEEE Trans. Automat. Contrib., 56, 1990–1995, doi:10.1109/TAC.2011.2154430.
Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model. J. Atmos. Sci., 55, 399–414, doi:10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.
Luo, X., and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform. Physica D, 238, 549–562, doi:10.1016/j.physd.2008.12.003.
Luo, X., and I. Hoteit, 2011: Robust ensemble filtering and its relation to covariance inflation in the ensemble Kalman filter. Mon. Wea. Rev., 139, 3938–3953, doi:10.1175/MWR-D-10-05068.1.
Luo, X., and I. Hoteit, 2012: Ensemble Kalman filtering with residual nudging. Tellus,64A, 17130, doi:10.3402/tellusa.v64i0.17130.
Luo, X., and I. Hoteit, 2013: Covariance inflation in the ensemble Kalman filter: A residual nudging perspective and some implications. Mon. Wea. Rev., 141, 3360–3368, doi:10.1175/MWR-D-13-00067.1.
Luo, X., and I. Hoteit, 2014: Efficient particle filtering through residual nudging. Quart. J. Roy. Meteor. Soc., 140, 557–572, doi:10.1002/qj.2152.
Meng, Z., and F. Zhang, 2007: Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part II: Imperfect model experiments. Mon. Wea. Rev., 135, 1403–1423, doi:10.1175/MWR3352.1.
Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its implementation with the local ensemble transform Kalman filter. Mon. Wea. Rev., 139, 1519–1535, doi:10.1175/2010MWR3570.1.
Nocedal, J., and S. J. Wright, 2006: Numerical Optimization.2nd ed. Springer, 664 pp.
Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415–428, doi:10.1111/j.1600-0870.2004.00076.x.
Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 1194–1207, doi:10.1175/1520-0493(2001)129<1194:SMFSDA>2.0.CO;2.
Sakov, P., D. S. Oliver, and L. Bertino, 2012: An iterative EnKF for strongly nonlinear systems. Mon. Wea. Rev., 140, 1988–2004, doi:10.1175/MWR-D-11-00176.1.
Simon, D., 2006: Optimal State Estimation: Kalman, H-Infinity, and Nonlinear Approaches. Wiley-Interscience, 552 pp.
Song, H., I. Hoteit, B. D. Cornuelle, X. Luo, and A. C. Subramanian, 2013: An adjoint-based adaptive ensemble Kalman filter. Mon. Wea. Rev., 141, 3343–3359, doi:10.1175/MWR-D-12-00244.1.
Spall, J. C., 1992: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Auto. Control, 37, 332–341, doi:10.1109/9.119632.
Stordal, A. S., and R. J. Lorentzen, 2014: An iterative version of the adaptive Gaussian mixture filter. Comput. Geosci., doi:10.1007/s10596-014-9402-6, in press.
Stordal, A. S., H. A. Karlsen, G. Nævdal, H. J. Skaug, and B. Vallès, 2011: Bridging the ensemble Kalman filter and particle filters: The adaptive Gaussian mixture filter. Comput. Geosci., 15, 293–305, doi:10.1007/s10596-010-9207-1.
Tarantola, A., 2005: Inverse Problem Theory and Methods for Model Parameter Estimation.SIAM, 352 pp.
Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 1485–1490, doi:10.1175/1520-0493(2003)131<1485:ESRF>2.0.CO;2.
Triantafyllou, G., I. Hoteit, X. Luo, K. Tsiaras, and G. Petihakis, 2013: Assessing a robust ensemble-based Kalman filter for efficient ecosystem data assimilation of the Cretan Sea. J. Mar. Syst., 125, 90–100, doi:10.1016/j.jmarsys.2012.12.006.
Wang, X., C. H. Bishop, and S. J. Julier, 2004: Which is better, an ensemble of positive-negative pairs or a centered simplex ensemble? Mon. Wea. Rev., 132, 1590–1605, doi:10.1175/1520-0493(2004)132<1590:WIBAEO>2.0.CO;2.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 1913–1924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.
Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation. Mon. Wea. Rev., 140, 3078–3089, doi:10.1175/MWR-D-11-00276.1.
Yang, S.-C., E. Kalnay, and B. Hunt, 2012: Handling nonlinearity in an ensemble Kalman filter: Experiments with the three-variable Lorenz model. Mon. Wea. Rev., 140, 2628–2646, doi:10.1175/MWR-D-11-00313.1.
Zhang, F., C. Snyder, and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter. Mon. Wea. Rev., 132, 1238–1253, doi:10.1175/1520-0493(2004)132<1238:IOIEAO>2.0.CO;2.
Zupanski, M., 2005: Maximum likelihood ensemble filter: Theoretical aspects. Mon. Wea. Rev., 133, 1710–1726, doi:10.1175/MWR2946.1.
Here, by “iterative” we mean the presence of an iteration process [Eq. (11)] in each data assimilation cycle.
Examples may include, for instance, neural networks or certain commercial software.
If necessary, one may choose a larger value for γ0 [meaning a smaller step size in Eq. (11)], in order for a more accurate first-order Taylor approximation in Eq. (10). A consequence of such a choice, however, is that more iteration steps may be needed to reduce the residual norm by the same amount.
Specifically the inflation factor δ ∈ {1.05, 1.1, 1.15, …, 1.30}, and the half-width lc ∈ {0.1, 0.3, 0.5, 0.7, 0.9}.
For instance, when the inflation factor δ = 0.08, the half-width lc = 0.1 and βu = 1, it is found that the time mean RMSEs of the IETKF-RN are around 0.50, 0.68, and 1.15, respectively, given fa = 1, 2, and 4.
In the full observation scenario, however, the lowest time mean RMSE is indeed achieved at F = 8 (results not shown).