## 1. Data assimilation with residual nudging

A finite, often small, ensemble size has some well-known effects that may substantially influence the behavior of an ensemble Kalman filter (EnKF). These effects include, for instance, rank deficient sample error covariance matrices, systematically underestimated error variances, and, in contrast, exceedingly large error cross covariances of the model state variables (Whitaker and Hamill 2002). In the literature, the latter two issues are often tackled through covariance localization (Hamill et al. 2001), while the first issue, underestimation of sample variances, is often handled by covariance inflation (Anderson and Anderson 1999), in which one artificially increases the sample variances, either multiplicatively (see, e.g., Anderson and Anderson 1999; Anderson 2007, 2009; Bocquet and Sakov 2012; Miyoshi 2011) or additively (see, e.g., Hamill and Whitaker 2011), or in a hybrid way by combining both multiplicative and additive inflation methods (see, e.g., Whitaker and Hamill 2012), or through other ways such as relaxation to the prior (Zhang et al. 2004), multischeme ensembles (Meng and Zhang 2007), modification of the eigenvalues of sample error covariance matrices (Altaf et al. 2013; Luo and Hoteit 2011; Ott et al. 2004; Triantafyllou et al. 2013), or back projection of the residuals to construct new ensemble members Song et al. (2010), to name but a few. In general, covariance inflation tends to increase the robustness of the EnKF against uncertainties in data assimilation (Luo and Hoteit 2011) and, often, also improves the filter performance in terms of estimation accuracy.

The focus of this article is on the study of the effect of covariance inflation from the point of view of residual nudging (Luo and Hoteit 2012). Here, the “residual” with respect to an *m*-dimensional system state **x** is a vector in the observation space, defined as **x** − **y**,^{1} where **y** is the corresponding *p*-dimensional observation vector. Throughout this paper, our discussion is confined to the filtering (or analysis) step of the EnKF, so that the time index in the EnKF is dropped. The linearity assumption in the observation operator

**v**is the vector of observation error, with zero mean and a nonsingular covariance matrix

^{1/2}

^{T/2}, where

^{1/2}is a nonsingular square root of

^{T/2}denotes the transpose of

^{1/2}.

**z**in the observation space, we adopt the following weighted Euclidean norm:

**z**‖

_{}= ‖

^{−1/2}

**z**‖

_{2}, where ‖•‖

_{2}denotes the standard Euclidean norm. As a result, many topological properties with respect to the standard Euclidean norm, for example, the triangle inequality [see (3) below], still hold with respect to the weighted Euclidean norm.

**x**

^{tr}be the true system state (truth),

**y**

^{o}=

**x**

^{tr}+

**v**

^{o}the recorded observation for a specific realization

**v**

^{o}of the observation error, and

**v**

^{o}‖

_{}. As a result, one may obtain an upper bound of

**v**

^{o}‖

_{}(e.g., in the form of

*β*‖

**v**

^{o}‖

_{}, where

*β*is a nonnegative scalar coefficient). In practice, though, ‖

**v**

^{o}‖

_{}is often unknown. As a remedy, we replace ‖

**v**

^{o}‖

_{}by an upper bound of the expectation

**v**, where

_{p}is the

*p*-dimensional identity matrix. From (4), we have the upper bound

*β*. It is worthy of mentioning that in general it may be difficult to identity which

*β*gives the best state estimation accuracy with respect to the truth

**x**

^{tr}. Therefore, in Luo and Hoteit (2012), we mainly used DARN as a safeguard strategy; that is, if a state estimate

In Luo and Hoteit (2012), we introduced DARN to the analysis

## 2. Covariance inflation from the point of view of residual nudging

Our objective here is to examine under which conditions the residual norm *β _{l}* and

*β*(0 ≤

_{u}*β*≤

_{l}*β*) represent, respectively, the lower and upper values of

_{u}*β*that one wants to set for the analysis residual norm in DARN. Different from the previous works (Luo and Hoteit 2012, 2013), the lower bound

^{−1/2}, one obtains

*λ*

_{max}and

*λ*

_{min}be the maximum and minimum eigenvalues of

*λ*

_{min}is very small, then there is no guarantee that (13) will hold. A small

*λ*

_{min}may appear, for instance, when the ensemble size

*n*is smaller than the dimension

*p*of the observation space. In such circumstances, the matrix

*λ*

_{min}= 0, and the singularity may not be avoided only through the multiplicative covariance inflation. If one cannot afford to increase the ensemble size

*n*, then a few alternative strategies may be adopted to address (or at least mitigate) the problem of singularity. These include, for instance, (a) introducing covariance localization (Hamill et al. 2001) to

*p*of the observation in the update formula, for instance, by assimilating the observation in a serial way (see, e.g., Whitaker and Hamill 2002) or by assimilating the observation within the framework of a local EnKF (see, e.g., Bocquet 2011; Ott et al. 2004). Once the problem of singularity is solved so that the smallest eigenvalue of

*α*,

*δ*, and

*γ*are some positive coefficients and

*α*= 1 (e.g., by moving

*α*inside the parentheses) so that the gain matrix is simplified to

*δ*= 1, then

*γ*being analogous to the multiplicative covariance inflation factor, as used in Anderson and Anderson (1999). In our discussion below, we first derive some inflation constraints in the general case with

*δ*> 0, and then examine the more specific situation with

*δ*= 1. It is expected that one can also obtain constraints for other types of inflations in a similar way, but the results themselves may be case dependent.

**z**with suitable dimensions, one has

_{2}, the induced 2-norm of

_{2}is equal to the square root of the largest eigenvalue of

^{T}(Horn and Johnson 1990, chapter 5). Second, if in addition

*μ*

_{max}and

*μ*

_{min}, respectively. Then, by (17),

*μ*

_{max}and

*μ*

_{min}can be negative (e.g., when

*δ*< 1 and

*γ*→ 0); therefore, ‖Φ‖

_{2}= max(|

*μ*

_{max}|, |

*μ*

_{min}|). By (18) and (19), a sufficient condition for

Depending on the signs and magnitudes of *μ*_{max} and *μ*_{min}, there are in general four possible scenarios: (a) *μ*_{max} ≥ 0 and *μ*_{min} ≥ 0, so that ‖Φ‖_{2} = *μ*_{max}; (b) *μ*_{max} ≤ 0 and *μ*_{min} ≤ 0, so that ‖Φ‖_{2} = −*μ*_{min}; (c) *μ*_{max} ≥ 0, *μ*_{min} ≤ 0, and *μ*_{max} + *μ*_{min} ≥ 0, so that ‖Φ‖_{2} = *μ*_{max}; and (d) *μ*_{max} ≥ 0, *μ*_{min} ≤ 0, and *μ*_{max} + *μ*_{min} ≤ 0, so that ‖Φ‖_{2} = −*μ*_{min}. Inserting (21) into the above conditions, one obtains some inequalities with respect to the variables *δ* and *γ* (subject to *δ* > 0 and *γ* > 0), which are omitted in this paper for brevity.

^{−1}be

*ν*

_{max}and

*ν*

_{min}, respectively; then,

*ν*

_{max}≥ 0 and

*ν*

_{min}≥ 0, so that ‖Φ

^{−1}‖

_{2}=

*ν*

_{max}; (b)

*ν*

_{max}≤ 0 and

*ν*

_{min}≤ 0, so that ‖Φ

^{−1}‖

_{2}= −

*ν*

_{min}; (c)

*ν*

_{max}≥ 0,

*ν*

_{min}≤ 0, and

*ν*

_{max}+

*ν*

_{min}≥ 0, so that ‖Φ

^{−1}‖

_{2}=

*ν*

_{max}; and (d)

*ν*

_{max}≥ 0,

*ν*

_{min}≤ 0, and

*ν*

_{max}+

*ν*

_{min}≤ 0, so that ‖Φ

^{−1}‖

_{2}= −

*ν*

_{min}. Again, inserting (23) into the above conditions, one obtains some inequalities with respect to the variables

*δ*and

*γ*.

*δ*= 1 (corresponding to the update formula in the EnKF) is significantly simplified. Indeed, when

*δ*= 1, the maximum and minimum eigenvalues in (21) and (23) are all positive. Therefore, the following conditions,

*ξ*≥ 1, that is,

_{u}*γ*> 0 would guarantee that

_{2}‖ ≤ 1 with

*δ*= 1] and that inequality (24a) holds. On the other hand, if

*ξ*≥ 1 such that

_{l}^{2}it is impossible for the EnKF to have

*ξ*,

_{u}*ξ*∈ [0, 1). With some algebra, it can be shown that

_{l}*γ*should be bounded by

*κ*=

*λ*

_{max}/

*λ*

_{min}be the condition number of the (normalized) matrix

*β*and

_{l}*β*, in terms of

_{u}Inequality (25) suggests that the upper and lower bounds of *γ* are related to the minimum and maximum eigenvalues of *γ* should be lower bounded; hence, its inverse 1/*γ*, resembling the multiplicative inflation factor, should be upper bounded, as mentioned previously.

*p*of the observation space is large, then it may be expensive to evaluate

*λ*

_{max}and

*λ*

_{min}. In certain circumstances, though, there may be cheaper ways to compute an interval for

*γ*. For instance, if

*c*

_{1}and

*c*

_{2}being some positive scalars and

*λ*

_{max}and

*λ*

_{min}:

*τ*and

*ρ*are the eigenvalues of

*τ*

_{max}is equal to the largest eigenvalue of

*n*and is, in fact, the same as the one used in the ensemble transform Kalman filter (ETKF; Bishop et al. 2001; Wang et al. 2004) in order to obtain the transform matrix. Therefore,

*τ*

_{max}can be taken as a by-product within the framework of ETKF. On the other hand, if both

*ρ*

_{max}and

*ρ*

_{min}of

^{−1/2}

^{T}

^{−T/2}can be calculated offline once and for all. Taking these considerations into account, (25) can be modified as follows:

## 3. Numerical verification

Here, we focus on using the 40-dimensional Lorenz 96 (L96) model (Lorenz and Emanuel 1998) to verify the above analytic results, while more intensive filter (with residual nudging) performance investigations are reported in Luo and Hoteit (2012). The experiment settings are the following. A reference trajectory (truth) is generated by numerically integrating the L96 model (with the driving force term *F* = 8) forward through the fourth-order Runge–Kutta method, with the integration step being 0.05 and the total number of integration steps being 1500. The first 500 steps are discarded to avoid the transition effect, and the remaining 1000 steps are used for data assimilation. To obtain a long-term “background covariance” ^{lt} (“background mean” **x**^{}, respectively), we also conduct a separate long model run with 100 000 integration steps, and take ^{lt} (**x**^{}) as the temporal covariance (mean) of the generated model trajectory. The synthetic observations are generated by adding the Gaussian white noise *N*(0, 1) to each odd numbered element (*x*_{1}, *x*_{3}, …, *x*_{39}) of the state vector **x** = (*x*_{1}, *x*_{2}, …, *x*_{40})^{T} every four integration steps. This corresponds to the ½ observation scenario used in Luo and Hoteit (2012). An initial ensemble with 20 ensemble members is generated by drawing samples from the Gaussian distribution *N*(**x**^{}, ^{lt}), and the ETKF is adopted for data assimilation.

For distinction later, we call the ETKF without residual nudging the normal ETKF, and the ETKF with residual nudging the ETKF-RN. In the normal ETKF, (6) is used for the mean update, with ^{3} Neither covariance inflation nor covariance localization is introduced to the normal ETKF, since for our purposes we wish to use this plain filter setting as the baseline for comparison. One may adopt various inflation and localization techniques to enhance the filter performance, but such an investigation is beyond the scope of this paper.

In the ETKF-RN, we adopt the hybrid scheme *α* = *δ* = 1 and *γ* constrained by (28) and (29). For convenience, we denote the lower and upper bounds of *γ* in (28) by *γ*_{min} and *γ*_{max}, respectively, and rewrite *γ* in terms of *γ* = *γ*_{min} + *c*(*γ*_{max} − *γ*_{min}) with *c* being a corresponding scalar coefficient that is involved in our discussion later. Note that in general the background residual norm *ξ _{u}* and

*ξ*in (25). This implies that, in general,

_{l}*γ*

_{min}and

*γ*

_{max}(hence

*γ*) also change with time; therefore, they need to be calculated at each data assimilation cycle.

An additional remark is that the normal ETKF and the ETKF-RN share the same square root update formula as in Wang et al. (2004), where it is the sample error covariance ^{lt}, that is used to generate the background square root. Such a choice is based on the following considerations. On the one hand, if one uses the hybrid covariance for square root update, then it would require a matrix factorization (e.g., singular value decomposition) in order to compute a square root of the hybrid covariance at each data assimilation cycle, which can be very expensive in large-scale applications. On the other hand, for the L96 model used here, numerical investigations show that using the hybrid covariance for the square root update does not necessarily improve the filter performance (results not shown).

The procedures in the ETKF-RN are summarized as follows. Because the matrix ^{−1/2}^{T}^{−T/2} is time invariant, its maximum and minimum eigenvalues, *ρ*_{max} and *ρ*_{min} [cf. (28)], respectively, are calculated and saved for later use. Then, with the background ensemble at each data assimilation cycle, calculate the sample mean *τ*_{max} of *γ*_{min} and *γ*_{max} in (28) and, hence, obtain *γ* = *γ*_{min} + *c*(*γ*_{max} − *γ*_{min}) for a given value of *c* (*c* can be constant or variable during the whole data assimilation time window). This *γ* value is then inserted into (14) (with *α* = *δ* = 1 there) to obtain the analysis mean

The experiment below aims to show that, at each data assimilation cycle, if a *γ* value lies in the interval *γ*_{min}, *γ*_{max}] given by (28), then the corresponding analysis residual norm *β _{l}* and

*β*satisfying the constraint (29). In the experiment we fix

_{u}*β*= 2, and let

_{u}^{4}

Figure 1 shows the time series of the background (dashed–dotted) and analysis (thick solid) residual norms in different filter settings (for convenience of visualization, the residual norm values are plotted in the logarithmic scale). For reference we also plot the targeted lower and upper bounds (dashed and thin solid lines, respectively), *p* = 20), respectively. In the normal ETKF (Fig. 1a), in most of the time the analysis residual norms are larger than the targeted upper bound (no targeted lower bound is calculated and plotted in this case). With residual nudging, the analysis residual norms of the ETKF-RN migrate into the targeted interval, as long as the coefficient *c* lies in [0, 1] (Figs. 1b–d). Also see the caption of Fig. 1 to find out how the corresponding *c* values are chosen. When *c* is outside the interval [0, 1], the corresponding *γ* is not bounded by [*γ*_{min}, *γ*_{max}]; hence, there is no guarantee that the corresponding analysis residual norms are bounded by *c* being 2.5 and −0.005, respectively (e.g., for *c* = −0.005 in Fig. 1f, breakthroughs of the lower bound are found around time step 220 and at a few other places). As “side” results, we also report in Table 1 on the time mean root-mean-square errors (RMSEs) [see Eq. (13) of Luo and Hoteit (2012)] that correspond to different filter settings in Fig. 1. In these tested cases, the filter performance of the ETKF-RN appears improved, in terms of the time mean RMSE, when compared to that of the normal ETKF.

## 4. Discussion and conclusions

We derived some sufficient inflation constraints in order for the analysis residual norm to be bounded in a certain interval. The analytic results showed that these constraints are related to the maximum and minimum eigenvalues of certain matrices [cf. (11)]. In certain circumstances, the constraint with respect to the minimum eigenvalue [e.g., (13)] may impose a nonsingularity requirement on relevant matrices. A few strategies in the literature that can be adopted to address or mitigate this issue are highlighted.

Some remaining issues are manifest in our deduction. These include, for instance, the nonlinearity in the observation operator and the choice of *β _{u}* and

*β*. For the former problem, under a suitable smoothness assumption on the observation operator, one may also obtain inflation constraints similar to those in section 2. On the other hand, though, more investigations may be needed to make the results more practical in terms of computational complexity. For the latter problem, numerical results in Luo and Hoteit (2012) show that the

_{l}*β*values influence the overall performance of the EnKF in terms of filter stability and accuracy. Intuitively, smaller (larger)

*β*values tend to make residual nudging happen more (less) often. Therefore, if the normal EnKF performs well (poorly), then a larger (smaller)

*β*value may be suitable. In this aspect, it is expected that an objective criterion is needed. This will be investigated in the future.

## Acknowledgments

We thank two anonymous reviewers for their constructive comments and suggestions. The first author would also like to thank the IRIS/CIPR cooperative research project “Integrated Workflow and Realistic Geology,” which is funded by industry partners ConocoPhillips, Eni, Petrobras, Statoil, and Total, as well as the Research Council of Norway (PETROMAKS) for financial support.

## REFERENCES

Altaf, U. M., T. Butler, X. Luo, C. Dawson, T. Mayo, and I. Hoteit, 2013: Improving short-range ensemble Kalman storm surge forecasting using robust adaptive inflation.

,*Mon. Wea. Rev.***141**, 2705–2720.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903.Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters.

,*Tellus***59A**, 210–224.Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83.Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts.

,*Mon. Wea. Rev.***127**, 2741–2758.Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129**, 420–436.Bocquet, M., 2011: Ensemble Kalman filtering without the intrinsic need for inflation.

,*Nonlinear Processes Geophys.***18**, 735–750.Bocquet, M., and P. Sakov, 2012: Combining inflation-free and iterative ensemble Kalman filters for strongly nonlinear systems.

,*Nonlinear Processes Geophys.***19**, 383–399.Grcar, J. F., 2010: A matrix lower bound.

,*Linear Algebra Appl.***433**, 203–220.Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme.

,*Mon. Wea. Rev.***128**, 2905–2919.Hamill, T. M., and J. S. Whitaker, 2011: What constrains spread growth in forecasts initialized from ensemble Kalman filters?

,*Mon. Wea. Rev.***139**, 117–131.Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790.Hamill, T. M., J. S. Whitaker, J. L. Anderson, and C. Snyder, 2009: Comments on “Sigma-point Kalman filter data assimilation methods for strongly nonlinear systems.”

,*J. Atmos. Sci.***66**, 3498–3500.Horn, R., and C. Johnson, 1990:

*Matrix Analysis*. Cambridge University Press, 575 pp.Horn, R., and C. Johnson, 1991:

*Topics in Matrix Analysis*. Cambridge University Press, 607 pp.Hoteit, I., D. T. Pham, and J. Blum, 2002: A simplified reduced order Kalman filtering and application to altimetric data assimilation in tropical Pacific.

,*J. Mar. Syst.***36**, 101–127.Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414.Luo, X., and I. M. Moroz, 2009: Ensemble Kalman filter with the unscented transform.

,*Physica D***238**, 549–562.Luo, X., and I. Hoteit, 2011: Robust ensemble filtering and its relation to covariance inflation in the ensemble Kalman filter.

,*Mon. Wea. Rev.***139**, 3938–3953.Luo, X., and I. Hoteit, 2012: Ensemble Kalman filtering with residual nudging.

,*Tellus***64A**, 17130, doi:http://dx.doi.org/10.3402/tellusa.v64i0.17130.Luo, X., and I. Hoteit, 2013: Efficient particle filtering through residual nudging.

, doi:10.1002/qj.2152, in press.*Quart. J. Roy. Meteor. Soc.*Meng, Z., and F. Zhang, 2007: Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part II: Imperfect model experiments.

,*Mon. Wea. Rev.***135**, 1403–1423.Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its implementation with the local ensemble transform Kalman filter.

,*Mon. Wea. Rev.***139**, 1519–1535.Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A**, 415–428.Song, H., I. Hoteit, B. Cornuelle, and A. Subramanian, 2010: An adaptive approach to mitigate background covariance limitations in the ensemble Kalman filter.

,*Mon. Wea. Rev.***138**, 2825–2845.Triantafyllou, G., I. Hoteit, X. Luo, K. Tsiaras, and G. Petihakis, 2013: Assessing a robust ensemble-based Kalman filter for efficient ecosystem data assimilation of the Cretan Sea.

,*J. Mar. Syst.***125,**90–100, doi:10.1016/j.jmarsys.2012.12.006.Wang, X., C. H. Bishop, and S. J. Julier, 2004: Which is better, an ensemble of positive–negative pairs or a centered simplex ensemble.

,*Mon. Wea. Rev.***132**, 1590–1605.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924.Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation.

,*Mon. Wea. Rev.***140**, 3078–3089.Zhang, F., C. Snyder, and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter.

,*Mon. Wea. Rev.***132**, 1238–1253.

^{1}

In the literature, the vector with the opposite sign, **y** − **x**, is often called innovation.

^{2}

An exception is in the case that *γ* = +∞ and *ξ _{l}* = 1. This implies that

^{3}

One may also let ^{lt}. In this case, both residual norms and RMSEs of the normal ETKF may become smaller (results not shown), while the validity of the analytic results in the previous section is not affected.