1. Introduction
This paper begins to address what is in our view the vital question facing those who employ the ensemble Kalman filter (EnKf; Evensen 1994): What are the systematic biases in the statistics on which it is based, resulting from a limited ensemble size? Ensemble forecasting is clearly an expensive proposition. The forecaster is faced with many choices to reduce the computational burden of the task. For example, is it better to have a small ensemble of high-resolution models (thereby potentially reducing model error at the risk of sampling error) or to have a large ensemble of lower-resolution models (thereby reducing sampling error at the risk of model error)? The answer depends on quite a number of things and in this study, the focus is much less ambitious. As a first step in the process, we attempt to quantify the sampling error resulting from the use of a relatively small (i.e., less than infinite) ensemble. Of course, we are not the first to be aware of these issues and our statistical error scaling arguments are employed to understand and improve previous attempts to deal with these biases, such as covariance inflation and the subdivision of the ensemble into smaller subensembles in order to remove “inbreeding” (see references below).
Blending of background and observation information is a necessary step in “data assimilation” (later referred to as DA), because neither model estimate nor observation is ever perfect. In this study, we consider a sequential (as opposed to variational) DA approach where the product of the analysis cycle is used as an initial state of a numerical prediction system. The basic tool in this approach is filtering; that is, all of the information is used up to and including the considered analysis time. We focus on one particular technique, the EnKf, and its numerical weather prediction (NWP) application. The EnKF has been the subject of studies in various meteorological and oceanographic contexts, many of them reviewed by Evensen (2007). However, to date, only the Canadian Meteorological Centre (CMC) has implemented it operationally (Houtekamer and Mitchell 2005). The Kalman filter (e.g., Gelb 1974), or its nonlinear extension the extended Kalman filter (EKF; e.g., Evensen 1992), are common examples of filtering methods. Nevertheless, it has been recognized that the EKF is impossible to implement in an operational NWP context, because of the computational burden (Evensen 1992; Keppenne 2000; Lawson and Hansen 2004). Moreover, unbounded error variance growth due to the linearization procedure has been observed (Evensen 1992). Alternatives to the EKF have been proposed, relying on the assumption that “the trajectory of the system state in a high-dimensional dynamical system typically lies on a smaller dimensional subspace of the entire phase space” (Farrell and Ioannou 2001) and, thus, can be called reduced-state filters (Lawson and Hansen 2004). One class of these employs Monte Carlo–based methods, where one generates an ensemble of N forecasts that samples the state space. They provide an estimation of the state’s probability density function (pdf), whose quality is dependent on the size of the ensemble. The generated pdf can naturally be used to initiate an ensemble prediction system, where one performs “stochastic dynamic prediction” (Epstein 1969). This latter approach is possible with the atmosphere’s deterministic laws because of the chaotic nature of the solutions. Monte Carlo filters can be divided into two subclasses: deterministic and stochastic, as stated by Tippett et al. (2003) or Lawson and Hansen (2004). In the stochastic filter (Houtekamer and Mitchell 1998; Burgers et al. 1998), observations are treated as random variables and thus perturbed to avoid an underestimation of the analysis variance (Burgers et al. 1998). This version has shown efficiency in various contexts (Houtekamer and Mitchell 1998; Hansen and Smith 2000; Reichle et al. 2002; Snyder and Zhang 2003). Deterministic formulations of Monte Carlo–based techniques, as reviewed by, for example, Tippett et al. (2003), do not require perturbed observations.
2. Sampling issues
Previous experiments dealing with realistic models have highlighted a few issues concerning EnKf’s operational application: (a) The limited number of members [usually no more than O(100)] can cause spurious covariances between widely separated locations, (b) the unavoidable presence of model errors of different types (presence of unresolved scales, imperfectly known forcing, etc.) can also cause spurious covariances, and (c) the presence of observational error of representation and errors in the statistical description of observations. In the current study, we only address issue a above. However, issue a (as well as issue b) can potentially lead to “underestimation” of the analysis variance. In some cases, this is known to be fatal to the EnKf: the analysis loses track of the truth, whereas its spread remains “too small.” As spread represents the uncertainty, the filter “believes” it performs better than it does in reality. This behavior is called filter divergence (e.g., Maybeck 1979, p. 338) and was observed by, for example, Houtekamer and Mitchell (1998) and Hamill et al. (2001) for the EnKf. We consider it important to clarify the words “too small” and “underestimation.” For this purpose, we recall that “the goal of the assimilation is to produce a random sample of the conditional probability distribution that is consistent with the truth while minimizing the rms error of the ensemble mean from the truth” (Anderson and Anderson 1999). Hence, the aim of Monte Carlo filtering is to satisfy the following two criteria:
(i) to produce a sample with an ensemble mean, commonly used as the state’s best estimate, having the smallest expected error, and
(ii) to produce a reliable ensemble of analysis, which is to say that the variations of the ensemble mean error, though not accessible, should be well represented by the ensemble of analyses given by the filter itself (in other words, the true state should be statistically indistinguishable from a randomly selected member of the ensemble).
In fact, criterion ii is a necessary condition for a perfect assimilation (Anderson and Anderson 1999). To be sure to satisfy it, a requirement is that the spread among the ensemble members remains representative of the difference between the ensemble mean and the true state, as the assimilation cycle proceeds (Anderson and Anderson 1999; Houtekamer and Mitchell 1998). We define the optimal analysis in the EnKf framework as the one obtained with an infinite number of members, and the ideal analysis as the one satisfying criteria i and ii when using a finite ensemble. Then, as only N members are used, and not an infinite number, the ideal value of the spread is not the one obtained with an infinite ensemble. It is rather represented by the mean squared error of the analysis mean calculated over the N ensemble members, which is itself different from the mean squared error (MSE) of an analysis using an ensemble of infinite size. Now, if the analysis spread calculated over the ensemble of size N under- or overestimates this ideal value, error statistics are not well represented (henceforth “misrepresented”), and criterion ii is not achieved. In fact, the analysis spread of a N-member ensemble systematically underestimates the MSE of the analysis mean calculated over these members.
To illustrate this, results of a twin-experiment simulation of one analysis cycle with the very simple case of a Gaussian process represented by a scalar are shown in Fig. 1. The curves show an average over 105 realizations. Forecasts and observations are taken from a normal distribution of zero mean and unit variance [
Note that in experimental applications of the EnKf, underestimation of the analysis variance with respect to the optimal value is more often observed than overestimation, whereas it is known that an overestimate is “safer” than an underestimate (Maybeck 1979, p. 339; Daley 1991, section 4.9). Houtekamer and Mitchell (1998) claimed that besides sampling noise related to the limited size of the ensemble, “inbreeding” problems may come into play and also cause spread underestimation. This would refer to the fact that we produce the estimate of the analysis error variation with the same ensemble used to compute the analysis. This is another way to express that the gain and the analysis variance depend nonlinearly on the background error covariance matrix (Whitaker and Hamill 2002). Various techniques have been proposed to correct for inbreeding-sampling errors and stabilize the EnKf. They include covariance localization (Houtekamer and Mitchell 2001; Ott et al. 2004), a hybrid formulation of EnKf–three-dimensional variational data assimilation (3DVAR; Hamill and Snyder 2000), a double-ensemble Kalman filter (DEnKf) (Houtekamer and Mitchell 1998), and covariance inflation (Anderson and Anderson 1999; Hamill et al. 2001; Whitaker and Hamill 2002; Ott et al. 2004; Gillijns et al. 2005). Finally, Pham et al. (1998) used a “forgetting factor” to reduce the observation error covariance matrix 𝗥. We do not present “covariance localization” nor hybrid formulations here, though conclusions may be drawn from our analytic results. In this study, we want to improve the theoretical understanding of the problem of misrepresentation of the analysis variance of the EnKf, as well as investigate the potential of the DEnKf and covariance inflation techniques to produce stable analyses (i.e., not subject to filter divergence).
To quantitatively address the question, we need to compare representative measures of the spread and the error of the ensemble mean. The Gaussian hypothesis for the statistics leads one to naturally calculate the second-order moments of the pdf of the error of the ensemble mean and of the analysis error to verify whether the criteria are satisfied. We consider here an idealized context that intentionally does not include any localization procedure, though Houtekamer and Mitchell (2001) show that localization can be effective in forcing the filter to satisfy criterion ii. Moreover, signal contamination by errors present in an operational context (e.g., model or representation errors) is not considered either. Using the established results, we then propose a theoretically based covariance inflation technique. The technique presented, which modifies the analysis update, turns out to be easier to implement and less costly than the DEnKf, whose performance is also considered. Section 3 describes the formulation of the perturbed-observation EnKf used here. In section 4, we establish the analytical expressions for the average analysis error covariance matrix and the MSE of the ensemble mean for the EnKf. In section 5, similar analytic results are demonstrated for the DEnKf, and a new covariance inflation technique is described. For each analytical result, we give a simple application using the analysis of a Gaussian scalar process. This application, though very simplistic, gives a fairly good insight into the problem. Moreover, we will see in an upcoming second paper that our analytic results compare well to analyses made with a barotropic model.
3. Formulation of the EnKf
The Kalman filter is a “recursive state estimation technique” (Lawson and Hansen 2004). It consists of two steps: (i) a propagation step using the system dynamics to evolve the first two moments of the state’s pdf (characterizing completely a Gaussian distribution) and (ii) an analysis step using a solution derived from Bayes’s rule, giving “either a maximum likelihood estimate (e.g., Lorenc 1986) or a minimum error variance estimate (e.g., within estimation theory; Cohn 1997).” It can also be identified “as a recursive least squares problem properly weighted by the inverses of the relevant error covariance matrices (e.g., Wunsch 1996),” as reviewed by Lawson and Hansen (2004).





This definition leads to an interpretation of the EnKf as a Monte Carlo method where the ensemble of model states evolves in state space, with the mean as the best estimate and the error variance as an estimate of the spread. Observations are similarly represented by another ensemble, generated by the addition of random noise sampled from the observational error distribution (Burgers et al. 1998). Hence, at measurement times, each ensemble member is updated using a different observation taken from this ensemble of perturbations. Here, 𝗞N is the Kalman gain evaluated over the ensemble. Note the nonlinear dependence in (5) of 𝗞N on 𝗣 fN.
4. On sampling errors in the EnKf
a. Averaged analysis error covariance matrix
Following van Leeuwen (1999), let us now assume that the ensemble estimates are not too far off; hence, we express the statistical error with respect to the unknown true background covariance matrix 𝗣f as ϵ: 𝗣 fN = 𝗣 f + ϵ, with ‖ϵ‖ ≪ ‖𝗣f‖. We want to express the analysis error covariance matrix 𝗣aN as a series expansion in ϵ. Similarly, we suppose
Henceforth, in order to simplify the algebra, we define 𝗟N = 𝗜 − 𝗞N𝗛, as well as ΠaN = 𝗟N𝗣 fN. Also, Θ = (𝗛𝗣f𝗛T + 𝗥) and Φ is the inverse of Θ. Then, the optimal Kalman gain, given by (5) when taking an infinite ensemble, can be written 𝗞 = 𝗣f𝗛TΦ, and its transpose 𝗞T = (𝗣f𝗛TΦ)T = Φ𝗛𝗣f. We finally define the gainlike matrix κ = ϵ𝗛TΦ.

b. Mean squared error of the ensemble mean


Order 2 truncation is a very good approximation for 〈𝗣aN〉 even for small values of N. In this case, higher-order terms almost cancel each other out, though they are not negligible individually. Although it is almost not perceptible in the scalar case, this is not exactly true in the expression of 〈ΔaN〉, for which higher-order terms in ϵ may have to be added.
5. Proposed solutions: DEnKf and covariance inflation
The results of the previous section show the impossibility for the EnKf to satisfy criterion ii, and therefore that it is naturally subject to divergence for small ensembles. The double-ensemble Kalman filter (DEnKf; Houtekamer and Mitchell 1998), where the N-member ensemble is split into two N/2-member subsets is an attempt to fix this problem. The covariance information from one of the subsets is used in the data assimilation of the other subset. By using one subensemble to calculate the background error covariance needed to update the forecast of the other subensemble, the DEnKf is expected to remove the negative bias in the analysis error variance described in the last section, because the dominant term in (10), −𝗟κΘκT, responsible for the systematic negative bias in the analysis error variance described in the last section, will average to zero.
The authors further suggested that the ensemble of size N can be divided in l subensembles of sizes N/l, updating each N/l-member subset by using the other one [of size (l − 1)N/l] to calculate 𝗞N. As it still implies two ensembles, we will later refer to this as lDEnKf (l ∈
Alternatively, some authors correct directly for the spread misrepresentation by using the so-called covariance inflation technique, where the ensemble-based covariances are multiplied by a tunable factor r ≥ 1 (Anderson and Anderson 1999; Whitaker and Hamill 2002; Ott et al. 2004; Anderson 2001). As seen previously, in order to satisfy criterion ii, these methods should seek an enhanced analysis variance that is as close as possible to the ideal value (and not as close as possible to the optimal value), although it is necessarily suboptimal due to the use of an ensemble of limited size. We stress here that this inflation is a priori dependent on N as well as on 𝗞, as verified below.
In section 5a, we examine the ability of the DEnKf to produce an ideal analysis variance in a perfect model context by extending our theoretical analysis. In section 5b, this is employed to formulate covariance inflation and randomly perturbed analysis methods giving, on average, an analysis that satisfies criteria i and ii.
a. Sampling errors in the DEnKf
Here, we examine separately the quality of the partitioned and the merged analysis. This distinction only stands for the way the analysis is used once produced. The former considers the whole ensemble given by merging all of the subensembles, whereas the latter considers each subensemble separately. In a partitioned analysis, each subensemble is defined at the first analysis cycle and evolves independently. In a merged analysis, the l subensembles of size N/l are unified in a single one of size N at the end of each analysis, and then randomly redivided. We chose to do so because authors have used both types of presentations in previous studies. We treat the 2DEnKf fully, and infer the result for the general case of an lDEnKf.
1) Partitioned analysis









We stress here that for small values of N, the second-order truncation is still a good one for tr〈𝗣a1〉, but not as precise as for the EnKf. Indeed, higher-order terms will not compensate because those containing products of ϵ1 and ϵ2 average to zero, whereas the others do not.
2) Merged analysis





Figure 2 shows the results obtained with a scalar process and a 2DEnKf. One can see that in this case the analysis variance given by the merged analysis matches very well the error of the ensemble mean on average. This means that in this case (k = 0.5), 2 is the optimal value for l, a value effectively given by solving the scalar version of (32). On the other hand, the partitioned analysis (the first ensemble in this case) gives an analysis with a degraded accuracy, whose variance is still underestimating the MSE of the ensemble mean. However, as predicted above, it overestimates the optimal value. Furthermore, the experimental results match the theoretical curves quite closely, as in the case of the EnKf. This is a good indication of the accuracy of the approximations made. Higher values of l have also been tested, namely l = 4 and l = 8. The results shown in Figs. 3 and 4 are also quite consistent with the theory. Figure 5 shows the simulation for the extreme case of an NDEnKf. As predicted, the analysis variance underestimates the MSE of the ensemble mean. The error with respect to the optimal value is almost exactly equal and opposite to the one given by the EnKf.
We see that the use of a DEnKf is a very good alternative to the EnKf. It can give an analysis that satisfies criteria i and ii, which is impossible with the EnKf. Nevertheless, it appears that the improvement is not as great when one divides the ensemble in more than two in this case.
b. Optimal covariance inflation
To correct for systematic sampling errors in 𝗣aN and to satisfy criterion ii, some authors multiply the forecast or analysis covariances by an inflation factor r. To properly satisfy criterion ii, we have seen that this factor is necessarily dependent on the ensemble size. Instead of using a blindly chosen factor, we propose here two inflation techniques, namely inflation of the deviations from the background before analysis, and addition of random errors after analysis.
The analyses given by the two methods on average satisfy criteria i and ii. They do not lead to underestimated error covariances. We therefore expect these analyses to be less subject to the problem of filter divergence.











6. Conclusions
Theoretical expressions have been established for the analysis error covariance matrices and the mean squared error of the ensemble mean given by various versions of a perturbed-observation EnKf. The performance of each version is examined by verifying whether it produces a sample with an ensemble mean having the smallest error (a requirement termed criterion i) and if the MSE of the ensemble mean is well represented by the spread drawn from the ensemble of analyses given by the filter (termed criterion ii). For the EnKf used with a finite ensemble, our results show that it never satisfies criterion ii fully. That is, the analysis spread among the ensemble members always underestimates the mean squared difference between the ensemble mean and the true state and the filter is therefore naturally subject to divergence. The possibility for the double-ensemble Kalman filter technique to properly correct this problem was studied. We showed that an lDEnKf (i.e., cutting the ensemble in l subensembles of size N/l) can potentially give adequate analyses when using the ensemble resulting from the merging of all the subensembles. Nevertheless, the best value of l is problem dependent, typically depending on the observation density and frequency. In addition, using an lDEnKf requires the computation of l Kalman gains, whose cost may be prohibitive. For the case of a partitioned lDEnKf, that is using each subensemble independently, we showed that the analysis never satisfies criterion ii, though performing relatively better than the EnKf for small values of l, but at the expense of a degradation of the accuracy of the ensemble mean (criterion i is not satisfied) whose level increases with l. To correct for the misrepresentation problems and satisfy criteria i and ii on average, we presented flow-dependent techniques such as an optimal covariance inflation, and a randomly perturbed analysis. They are motivated by the present theoretical analysis, which takes into account the dependence of criterion ii on the number of members and the Kalman gain. We expect these methods to be less subject to the problem of filter divergence. It should be stressed here that all of the calculations have been done in an idealized context, ignoring model or representativity errors. This can virtually invalidate some of our results and would affect our theoretical value for the inflation factor r. It is also important to keep in mind that the analytic results concern the means over a large number of realizations of analyses. For one given analysis, they might not be so relevant—especially if the variance of the squared error of the ensemble mean is relatively high. In NWP, as one works with state spaces of high dimensions, we are confident that the averages will have some significance. We should also stress that the whole theory developed here follows the traditional assumption of a Gaussian process. As a result, we may expect correction to r in a strongly nonlinear context. Nevertheless, we expect our results to better guide the user in the determination of the r factor in an operational context, previously chosen by trial and error (Anderson and Anderson 1999). We have performed simulations using a very nonlinear two-dimensional turbulence model that have led to very encouraging results regarding the capacity of these techniques to avoid filter divergence. They will be presented in the forthcoming Part II of this paper.
Acknowledgments
This work would not have been possible without the contributions of Peter Houtekamer and Herschel Mitchell, with whom we had hours of stimulating and fruitful discussions. We thank two anonymous reviewers for their very helpful and productive comments. We acknowledge financial support from the Canadian Foundation for Climate and Atmospheric Sciences (CFCAS) in the form of Grant GR-500B.
REFERENCES
Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129 , 2884–2903.
Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127 , 2741–2758.
Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126 , 1719–1724.
Charron, M., P. L. Houtekamer, and P. Bartello, 2006: Assimilation with an ensemble Kalman filter of synthetic radial wind data in anisotropic turbulence: Perfect model experiments. Mon. Wea. Rev., 134 , 618–637.
Cohn, S., 1997: An introduction to estimation theory. J. Meteor. Soc. Japan, 75 , 257–288.
Corazza, M., E. Kalnay, D. J. Patil, E. Ott, J. Yorke, B. R. Hunt, I. Szunyogh, and M. Cai, 2002: Use of the breeding technique in the estimation of the background error covariance matrix for a quasi-geostrophic model. Preprints, Symp. on Observations, Data Assimilation and Probabilistic Prediction, Orlando, FL, Amer. Meteor. Soc., 6.4. [Available online at http://ams.confex.com/ams/pdfpapers/28755.pdf.].
Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 457 pp.
Epstein, E., 1969: Stochastic dynamic prediction. Tellus, 21 , 739–759.
Evensen, G., 1992: Using the extended Kalman filter with a multilayer quasi-geostrophic ocean model. J. Geophys. Res., 97 (C11) , 17905–17924.
Evensen, G., 1994: Inverse methods and data assimilation in nonlinear ocean models. Physica D, 77 , 108–129.
Evensen, G., 2007: Data Assimilation: The Ensemble Kalman Filter. Springer, 280 pp.
Farrell, B., and P. Ioannou, 2001: State estimation using a reduced-order kalman filter. J. Atmos. Sci., 58 , 3666–3680.
Furrer, R., and T. Bengtsson, 2007: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivariate Anal., 98 , 227–255.
Gelb, A., 1974: Applied Optimal Estimation. The MIT Press, 382 pp.
Gillijns, S., D. S. Bernstein, and B. D. Moor, 2005: The reduced rank transform square root filter for data assimilation. Proc. 14th IFAC Symp. on System Identification (SYSID-2006), Newcastle, Australia, Int. Federation on Automatic Control, FrB2.5.
Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128 , 2905–2919.
Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129 , 2776–2790.
Hansen, J. A., and L. A. Smith, 2000: The role of operational constraints in selecting supplementary observations. J. Atmos. Sci., 57 , 2859–2871.
Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796–811.
Houtekamer, P. L., and H. L. Mitchell, 1999: Reply. Mon. Wea. Rev., 127 , 1378–1379.
Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129 , 123–137.
Houtekamer, P. L., and H. L. Mitchell, 2005: Ensemble Kalman filtering. Quart. J. Roy. Meteor. Soc., 131 , 3269–3289.
Keppenne, C., 2000: Data assimilation into a primitive-equation model with a parallel ensemble Kalman filter. Mon. Wea. Rev., 128 , 1971–1981.
Lawson, W. G., and J. A. Hansen, 2004: Implications of stochastic and deterministic filters as ensemble-based data assimilation methods in varying regimes of error growth. Mon. Wea. Rev., 132 , 1966–1981.
Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor. Soc., 112 , 1177–1194.
Maybeck, P. S., 1979: Stochastic Models, Estimation, and Control. Vol. 1. Mathematics in Science and Engineering, Vol. 141-1, Academic Press, 442 pp.
Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A , 415–428.
Pham, D. T., J. Verron, and M. C. Roubaud, 1998: A singular evolutive extended Kalman filter for data assimilation in oceanography. J. Mar. Syst., 16 , 323–340.
Reichle, R. H., D. B. McLaughlin, and D. Entekhabi, 2002: Hydrologic data assimilation with the ensemble Kalman filter. Mon. Wea. Rev., 130 , 103–114.
Snyder, C., and F. Zhang, 2003: Assimilation of simulated Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 131 , 1663–1677.
Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131 , 1485–1490.
van Leeuwen, P. J., 1999: Comment on “Data assimilation using an ensemble Kalman filter technique”. Mon. Wea. Rev., 127 , 1374–1377.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130 , 1913–1924.
Wunsch, W., 1996: The Ocean Circulation Inverse Problem. Cambridge University Press, 458 pp.
APPENDIX A
Taylor Expansion of 𝗞N and aN in the EnKf
APPENDIX B
Calculation of tr(〈𝗟κΘκT𝗟T〉) for a Gaussian Multivariate Process





APPENDIX C
Calculation of ΔfN for a Gaussian Process


APPENDIX D
Taylor Expansion for ΔaN

APPENDIX E
Taylor Expansion for 𝗣aN in the Partitioned DEnKf Method

APPENDIX F
Taylor Expansion for 𝗣aN in the Merged DEnKf Method

APPENDIX G
Taylor Expansion for ΔaN in the Merged DEnKf Method








Simulation of one analysis cycle with a Gaussian scalar, taken from
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1
Same simulation as in Fig. 1 but for a 2DEnKf. (a) Statistics for the merged analysis. The scalar versions of (27) and (30) are drawn in continuous lines. (b) Statistics for the analysis of the first ensemble. The scalar versions of (21) and (24) are drawn in continuous lines.
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1
Same simulation as in Fig. 1 but for a 4DEnKf. (a) Statistics for the merged analysis. The scalar versions of (28) and (31) for l = 4 are drawn in continuous lines. (b) Statistics for the analysis of the first ensemble. The scalar versions of (22) and (25) for l = 4 are drawn in continuous lines.
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1
Same simulation as in Fig. 1 but for an 8DEnKf. (a) Statistics for the merged analysis. The scalar versions of (28) and (31) for l = 8 are drawn in continuous lines. (b) Statistics for the analysis of the first ensemble. The scalar versions of (22) and (25) for l = 8 are drawn in continuous lines.
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1
Same simulation as in Fig. 1 but for an NDEnKf, N being the number of ensemble members. The scalar versions of (28) and (31) for l = N are drawn in continuous lines.
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1
Same simulation as in Fig. 1 but for a sampling-error-corrected EnKf, using an inflation factor. (a) The inflation coefficient rN, which depends on the number of ensemble members, has been used. (b) The constant value r = 1.035 has been used. The curves match closely the corresponding analytical curve 〈σ2a(N)〉 = σ2a[1.035 − 2k(1 − k)(N − 1)−1], represented by the continuous line.
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1
Isolines of the scalar version of the inflation coefficient rN, as a function of the number of ensemble members and the gain k.
Citation: Monthly Weather Review 136, 8; 10.1175/2007MWR2323.1