## 1. Introduction

Ensemble Kalman filters are widely used for data assimilation in numerical weather prediction on both global (Houtekamer et al. 2014; Whitaker et al. 2008) and regional scales (Zhu et al. 2013; Cavallo et al. 2013), ocean prediction (Keppenne et al. 2008; Karspeck et al. 2013), land surface (Rosolem et al. 2014), and hydrology prediction (Reichle et al. 2002), as well as many other types of applications (Chen and Oliver 2010; Emerick and Reynolds 2011; Shaman and Karspeck 2012). State-of-the-art prediction models for many of these applications have evolved to use all available computational resources. Therefore, there is a natural desire to minimize ensemble sizes while still providing reasonably high-quality ensemble analyses.

Many variants of ensemble filters provide the exact Kalman filter solution for linear prediction models with Gaussian observational error and sufficiently large ensembles (Anderson 2009b; Tippett et al. 2003). However, all of these prerequisites are violated for large geophysical applications. Consequently, basic ensemble filters without ad hoc adjunct algorithms like inflation and localization and with affordable ensemble sizes generally diverge from the observed system. Other filter variants (Burgers et al. 1998; Ott et al. 2004) also tend to diverge without additional algorithmic enhancements.

Two fundamental problems associated with small ensembles for large geophysical applications are insufficient prior ensemble variance and spurious prior correlations between observations and state variables. Heuristic methods addressing these problems have been developed, in particular inflation (Anderson and Anderson 1999) and localization (Houtekamer and Mitchell 1998; Hamill et al. 2001; Furrer and Bengtsson 2007). Inflation algorithms reduce the loss of variance during the assimilation (Zhang et al. 2004) or restore variance when it is lost (Anderson 2009a). Localization algorithms attempt to reduce errors in the correlations. In most cases, small ensembles are implicitly assumed to overestimate the magnitude of correlation and localization algorithms reduce ensemble correlations (Anderson 2012, hereafter A12). Adaptive algorithms that estimate the required inflation (Whitaker and Hamill 2012) and localization (Bishop and Hodyss 2007, 2009; Zhang and Oliver 2010) as adjunct parts of the ensemble assimilation have been developed.

Localization and inflation often correct for more than just errors from small ensembles. In large geophysical applications, inflation may correct primarily for systematic model errors (Li et al. 2009a,b) and localization may also correct for model errors that produce incorrect correlations.

A12 ignored these other aspects of localization and inflation and explicitly assumed that ensemble sampling error was the only source of correlation errors. By assuming that ensembles were random draws from a specified prior distribution of correlations, it was possible to correct for some sampling errors in the correlations between observations and state variables. Applying this sampling error correction (SEC) algorithm led to ensemble filters that still required traditional fixed localization, but were less sensitive to the tuning of traditional localization width.

This manuscript extends the approach of A12 by explicitly estimating the distribution of correlations between observations and state variables. The correlation distributions are estimated for subsets of pairs of observations and state variables as in Anderson and Lei (2013, hereafter AL13). This distribution is then used to calculate a correction for sampling error in the computation of correlation in the ensemble filter. This correlation error reduction (CER) algorithm is described in section 2. Section 3 describes low-order perfect-model experiments used to test the algorithm. Sections 4 and 5 present results from linear and nonlinear low-order models with various observing system configurations. Section 6 presents results using a low-order dry atmospheric general circulation model. Sections 7 and 8 discuss details of the algorithmic performance and present conclusions.

## 2. Reducing correlation sampling error

*y*on a single state vector component

*x*(Anderson 2003):Here

*n*indexes the

*N*ensemble members,

*n*th ensemble estimate of the increment for the observed quantity,

*n*th ensemble estimate of the state variable component, and

*x*and

*y*, respectively.

Following A12, sampling error is assumed to occur in the computation of *N*. Sampling error in the standard deviations is assumed to be small as justified in the appendix of A12, so only sampling error in the correlation

*y*and

*x*using information from previous state variable updates. Let the prior distribution for

*r*given that the ensemble sample correlation is

*x*instead of the sample value

Implementing this algorithm requires a discrete representation of the prior and posterior distributions of a correlation and the likelihood. The range [−1, 1] of the correlation is divided into *S* congruent subintervals and the mean probability density in each subinterval is defined as *S* = 200) are used in all results presented here (see section 7).

*N*is computed by an offline Monte Carlo computation like the one used in A12. It is assumed that before the assimilation starts, the correlation between an observation and a state variable is uniformly distributed on [−1, 1]. A set of

*K*correlation values (

*K*= 10

^{8}here) is defined that equally partitions the interval [−1, 1] withFor each

*N*from a bivariate normal distribution with covariance

The set of pairs of true and sample correlation intervals

The normalized product in (3) of the prior and the likelihood (blue curve in Fig. 1) is the best available estimate of the PDF of correlations for the update of state variable *x* by observation *y*.

Optimal prior distributions for *y* and *x*. AL13 developed a method to compute empirical localization functions (ELFs) for subsets of pairs of state variables and observations. The ELF algorithm is not based on sampling error arguments, but rather does an a posteriori optimization using the output of an observing system simulation experiment to produce a localization coefficient for each subset. That use of such subsets is adopted here with an independent estimate of the correlation distribution being computed for each subset, but it is important to note that the algorithm here is an extension of A12, not AL13. In the simple examples in subsequent sections, the subsets are distinguished only by the distance between *y* and *x* unless otherwise noted in sections 5c, 6, and 7. AL13, Lei and Anderson (2014a), and Lei and Anderson (2014b) discuss the selection of subsets for more complex applications.

Starting an assimilation experiment requires an initial estimate of the prior correlation distribution for each subset. All experiments here use a uniform distribution on [−1, 1]; sensitivity to this choice is discussed in section 7.

*y*,

*x*) subset were the same throughout the assimilation experiment, the posterior distribution for the correlation for a given subset in (3) would be used as the prior for the next (

*y, x*) pair from that subset. However, the true (

*y*,

*x*) distribution cannot be assumed to be stationary in this way for real applications. As the synoptic situation changes, the expected correlation between an observation 200 km from a state variable might change significantly (e.g., depending on whether a front existed between the two). Instead, for potentially nonstationary correlation distributions, the posterior distribution in (3) is treated as an independent estimate of distributions that can occur for this subset. The subsequent prior for the subset is computed as a weighted average of the current prior and the posteriorThe relative weight

*β*= 1/10 000 and sensitivity to

A summary of the steps in the CER algorithm for observation *y* impacting state variable *x* is as follows:

- Use the distance between
*x*and*y*to determine which subset to use. - Use the sample correlation
(red dashed value in Fig. 1) to select the appropriate prior likelihood (red curve in Fig. 1) using (8). - Compute the posterior correlation distribution (blue curve in Fig. 1) as the product of the prior distribution for the subset (green curve in Fig. 1) and the likelihood using (3).
- Use the mean of the posterior (blue dashed value in Fig. 1) to compute increments for
*x*using (4). - Compute an updated prior estimate for this subset by adding a small fraction times the posterior PDF to the prior (red curve + fraction of blue curve) using (9).

There can be instances when

## 3. Evaluation of CER algorithm

Perfect-model assimilation experiments are used to evaluate the CER algorithm. Forward observation operators are applied to the state vector from a single long run of a forecast model, the “truth” run, and random samples from a specified observational error distribution are added to generate synthetic observations. These observations are then assimilated by an ensemble filter using the same forecast model.

*m*indexes the

*M*model state variables. All results are for prior estimates but no qualitative differences were found for posterior estimates. No traditional localization is applied with the CER algorithm. Comparison cases that do not use CER use standard Gaspari–Cohn (GC) localization (Gaspari and Cohn 1999) with a specified half-width.

## 4. Simple linear model

_{,}the standard EAKF has no sampling error and reproduces the Kalman filter solution to machine precision, but for

*i*th observation impacts only the

*i*th state variable, the EAKF converges to the Kalman filter solution for any ensemble size

*u*refers to the posterior at the previous assimilation time, and subscript

*p*is the prior at the current assimilation time. Figure 2 shows the time mean RMSE and ensemble spread as a function of ensemble size for the CER. Ensemble size 5 gave the smallest RMSE (0.324) and most consistent spread. Ensemble sizes of 20 and larger gave nearly identical values of RMSE (0.331) and spread (0.368). An ensemble size of 201 that would give the exact answer without CER and adaptive inflation also has RMSE of 0.331. Ensemble size 4 periodically diverged and reconverged to the truth while ensemble size 3 diverged.

Figure 2 can be compared to Fig. 2 from A12, which shows results using the SEC algorithm applied with no GC localization and the same adaptive inflation as used here. For ensemble sizes of 160 and 200 the SEC gives nearly identical results to CER, but for smaller ensemble sizes the SEC error is larger and the SEC diverged for ensemble sizes less than 50. In this simple case where the only source of error is sampling error, the CER algorithm does nearly as well as the exact solution for ensemble sizes greater than 4 and required no tuning of localization.

## 5. Lorenz-96 40-variable model

The second model examined is the 40-variable configuration of the Lorenz-96 model (L96; Lorenz and Emanuel 1998) with standard parameter settings of forcing *F* = 8.0, a time step of 0.05 units, and fourth-order Runge–Kutta time differencing. To facilitate comparison to the ELF results in AL13, the same three observation distributions examined there are discussed here.

### a. Frequent low-quality observations

For the first test, all 40 state variables are observed every model time step with an observational error variance of 16. Figure 3 shows the time mean RMSE for ensemble sizes of 5, 10, 20, and 40 for a variety of half-widths using standard GC localization and for the CER algorithm (horizontal lines). The CER RMSE is slightly larger than for the best GC half-width for all ensemble sizes. The GC results for the 20-member ensemble are identical to those shown in AL13’s Fig. 2, while the 10- and 40-member ensemble results are the same as in AL13’s Fig. 3. The RMSE for the CER for the 10-, 20-, and 40-member ensemble cases is very similar to the RMSE for the best ELF from AL13.

Figure 4 plots the time mean equivalent localization [

For five members the dip is a local minimum with the localization for distance one smaller than for distance two. For all ensemble sizes, the localization from the CER is smaller than the best GC whenever the GC localization is greater than 0.1. This suggests that assuming that the ensembles act as random samples from an underlying distribution of correlations is not a very good choice in this case. In other words, the problem is fairly close to one in which a sufficiently large ensemble gives the correct result without sampling error. Therefore, the CER always localizes too aggressively resulting in larger RMSE than the best GC.

The CER localizations can be compared to the ELF localizations in AL13’s Fig. 1 for 20 members and Fig. 5 for 10 members. The shapes are very similar with the same kink at distance one. It is unclear why the ELF RMSE was not smaller than that from the best GC for this case. However, the similarity between the ELF and the CER localization patterns suggests that sampling error, the only thing used to construct the CER, is a major error source in this case.

### b. Infrequent high-quality observations

The second case has observations of all state variables every 12 model time steps with an observational error variance of 1. For this observation system, the prior RMSE is about twice as large as for the previous case, but the posterior RMSE is much smaller. The fact that the prior error is larger suggests that this case may be further from the linear limit where the Kalman filter is exact so that the sampling error approximation may be more appropriate than in more linear cases.

Figure 5 shows RMSE for 10-, 20-, and 40-member ensembles for a selection of GC half-widths and for the CER. The CER RMSE is smaller than that for the best GC half-width for all ensemble sizes and is again very similar to that for the best ELFs (AL13’s Fig. 7); the CER is better than the best ELF for 10- and 40-member ensembles and slightly worse for 20.

Figure 6 shows the time mean localization for the CER along with the best GC for 10, 20, and 40 members. In this case, there is a local minimum in localization for distance of one grid interval for all ensemble sizes. The CER localizations are smaller than the best GC for small distances between observation and state variable, but larger for distances greater than about three grid intervals. This suggests that the CER does better than the best GC in this case for two reasons. First, because the prior errors are much larger, the sampling error that the CER corrects is a more dominant source of error. Second, because the best localization is not as similar to a Gaussian as in the previous case, the baseline GC localization is further from optimal so it is easier for the CER to do better. The CER can have a more aggressive localization for small distances and less aggressive for larger distances than the best GC. Again, the CER localizations are very similar in shape to the best ELFs show in AL13’s Fig. 8 implying that correlation sampling error is the dominant error source in the filter for this case.

Figure 7a shows the evolution of the prior correlation PDF in the 20-member CER algorithm for the subset with the closest observations and state variables, one grid interval apart. The initial distribution (not shown) is

Figure 7b shows the evolution of the 20-member correlation distribution for observations and state variables that are relatively far apart with a distance of 8 grid intervals. The distribution evolves to have large probability of correlation close to 0 and negligible probability that the absolute value of correlation is greater than 0.5. The mode of the distribution is almost always in one of the two discrete intervals bounded above or below by zero.

Figure 8 shows the equilibrated 20-member correlation distributions for a selection of distances between the observation and state variable. The one and eight grid interval distances were already discussed. The two grid interval distance has a mode of about −0.4 with a skewed tail extending to large positive correlations. It does not, however, appear to have a second mode as for the one interval case. As discussed above, this may explain why the localization for one grid interval is a local minimum for the 20-member ensemble in Fig. 6. As the distance increases past two intervals, the correlation modes in Fig. 8 are close to 0 with increasingly narrow distributions. Correlation PDFs for distances greater than 8 are similar to the distance 8 case.

Although the prior correlation PDFs approximately converge as the assimilation proceeds, the localizations computed with (6) for subsets with small distance between observation and state variable do not. Figure 9 shows the localization for different distance subsets as sequences of pairs of observations and state variables are updated for the 20-member case. For distances of 3 grid intervals or less, localization varies from more than 1 to less than 0.1 over a sequence of 500 pairs. The variation is smaller for larger distances. Localization for the 6 grid interval subset is almost always between 0.3 and 0.5 and for the 16 grid interval subset is almost always between 0.03 and 0.05. This time variation is a result of the fact that the correlations for a subset vary in time and the correlation error reduction equivalent localizations are functions of the current sample correlation in the ensemble and the pdf of correlations for the subset.

### c. Many observations of sums of state variables

Naive application of the CER algorithm with 20-member ensembles gave time mean RMSE that was much larger than for the best GC localization or the ELFs from AL13. This is because the subsets of *y* and *x*. However, the parallel sequential assimilation algorithm used here (Anderson and Collins 2007) works in a joint phase space where the forward operators for all observations are computed at the beginning of each assimilation step. The model state vector becomes the set consisting of the standard 40 state variables plus the (320 in this case) observed variables. When an observation is assimilated, increments are computed and applied not only to the 40 original state variables but also to all of the 320 observed variables that have not yet been assimilated. In this case, that means that a given distance subset includes two distinct types of *y*, the sum of 17 adjacent state variables, and a state variable that is *k* grid intervals from the center of these 17. The second type contains an observation *y* and another observation with a center that is *k* grid intervals from the center of *y*. The correlation PDFs for these two types of pairs are quite different. When the two types of pairs are combined to estimate a single correlation PDF, the result is not very similar to either alone.

To improve the performance of the CER, correlation PDFs are estimated for 41 subsets of

The RMSE for a 20-member ensemble for this CER application and a variety of GC localization half-widths are shown in Fig. 10. The CER produces time-averaged RMSE of less than 1.3 while the best GC is just less than 1.7. The best ELF from AL13 gives an RMSE of about 1.45, but only a single ELF was computed rather than one for pairs of observations with state variables and one for observations with observations as discussed in the previous paragraph.

The time mean localization for this case is shown in Fig. 11 along with the best GC function. Note that there are two curves for the CER localization: one for state variables and one for other observed variables. The state variable localization is somewhat reminiscent of ELFs for this case with values of about 0.5 for small distance and largest localization values at 7 grid intervals. However, the state localization at distances larger than seven intervals is larger and has more structure than for the ELFs. The localization for observations has values greater than 1 for distances between 2 and 6 and is larger than the localization for state until distances of 15. The best GC function is bracketed between the two localizations for distances out to six intervals. The small RMSE for the CER suggests that sampling error explains a significant portion of the error in this case. Since the CER localizations are distinctly non-Gaussian, the GC is unable to compete. The ELF is also not as good presumably because it did not work with separate

The localization values greater than 1 for the observed variables suggest that in some cases, correlation sampling error can cause correlations to be too small, rather than too large. AL13 also found some instances of ELFs with values larger than one for small distances. It was argued that the ELFs were acting as an empirical inflation in those cases.

## 6. Low-order dry dynamical core

The low-resolution GFDL AM2 B-grid dynamical core (GFDL Global Atmospheric Model Development Team 2004) with 30 latitudes, 60 longitudes, and 5 levels that was used in A12 and Anderson et al. (2005) with forcing from Held and Suarez (1994) is used next. As in A12, surface pressure along with wind components and temperature at all five levels are observed every 12 h at 180 approximately regularly spaced latitudes and longitudes. Simulated observational errors are drawn from *U*, *V*) and temperature *T* (units are m s^{−1} and K, respectively).

The initial truth and ensemble for the experiment are identical to those in A12. Observations are generated from a 200-day truth integration and a 20-member ensemble assimilation is then applied. The first 50 days are discarded, and the final 150 days (300 assimilation times) are used to compute statistics. Baseline cases are run with GC localizations of 0.2, 0.3, 0.4, 0.5, 0.6, 0.8, 1.0, and 1.6 rad (the last case results in very large RMSE and is not included in Fig. 12).

Corresponding assimilations with CER are also run. The CER cases have no GC localization, but they only allow observations to impact state variables that are within the region where the GC case had nonzero localization (i.e., observations impact state variables within twice the half-width). For the 1.6 rad case, this means that each observation in the CER impacts all state variables.

Subsets of observation–state pairs are a function of the horizontal separation (60 categories), vertical separation (4 categories), observation type (4 categories for PS, *U*, *V*, *T*), and state variable type (4 categories) for a total of 3840 subsets. This is similar to the subsets used for the ELF method in Lei and Anderson (2014a).

Unlike in A12, no inflation is applied to either the baseline or CER cases in order to highlight the impact of localization as opposed to adaptive inflation. As noted in Anderson et al. (2005), this low-order GCM is unusually robust when used with no inflation. This makes it a convenient tool for looking at the impacts of the CER method without the complications of interaction with the adaptive inflation that is required for general good performance in the L96 applications.

The assimilation quality is evaluated by the spatial and temporal mean of the RMSE for the ensemble mean of the model PS variables [as in (10) with *m* indexing only the *M* = 1800 surface pressure variables from the model]. Results are qualitatively similar for any of the other model variables. Figure 12 shows the time mean PS RMSE as a function of background horizontal GC localization radius half-width for the baseline and CER cases (cf. to Fig. 6 in A12, which has inflation). The CER RMSE is always smaller than that for the corresponding GC case, and the CER values change very little as the impact region increases.

Figure 13 shows the time mean equivalent localization from the CER for the impact of a north–south wind component observation on the model’s middle level on east–west wind state variables on the same level (cf. to Fig. 10 in A12 for sampling error correction and A12’s Fig. 12 for a group filter; Anderson 2007) for the 1.6 half-width case. The maximum localization values are significantly less than one with non-Gaussian structure apparent close to the observation. For larger separations, the CER algorithm has very effectively determined that localization should be close to 0.

## 7. Discussion

The CER algorithm estimates the correlation distribution for different subsets of

The CER algorithm has several free parameters that impact performance. The relative weight,

The second free parameter is

A third free parameter is the number of subintervals, *S*, in the discrete representation of the correlation PDFs and likelihoods. The computational cost is a linear function of *S*, so smaller is cheaper. Reducing *S* to 40 (subinterval width 0.05) had a negligible impact on any of the CER assimilations. Reducing *S* to 20 led to small increases in RMSE for all experiments.

The number of subsets for which correlation is estimated clearly has significant impact on the results as demonstrated by the case discussed in section 5c. In that case, the correlation distributions required for state variables and observed variables at the same distance were very different. However, it may be possible to combine subsets that do not have such large differences in correlation distributions. To explore this, the experiment in section 5b was repeated with fewer subsets. Cases were done with 2, 3, 4, 5, 7, 10, and 20 pairs of consecutive

The CER algorithm only considers sampling error in the prior correlation, not in the sample standard deviations. Sample estimates of standard deviation have a small bias, but sample estimates of the quotient of standard deviations in (2) can be more significantly biased. An algorithm similar to the CER can correct for sampling error in the quotient. This algorithm was tested for all cases here, but had a negligible impact for ensemble sizes of 10 or greater. For smaller ensemble sizes, very small reductions in RMSE were found for most cases.

The CER algorithm requires *O*(*S*) multiplications and several sums of *S* elements to normalize PDFs for each *S* is the number of subintervals used in the discrete representation of PDFs. The cost of the standard filter is *O*(*N*) for each *N* = 20, the CER algorithm implemented in the Data Assimilation Research Testbed (Anderson et al. 2009) required approximately 1.5 times as much computation as the basic ensemble filter. This ratio can be reduced by noting that the likelihood functions (red curve in Fig. 1) and correlation prior PDFs (Fig. 8) are significantly nonzero only for a fraction of the [−1, 1] correlation range. More sophisticated implementations could only perform multiplications and sums for subintervals in which both are significantly nonzero. Since the correlation PDFs converge fairly rapidly during the assimilation, practical applications, especially for large ensemble sizes, could estimate PDFs for an initial spinup period. After that, a simple lookup table could return the value of the CER correlation given the ensemble sample correlation for a given subset. For subsets where the underlying correlation is always small (e.g., subsets at large horizontal distance in the low-order GCM), it takes fewer than 100 observation state variable pairs for a 10-member ensemble (and even fewer pairs for larger ensembles) to be able to state confidently that the correlation is very small. An efficient implementation could very quickly identify such subsets and cease to compute updates to reduce computation time.

The ELF approach has been used as a benchmark here since it does an a posteriori computation of a good localization. However, as implemented in work to date, the ELF requires an iterative process using a sequence of long observing system simulation experiments, so it is much more expensive and complex than the CER algorithm used here. In addition, the CER provides the capability to introduce a priori information about background correlation distributions; there is no similar capability for the ELF.

## 8. Conclusions and next steps

An algorithm that allows ensemble data assimilation without tuning localization functions has been developed and applied to low-order models. The algorithm assumes that ensemble sampling error in the computation of correlations is the primary source of error in an assimilation. This assumption is clearly false for some applications (like the example in section 4) since the standard ensemble adjustment Kalman filter with a large enough ensemble and no localization is the optimal solution. The fact that the new algorithm is competitive with other empirical methods for computing localization suggests, however, that the assumption may be approximately valid for many applications. Further research in larger models is required to determine if the new algorithm will be effective. For instance, issues related to the interaction of localization and model balance (Greybush et al. 2011; Kepert 2009) are not addressed by the results here (Oke et al. 2007). In addition, all experiments presented here are in situations with no model systematic errors. Realistic applications will include bias and it is possible that tuned a priori localization will be more effective in such cases. Applying the correlation error reduction algorithm to the large atmospheric model applications explored with empirical localization functions in Lei and Anderson (2014a,b) will be the next step.

Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. Thanks to the DART team for support of the code and to Lili Lei, Abhishek Chatterjee, and three anonymous reviewers for constructive comments on earlier drafts.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Anderson, J. L., 2003: A local least squares framework for ensemble filtering.

,*Mon. Wea. Rev.***131**, 634–642, doi:10.1175/1520-0493(2003)131<0634:ALLSFF>2.0.CO;2.Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230**, 99–111, doi:10.1016/j.physd.2006.02.011.Anderson, J. L., 2009a: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.Anderson, J. L., 2009b: Ensemble Kalman filters for large geophysical applications.

,*IEEE Contr. Syst. Mag.***29**, 66–82, doi:10.1109/MCS.2009.932222.Anderson, J. L., 2012: Localization and sampling error correction in ensemble Kalman filter data assimilation.

,*Mon. Wea. Rev.***140**, 2359–2371, doi:10.1175/MWR-D-11-00013.1.Anderson, J. L., , and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts.

,*Mon. Wea. Rev.***127**, 2741–2758, doi:10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.Anderson, J. L., , and N. Collins, 2007: Scalable implementations of ensemble filter algorithms for data assimilation.

,*J. Atmos. Oceanic Technol.***24**, 1452–1463, doi:10.1175/JTECH2049.1.Anderson, J. L., , and L. Lei, 2013: Empirical localization of observation impact in ensemble Kalman filters.

,*Mon. Wea. Rev.***141**, 4140–4153, doi:10.1175/MWR-D-12-00330.1.Anderson, J. L., , B. Wyman, , S. Zhang, , and T. Hoar, 2005: Assimilation of surface pressure observations using an ensemble filter in an idealized global atmospheric prediction system.

,*J. Atmos. Sci.***62**, 2925–2938, doi:10.1175/JAS3510.1.Anderson, J. L., , T. Hoar, , K. Raeder, , H. Liu, , N. Collins, , R. Torn, , and A. Arellano, 2009: The Data Assimilation Research Testbed.

,*Bull. Amer. Meteor. Soc.***90**, 1283–1296, doi:10.1175/2009BAMS2618.1.Bishop, C. H., , and D. Hodyss, 2007: Flow adaptive moderation of spurious ensemble correlations and its use in ensemble based data assimilation.

,*Quart. J. Roy. Meteor. Soc.***133**, 2029–2044, doi:10.1002/qj.169.Bishop, C. H., , and D. Hodyss, 2009: Ensemble covariances adaptively localized with ECO-RAP. Part 1: Tests on simple error models.

,*Tellus***61A**, 84–96, doi:10.1111/j.1600-0870.2008.00371.x.Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126**, 1719–1724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.Cavallo, S. M., , R. D. Torn, , C. Snyder, , C. Davis, , W. Wang, , and J. Done, 2013: Evaluation of the Advanced Hurricane WRF data assimilation system for the 2009 Atlantic hurricane season.

,*Mon. Wea. Rev.***141**, 523–541, doi:10.1175/MWR-D-12-00139.1.Chen, Y., , and D. S. Oliver, 2010: Cross-covariances and localization for EnKF in multiphase flow data assimilation.

,*Comput. Geosci.***14**, 579–601, doi:10.1007/s10596-009-9174-6.Emerick, A., , and A. Reynolds, 2011: Combining sensitivities and prior information for covariance localization in the ensemble Kalman filter for petroleum reservoir applications.

,*Comput. Geosci.***15**, 251–269, doi:10.1007/s10596-010-9198-y.Furrer, R., , and T. Bengtsson, 2007: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants.

,*J. Multivar. Anal.***98**, 227–255, doi:10.1016/j.jmva.2006.08.003.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757, doi:10.1002/qj.49712555417.GFDL Global Atmospheric Model Development Team, 2004: The new GFDL Global Atmosphere and Land Model AM2–LM2: Evaluation with prescribed SST simulations.

,*J. Climate***17**, 4641–4673, doi:10.1175/JCLI-3223.1.Greybush, S. J., , E. Kalnay, , T. Miyoshi, , K. Ide, , and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques.

,*Mon. Wea. Rev.***139**, 511–522, doi:10.1175/2010MWR3328.1.Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.Held, I. M., , and M. J. Suarez, 1994: A proposal for the intercomparison of the dynamical cores of atmospheric general circulation models.

,*Bull. Amer. Meteor. Soc.***75**, 1825–1830, doi:10.1175/1520-0477(1994)075<1825:APFTIO>2.0.CO;2.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811, doi:10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.Houtekamer, P. L., , X. Deng, , H. L. Mitchell, , S.-J. Baek, , and N. Gagnon, 2014: Higher resolution in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***142**, 1143–1162, doi:10.1175/MWR-D-13-00138.1.Karspeck, A. R., , S. Yeager, , G. Danabasoglu, , T. Hoar, , N. Collins, , K. Raeder, , J. Anderson, , and J. Tribbia, 2013: An ensemble adjustment Kalman filter for the CCSM4 ocean component.

,*J. Climate***26**, 7392–7413, doi:10.1175/JCLI-D-12-00402.1.Kepert, J. D., 2009: Covariance localisation and balance in an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 1157–1176, doi:10.1002/qj.443.Keppenne, C. L., , M. M. Rienecker, , J. P. Jacob, , and R. Kovach, 2008: Error covariance modeling in the GMAO ocean ensemble Kalman filter.

,*Mon. Wea. Rev.***136**, 2964–2982, doi:10.1175/2007MWR2243.1.Lei, L., , and J. L. Anderson, 2014a: Comparisons of empirical localization techniques for serial ensemble Kalman filters in a simple atmospheric general circulation model.

,*Mon. Wea. Rev.***142**, 739–754, doi:10.1175/MWR-D-13-00152.1.Lei, L., , and J. L. Anderson, 2014b: Empirical localization of observations for serial ensemble Kalman filter data assimilation in an atmospheric general circulation model.

,*Mon. Wea. Rev.***142**, 1835–1851, doi:10.1175/MWR-D-13-00288.1.Li, H., , E. Kalnay, , and T. Miyoshi, 2009a: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 523–533, doi:10.1002/qj.371.Li, H., , E. Kalnay, , T. Miyoshi, , and C. M. Danforth, 2009b: Accounting for model errors in ensemble data assimilation.

,*Mon. Wea. Rev.***137**, 3407–3419, doi:10.1175/2009MWR2766.1.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414, doi:10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.Oke, P. R., , P. Sakov, , and S. P. Corney, 2007: Impacts of localisation in the EnKF and EnOI: Experiments with a small model.

,*Ocean Dyn.***57**, 32–45, doi:10.1007/s10236-006-0088-8.Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A**, 415–428, doi:10.1111/j.1600-0870.2004.00076.x.Reichle, R. H., , D. B. McLaughlin, , and D. Entekhabi, 2002: Hydrologic data assimilation with the ensemble Kalman filter.

,*Mon. Wea. Rev.***130**, 103–114, doi:10.1175/1520-0493(2002)130<0103:HDAWTE>2.0.CO;2.Rosolem, R., , T. Hoar, , A. Arellano, , J. L. Anderson, , W. J. Shuttleworth, , X. Zeng, , and T. E. Franz, 2014: Assimilation of near-surface cosmic-ray neutrons improves summertime soil moisture profile estimates at three distinct biomes in the USA.

,*Hydrol. Earth Syst. Sci. Discuss.***11**, 5515–5558, doi:10.5194/hessd-11-5515-2014.Shaman, J., , and A. Karspeck, 2012: Forecasting seasonal outbreaks of influenza.

,*Proc. Natl. Acad. Sci. USA***109**, 20 425–20 430, doi:10.1073/pnas.1208772109.Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters.

,*Mon. Wea. Rev.***131**, 1485–1490, doi:10.1175/1520-0493(2003)131<1485:ESRF>2.0.CO;2.Whitaker, J. S., , and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation.

,*Mon. Wea. Rev.***140**, 3078–3089, doi:10.1175/MWR-D-11-00276.1.Whitaker, J. S., , T. M. Hamill, , X. Wei, , Y. Song, , and Z. Toth, 2008: Ensemble data assimilation with the NCEP Global Forecast System.

,*Mon. Wea. Rev.***136**, 463–482, doi:10.1175/2007MWR2018.1.Zhang, F., , C. Snyder, , and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter.

,*Mon. Wea. Rev.***132**, 1238–1253, doi:10.1175/1520-0493(2004)132<1238:IOIEAO>2.0.CO;2.Zhang, Y., , and D. S. Oliver, 2010: Improving the ensemble estimate of the Kalman gain by bootstrap sampling.

,*Math. Geosci.***42**, 327–345, doi:10.1007/s11004-010-9267-8.Zhu, K., , Y. Pan, , M. Xue, , X. Wang, , J. S. Whitaker, , S. G. Benjamin, , S. S. Weygandt, , and M. Hu, 2013: A regional GSI-based ensemble Kalman filter data assimilation system for the rapid refresh configuration: Testing at reduced resolution.

,*Mon. Wea. Rev.***141**, 4118–4139, doi:10.1175/MWR-D-13-00039.1.