## 1. Introduction

Ensemble Kalman filters were developed for data assimilation in oceanic and atmospheric applications during the 1990s (Evensen 1994; Burgers et al. 1998). Basic ensemble filters worked well for low-order models, but performed poorly or diverged from the observed system when applied to large geophysical models. Houtekamer and Mitchell (1998) determined that small ensembles could not accurately estimate the small correlations between a state variable and a physically remote observation. To reduce errors, they did not use observations that were further than a cutoff distance from a state variable; this procedure is called localization. A properly tuned localization allows ensemble filters with fewer than 100 members to work with the largest atmosphere and ocean models. However, tuning the cutoff distance for a particular application is expensive.

Theoretically motivated functions that approximate the spatial covariance of geophysical variables for data assimilation purposes were described by Gaspari and Cohn (1999). Data assimilation experiments with variational methods (Courtier et al. 1998) and Kalman filters (Lyster et al. 1997) using these functions were already under way and it was a natural extension to apply them in ensemble Kalman filters. The compactly supported polynomial approximation of a normal probability distribution in Gaspari and Cohn (1999) became the standard solution for localizing in the horizontal in atmospheric applications of the ensemble Kalman filter. Only a single real coefficient defining the width of the localization must be tuned, but even this can be expensive for large models.

Localization in the vertical is also required for good filter performance in atmospheric applications. In this case, there is less theoretical foundation for choosing a localization and a number of functions have been used (Whitaker et al. 2004; Houtekamer and Mitchell 2005). To further increase the complexity of tuning, ensemble filter performance is also improved when different localizations are used for the impact of different observation types (Houtekamer and Mitchell 2005; Tong and Xue 2005). An additional challenge is that increasingly strong localization of observation impacts can lead to increasingly unbalanced posterior model states and consequent transient nonequilibrium oscillations such as gravity waves in atmospheric models (Mitchell et al. 2002; Kepert 2009). This complexity motivates the design of algorithms to automatically tune localizations.

Developing a better understanding of why localization is needed for good filter performance is important for improving filter algorithms. Bishop and Hodyss (2007) were among the first to propose a method for adaptive ensemble covariance localization. Anderson (2007) suggested that localization was primarily needed because of sampling error and proposed a method using a group of ensemble filters to detect and correct for this error. The nature of sampling error in ensemble filters is explored further here leading to an algorithm that computes localization as a function of ensemble sample correlation and ensemble size. A number of other studies including Chen and Oliver (2009) and Bishop and Hodyss (2009a,b) have related localization to the correlation between an observation and a state variable.

Section 2 discusses methods for automatically tuning localization and hypothesizes that localization is required because of sampling error. Section 3 proposes an algorithm for reducing sampling error and section 4 presents results in two low-order models and the dynamical core of an atmospheric circulation model.

## 2. Sampling error in ensemble filters

The Kalman filter is the optimal solution to a data assimilation problem with a linear forecast model, linear observation operators, and normal observational error. A deterministic ensemble Kalman filter like the ensemble adjustment Kalman filter (EAKF; Anderson 2001) with an ensemble size that exceeds a threshold is simply an algorithm for computing the Kalman filter solution (Anderson 2009b). The ensemble sample covariance of an observation and a state vector component is the optimal estimate. When the EAKF is used with an ensemble that is too small, or any of the conditions for the Kalman filter to be optimal are violated, the ensemble sample covariance is no longer guaranteed to be optimal. In addition, stochastic ensemble Kalman filters like the perturbed observation filter (Burgers et al. 1998) are subject to sampling error for any ensemble size even when the Kalman filter is an optimal solution.

Localization is a standard algorithm for reducing the impact of errors in ensemble Kalman filters (Houtekamer and Mitchell 1998; Hamill et al. 2001; Furrer and Bengtsson 2007). The regression coefficient, or gain, relating ensemble increments for observed quantity **y** to increments for state variable **x** is multiplied by a factor between 0 and 1 called a localization. The localization is often a function of the physical distance between **y** and **x**. A common form for this function is a compactly supported polynomial approximation to a normal known as the Gaspari–Cohn (GC) function (Gaspari and Cohn 1999). The GC function works well for many applications where localization in the horizontal is important, but specifying localization in the vertical is more challenging (Whitaker et al. 2004). In addition, some observations, for instance a satellite radiance (Houtekamer and Mitchell 2005) or a Constellation Observing System for Meteorology Ionosphere and Climate (COSMIC) radio occultation (Liu et al. 2007), are not associated with a unique spatial location. Campbell et al. 2010 explored localization functions for observations, like satellite radiances, that have forward operators that perform weighted averages of a large number of state variables. As noted in Anderson (2007) localizations for such variables are expected to be quite different from a GC function in some cases. Good localizations may also be a function of the types of the observation and the state variable; for instance, an observation of temperature might require different localizations for temperature and wind state variables (Anderson 2007). Finally, localization is also expected to be a function of the time difference between an observation and a state variable (Anderson 2007); this is relevant for ensemble Kalman smoothing. Some ensemble filters like the local ensemble transform Kalman filter (Ott et al. 2004) implicitly localize by implementing the assimilation algorithm on local patches although they can be modified to have more general localization (Miyoshi et al. 2007).

There are a number of techniques to adapt to or correct for errors in ensemble Kalman filter sample covariances. Mitchell and Houtekamer (2000) developed a model of model error and compared prior statistics to observations. Their algorithm improved filter performance and provided estimates of the error terms. Li et al. (2009b) explored several models of model error and compared their efficacy in perfect model assimilations with and without inflation of ensemble priors. Li et al. (2009a) presented a method to simultaneously correct for a lack of variance in prior ensembles and estimate the error variance of observations.

Methods for dynamically computing localization have been developed. In an algorithm closely related to the one presented here, Emerick and Reynolds (2010) based localization on observation sensitivity matrices in a petroleum reservoir application. A similar algorithm, where localization is related to the prior ensemble covariance between an observation and a state variable, is presented in Chen and Oliver (2009). Bishop and Hodyss (2007, 2009a,b) present algorithms where localization is equal to a power of the sample correlation of a smoothed ensemble.

Anderson (2007) hypothesized that most of the errors necessitating localization in atmospheric applications were literally sampling error (i.e., dependent on arbitrary choices in the selection of a finite initial ensemble). Running several ensemble filters, differing only in the initial ensemble, produces a sample of regression coefficients (gains) for an observation **y** and a state vector component **x**. A localization that minimizes the expected, ensemble mean, root-mean-square error for the state variable component, given the set of regression coefficients, can be computed analytically. Time mean values of this group filter localization are similar to the best heuristically derived localizations for a variety of applications. However, this method ignores the possibility that the regression coefficients from the group of filters are biased compared to the optimal value. A similar approach that estimates errors in ensemble priors by splitting the ensemble into pieces is presented in Houtekamer and Mitchell (1998). A bootstrap method presented in Zhang and Oliver (2010) for estimating errors in ensemble covariances is also similar.

The group filter algorithm is expensive because it requires several ensemble filter assimilations. Anderson (2007) suggested that the time mean localization from a short group filter assimilation could be used as a static localization for a traditional ensemble filter. The next section describes a less costly algorithm for estimating sampling error using a single ensemble filter. This algorithm assumes that the ensemble filter is more similar to a true Monte Carlo algorithm in which the ensemble is a random draw from some underlying distribution than to a deterministic exact Kalman filter algorithm. If applied in the case where the ensemble filter is the optimal solution, the algorithm can only degrade performance. However, it may lead to improved performance in large geophysical applications with small ensemble sizes.

## 3. Sampling error correction algorithm

**y**on a single state variable component

**x**is sufficient to define all commonly used ensemble filter algorithms without loss of generality (see Anderson 2003 for a derivation). In this paper, a localization is defined as a factor that multiplies the regression coefficient that is used to compute increments to the prior estimate of

**x**given increments for

**y**,where

*α*is a localization,

*n*indexes the

*N*ensemble members, Δ

**y**

_{n}is the increment for the

*n*th ensemble estimate of the observed quantity, Δ

**x**

_{n}is the corresponding increment for the

*n*th ensemble estimate of the state variable component, and

**x**and

**y**, respectively.

*N*-member sample regression coefficient

*σ*

_{b,N}indicates that there is sampling error in the ensemble filter. The algorithm computes a localization

*α*such that multiplying a random draw

*z*from

*α*minimizes the expected RMS difference between

*αz*and the maximum likelihood estimate

**x**. Minimizing requires thatwhereTaking the partial derivative in (3) results inUsing the integral identitiesandgivesSolving for

*α*giveswhere

*Q*is the ratio of the mean regression to the standard deviation:A basic sampling error correction algorithm would proceed as follows. First, the sample regression coefficient is computed. Then a sampling error correction localization is computed with (9). The state variable ensemble is then updated usingTo compute (9), one needs to compute

*σ*

_{b,N}given the ensemble sample

*r*is independent of the relative scale of the priors and is bounded on [−1, 1].

*σ*

_{r}_{,N}of the distribution from which the sample correlation

For all results shown here, the prior information assumed about *r* is that it is uniformly distributed on [−1, 1]. This is a relatively uninformative prior. If more detailed information about the prior is available, for instance that the correlation is greater than 0.5, it could be used (see section 5). An additional complication is that the sample correlation coefficient

*σ*

_{r}_{,N}given the sample regression coefficient

*K*correlation values is selected that equally partition the interval [−1, 1] withFor results shown here,

*K*= 201. For each

*r*,

_{k}*M*random samples (

*M*= 10

^{8}for all results here) of size

_{N}from a bivariate normal distribution with covariance

*K*×

*M*sample correlation values:The

*K*subsets

*R*where the

_{k}*k*th subset contains all

*r*than to any other value

_{k}*r*from the set defined in (14). Each sample correlation

_{j}*r*. The mean and standard deviation of the actual correlations in each subset are computed. Suppose that an observation and a state variable have sample correlation

_{k}*R*for the value of

_{k}*r*that is closest to

_{k}The final sampling error correction algorithm proceeds as follows:

- The ensemble sample regression coefficient
and correlation are computed. - The mean
and standard deviation *σ*_{r}_{,N}are obtained from the offline computation outlined in the previous paragraph given. and *σ*_{b}_{,N}are computed from (12) and (13).*α*is computed from (9).- Increments for
**x**are computed asThe termis the product of the localization and the bias correction and is referred to as the sampling error correction (SEC) hereafter.

*S*as a function of ensemble size and ensemble sample correlation can be computed offline and referenced during an assimilation. Figure 1 shows

*S*as a function of the absolute value of the sample correlation

*S*is smaller for small ensembles and small correlations indicating that the relative sampling error is larger in these cases.

## 4. Results

Perfect model assimilation experiments with increasingly complex models are used to evaluate the SEC algorithm. In a perfect model experiment, a single long run of the numerical forecast model, referred to as the “truth” run here, is used as an analog for an evolving physical system. Synthetic observations of this long run are generated by applying forward operators to the state vector from the truth run and adding in random samples from a specified observational error distribution.

**x**

_{m,t}is the true value, and the subscript

*m*indexes the model state variable.

### a. Simple linear model

*N*> 200 the EAKF has no sampling error and produces the Kalman filter solution. For

*N*≤ 200, the EAKF solution diverges leading to unbounded growth in the ensemble estimates of the state. The optimal localization for this problem is a delta function so that the

*i*th observation impacts only the

*i*th state variable because the state variables evolve independently.

Ensemble assimilations are performed for 100 000 steps. Figure 2 displays the time mean root-mean-square error [(18)] of the ensemble mean averaged over all 200 state vector components for various ensemble sizes using the SEC algorithm. The dashed horizontal line indicates both the spread and the expected RMSE of the optimal solution from a 201-member filter with no SEC. The largest SEC assimilation ensemble shown has 201 members and an RMSE larger than the optimal value. All of the smaller ensembles would diverge without SEC and their RMSE increases as ensemble size decreases. Even with SEC, ensembles smaller than 50 generally diverged during the 100 000 steps. Figure 2 also displays the ensemble spread. It is larger than the optimal spread in all cases and is larger than the corresponding RMSE for ensemble sizes greater than 100.

Figure 3 displays the time mean estimate of the SEC averaged over all pairs of observations and state variables that are not collocated; the optimal value in this case is 0. The average SEC is very close to 0 for ensemble sizes of 201 and 200 and increases to 0.4 as ensemble size decreases to 50. These nonzero values reflect the inability of the SEC algorithm to accurately determine that all nonzero correlations are spurious in this case. However, the sampling error is systematically decreased for all ensemble sizes greater than 50. The value of SEC for all collocated pairs of observations and state variables is 1 in this case. Figure 3 also displays the time mean value of the adaptive inflation averaged over all state variables. As the ensemble gets smaller, more inflation is needed to offset the erroneous loss of variance that occurs when observations improperly impact state variables that are not collocated.

### b. Lorenz-96 40-variable model

*F*= 8.0, time step of 0.05 units, and the fourth-order Runge–Kutta time differencing scheme. At each assimilation time, 20 observationsthat are equally spaced in the model domain are assimilated. Observations are generated from a 110 000 step control integration with observational error simulated by adding a random draw from Normal(0, 1). The initial ensemble is composed of random draws from the long run. The first 10 000 steps of the assimilation are discarded and results are the average of the last 100 000 steps. A background GC localization is used in most experiments, both with and without SEC, with a half-width expressed as a fraction of the model domain.

Figure 4 displays the time mean ensemble mean RMSE [(18)] as a function of the half-width of the GC localization for ensemble sizes of 10, 20, and 40 with and without SEC. For 10-member ensembles, the SEC RMSE is larger for GC half-width less than 0.3 but smaller for larger halfwidths. Without SEC, the filter blows up for GC halfwidths greater than 0.5. Both filters with and without SEC give the smallest RMSE for GC half-width 0.2 but the case without SEC has a significantly smaller RMSE.

Qualitatively, the behavior for 20-member ensembles is similar. The case without SEC has smallest RMSE for GC half-width 0.3 while the case with SEC is smallest for GC half-width 0.5. The SEC case has larger RMSE for GC half-width less than 0.7 and smaller for larger halfwidths. Again, the smallest overall RMSE is for the case without SEC. Finally, for 40-member ensembles the case without SEC has smaller RMSE for all GC halfwidths with the largest differences from the SEC case being for smaller halfwidths.

The SEC degrades filter performance for many of the Lorenz-96 cases. Only for large GC halfwidths and small ensemble sizes does it reduce the RMSE. Figure 5 shows time mean values of *S* for a particular observation location and 10-, 20-, and 40-member ensembles with no background GC localization. The largest values of *S* are for the two state variables that are averaged in the forward operator for this observation. For 10 members, the maximum value of *S* is approximately 0.75 and this increases to nearly 0.9 for 40 members. For state variables far from the observation, *S* has a minimum of about 0.2 for all three ensemble sizes; the figure includes the 10-member curve on the plots of 20- and 40-member results to facilitate this comparison.

Figure 5 also shows the time mean localization that results from applying a group filter (Anderson 2007) with 4 groups for each ensemble size; for example, 4 times 10 model forecasts are used for the group results in Fig. 5a. The group filter localization is larger than the SEC near the observation location and smaller for remote state variables. The differences between the group filter and SEC time means become smaller, especially close to the observation, as the ensemble size increases. These results suggest that the SEC is giving too little weight to observations close to a state variable, and too much weight to observations that are remote. As the ensemble size gets large, the difference becomes predominantly giving too much weight to remote observations.

The top panel of Fig. 5 also shows a GC function for comparison. With the exceptions of the long tails, the GC is fairly similar to the group filter time mean localizations. The GC localizations that lead to the smallest RMSE results for Lorenz-96 are quite similar to the corresponding group filter time mean localizations. The best SEC cases are not quite as good because state variables receive too much impact from nearly unrelated distant observations and too little from nearby observations. Nevertheless, using the SEC does stabilize the RMSE for larger background GC cases for the 10- and 20-member ensembles. In particular, the SEC leads to the 10-member case being stable even for no background GC, while the base case becomes unstable for GC greater than 0.5.

The similarity between the 40-member SEC time mean in Fig. 5 and the four group filter result suggests that the SEC is an inexpensive way to estimate time mean localization that could then be used in a standard filter. Anderson (2007) explored the impact of using time mean localizations from a group filter in this way.

### c. Low-order dry dynamical core

A low-resolution version of the B-grid dynamical core of the Geophysical Fluid Dynamics Laboratory Atmospheric Model version 2.0 (GFDL AM2) general circulation model (GFDL Global Atmospheric Model Development Team 2004) with 30 latitudes, 60 longitudes, and 5 levels is used for perfect model assimilation experiments. The model is forced with a pole to equator temperature gradient as described in Held and Suarez (1994) and is the same model used in Anderson et al. (2005) to examine assimilation of only surface pressure (PS) observations. The model has midlatitude baroclinic instability and a total of 28 800 variables.

^{2}) to PS (units are Pa), and Normal(0, 3

^{2}) to wind components and temperature (units are m s

^{−1}and K).

A 100-yr free run of the model starting from no motion is used to generate a climatological sample from the model attractor. This preliminary initial condition is extended an additional 150 days during which synthetic observations are generated. A 320-member ensemble is generated by adding small perturbations to the preliminary initial condition and a preliminary ensemble assimilation with no localization is done for these 150 days.

The end of the 150-day control integration is the initial condition for the ensemble assimilation experiments. This initial condition is integrated for an additional 200 days during which observations are generated. The first *N* members from the final state of the 320-member preliminary assimilation are used as initial conditions for an ensemble assimilation with *N* members. This *N* member ensemble assimilation is then applied to the next 200 days of observations, the first 50 days are discarded, and the final 150 days (300 assimilation times) are used to compute statistics.

The relative quality of filter solutions is measured by the spatial and temporal mean of the RMSE for the ensemble mean of the model PS variables [(18) with *m* indexing only the *M* = 1800 surface pressure variables from the model]. Results are qualitatively similar for any of the other model variables. Figure 6 shows the time mean PS RMSE as a function of background horizontal GC localization radius half-width for 20-, 40-, and 80-member ensembles with and without SEC. For 20- and 40-member ensembles, the SEC has slightly larger RMSE for GC localization of 0.2, but smaller RMSE for all other GC values. For 80 members, the SEC has slightly larger RMSE for GC values less than 1, but smaller RMSE for larger values of GC. For all ensemble sizes, the absolute minimum of RMSE is for a SEC case. For larger values of GC, the SEC RMSE is much smaller than for cases without SEC. Since exhaustively tuning the GC half-width becomes prohibitively expensive in large models, the fact that applying the SEC makes the RMSE less sensitive to GC values is potentially useful.

Figure 7 displays the spatial and temporal mean of the ratio of ensemble spread to RMSE for PS as a function of GC half-width for 20- and 80-member ensembles with and without SEC. For small GC, cases with and without SEC have too much spread. As the GC half-width is increased, the ratio decreases for all four cases. For values of GC for which the RMSE was smaller in the SEC case, the ratio of spread to RMSE is closer to 1 for the SEC case. It is not surprising that the SEC spread is less deficient for large GC half-width since each state variable is being affected by many remote observations that are only weakly related and the SEC reduces some of this noise.

Figure 8 shows the spatial and temporal mean of the adaptive inflation applied to the PS field as a function of GC half-width for the 20- and 80-member ensemble cases. For small GC, inflation values in all cases are very close to 1 since the ensemble has sufficient spread. As the GC half-width increases, inflation increases in all cases. The increase is faster for the 20-member ensembles and for the cases without SEC, again consistent with the spread ratios in Fig. 7. The 80-member SEC case with GC half-width of 3 requires mean inflation of about 1.05, while the case without requires inflation of 1.7. The smaller inflation for the SEC cases suggests that the SEC is acting mostly to remove noise while retaining much of the signal in the assimilation.

A number of studies have pointed out that ensemble assimilation can lead to unbalanced posterior states that result in transient nonequilibrium oscillations such as gravity waves in atmospheric models (Mitchell et al. 2002; Kepert 2009; Oke et al. 2007; Greybush et al. 2011). One common measure of imbalance in primitive equation models is the RMS of the time tendency of the PS (Fillion et al. 1995). The spatial mean of the time tendency of PS for the first model time step after each assimilation is averaged to measure imbalance. Figure 9 shows this measure as a function of GC half-width for the 20- and 80-member ensembles. The figure also shows a baseline value obtained from the last 10 years of the free model run that generated the preliminary initial condition.

In general, the 20-member ensemble cases are much less balanced than their 80-member counterparts (Fig. 9). For all but the largest values of GC half-width, the SEC cases are more unbalanced than the corresponding case without SEC. The smallest imbalance occurs for intermediate values of the background GC. For very small GC, imbalance is occurring because strongly correlated state variables that are close to one another may be impacted quite differently by a legitimately correlated observation that is slightly closer to one of them. For large GC, the introduction of noise due to accidental correlation between observations and distant state variables leads to imbalance.

Figure 10 shows the time mean SEC for the impact of a PS observation near the equator on PS state variables for an 80-member ensemble with a 1.2-rad background GC half-width. For the closest state variables, the SEC is very close to 1 while for distant state variables, the values are between 0.3 and 0.4. One dimensional cross sections are roughly symmetric about the observation and similar in shape to those for Lorenz-96 in Fig. 5. Results for smaller ensembles (not shown) are similar in shape but have smaller values near the observations and roughly the same values at larger distances. As in the Lorenz-96 case, the SEC is somewhat similar in shape to a GC function in the horizontal.

Figure 11 shows the time mean SEC for the impact of a north–south wind component observation on the model’s middle level on east–west wind state variables on the same level. For remote observations (not shown in the limited domain of the figure) the SEC values are roughly the same as for the PS observation in Fig. 10. However, close to the observation there is a more complicated pattern with four independent maxima. Interestingly, the largest values of SEC are not for the state variables that are closest to the observation in this case. Also, the maximum values are not nearly as large as those in Fig. 10. This is consistent with results in Anderson (2007) suggesting that localization needs to be applied not only as a function of the spatial relation between an observation and state variable, but also depending on the “types” of the observation and state.

Figure 12 shows the time mean localization from a two group filter for the same observation as in Fig. 11. The pattern is quite similar with four maximum located around the observation location. As in the comparisons of group filters and SEC for Lorenz-96, the maximum values are slightly larger for the group filter than for the SEC. However, the similarity suggests that the SEC is detecting much of the sampling error that is detected by the group filter.

There are several reasons why the SEC RMSE is generally better than for cases without SEC for this model. Having three spatial dimensions means that the number of distant observations that are potentially contaminated with noise relative to the number of nearby observations is larger than in the one-dimensional Lorenz-96 model. Second, the group filter results in Fig. 12 suggest that there are cases where the appropriate localization for sampling error is less similar in shape to an appropriately tuned GC function than for the Lorenz-96 model. Third, there is evidence that the horizontal width of localization suggested by the group filter varies as a function of the horizontal location of observations. The SEC for a PS observation in midlatitudes (not shown) is found to have a much larger area of large values than for an equatorial PS observation. Finally, there is limited experience with applying vertical localization in large model applications. It is not clear that a GC function is appropriate in the vertical for this application.

## 5. Conclusions

Practical applications of ensemble Kalman filters in large geophysical models must use ensembles that are small compared to the model state vector. Some form of localization of observation impact is then required to avoid poor performance or filter divergence.

A systematic error correction algorithm that computes a localization for the impact of each observation on each state variable has been described. In low-order models, applying this SEC algorithm reduces the sensitivity of assimilation quality to the width of a specified background localization. With the SEC algorithm, small ensembles can produce reasonable results with values of background localization that result in model failure without SEC. In a three-dimensional model of greater complexity, the SEC algorithm reduces sensitivity to background localization and also produces better assimilation results.

If a specified GC localization is very similar to the optimal localization for an application, it is difficult for the SEC to produce assimilations with lower RMSE. Time mean localizations from the SEC are very similar in shape to a GC function in the one-dimensional Lorenz-96 model. In addition, there are only 40 variables in the model so that the number of remote observations for a given state variable is relatively small. The three-dimensional atmospheric dynamical core presents a better opportunity for the SEC algorithm to demonstrate improvement. The time mean values of SEC localization are not as similar to a GC function. There is evidence, both from the SEC results and from group filter results, that localization is a function not only of horizontal distance but also of the latitude of the observation and the latitude, longitude, and vertical displacements of the state variable from the observation. In some cases, the horizontal pattern of the SEC-derived localizations is not closely approximated by a Gaussian in the vicinity of the observation. In fact, even in this simple dynamical core, the structure of the localization suggested by the SEC is quite varied. Also, having three spatial dimensions means that the number of observations impacting a state variable increases as the cube of the localization cutoff. This implies that the number of weakly related observations at a distance is much greater in this model than in the Lorenz-96 model. The ability of the SEC algorithm to minimize the impact of weakly related observations is expected to be more important in three-dimensional models.

There are a number of geophysical applications where a standard GC localization is suboptimal. First, little is known about how to localize in the vertical in atmospheric or oceanic models, so any three-dimensional model may be helped by the SEC algorithm. There are also applications where the correlation of observations and state variables is expected to be heterogeneous in the horizontal. Localization of an observation in the eyewall of a hurricane might require a much stronger horizontal localization, possibly with a complicated vortex-oriented shape, than observations of the background flow outside of the storm. Similarly, observations of convective cells for a cloud-resolving assimilation of severe convection might require localization quite different from observations in a surrounding location without convection. In the ocean, localization of observations in a western boundary current like the Gulf Stream might be different from that needed in the middle of an ocean basin. Experiments applying the SEC to real-time hurricane prediction in the Atlantic basin (R. Torn 2010, personal communication) and to global ocean prediction are under way and will be described in subsequent reports. Finally, the ensemble Kalman smoother where observations impact state variables at other times also requires localization that is quite different from a GC function (Anderson 2007). Further testing of the SEC algorithm is required to see if it can stabilize and improve filter and smoother performance in these applications.

The SEC algorithm independently computes a localization for each observation and state variable pair during an assimilation. One could consider using additional information about other pairs to improve the localization. For instance, at a single time one could do spatial averaging of the localizations for similar pairs of observations and state variables. One could also use information from previous times to do temporal averages of SEC localization. In the most general case, one could build a statistical model of the localization expected for a given observation and state variable using all SEC localization information from previous steps in an assimilation. A relatively simple example of such a statistical model would be to use the time mean localization, like that shown in Figs. 10 and 11, from all previous assimilation steps.

The results discussed here all assumed a relatively uninformative uniform prior for the correlation of an observation and a state variable. It is possible to efficiently compute the SEC algorithm for more informative priors. For instance, for an observation and a state variable that are known a priori to be strongly positively correlated, one could specify a prior correlation distribution that is *U*(0.8, 1.0). For a pair that is known to be very weakly correlated, a prior of *U*(−0.5, 0.5) could be used. The additional prior information would result in more constrained values of SEC localization.

The relation between localization and unbalanced analyses requires further consideration. Even for a large ensemble, the application of the SEC resulted in slightly more unbalanced solutions. It is possible that using more informed prior estimates of the correlations could reduce this imbalance.

The SEC assumes that the ensemble estimates of covariance are not systematically biased due to model error or nonlinearity. An algorithm like the SEC, or the group filter, that only examines the prior ensemble cannot correct for these errors. Adaptive inflation algorithms like the ones in Anderson (2009a) make use of both the prior ensemble and the observations when correcting for error. Algorithms that use observations to detect systematic model errors in prior covariance estimates could lead to improved filter performance.

Understanding of why ensembles of *O*(10) members work so effectively in enormous geophysical models is still lacking. For instance, it is unclear if localized filters are closer to the deterministic Kalman filter limit or to a true Monte Carlo algorithm. The SEC algorithm leads to degraded behavior in the Kalman filter limit while it is expected to be beneficial in the Monte Carlo limit. The low-order GCM results here are apparently not so close to the Kalman filter limit that the SEC degrades performance. Future research on this issue could lead to an ability to estimate a priori the expected performance of a given ensemble size for a given application.

Thanks to Kevin Raeder, Pavel Sakov, Peter Bickell, and Doug Nychka for comments that improved the clarity of the manuscript. Jim Hansen, Craig Bishop, and an anonymous reviewer helped to significantly increase the clarity and correctness of the manuscript. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of the National Science Foundation.

# APPENDIX

## Sources of Sampling Error

*N*, correlation

*r*, and standard deviations

*σ*and

_{x}*σ*, generate

_{y}*K*(

*K*is 10 million for the results here) random samples of size

*N*from a distribution with covariance matrix

*b*and the correlation coefficient

*r*. Then compute the mean and standard deviation of

*b*and

*r*over the

*K*samples. One can then compute

*Q*, the ratio of the mean to the standard deviation for both the regression coefficient and the correlation coefficient and then use (9) to compute a localization for each. Define the relative error aswhere

*α*is the localization for the regression and

_{b}*α*is the localization for the correlation. The relative error in this case does not depend on

_{r}*σ*or

_{x}*σ*.

_{y}This calculation was performed for correlation values ranging from 0.01 to 1.0 every 0.01. For correlations larger than 0.05, the maximum relative error was 14%, 7%, 3%, and 2% for ensemble sizes of 10, 20, 40, and 80, respectively. Somewhat larger errors were found for correlations less than 0.05, but the impact of these observations is already small. The fact that the sampling error is dominated by the correlation term in this case suggests that a similar result holds for the full sampling error problem described in section 3.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903.Anderson, J. L., 2003: A local least squares framework for ensemble filtering.

,*Mon. Wea. Rev.***131**, 634–642.Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230**, 99–111.Anderson, J. L., 2009a: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83.Anderson, J. L., 2009b: Ensemble Kalman filters for large geophysical applications.

,*IEEE Control Syst.***29**, 66–82.Anderson, J. L., , B. Wyman, , S. Zhang, , and T. Hoar, 2005: Assimilation of surface pressure observations using an ensemble filter in an idealized global atmospheric prediction system.

,*J. Atmos. Sci.***62**, 2925–2938.Anderson, J. L., , T. Hoar, , K. Raeder, , H. Liu, , N. Collins, , R. Torn, , and A. Arellano, 2009: The Data Assimilation Research Testbed.

,*Bull. Amer. Meteor. Soc.***90**, 1283–1296.Bishop, C. H., , and D. Hodyss, 2007: Flow adaptive moderation of spurious ensemble correlations and its use in ensemble based data assimilation.

,*Quart. J. Roy. Meteor. Soc.***133**, 2029–2044.Bishop, C. H., , and D. Hodyss, 2009a: Ensemble covariances adaptively localized with ECO-RAP. Part 1: Tests on simple error models.

,*Tellus***61A**, 84–96.Bishop, C. H., , and D. Hodyss, 2009b: Ensemble covariances adaptively localized with ECO-RAP. Part 2: A strategy for the atmosphere.

,*Tellus***61A**, 97–111.Burgers, G., , P. J. van Leeuwen, , and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126**, 1719–1724.Campbell, W. F., , C. H. Bishop, , and D. Hodyss, 2010: Vertical covariance localization for satellite radiances in ensemble Kalman filters.

,*Mon. Wea. Rev.***138**, 282–290.Chen, Y., , and D. S. Oliver, 2009: Cross-covariances and localization for EnKF in multiphase flow data assimilation.

,*Comput. Geosci.***14**, 579–601, doi:10.1007/s10596-009-9174-6.Courtier, P., and Coauthors, 1998: The ECMWF implementation of three-dimensional variational assimilation (3D-Var). I: Formulation.

,*Quart. J. Roy. Meteor. Soc.***124**, 1783–1807.Emerick, A., , and A. Reynolds, 2010: Combining sensitivities and prior information for covariance localization in the ensemble Kalman filter for petroleum reservoir applications.

,*Comput. Geosci.***15**, 251–269, doi:10.1007/s10596-010-9198-y.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**(C5), 10 143–10 162.Fillion, L., , H. L. Mitchell, , H. Ritchie, , and A. Staniforth, 1995: The impact of a digital filter finalization technique in a global data assimilation system.

,*Tellus***47A**, 304–323.Furrer, R., , and T. Bengtsson, 2007: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants.

,*J. Multivar. Anal.***98**, 227–255.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757.GFDL Global Atmospheric Model Development Team, 2004: The new GFDL global atmosphere and land model AM2–LM2: Evaluation with prescribed SST simulations.

,*J. Climate***17**, 4641–4673.Greybush, S. J., , E. Kalnay, , T. Miyoshi, , K. Ide, , and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques.

,*Mon. Wea. Rev.***139**, 511–522.Hamill, T. M., , J. S. Whitaker, , and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790.Held, I. M., , and M. J. Suarez, 1994: A proposal for the intercomparison of the dynamical cores of atmospheric general circulation models.

,*Bull. Amer. Meteor. Soc.***75**, 1825–1830.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811.Houtekamer, P. L., , and H. L. Mitchell, 2005: Ensemble Kalman filtering.

,*Quart. J. Roy. Meteor. Soc.***131**, 3269–3289.Kepert, J. D., 2009: Covariance localisation and balance in an Ensemble Kalman Filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 1157–1176.Li, H., , E. Kalnay, , and T. Miyoshi, 2009a: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 523–533.Li, H., , E. Kalnay, , T. Miyoshi, , and C. M. Danforth, 2009b: Accounting for model errors in ensemble data assimilation.

,*Mon. Wea. Rev.***137**, 3407–3419.Liu, H., , J. L. Anderson, , Y.-H. Kuo, , and K. Raeder, 2007: Importance of forecast error multivariate correlations in idealized assimilations of GPS radio occultation data with the ensemble adjustment filter.

,*Mon. Wea. Rev.***135**, 173–185.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414.Lyster, P. M., , S. E. Cohn, , R. Menard, , L.-P. Chang, , S.-J. Lin, , and R. G. Olsen, 1997: Parallel implementation of a Kalman filter for constituent data assimilation.

,*Mon. Wea. Rev.***125**, 1674–1686.Mitchell, H. L., , and P. L. Houtekamer, 2000: An adaptive ensemble Kalman filter.

,*Mon. Wea. Rev.***128**, 416–433.Mitchell, H. L., , P. L. Houtekamer, , and G. Pellerin, 2002: Ensemble size, balance and model-error representation in an ensemble Kalman filter.

,*Mon. Wea. Rev.***130**, 2791–2808.Miyoshi, T., , S. Yamane, , and T. Enomoto, 2007: Localizing the error covariance by physical distance within a local ensemble transform Kalman filter (LETKF).

,*Sci. Online Lett. Atmos.***3**, 89–92.Oke, P. R., , P. Sakov, , and S. P. Corney, 2007: Impacts of localization in the EnKF and EnOI: Experiments with a small model.

,*Ocean Dyn.***57**, 32–45.Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A**, 415–428.Tong, M., , and M. Xue, 2005: Ensemble Kalman filter assimilation of Doppler radar data with a compressible nonhydrostatic model: OSS experiments.

,*Mon. Wea. Rev.***133**, 1789–1807.Whitaker, J. S., , G. P. Compo, , X. Wei, , and T. M. Hamill, 2004: Reanalysis without radiosondes using ensemble data assimilation.

,*Mon. Wea. Rev.***132**, 1190–1200.Zhang, Y., , and D. S. Oliver, 2010: Improving the ensemble estimate of the Kalman gain by bootstrap sampling.

,*Math. Geosci.***42**, 327–345.