## 1. Introduction

Ensemble data assimilation strategies are now used frequently for state and parameter estimation in geoscience. Examples include the perturbed observation ensemble Kalman filter (Evensen 1994; Houtekamer and Mitchell 1998), deterministic ensemble square root filters (e.g., Anderson 2001; Bishop and Hodyss 2011; Whitaker and Hamill 2002), and ensemble/variational hybrids (e.g., Hamill and Snyder 2000; Lorenc 2003; Buehner 2005). These methods attempt to provide Monte Carlo estimates of a system’s probability density conditioned on observations, and are formulated by assuming errors for the model state and observations are independent and Gaussian. A major benefit of Gaussian approaches is that they can be made to work effectively using affordable ensemble sizes by accounting for sampling errors in ensemble-estimated covariances. Nevertheless, these methods may not be the best option as computational resources allow for larger ensembles that better resolve non-Gaussian errors.

A recent paper by Poterjoy (2016, hereafter P16) introduces a new localized particle filter (PF) for ensemble data assimilation and forecasting. The filter operates in a manner similar to traditional PFs in that it estimates posterior quantities by reweighting particles (ensemble members) based on the likelihood of observations. Particle filters do not assume a parametric probability density function (PDF) for errors in the model state estimate, which allows them to perform well for most applications when provided with a sufficiently large number of particles; see Doucet et al. (2001) for a review. Unlike conventional PF methods, the local PF reduces the influence of distant observations on the weight calculations via a localization function, similar to covariance localization in ensemble Kalman filters (Houtekamer and Mitchell 1998; Hamill and Whitaker 2001). The resulting filter does not require the number of particles to increase exponentially with the degrees of freedom in the system, as is the case for standard PFs (Bickel et al. 2008; Bengtsson et al. 2008; Snyder et al. 2008).

Localization has already been used in previously proposed PF variants, including the local-local ensemble filter (LLEF) of Bengtsson et al. (2003) and the moment matching filter of Lei and Bickel (2011). These studies present novel approaches of generating localized updates to particles that differ from the local PF described in P16. Nevertheless, their potential for data assimilation in large geophysical models has not yet been demonstrated. Another strategy for maintaining a filter’s stability when the state dimension is large is to sample particles from a proposal distribution, typically conditioned on the most recent observations (Doucet et al. 2001). Methods of this type include the equivalent-weights PF (van Leeuwen 2010) and the implicit PF (Chorin et al. 2010). More recently, Ades and van Leeuwen (2015) applied the equivalent-weights PF to a barotropic vorticity model with 65 500 state variables, demonstrating its ability to prevent filter divergence using only 32 particles for a large model. This method relies on modifying the transition density between observation times to draw particles closer to observations, then sampling from a proposal density that ensures particles have nearly equal weights.

The current study differs from previous work in that it uses a high-dimensional geophysical model to examine potential benefits of a PF-based method over two commonly used Gaussian filters: the perturbed observation ensemble Kalman filter (EnKF; Evensen 1994; Houtekamer and Mitchell 1998), and the ensemble adjustment Kalman filter (EAKF; Anderson 2001). P16 already performed extensive tests of the local PF using the 40-variable model of Lorenz (1996) and found the filter to be a robust method for various observation densities, observation types, and model system dynamics. P16 also show that the local PF is substantially better than Gaussian methods for assimilating dense observation networks that relate nonlinearly to the model state, like satellite and radar measurements collected routinely by environmental observing platforms. Most importantly, the local PF produces stable results using as few as 5 particles for the 40-variable Lorenz system. These results provide an incentive to explore the local PF’s potential for data assimilation applications that would require very large numbers of particles for standard PF methods.

Two aspects of the local PF that could not be investigated using the Lorenz system in P16 are explored here: the filter’s ability to produce physically consistent posterior particles and the scalability of the new method for high-dimensional systems. In geophysical models, state variables are often highly correlated due to physical balances created by the model dynamics. These correlations can exhibit complex structures that change rapidly with time (e.g., Poterjoy and Zhang 2011), which motivates the use of ensembles to approximate model forecast errors. Localization can modify or destroy these correlations (Mitchell et al. 2002; Greybush et al. 2011), so physical balances must be considered when applying localized ensemble filters. Despite this limitation, localized ensemble Kalman filters perform reasonably well for most geophysical applications. The local PF must be shown to be as good or better than Kalman filters at preserving balance to be considered an effective data assimilation method. The filter also needs to be practical for large models, where the number of particles is usually severely limited. For typical modeling applications in atmospheric and oceanic science, computational resources usually allow for particle numbers on the order of 100–1000, which is far from what is needed for standard PF methods. To test whether the local PF is a viable method for high-dimensional geophysical systems, the current study uses yearlong cycling data assimilation experiments with an idealized atmospheric general circulation model (GCM). Similar to P16, the local PF is compared with the EAKF and EnKF for cases in which the linear/Gaussian assumptions of these filters are appropriate and for cases in which they are inappropriate for the assimilation problem at hand.

The manuscript is organized in the following manner. Section 2 introduces the most basic form of particle filtering and presents the local PF algorithm. Section 3 discusses deterministic and probabilistic forecast results from yearlong cycling data assimilation experiments using the EAKF, EnKF, and local PF. The last section summarizes the study and provides a discussion of possible applications for the local PF.

## 2. A particle filter for high-dimensional systems

### a. Sequential importance resampling and its limitations

*H*is an operator that maps the model state into observation space and

*ϵ*is the error in the measurements. Starting from a sample of particles to approximate a prior PDF for

The bootstrap filter has a number of desirable properties; most notably, it is simple to implement and converges to the true Bayesian solution as

### b. The local particle filter

Ensemble Kalman filters rely on covariance localization and variance inflation methods to reduce the effects of sampling errors and prevent the ensemble variance from collapsing for small ensembles (Houtekamer and Mitchell 2001; Hamill and Whitaker 2001). Adopting similar strategies, P16 introduced the local PF, which limits the update of particles by the bootstrap filter to regions in the vicinity of observations. The local PF exploits the fact that the signal-to-noise ratio of ensemble correlations between variables in

*n*th set of column elements can be estimated sequentially for each observation using the likelihood of each observations given

*j*th state variable of the

*n*th particle given the

*i*th observation,

*n*and

*j*on

*x*and

*ω*indicate the particle index and state variable, respectively. In (4),

*r*. Each column of the weight matrix contains the normalized weights

^{1}The sampled particles will be denoted

*j*th elements of

After processing all observations, higher-order corrections are then made to the particles using a nonparametric kernel density mapping approach (McGinnis et al. 2015). This step updates the particles so that the marginal probability densities of each state variable match the marginal densities given by

Following the above update steps, the resulting filter provides only an approximation to standard PF methods, which converge to the Bayesian solution as

### c. Posterior inflation and observation quality control

For geophysical filtering applications, the number of particles used to approximate the state-space PDF is typically much smaller than the degrees of freedom in the model. Particles are also integrated forward in time using imperfect forecast models with unknown errors. Performing many successive filtering steps without considering these systematic errors can lead to the collapse of ensemble variance over time. Ensemble Kalman filters generally rely on prior or posterior variance inflation methods to compensate for these errors; for example, the EAKF and EnKF in this study use adaptive prior inflation (Anderson 2007). In this study, we stabilize the local PF by maintaining relatively uniform importance weights. A simple, yet effective means of preventing the weights from converging to a single particle is to multiply *α*, where *α* parameter reduces the correction made to particles during the update step in a manner similar to posterior inflation schemes used in some implementations of ensemble Kalman filters (e.g., Zhang et al. 2004; Whitaker and Hamill 2012). This modification affects the sampling of particles and the calculation of

The local PF algorithm applies a sampling step for each observation, thus making the treatment of sampling errors using *α* an important component of the filter. To demonstrate the impact of *α* on the calculation of posterior weights, we present three univariate examples using a simple Gaussian application. For the first case, we generate a sample of 20 particles from *α* from 1 to 0.99 has no major effect on the *α* coefficient has the same effect as choosing a *α* also acts as a quality control mechanism by reducing the effect of observations with low likelihoods.

## 3. Cycling data assimilation experiments

### a. Forecast model: An idealized GCM

To examine the potential of the local PF as a data assimilation method for geophysical models, we apply the filter for a coarse-resolution atmospheric GCM. The model has the same hydrostatic dynamical core as the Geophysical Fluid Dynamics Laboratory AM2 climate model (Anderson et al. 2004), but with no moisture and topography, and no parameterizations for radiation, turbulence, and convection. Instead, the model uses Held and Suarez (1994) forcing, which consists of Newtonian relaxation of the temperature field to a prescribed radiative equilibrium solution and Rayleigh damping of winds near the surface to represent friction. The prognostic variables are zonal and meridional winds (*u* and *υ*), temperature *T*, and surface pressure

### b. Experiment configuration

This section presents results from three sets of cycling data assimilation experiments comparing the performance of the local PF and two ensemble Kalman filters for the idealized GCM application. In the first of these experiments, the observations contain Gaussian errors and *H* is a linear operator. Prior errors also remain close to Gaussian for this model, providing a scenario in which the assumptions made by Kalman filtering methods are appropriate. To make the filtering problem more challenging, we use non-Gaussian measurement errors in the second experiment and a nonlinear operator for *H* in the third experiment. For each of these cases, we generate 300 synthetic sounding observations every 6 h from a yearlong “truth” simulation using the same resolution and configuration as the forecast model; that is, the model used to propagate particles forward in time between cycles is perfect. The soundings contain measurements of *u*, *υ*, and *T* at each model level, which are created from the truth simulation using *H* with random errors added [see (1)]. We choose the locations of these sounding randomly from a uniform distribution over the domain and maintain the same spatial network of observations for all times.

To initialize our experiments we add small, uncorrelated perturbations to a previously spunup model state, then integrate each perturbed state forward 30 days. The ensemble perturbations grow into large amplitude correlated errors over this period, providing a suitable approximation of a climatological prior probability density for the first data assimilation cycle. All three experiments use a relatively small number of particles (*r* (and *α* for the PF). Though not shown, simulations were performed using *r* values of 0.07, 0.09, 0.12, 0.15, 0.18, 0.21, 0.24, and 0.27 rad and *α* values of 0.8, 0.85, 0.9, and 0.99. The EAKF and EnKF do not rely on a relaxation parameter to maintain the filter’s stability, but instead use adaptive prior covariance inflation (Anderson 2007) with prior standard deviations for the inflation coefficient (*γ*) fixed at 0.6.

### c. Linear H, Gaussian likelihood experiment

The first test of the local PF is for a case in which *H* is an operator that performs a linear horizontal interpolation from model space to observation space. The simulated observations contain uncorrelated Gaussian errors, with standard deviations of 1 hPa, 1 K, and 1 m s^{−1} for *T*, and wind variables, respectively. The purpose of this experiment is to test whether the local PF can provide adequate filtering results for a realistic geophysical system using a small number of particles. We use the three filters to generate posterior particles and 6-h forecasts, which approximate the prior errors for each succeeding assimilation cycle. We then summarize prior error statistics from the EAKF, EnKF, and local PF cases by calculating domain averages of RMSE and ensemble spread for all cycles and variable types (Fig. 2). All three filters require about 30 days to produce time series of global mean errors that are near stationary, so the first 45 days are omitted from the verification. Table 1 also summarizes the results from this experiment (and the nonlinear/non-Gaussian experiments) using RMSEs and spread averaged over the period plotted in Fig. 2. For the linear/Gaussian experiment, all three filters produce the lowest prior RMSEs when the localization radius is set to *α* set to 0.9 for the PF. The resulting prior statistics show that the local PF provides stable results over the entire yearlong cycling period and produces forecast errors that are well below the observation errors. The EAKF and EnKF priors have lower RMSEs than the local PF because parametric methods are less sensitive to sampling errors. This experiment, however, demonstrates that the local PF can provide stable results for large applications using a small number of particles.

Domain-averaged prior mean RMSE every 6 h from the linear *H*, Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged prior mean RMSE every 6 h from the linear *H*, Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged prior mean RMSE every 6 h from the linear *H*, Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

EAKF, EnKF, and local PF prior RMSEs and spread (separated by commas) averaged over the last 320 days of each experiment.

Next, we explore how well the PF preserves balance when updating particles each assimilation cycle. Arriving at the same optimal *r* values for the three filters allows for a convenient comparison of the methods when considering factors such as initial condition imbalance, where the amount of localization can have a large influence (Anderson 2012). To examine potential imbalances between data assimilation cycles we adopt a commonly used metric, RMS surface pressure tendency (

Figure 3 shows global mean RMS

Domain-averaged pressure tendency for each cycle from the linear *H*, Gaussian likelihood experiment. Tendencies calculated from the truth simulation are plotted in black for reference.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged pressure tendency for each cycle from the linear *H*, Gaussian likelihood experiment. Tendencies calculated from the truth simulation are plotted in black for reference.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged pressure tendency for each cycle from the linear *H*, Gaussian likelihood experiment. Tendencies calculated from the truth simulation are plotted in black for reference.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

### d. Linear H, non-Gaussian likelihood experiment

One advantage of the PF over ensemble Kalman filters is flexibility in choosing the likelihood function for calculating weights. A number of observation types assimilated by atmospheric models contain non-Gaussian errors that are often skewed or bounded; precipitation data present one example. While methods exist for transforming observed data into Gaussian distributions (e.g., Lien et al. 2013), or reformulating the filter update equations to handle certain non-Gaussian PDFs (e.g., Fletcher and Zupanski 2006), the PF simply uses a likelihood function that best fits the measurement errors.^{2} The only requirement is that errors come from a known parametric distribution. Developing a filter that can handle measurement errors in a general manner may allow for a more efficient treatment of observations with known non-Gaussian errors.

*H*as the previous experiment, but use a skew normal distribution to generate observation errors. The skew normal PDF is given by

*ψ*,

*s*, and

*κ*are location, scale, and shape parameters, respectively; and

*μ*and variance

*ψ*and

*s*, then sample from (11) using these parameters. Figure 4 compares the skew normal PDF with a Gaussian PDF for the parameters specified here. The resulting samples are added to the simulated truth to form observations. In this experiment, the EAKF and EnKF correctly assume the observation errors have a mean and variance of 0 and 1, respectively, but have no knowledge of the skewness associated with these errors. On the other hand, the PF uses (11) directly in the weight equation, with location, scale, and shape parameters modified to fit the observation likelihoods.

Skew normal PDF (solid) and normal PDF (dashed) with mean 0 and variance 1. The skew normal PDF has a skewness (*κ*) of 10, as used in non-Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Skew normal PDF (solid) and normal PDF (dashed) with mean 0 and variance 1. The skew normal PDF has a skewness (*κ*) of 10, as used in non-Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Skew normal PDF (solid) and normal PDF (dashed) with mean 0 and variance 1. The skew normal PDF has a skewness (*κ*) of 10, as used in non-Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Figure 5 shows prior RMSEs and spread for the experiment using skew normal observation errors. Here, the best configurations of the EAKF, EnKF, and local PF contain the same localization radius as the previous experiment (*r* = 0.24), but with *α* reduced from 0.9 to 0.85 for the PF. Prior errors from the EAKF and EnKF simulations resemble results from the linear/Gaussian case discussed in section 3c. Because the observation errors are unbiased and have the correct error variance, the skewed observation errors lead to similar posterior mean and error variance estimates as the Gaussian likelihood experiment (see the appendix for more details). The relative PF performance, however, is improved over the linear/Gaussian case. This method has the advantage of using the correct PDF for the observation likelihoods, which has a higher peak probability density than a Gaussian PDF with the same variance. The larger maximum in peak probability density allows the local PF to move more particles toward the truth than in the previous Gaussian case, thus leading to lower RMSEs. The performance gains of the local PF are large enough to reduce most of the differences in RMSEs between the three filtering methods (Fig. 5).

Domain-averaged prior mean RMSE every 6 h from the linear *H*, non-Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged prior mean RMSE every 6 h from the linear *H*, non-Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged prior mean RMSE every 6 h from the linear *H*, non-Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Because the observation errors follow a non-Gaussian distribution, forecast errors between cycles exhibit more non-Gaussian behavior than the linear/Gaussian experiment, rendering RMSE and ensemble spread insufficient for quantifying the probabilistic skill of the filters. Therefore, we apply rank histograms to evaluate 6-h ensemble forecasts generated during experiments where the linear/Gaussian assumptions made by the EAKF and EnKF are violated. Rank histograms plot how often the true model state lands in discrete bins formed by ranking particles in ascending order (Anderson 1996; Hamill and Colucci 1996; Talagrand et al. 1997). A flat histogram is a necessary, but insufficient condition for particles to come from the same probability distribution as the true model state, and will be used here to determine potential shortcomings in the filters. The histograms are calculated independently for all model grid points from prior particles generated every 8 cycles (48 h) of the experiments. Using data every 48 h reduces the effects of temporal correlations when estimating whether each histogram exhibits a significant deviation from a uniform probability distribution.

After calculating histograms, we apply a chi-squared test to determine whether deviations from a uniform distribution can be detected at a 99% significance level. Our motivation for using hypothesis testing is to identify regions where nonuniform histograms may exist, before performing a visual inspection of histograms in suspect regions. Figures 6a and 6b show global maps indicating [with crisscrosses (×s)] model grid points where forecasts for midlevel *T* (vertical model level 3) likely contain nonuniform histograms.^{3} Histograms detected to be nonuniform often reveal large forecast biases relative to RMSEs; therefore, we plot time-averaged

Probabilistic verification of 6-h midlevel *T* forecasts for (a),(c) non-Gaussian likelihood experiments and (b),(d) nonlinear *H* experiments. The ×s mark grid points where rank histograms are nonuniform with 99% probability, shading indicates the ratio of bias to RMSE every 0.1 units, and dotted lines are latitude and longitude every 40°.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Probabilistic verification of 6-h midlevel *T* forecasts for (a),(c) non-Gaussian likelihood experiments and (b),(d) nonlinear *H* experiments. The ×s mark grid points where rank histograms are nonuniform with 99% probability, shading indicates the ratio of bias to RMSE every 0.1 units, and dotted lines are latitude and longitude every 40°.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Probabilistic verification of 6-h midlevel *T* forecasts for (a),(c) non-Gaussian likelihood experiments and (b),(d) nonlinear *H* experiments. The ×s mark grid points where rank histograms are nonuniform with 99% probability, shading indicates the ratio of bias to RMSE every 0.1 units, and dotted lines are latitude and longitude every 40°.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

### e. Nonlinear H, Gaussian likelihood experiment

As shown in P16, the benefit of PFs over ensemble Kalman filters is apparent only when ensemble statistics are sufficient for detecting non-Gaussian PDFs. Therefore, distributions that are close to Gaussian require a larger *H* comprising absolute value and natural log functions:

Figure 7 shows time series plots of domain-averaged prior RMSEs and spread for the filter experiments using the nonlinear *H*. All three filters produce higher errors than the previous experiment because the new observations provide only indirect measurements of the system state at each filter time. The new *H* causes the EAKF and EnKF updates to be suboptimal, leading to prior RMSEs that are almost twice the local PF RMSEs over the cycling period (Table 1). EAKF forecasts tend to exhibit the largest peaks in mean RMSE between cycles, and in some cases, yield ensemble spread that is much smaller than the true prior errors. One factor leading to elevated errors for the ensemble Kalman filters is the smaller localization half-width needed to maintain stable results; in this case, *r* must be reduced to 0.07 for both the EAKF and EnKF. Though not shown, using larger *r* leads to filter divergence partway through the experiment for both cases. At the same time, the optimal *r* and *α* parameters for the local PF remain unchanged from the previous experiment. P16 also found the local PF to produce reasonable results using the same *r* for linear and nonlinear *H*, because the filter does not need to compensate for systematic errors introduced when Gaussian assumptions are invalid.

Domain-averaged prior mean RMSE every 6 h from the nonlinear *H*, Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged prior mean RMSE every 6 h from the nonlinear *H*, Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Domain-averaged prior mean RMSE every 6 h from the nonlinear *H*, Gaussian likelihood experiment.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

We again use rank histograms to examine the probabilistic forecast performance of the three methods. As was the case for the non-Gaussian likelihood experiment, the EAKF and EnKF forecasts have a larger number of grid points with nonuniform histograms than the PF (Figs. 6b,d). For the nonlinear *H* experiment, the result is mostly due to a domainwide cold bias caused by assuming a Gaussian distribution for the observation-space prior PDF. Though not shown, ensemble forecasts (for all three filters) also tend to be underdispersed in the lower latitudes, which is a major factor contributing to the nonuniform histograms near the equator. This result is made worse in the EAKF and EnKF cases, which use a relatively short localization radius to maintain the filters’ stability. The small *r* required for the ensemble Kalman filters is far from optimal in low latitudes, where the Rossby radius of deformation is quite large. The local PF, however, benefits from maintaining a large localization radius for this observation network, which allows for significant improvements over these filters.

## 4. Conclusions

This study describes the first application of a localized PF for data assimilation in a high-dimensional geophysical model. As a benchmark, the local PF is compared with two commonly used deterministic and stochastic ensemble Kalman filters in the DART software package. Particle filters provide benefits over Gaussian filters when linear/Gaussian assumptions limit the effectiveness of Kalman filtering theory. Nevertheless, the number of particles required for standard PFs is known to scale exponentially with the system dimension. The local PF avoids this issue by limiting the influence of observations on the calculation of posterior quantities using a strategy similar to covariance localization in ensemble Kalman filters. To form posterior samples, the filter processes observations serially, updating particles via sampling and merging steps to approximate the posterior mean and covariance resulting from the local influence of observations. The algorithm then uses a probability mapping step to make higher-order corrections not considered during the previous steps. The resulting set of particles reflects samples from a posterior PDF that approximates the Bayesian solution near observations and the prior away from observations. The first step in the local PF algorithm has a computational cost that scales with the number of observations assimilated; likewise, the probability mapping step scales with the number of particles. In typical data assimilation applications, where the number of observations is much greater than the number of particles, the time required to run the filter is dominated by the first part of the algorithm.

Using a relatively small number of particles (

This study and P16 focus mostly on the potential of PFs for assimilating observations that have nonlinear operators or non-Gaussian measurement errors. Advantages demonstrated by the local PF thus far provide enough incentive to explore its potential in real geophysical modeling applications. Nevertheless, the theoretical benefits of the local PF for highly nonlinear dynamical models have not yet been tested. Future work will focus on applying the local PF in high-dimensional systems that do not require nonlinear measurement operators to yield non-Gaussian priors. One possible application is for mesoscale convective systems in weather models, which contain large gradients in wind, temperature, and moisture that can lead to non-Gaussian errors. Other conceivable uses include land surface models, ocean models, and coupled system models, all of which are governed by nonlinear dynamical systems that are not well constrained by observations.

Moreover, the serial processing of observations by the local PF and its inclusion in the DART software package means that it can be used together with ensemble Kalman filters to obtain a superior data assimilation system. We hypothesize that the best filtering option when small to moderate numbers of particles are available may be to use the PF for observation networks that are problematic for Gaussian filters, while applying a Kalman filter update when linear/Gaussian assumptions are appropriate. Nevertheless, the practical benefits of the local PF need to be explored for more realistic geophysical modeling applications.

## Acknowledgments

This research is sponsored by the NCAR Advanced Study Program. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. We would like to acknowledge high-performance computing support from Yellowstone (ark:/85065/d7wd3xhc) provided by NCAR’s Computational and Information Systems Laboratory, sponsored by the National Science Foundation.

## APPENDIX

### Univariate Skew Normal Experiments

Numerical simulations demonstrate the impact of using the PF and the EAKF to assimilate observations with skew normal errors for a univariate application. Here, we perform

Rank histograms from posterior particles generated using (a) the PF and (b) EAKF with skew normal observation errors, and (c) the EAKF with Gaussian observation errors. Average RMSE and ensemble spread are indicated in the upper-right corners.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Rank histograms from posterior particles generated using (a) the PF and (b) EAKF with skew normal observation errors, and (c) the EAKF with Gaussian observation errors. Average RMSE and ensemble spread are indicated in the upper-right corners.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

Rank histograms from posterior particles generated using (a) the PF and (b) EAKF with skew normal observation errors, and (c) the EAKF with Gaussian observation errors. Average RMSE and ensemble spread are indicated in the upper-right corners.

Citation: Monthly Weather Review 144, 5; 10.1175/MWR-D-15-0322.1

## REFERENCES

Ades, M., and P. J. van Leeuwen, 2015: The equivalent-weights particle filter in a high-dimensional system.

,*Quart. J. Roy. Meteor. Soc.***141**, 484–503, doi:10.1002/qj.2370.Anderson, J. L., 1996: A method for producing and evaluating probabilistic forecasts from ensemble model integrations.

,*J. Climate***9**, 1518–1530, doi:10.1175/1520-0442(1996)009<1518:AMFPAE>2.0.CO;2.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Anderson, J. L., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters.

,*Tellus***59A**, 210–224, doi:10.1111/j.1600-0870.2006.00216.x.Anderson, J. L., 2010: A non-Gaussian ensemble filter update for data assimilation.

,*Mon. Wea. Rev.***138**, 4186–4198, doi:10.1175/2010MWR3253.1.Anderson, J. L., 2012: Localization and sampling error correction in ensemble Kalman filter data assimilation.

,*Mon. Wea. Rev.***140**, 2359–2371, doi:10.1175/MWR-D-11-00013.1.Anderson, J., and Coauthors, 2004: The new GFDL global atmosphere and land model AM2–LM2: Evaluation with prescribed SST simulations.

,*J. Climate***17**, 4641–4673, doi:10.1175/JCLI-3223.1.Arakawa, A., and V. R. Lamb, 1977: Computational design of the basic dynamical processes of the UCLA general circulation model.

*Methods in Computational Physics: Advances in Research and Applications*, J. Chang, Ed., Vol. 17, Elsevier, 173–265.Bengtsson, T., C. Snyder, and D. Nychka, 2003: Toward a nonlinear ensemble filter for high-dimensional systems.

,*J. Geophys. Res.***108**, 8775, doi:10.1029/2002JD002900.Bengtsson, T., P. Bickel, and B. Li, 2008: Curse-of-dimensionality revisited: Collapse of the particle filter in very large scale systems.

*Probability and Statistics: Essays in Honor of David A. Freedman*, D. Nolan and T. Speed, Eds., Vol. 2, Institute of Mathematical Statistics, 316–334.Bickel, P., B. Li, and T. Bengtsson, 2008: Sharp failure rates for the bootstrap particle filter in high dimensions.

*Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh*, B. Clarke and S. Ghosal, Eds., Vol. 3, Institute of Mathematical Statistics, 318–329.Bishop, C. H., and D. Hodyss, 2011: Adaptive ensemble covariance localization in ensemble 4D-VAR state estimation.

,*Mon. Wea. Rev.***139**, 1241–1255, doi:10.1175/2010MWR3403.1.Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background-error covariances: Evaluation in a quasi-operational NWP setting.

,*Quart. J. Roy. Meteor. Soc.***131**, 1013–1043, doi:10.1256/qj.04.15.Chorin, A., M. Morzfeld, and X. Tu, 2010: Implicit particle filters for data assimilation.

,*Comm. Appl. Math. Comput. Sci.***5**, 221–240, doi:10.2140/camcos.2010.5.221.Doucet, A., N. de Freitas, and N. Gordon, Eds., 2001: An introduction to sequential Monte Carlo methods.

*Sequential Monte Carlo Methods in Practice*, Springer-Verlag, 2–14.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99C**, 10 143–10 162, doi:10.1029/94JC00572.Fletcher, S. J., and M. Zupanski, 2006: A data assimilation method for log-normally distributed observational errors.

,*Quart. J. Roy. Meteor. Soc.***132**, 2505–2519, doi:10.1256/qj.05.222.Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757, doi:10.1002/qj.49712555417.Gordon, N. J., D. J. Salmond, and A. F. M. Smith, 1993: Novel approach to nonlinear/non-Gaussian state estimation.

,*IEE Proc., F, Radar Signal Process.***140**, 107–113, doi:10.1049/ip-f-2.1993.0015.Greybush, S. J., E. Kalnay, T. Miyoshi, K. Ide, and B. Hunt, 2011: Balance and ensemble Kalman filter localization techniques.

,*Mon. Wea. Rev.***139**, 511–522, doi:10.1175/2010MWR3328.1.Hamill, T. M., and S. J. Colucci, 1996: Random and systematic error in NMC’s short-range Eta ensembles. Preprints,

*13th Conf. on Probability and Statistics in the Atmospheric Sciences*, San Francisco, CA, Amer. Meteor. Soc., 51–56.Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter-3D variational analysis scheme.

,*Mon. Wea. Rev.***128**, 2905–2919, doi:10.1175/1520-0493(2000)128<2905:AHEKFV>2.0.CO;2.Hamill, T. H., and J. S. Whitaker, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129**, 2776–2790, doi:10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.Held, I. M., and M. J. Suarez, 1994: A proposal for the intercomparison of the dynamical cores of atmospheric general circulation models.

,*Bull. Amer. Meteor. Soc.***75**, 1825–1830, doi:10.1175/1520-0477(1994)075<1825:APFTIO>2.0.CO;2.Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811, doi:10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129**, 123–137, doi:10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2.Kitagawa, G., 1996: Monte Carlo filter and smoother for non-Gaussian nonlinear state space models.

,*J. Comput. Graph. Stat.***5**, 1–25.Lei, J., and P. Bickel, 2011: A moment matching ensemble filter for nonlinear non-Gaussian data assimilation.

,*Mon. Wea. Rev.***139**, 3964–3973, doi:10.1175/2011MWR3553.1.Lien, G.-Y., E. Kalnay, and T. Miyoshi, 2013: Effective assimilation of global precipitation: Simulation experiments.

,*Tellus***65A**, 19915, doi:10.3402/tellusa.v65i0.19915.Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP: A comparison with 4D-Var.

,*Quart. J. Roy. Meteor. Soc.***129**, 3183–3203, doi:10.1256/qj.02.132.Lorenz, E. N., 1996: Predictability: A problem partly solved.

*Proc. Seminar on Predictability*, Vol. 1, Reading, United Kingdom, ECMWF, 1–18.McGinnis, S., D. Nychka, and L. O. Mearns, 2015: A new distribution mapping technique for bias correction of climate model output.

*Machine Learning and Data Mining Approaches to Climate Science*, V. Lakshmanan et al., Eds., Springer International Publishing, doi:10.1007/978-3-319-17220-0, in press.Meinhold, R. J., and N. D. Singpurwalla, 1989: Robustification of Kalman filter models.

,*J. Amer. Stat. Assoc.***84**, 479–486, doi:10.1080/01621459.1989.10478794.Mitchell, H. L., P. L. Houtekamer, and G. Pellerin, 2002: Ensemble size, balance, and model-error representation in an ensemble Kalman filter.

,*Mon. Wea. Rev.***130**, 2791–2808, doi:10.1175/1520-0493(2002)130<2791:ESBAME>2.0.CO;2.Poterjoy, J., 2016: A localized particle filter for high-dimensional nonlinear systems.

,*Mon. Wea. Rev.***144**, 59–76, doi:10.1175/MWR-D-15-0163.1.Poterjoy, J., and F. Zhang, 2011: Dynamics and structure of forecast error covariance in the core of a developing hurricane.

,*J. Atmos. Sci.***68**, 1586–1606, doi:10.1175/2011JAS3681.1.Slivinski, L., and C. Snyder, 2016: Exploring practical estimates of the ensemble size necessary for particle filters.

,*Mon. Wea. Rev.***144**, 861–875, doi:10.1175/MWR-D-14-00303.1.Snyder, C., T. Bengtsson, P. Bickel, and J. Anderson, 2008: Obstacles to high-dimensional particle filtering.

,*Mon. Wea. Rev.***136**, 4629–4640, doi:10.1175/2008MWR2529.1.Snyder, C., T. Bengtsson, and M. Morzfeld, 2015: Performance bounds for particle filters using the optimal proposal.

,*Mon. Wea. Rev.***143**, 4750–4761, doi:10.1175/MWR-D-15-0144.1.Talagrand, O., R. Vautard, and B. Strauss, 1997: Evaluation of probabilistic prediction systems.

*Proc. ECMWF Workshop on Predictability*, Reading, United Kingdom, ECMWF, 1–25.van Leeuwen, P. J., 2010: Nonlinear data assimilation in geosciences: An extremely efficient particle filter.

,*Quart. J. Roy. Meteor. Soc.***136**, 1991–1999, doi:10.1002/qj.699.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation.

,*Mon. Wea. Rev.***140**, 3078–3089, doi:10.1175/MWR-D-11-00276.1.Zhang, F., C. Snyder, and J. Sun, 2004: Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter.

,*Mon. Wea. Rev.***132**, 1238–1253, doi:10.1175/1520-0493(2004)132<1238:IOIEAO>2.0.CO;2.

^{1}

We use systematic resampling (Kitagawa 1996) for this step, but any type of resampling method can be used.

^{2}

Like the PF, the rank histogram filter (Anderson 2010) also benefits from flexibility in choosing a likelihood function, except variables in state space are updated via a linear regression from the observation-space correction.

^{3}

We performed the same analysis for several other model levels and variables and find midlevel *T* to be representative of how forecasts are verified for most variables.