Search Results

You are looking at 11 - 20 of 49 items for

  • Author or Editor: Jeffrey L. Anderson x
  • All content x
Clear All Modify Search
Jeffrey L. Anderson

Abstract

Ensemble Kalman filters are widely used for data assimilation in large geophysical models. Good results with affordable ensemble sizes require enhancements to the basic algorithms to deal with insufficient ensemble variance and spurious ensemble correlations between observations and state variables. These challenges are often dealt with by using inflation and localization algorithms. A new method for understanding and reducing some ensemble filter errors is introduced and tested. The method assumes that sampling error due to small ensemble size is the primary source of error. Sampling error in the ensemble correlations between observations and state variables is reduced by estimating the distribution of correlations as part of the ensemble filter algorithm. This correlation error reduction (CER) algorithm can produce high-quality ensemble assimilations in low-order models without using any a priori localization like a specified localization function. The method is also applied in an observing system simulation experiment with a very coarse resolution dry atmospheric general circulation model. This demonstrates that the algorithm provides insight into the need for localization in large geophysical applications, suggesting that sampling error may be a primary cause in some cases.

Full access
Jeffrey L. Anderson

Abstract

It is possible to describe many variants of ensemble Kalman filters without loss of generality as the impact of a single observation on a single state variable. For most ensemble algorithms commonly applied to Earth system models, the computation of increments for the observation variable ensemble can be treated as a separate step from computing increments for the state variable ensemble. The state variable increments are normally computed from the observation increments by linear regression using the prior bivariate ensemble of the state and observation variable. Here, a new method that replaces the standard regression with a regression using the bivariate rank statistics is described. This rank regression is expected to be most effective when the relation between a state variable and an observation is nonlinear. The performance of standard versus rank regression is compared for both linear and nonlinear forward operators (also known as observation operators) using a low-order model. Rank regression in combination with a rank histogram filter in observation space produces better analyses than standard regression for cases with nonlinear forward operators and relatively large analysis error. Standard regression, in combination with either a rank histogram filter or an ensemble Kalman filter in observation space, produces the best results in other situations.

Full access
Jeffrey L. Anderson

Abstract

An extension to standard ensemble Kalman filter algorithms that can improve performance for non-Gaussian prior distributions, non-Gaussian likelihoods, and bounded state variables is described. The algorithm exploits the capability of the rank histogram filter (RHF) to represent arbitrary prior distributions for observed variables. The rank histogram algorithm can be applied directly to state variables to produce posterior marginal ensembles without the need for regression that is part of standard ensemble filters. These marginals are used to adjust the marginals obtained from a standard ensemble filter that uses regression to update state variables. The final posterior ensemble is obtained by doing an ordered replacement of the posterior marginal ensemble values from a standard ensemble filter with the values obtained from the rank histogram method applied directly to state variables; the algorithm is referred to as the marginal adjustment rank histogram filter (MARHF). Applications to idealized bivariate problems and low-order dynamical systems show that the MARHF can produce better results than standard ensemble methods for priors that are non-Gaussian. Like the original RHF, the MARHF can also make use of arbitrary non-Gaussian observation likelihoods. The MARHF also has advantages for problems with bounded state variables, for instance, the concentration of an atmospheric tracer. Bounds can be automatically respected in the posterior ensembles. With an efficient implementation of the MARHF, the additional cost has better scaling than the standard RHF.

Restricted access
Jeffrey L. Anderson

Abstract

An extremely simple chaotic model, the three-variable Lorenz convective model, is used in a perfect model setting to study the selection of initial conditions for ensemble forecasts. Observations with a known distribution of error are sampled from the “climate” of the simple model. Initial condition distributions that use only information about the observation and the observational error distribution (i.e., traditional Monte Carlo methods) are shown to differ from the correct initial condition distributions, which make use of additional information about the local structure of the model's attractor. Three relatively inexpensive algorithms for finding the local attractor structure in a simple model are examined; these make use of singular vectors. normal modes, and perturbed integrations. All of these are related to heuristic algorithms that have been applied to select ensemble members in operational forecast models. The method of perturbed integrations, which is somewhat similar to the “breeding” method used at the National Meteorological Center, is shown to be the most effective in this context. Validating the extension of such methods to realistic models is expected to be extremely difficult; however, it seems reasonable that utilizing all available information about the attractor structure of real forecast models when selecting ensemble initial conditions could improve the success of operational ensemble forecasts.

Full access
Jeffrey L. Anderson and Stephen L. Anderson

Abstract

Knowledge of the probability distribution of initial conditions is central to almost all practical studies of predictability and to improvements in stochastic prediction of the atmosphere. Traditionally, data assimilation for atmospheric predictability or prediction experiments has attempted to find a single “best” estimate of the initial state. Additional information about the initial condition probability distribution is then obtained primarily through heuristic techniques that attempt to generate representative perturbations around the best estimate. However, a classical theory for generating an estimate of the complete probability distribution of an initial state given a set of observations exists. This nonlinear filtering theory can be applied to unify the data assimilation and ensemble generation problem and to produce superior estimates of the probability distribution of the initial state of the atmosphere (or ocean) on regional or global scales. A Monte Carlo implementation of the fully nonlinear filter has been developed and applied to several low-order models. The method is able to produce assimilations with small ensemble mean errors while also providing random samples of the initial condition probability distribution. The Monte Carlo method can be applied in models that traditionally require the application of initialization techniques without any explicit initialization. Initial application to larger models is promising, but a number of challenges remain before the method can be extended to large realistic forecast models.

Full access
Sukyoung Lee and Jeffrey L. Anderson

Abstract

A forced, nonlinear barotropic model on the sphere is shown to simulate some of the structure of the observed Northern Hemisphere midlatitude storm tracks with reasonable accuracy. For the parameter range chosen, the model has no unstable modes with significant amplitude in the storm track regions; however, several decaying modes with structures similar to the storm track are discovered. The model's midlatitude storm tracks also coincide with the location of a waveguide that is obtained by assuming that the horizontal variation of the time-mean flow is small compared with the scale of the transient eddies. Since the model is able to mimic the behavior of the observed storm tracks without any baroclinic dynamics, it is argued that the barotropic waveguide effects of the time-mean background flow acting on individual eddies are partially responsible for the observed storm track structure.

Full access
Jonathan Poterjoy and Jeffrey L. Anderson

Abstract

This study presents the first application of a localized particle filter (PF) for data assimilation in a high-dimensional geophysical model. Particle filters form Monte Carlo approximations of model probability densities conditioned on observations, while making no assumptions about the underlying error distribution. Unlike standard PFs, the local PF uses a localization function to reduce the influence of distant observations on state variables, which significantly decreases the number of particles required to maintain the filter’s stability. Because the local PF operates effectively using small numbers of particles, it provides a possible alternative to Gaussian filters, such as ensemble Kalman filters, for large geophysical models. In the current study, the local PF is compared with stochastic and deterministic ensemble Kalman filters using a simplified atmospheric general circulation model. The local PF is found to provide stable filtering results over yearlong data assimilation experiments using only 25 particles. The local PF also outperforms the Gaussian filters when observation networks include measurements that have non-Gaussian errors or relate nonlinearly to the model state, like remotely sensed data used frequently in atmospheric analyses. Results from this study encourage further testing of the local PF on more complex geophysical systems, such as weather prediction models.

Full access
Mahsa Mirzargar and Jeffrey L. Anderson

Abstract

Various generalizations of the univariate rank histogram have been proposed to inspect the reliability of an ensemble forecast or analysis in multidimensional spaces. Multivariate rank histograms provide insightful information about the misspecification of genuinely multivariate features such as the correlation between various variables in a multivariate ensemble. However, the interpretation of patterns in a multivariate rank histogram should be handled with care. The purpose of this paper is to focus on multivariate rank histograms designed based on the concept of data depth and outline some important considerations that should be accounted for when using such multivariate rank histograms. To generate correct multivariate rank histograms using the concept of data depth, the datatype of the ensemble should be taken into account to define a proper preranking function. This paper demonstrates how and why some preranking functions might not be suitable for multivariate or vector-valued ensembles and proposes preranking functions based on the concept of simplicial depth that are applicable to both multivariate points and vector-valued ensembles. In addition, there exists an inherent identifiability issue associated with center-outward preranking functions used to generate multivariate rank histograms. This problem can be alleviated by complementing the multivariate rank histogram with other well-known multivariate statistical inference tools based on rank statistics such as the depth-versus-depth (DD) plot. Using a synthetic example, it is shown that the DD plot is less sensitive to sample size compared to multivariate rank histograms.

Full access
Jeffrey L. Anderson and Nancy Collins

Abstract

A variant of a least squares ensemble (Kalman) filter that is suitable for implementation on parallel architectures is presented. This parallel ensemble filter produces results that are identical to those from sequential algorithms already described in the literature when forward observation operators that relate the model state vector to the expected value of observations are linear (although actual results may differ due to floating point arithmetic round-off error). For nonlinear forward observation operators, the sequential and parallel algorithms solve different linear approximations to the full problem but produce qualitatively similar results. The parallel algorithm can be implemented to produce identical answers with the state variable prior ensembles arbitrarily partitioned onto a set of processors for the assimilation step (no caveat on round-off is needed for this result).

Example implementations of the parallel algorithm are described for environments with low (high) communication latency and cost. Hybrids of these implementations and the traditional sequential ensemble filter can be designed to optimize performance for a variety of parallel computing environments. For large models on machines with good communications, it is possible to implement the parallel algorithm to scale efficiently to thousands of processors while bit-wise reproducing the results from a single processor implementation. Timing results on several Linux clusters are presented from an implementation appropriate for machines with low-latency communication.

Most ensemble Kalman filter variants that have appeared in the literature differ only in the details of how a prior ensemble estimate of a scalar observation is updated given an observed value and the observational error distribution. These details do not impact other parts of either the sequential or parallel filter algorithms here, so a variety of ensemble filters including ensemble square root and perturbed observations filters can be used with all the implementations described.

Full access
Lili Lei and Jeffrey L. Anderson

Abstract

The empirical localization algorithm described here uses the output from an observing system simulation experiment (OSSE) and constructs localization functions that minimize the root-mean-square (RMS) difference between the truth and the posterior ensemble mean for state variables. This algorithm can automatically provide an estimate of the localization function and does not require empirical tuning of the localization scale. It can compute an appropriate localization function for any potential observation type and kind of state variable. The empirical localization algorithm is investigated in the Community Atmosphere Model, version 5 (CAM5). The empirical localization function (ELF) is computed for the horizontal and vertical separately so that the vertical localization is explored explicitly. The horizontal and vertical ELFs are also computed for different geographic regions. The ELFs varying with region have advantages over the single global ELF in the horizontal and vertical, because different localization functions are more effective in different regions. The ELFs computed from an OSSE can be used as the localization in a subsequent OSSE. After three iterations, the ELFs appear to have converged. When used as localization in an OSSE, the converged ELFs produce a significantly smaller RMS error of temperature and zonal and meridional winds than the best Gaspari–Cohn (GC) localization for a dependent verification period using the observations from the original OSSE, and a similar RMS error to the best GC for an independent verification period. The converged ELFs have a significantly smaller RMS error of surface pressure than the best GC for both dependent and independent verification periods.

Full access