# Search Results

## You are looking at 11 - 20 of 66 items for

- Author or Editor: Jeffrey Anderson x

- All content x

## Abstract

An extension to standard ensemble Kalman filter algorithms that can improve performance for non-Gaussian prior distributions, non-Gaussian likelihoods, and bounded state variables is described. The algorithm exploits the capability of the rank histogram filter (RHF) to represent arbitrary prior distributions for observed variables. The rank histogram algorithm can be applied directly to state variables to produce posterior marginal ensembles without the need for regression that is part of standard ensemble filters. These marginals are used to adjust the marginals obtained from a standard ensemble filter that uses regression to update state variables. The final posterior ensemble is obtained by doing an ordered replacement of the posterior marginal ensemble values from a standard ensemble filter with the values obtained from the rank histogram method applied directly to state variables; the algorithm is referred to as the marginal adjustment rank histogram filter (MARHF). Applications to idealized bivariate problems and low-order dynamical systems show that the MARHF can produce better results than standard ensemble methods for priors that are non-Gaussian. Like the original RHF, the MARHF can also make use of arbitrary non-Gaussian observation likelihoods. The MARHF also has advantages for problems with bounded state variables, for instance, the concentration of an atmospheric tracer. Bounds can be automatically respected in the posterior ensembles. With an efficient implementation of the MARHF, the additional cost has better scaling than the standard RHF.

## Abstract

An extension to standard ensemble Kalman filter algorithms that can improve performance for non-Gaussian prior distributions, non-Gaussian likelihoods, and bounded state variables is described. The algorithm exploits the capability of the rank histogram filter (RHF) to represent arbitrary prior distributions for observed variables. The rank histogram algorithm can be applied directly to state variables to produce posterior marginal ensembles without the need for regression that is part of standard ensemble filters. These marginals are used to adjust the marginals obtained from a standard ensemble filter that uses regression to update state variables. The final posterior ensemble is obtained by doing an ordered replacement of the posterior marginal ensemble values from a standard ensemble filter with the values obtained from the rank histogram method applied directly to state variables; the algorithm is referred to as the marginal adjustment rank histogram filter (MARHF). Applications to idealized bivariate problems and low-order dynamical systems show that the MARHF can produce better results than standard ensemble methods for priors that are non-Gaussian. Like the original RHF, the MARHF can also make use of arbitrary non-Gaussian observation likelihoods. The MARHF also has advantages for problems with bounded state variables, for instance, the concentration of an atmospheric tracer. Bounds can be automatically respected in the posterior ensembles. With an efficient implementation of the MARHF, the additional cost has better scaling than the standard RHF.

## Abstract

It is possible to describe many variants of ensemble Kalman filters without loss of generality as the impact of a single observation on a single state variable. For most ensemble algorithms commonly applied to Earth system models, the computation of increments for the observation variable ensemble can be treated as a separate step from computing increments for the state variable ensemble. The state variable increments are normally computed from the observation increments by linear regression using the prior bivariate ensemble of the state and observation variable. Here, a new method that replaces the standard regression with a regression using the bivariate rank statistics is described. This rank regression is expected to be most effective when the relation between a state variable and an observation is nonlinear. The performance of standard versus rank regression is compared for both linear and nonlinear forward operators (also known as observation operators) using a low-order model. Rank regression in combination with a rank histogram filter in observation space produces better analyses than standard regression for cases with nonlinear forward operators and relatively large analysis error. Standard regression, in combination with either a rank histogram filter or an ensemble Kalman filter in observation space, produces the best results in other situations.

## Abstract

It is possible to describe many variants of ensemble Kalman filters without loss of generality as the impact of a single observation on a single state variable. For most ensemble algorithms commonly applied to Earth system models, the computation of increments for the observation variable ensemble can be treated as a separate step from computing increments for the state variable ensemble. The state variable increments are normally computed from the observation increments by linear regression using the prior bivariate ensemble of the state and observation variable. Here, a new method that replaces the standard regression with a regression using the bivariate rank statistics is described. This rank regression is expected to be most effective when the relation between a state variable and an observation is nonlinear. The performance of standard versus rank regression is compared for both linear and nonlinear forward operators (also known as observation operators) using a low-order model. Rank regression in combination with a rank histogram filter in observation space produces better analyses than standard regression for cases with nonlinear forward operators and relatively large analysis error. Standard regression, in combination with either a rank histogram filter or an ensemble Kalman filter in observation space, produces the best results in other situations.

## Abstract

Many methods using ensemble integrations of prediction models as integral parts of data assimilation have appeared in the atmospheric and oceanic literature. In general, these methods have been derived from the Kalman filter and have been known as ensemble Kalman filters. A more general class of methods including these ensemble Kalman filter methods is derived starting from the nonlinear filtering problem. When working in a joint state–observation space, many features of ensemble filtering algorithms are easier to derive and compare. The ensemble filter methods derived here make a (local) least squares assumption about the relation between prior distributions of an observation variable and model state variables. In this context, the update procedure applied when a new observation becomes available can be described in two parts. First, an update increment is computed for each prior ensemble estimate of the observation variable by applying a scalar ensemble filter. Second, a linear regression of the prior ensemble sample of each state variable on the observation variable is performed to compute update increments for each state variable ensemble member from corresponding observation variable increments. The regression can be applied globally or locally using Gaussian kernel methods.

Several previously documented ensemble Kalman filter methods, the perturbed observation ensemble Kalman filter and ensemble adjustment Kalman filter, are developed in this context. Some new ensemble filters that extend beyond the Kalman filter context are also discussed. The two-part method can provide a computationally efficient implementation of ensemble filters and allows more straightforward comparison of methods since they differ only in the solution of a scalar filtering problem.

## Abstract

Many methods using ensemble integrations of prediction models as integral parts of data assimilation have appeared in the atmospheric and oceanic literature. In general, these methods have been derived from the Kalman filter and have been known as ensemble Kalman filters. A more general class of methods including these ensemble Kalman filter methods is derived starting from the nonlinear filtering problem. When working in a joint state–observation space, many features of ensemble filtering algorithms are easier to derive and compare. The ensemble filter methods derived here make a (local) least squares assumption about the relation between prior distributions of an observation variable and model state variables. In this context, the update procedure applied when a new observation becomes available can be described in two parts. First, an update increment is computed for each prior ensemble estimate of the observation variable by applying a scalar ensemble filter. Second, a linear regression of the prior ensemble sample of each state variable on the observation variable is performed to compute update increments for each state variable ensemble member from corresponding observation variable increments. The regression can be applied globally or locally using Gaussian kernel methods.

Several previously documented ensemble Kalman filter methods, the perturbed observation ensemble Kalman filter and ensemble adjustment Kalman filter, are developed in this context. Some new ensemble filters that extend beyond the Kalman filter context are also discussed. The two-part method can provide a computationally efficient implementation of ensemble filters and allows more straightforward comparison of methods since they differ only in the solution of a scalar filtering problem.

## Abstract

Ensemble Kalman filters are widely used for data assimilation in large geophysical models. Good results with affordable ensemble sizes require enhancements to the basic algorithms to deal with insufficient ensemble variance and spurious ensemble correlations between observations and state variables. These challenges are often dealt with by using inflation and localization algorithms. A new method for understanding and reducing some ensemble filter errors is introduced and tested. The method assumes that sampling error due to small ensemble size is the primary source of error. Sampling error in the ensemble correlations between observations and state variables is reduced by estimating the distribution of correlations as part of the ensemble filter algorithm. This correlation error reduction (CER) algorithm can produce high-quality ensemble assimilations in low-order models without using any a priori localization like a specified localization function. The method is also applied in an observing system simulation experiment with a very coarse resolution dry atmospheric general circulation model. This demonstrates that the algorithm provides insight into the need for localization in large geophysical applications, suggesting that sampling error may be a primary cause in some cases.

## Abstract

Ensemble Kalman filters are widely used for data assimilation in large geophysical models. Good results with affordable ensemble sizes require enhancements to the basic algorithms to deal with insufficient ensemble variance and spurious ensemble correlations between observations and state variables. These challenges are often dealt with by using inflation and localization algorithms. A new method for understanding and reducing some ensemble filter errors is introduced and tested. The method assumes that sampling error due to small ensemble size is the primary source of error. Sampling error in the ensemble correlations between observations and state variables is reduced by estimating the distribution of correlations as part of the ensemble filter algorithm. This correlation error reduction (CER) algorithm can produce high-quality ensemble assimilations in low-order models without using any a priori localization like a specified localization function. The method is also applied in an observing system simulation experiment with a very coarse resolution dry atmospheric general circulation model. This demonstrates that the algorithm provides insight into the need for localization in large geophysical applications, suggesting that sampling error may be a primary cause in some cases.

## Abstract

Localization is a method for reducing the impact of sampling errors in ensemble Kalman filters. Here, the regression coefficient, or gain, relating ensemble increments for observed quantity *y* to increments for state variable *x* is multiplied by a real number *α* defined as a localization. Localization of the impact of observations on model state variables is required for good performance when applying ensemble data assimilation to large atmospheric and oceanic problems. Localization also improves performance in idealized low-order ensemble assimilation applications. An algorithm that computes localization from the output of an ensemble observing system simulation experiment (OSSE) is described. The algorithm produces localizations for sets of pairs of observations and state variables: for instance, all state variables that are between 300- and 400-km horizontal distance from an observation. The algorithm is applied in a low-order model to produce localizations from the output of an OSSE and the computed localizations are then used in a new OSSE. Results are compared to assimilations using tuned localizations that are approximately Gaussian functions of the distance between an observation and a state variable. In most cases, the empirically computed localizations produce the lowest root-mean-square errors in subsequent OSSEs. Localizations derived from OSSE output can provide guidance for localization in real assimilation experiments. Applying the algorithm in large geophysical applications may help to tune localization for improved ensemble filter performance.

## Abstract

Localization is a method for reducing the impact of sampling errors in ensemble Kalman filters. Here, the regression coefficient, or gain, relating ensemble increments for observed quantity *y* to increments for state variable *x* is multiplied by a real number *α* defined as a localization. Localization of the impact of observations on model state variables is required for good performance when applying ensemble data assimilation to large atmospheric and oceanic problems. Localization also improves performance in idealized low-order ensemble assimilation applications. An algorithm that computes localization from the output of an ensemble observing system simulation experiment (OSSE) is described. The algorithm produces localizations for sets of pairs of observations and state variables: for instance, all state variables that are between 300- and 400-km horizontal distance from an observation. The algorithm is applied in a low-order model to produce localizations from the output of an OSSE and the computed localizations are then used in a new OSSE. Results are compared to assimilations using tuned localizations that are approximately Gaussian functions of the distance between an observation and a state variable. In most cases, the empirically computed localizations produce the lowest root-mean-square errors in subsequent OSSEs. Localizations derived from OSSE output can provide guidance for localization in real assimilation experiments. Applying the algorithm in large geophysical applications may help to tune localization for improved ensemble filter performance.

## Abstract

Knowledge of the probability distribution of initial conditions is central to almost all practical studies of predictability and to improvements in stochastic prediction of the atmosphere. Traditionally, data assimilation for atmospheric predictability or prediction experiments has attempted to find a single “best” estimate of the initial state. Additional information about the initial condition probability distribution is then obtained primarily through heuristic techniques that attempt to generate representative perturbations around the best estimate. However, a classical theory for generating an estimate of the complete probability distribution of an initial state given a set of observations exists. This nonlinear filtering theory can be applied to unify the data assimilation and ensemble generation problem and to produce superior estimates of the probability distribution of the initial state of the atmosphere (or ocean) on regional or global scales. A Monte Carlo implementation of the fully nonlinear filter has been developed and applied to several low-order models. The method is able to produce assimilations with small ensemble mean errors while also providing random samples of the initial condition probability distribution. The Monte Carlo method can be applied in models that traditionally require the application of initialization techniques without any explicit initialization. Initial application to larger models is promising, but a number of challenges remain before the method can be extended to large realistic forecast models.

## Abstract

Knowledge of the probability distribution of initial conditions is central to almost all practical studies of predictability and to improvements in stochastic prediction of the atmosphere. Traditionally, data assimilation for atmospheric predictability or prediction experiments has attempted to find a single “best” estimate of the initial state. Additional information about the initial condition probability distribution is then obtained primarily through heuristic techniques that attempt to generate representative perturbations around the best estimate. However, a classical theory for generating an estimate of the complete probability distribution of an initial state given a set of observations exists. This nonlinear filtering theory can be applied to unify the data assimilation and ensemble generation problem and to produce superior estimates of the probability distribution of the initial state of the atmosphere (or ocean) on regional or global scales. A Monte Carlo implementation of the fully nonlinear filter has been developed and applied to several low-order models. The method is able to produce assimilations with small ensemble mean errors while also providing random samples of the initial condition probability distribution. The Monte Carlo method can be applied in models that traditionally require the application of initialization techniques without any explicit initialization. Initial application to larger models is promising, but a number of challenges remain before the method can be extended to large realistic forecast models.

## Abstract

Two techniques for estimating good localization functions for serial ensemble Kalman filters are compared in observing system simulation experiments (OSSEs) conducted with the dynamical core of an atmospheric general circulation model. The first technique, the global group filter (GGF), minimizes the root-mean-square (RMS) difference between the estimated regression coefficients using a hierarchical ensemble filter. The second, the empirical localization function (ELF), minimizes the RMS difference between the true values of the state variables and the posterior ensemble mean. Both techniques provide an estimate of the localization function for an observation’s impact on a state variable with few a priori assumptions about the localization function. The ELF localizations can have values larger than 1.0 at small distances, indicating that this technique addresses localization but also can correct the prior ensemble spread in the same way as a variance inflation when needed. OSSEs using ELF localizations generally have smaller root-mean-square error (RMSE) than the optimal Gaspari and Cohn (GC) localization function obtained by empirically tuning the GC width. The localization functions estimated by the GGF are broader than those from the ELF, and the OSSEs with the GGF localization generally have larger RMSE than the optimal GC localization function. The GGFs are too broad because of spurious correlation biases that occur in the OSSEs. These errors can be reduced by using a stochastic EnKF with perturbed observations instead of a deterministic EAKF.

## Abstract

Two techniques for estimating good localization functions for serial ensemble Kalman filters are compared in observing system simulation experiments (OSSEs) conducted with the dynamical core of an atmospheric general circulation model. The first technique, the global group filter (GGF), minimizes the root-mean-square (RMS) difference between the estimated regression coefficients using a hierarchical ensemble filter. The second, the empirical localization function (ELF), minimizes the RMS difference between the true values of the state variables and the posterior ensemble mean. Both techniques provide an estimate of the localization function for an observation’s impact on a state variable with few a priori assumptions about the localization function. The ELF localizations can have values larger than 1.0 at small distances, indicating that this technique addresses localization but also can correct the prior ensemble spread in the same way as a variance inflation when needed. OSSEs using ELF localizations generally have smaller root-mean-square error (RMSE) than the optimal Gaspari and Cohn (GC) localization function obtained by empirically tuning the GC width. The localization functions estimated by the GGF are broader than those from the ELF, and the OSSEs with the GGF localization generally have larger RMSE than the optimal GC localization function. The GGFs are too broad because of spurious correlation biases that occur in the OSSEs. These errors can be reduced by using a stochastic EnKF with perturbed observations instead of a deterministic EAKF.

## Abstract

A forced, nonlinear barotropic model on the sphere is shown to simulate some of the structure of the observed Northern Hemisphere midlatitude storm tracks with reasonable accuracy. For the parameter range chosen, the model has no unstable modes with significant amplitude in the storm track regions; however, several decaying modes with structures similar to the storm track are discovered. The model's midlatitude storm tracks also coincide with the location of a waveguide that is obtained by assuming that the horizontal variation of the time-mean flow is small compared with the scale of the transient eddies. Since the model is able to mimic the behavior of the observed storm tracks without any baroclinic dynamics, it is argued that the barotropic waveguide effects of the time-mean background flow acting on individual eddies are partially responsible for the observed storm track structure.

## Abstract

A forced, nonlinear barotropic model on the sphere is shown to simulate some of the structure of the observed Northern Hemisphere midlatitude storm tracks with reasonable accuracy. For the parameter range chosen, the model has no unstable modes with significant amplitude in the storm track regions; however, several decaying modes with structures similar to the storm track are discovered. The model's midlatitude storm tracks also coincide with the location of a waveguide that is obtained by assuming that the horizontal variation of the time-mean flow is small compared with the scale of the transient eddies. Since the model is able to mimic the behavior of the observed storm tracks without any baroclinic dynamics, it is argued that the barotropic waveguide effects of the time-mean background flow acting on individual eddies are partially responsible for the observed storm track structure.

## Abstract

This study presents the first application of a localized particle filter (PF) for data assimilation in a high-dimensional geophysical model. Particle filters form Monte Carlo approximations of model probability densities conditioned on observations, while making no assumptions about the underlying error distribution. Unlike standard PFs, the local PF uses a localization function to reduce the influence of distant observations on state variables, which significantly decreases the number of particles required to maintain the filter’s stability. Because the local PF operates effectively using small numbers of particles, it provides a possible alternative to Gaussian filters, such as ensemble Kalman filters, for large geophysical models. In the current study, the local PF is compared with stochastic and deterministic ensemble Kalman filters using a simplified atmospheric general circulation model. The local PF is found to provide stable filtering results over yearlong data assimilation experiments using only 25 particles. The local PF also outperforms the Gaussian filters when observation networks include measurements that have non-Gaussian errors or relate nonlinearly to the model state, like remotely sensed data used frequently in atmospheric analyses. Results from this study encourage further testing of the local PF on more complex geophysical systems, such as weather prediction models.

## Abstract

This study presents the first application of a localized particle filter (PF) for data assimilation in a high-dimensional geophysical model. Particle filters form Monte Carlo approximations of model probability densities conditioned on observations, while making no assumptions about the underlying error distribution. Unlike standard PFs, the local PF uses a localization function to reduce the influence of distant observations on state variables, which significantly decreases the number of particles required to maintain the filter’s stability. Because the local PF operates effectively using small numbers of particles, it provides a possible alternative to Gaussian filters, such as ensemble Kalman filters, for large geophysical models. In the current study, the local PF is compared with stochastic and deterministic ensemble Kalman filters using a simplified atmospheric general circulation model. The local PF is found to provide stable filtering results over yearlong data assimilation experiments using only 25 particles. The local PF also outperforms the Gaussian filters when observation networks include measurements that have non-Gaussian errors or relate nonlinearly to the model state, like remotely sensed data used frequently in atmospheric analyses. Results from this study encourage further testing of the local PF on more complex geophysical systems, such as weather prediction models.

## Abstract

Various generalizations of the univariate rank histogram have been proposed to inspect the reliability of an ensemble forecast or analysis in multidimensional spaces. Multivariate rank histograms provide insightful information about the misspecification of genuinely multivariate features such as the correlation between various variables in a multivariate ensemble. However, the interpretation of patterns in a multivariate rank histogram should be handled with care. The purpose of this paper is to focus on multivariate rank histograms designed based on the concept of data depth and outline some important considerations that should be accounted for when using such multivariate rank histograms. To generate correct multivariate rank histograms using the concept of data depth, the datatype of the ensemble should be taken into account to define a proper preranking function. This paper demonstrates how and why some preranking functions might not be suitable for multivariate or vector-valued ensembles and proposes preranking functions based on the concept of simplicial depth that are applicable to both multivariate points and vector-valued ensembles. In addition, there exists an inherent identifiability issue associated with center-outward preranking functions used to generate multivariate rank histograms. This problem can be alleviated by complementing the multivariate rank histogram with other well-known multivariate statistical inference tools based on rank statistics such as the depth-versus-depth (DD) plot. Using a synthetic example, it is shown that the DD plot is less sensitive to sample size compared to multivariate rank histograms.

## Abstract

Various generalizations of the univariate rank histogram have been proposed to inspect the reliability of an ensemble forecast or analysis in multidimensional spaces. Multivariate rank histograms provide insightful information about the misspecification of genuinely multivariate features such as the correlation between various variables in a multivariate ensemble. However, the interpretation of patterns in a multivariate rank histogram should be handled with care. The purpose of this paper is to focus on multivariate rank histograms designed based on the concept of data depth and outline some important considerations that should be accounted for when using such multivariate rank histograms. To generate correct multivariate rank histograms using the concept of data depth, the datatype of the ensemble should be taken into account to define a proper preranking function. This paper demonstrates how and why some preranking functions might not be suitable for multivariate or vector-valued ensembles and proposes preranking functions based on the concept of simplicial depth that are applicable to both multivariate points and vector-valued ensembles. In addition, there exists an inherent identifiability issue associated with center-outward preranking functions used to generate multivariate rank histograms. This problem can be alleviated by complementing the multivariate rank histogram with other well-known multivariate statistical inference tools based on rank statistics such as the depth-versus-depth (DD) plot. Using a synthetic example, it is shown that the DD plot is less sensitive to sample size compared to multivariate rank histograms.