# Search Results

## You are looking at 1 - 10 of 36 items for :

- Author or Editor: Jeffrey Anderson x

- Monthly Weather Review x

- Refine by Access: All Content x

## Abstract

A general framework for deterministic univariate ensemble filtering is presented. The framework fits a continuous prior probability density function (PDF) to the prior ensemble. A functional representation for the observation likelihood is combined with the prior PDF to get a continuous analysis (posterior) PDF. Cumulative distribution functions for the prior and analysis are also required. The key innovation is that an analysis ensemble is computed so that the quantile of each ensemble member is the same as its prior quantile. Many choices for the prior PDF family and the likelihood function are described. A choice of normal prior with normal likelihood is equivalent to the ensemble adjustment Kalman filter. Some other choices for the prior include gamma, inverse gamma, beta, beta prime, lognormal, and exponential distributions. Both prior distributions and likelihoods can be defined over a set of intervals giving additional flexibility that can be used to implement methods like a Huber likelihood for observations with occasional outliers. Priors and likelihoods can also be defined as sums of distributions allowing choices like bivariate normals or kernel filters. Empirical distributions, for instance piecewise linear approximations to arbitrary PDFs and functions can be used. Another empirical choice leads to the rank histogram filter. Results here are univariate and can be used to compute increments for observed variables or marginal distributions for any variable for a reanalysis. Linear regression of increments can be used to update state variables in a serial filter to build a comprehensive data assimilation system. Part 2 will discuss other methods for extending the framework to multivariate data assimilation.

### Significance Statement

Data assimilation is used to combine information from model forecasts with subsequent observations to obtain better estimates of the current state of the atmosphere or other parts of the Earth system. Ensemble data assimilation uses a number of forecasts to get more information about uncertainty. A new method allows much more flexibility in the assumptions that must be made when doing ensemble data assimilation. As an example, the method can be better for quantities that are bounded like the amount of an atmospheric trace pollutant.

## Abstract

A general framework for deterministic univariate ensemble filtering is presented. The framework fits a continuous prior probability density function (PDF) to the prior ensemble. A functional representation for the observation likelihood is combined with the prior PDF to get a continuous analysis (posterior) PDF. Cumulative distribution functions for the prior and analysis are also required. The key innovation is that an analysis ensemble is computed so that the quantile of each ensemble member is the same as its prior quantile. Many choices for the prior PDF family and the likelihood function are described. A choice of normal prior with normal likelihood is equivalent to the ensemble adjustment Kalman filter. Some other choices for the prior include gamma, inverse gamma, beta, beta prime, lognormal, and exponential distributions. Both prior distributions and likelihoods can be defined over a set of intervals giving additional flexibility that can be used to implement methods like a Huber likelihood for observations with occasional outliers. Priors and likelihoods can also be defined as sums of distributions allowing choices like bivariate normals or kernel filters. Empirical distributions, for instance piecewise linear approximations to arbitrary PDFs and functions can be used. Another empirical choice leads to the rank histogram filter. Results here are univariate and can be used to compute increments for observed variables or marginal distributions for any variable for a reanalysis. Linear regression of increments can be used to update state variables in a serial filter to build a comprehensive data assimilation system. Part 2 will discuss other methods for extending the framework to multivariate data assimilation.

### Significance Statement

Data assimilation is used to combine information from model forecasts with subsequent observations to obtain better estimates of the current state of the atmosphere or other parts of the Earth system. Ensemble data assimilation uses a number of forecasts to get more information about uncertainty. A new method allows much more flexibility in the assumptions that must be made when doing ensemble data assimilation. As an example, the method can be better for quantities that are bounded like the amount of an atmospheric trace pollutant.

## Abstract

Many methods using ensemble integrations of prediction models as integral parts of data assimilation have appeared in the atmospheric and oceanic literature. In general, these methods have been derived from the Kalman filter and have been known as ensemble Kalman filters. A more general class of methods including these ensemble Kalman filter methods is derived starting from the nonlinear filtering problem. When working in a joint state–observation space, many features of ensemble filtering algorithms are easier to derive and compare. The ensemble filter methods derived here make a (local) least squares assumption about the relation between prior distributions of an observation variable and model state variables. In this context, the update procedure applied when a new observation becomes available can be described in two parts. First, an update increment is computed for each prior ensemble estimate of the observation variable by applying a scalar ensemble filter. Second, a linear regression of the prior ensemble sample of each state variable on the observation variable is performed to compute update increments for each state variable ensemble member from corresponding observation variable increments. The regression can be applied globally or locally using Gaussian kernel methods.

Several previously documented ensemble Kalman filter methods, the perturbed observation ensemble Kalman filter and ensemble adjustment Kalman filter, are developed in this context. Some new ensemble filters that extend beyond the Kalman filter context are also discussed. The two-part method can provide a computationally efficient implementation of ensemble filters and allows more straightforward comparison of methods since they differ only in the solution of a scalar filtering problem.

## Abstract

Many methods using ensemble integrations of prediction models as integral parts of data assimilation have appeared in the atmospheric and oceanic literature. In general, these methods have been derived from the Kalman filter and have been known as ensemble Kalman filters. A more general class of methods including these ensemble Kalman filter methods is derived starting from the nonlinear filtering problem. When working in a joint state–observation space, many features of ensemble filtering algorithms are easier to derive and compare. The ensemble filter methods derived here make a (local) least squares assumption about the relation between prior distributions of an observation variable and model state variables. In this context, the update procedure applied when a new observation becomes available can be described in two parts. First, an update increment is computed for each prior ensemble estimate of the observation variable by applying a scalar ensemble filter. Second, a linear regression of the prior ensemble sample of each state variable on the observation variable is performed to compute update increments for each state variable ensemble member from corresponding observation variable increments. The regression can be applied globally or locally using Gaussian kernel methods.

Several previously documented ensemble Kalman filter methods, the perturbed observation ensemble Kalman filter and ensemble adjustment Kalman filter, are developed in this context. Some new ensemble filters that extend beyond the Kalman filter context are also discussed. The two-part method can provide a computationally efficient implementation of ensemble filters and allows more straightforward comparison of methods since they differ only in the solution of a scalar filtering problem.

## Abstract

Ensemble Kalman filters are widely used for data assimilation in large geophysical models. Good results with affordable ensemble sizes require enhancements to the basic algorithms to deal with insufficient ensemble variance and spurious ensemble correlations between observations and state variables. These challenges are often dealt with by using inflation and localization algorithms. A new method for understanding and reducing some ensemble filter errors is introduced and tested. The method assumes that sampling error due to small ensemble size is the primary source of error. Sampling error in the ensemble correlations between observations and state variables is reduced by estimating the distribution of correlations as part of the ensemble filter algorithm. This correlation error reduction (CER) algorithm can produce high-quality ensemble assimilations in low-order models without using any a priori localization like a specified localization function. The method is also applied in an observing system simulation experiment with a very coarse resolution dry atmospheric general circulation model. This demonstrates that the algorithm provides insight into the need for localization in large geophysical applications, suggesting that sampling error may be a primary cause in some cases.

## Abstract

Ensemble Kalman filters are widely used for data assimilation in large geophysical models. Good results with affordable ensemble sizes require enhancements to the basic algorithms to deal with insufficient ensemble variance and spurious ensemble correlations between observations and state variables. These challenges are often dealt with by using inflation and localization algorithms. A new method for understanding and reducing some ensemble filter errors is introduced and tested. The method assumes that sampling error due to small ensemble size is the primary source of error. Sampling error in the ensemble correlations between observations and state variables is reduced by estimating the distribution of correlations as part of the ensemble filter algorithm. This correlation error reduction (CER) algorithm can produce high-quality ensemble assimilations in low-order models without using any a priori localization like a specified localization function. The method is also applied in an observing system simulation experiment with a very coarse resolution dry atmospheric general circulation model. This demonstrates that the algorithm provides insight into the need for localization in large geophysical applications, suggesting that sampling error may be a primary cause in some cases.

## Abstract

A number of operational atmospheric prediction centers now produce ensemble forecasts of the atmosphere. Because of the high-dimensional phase spaces associated with operational forecast models, many centers use constraints derived from the dynamics of the forecast model to define a greatly reduced subspace from which ensemble initial conditions are chosen. For instance, the European Centre for Medium-Range Weather Forecasts uses singular vectors of the forecast model and the National Centers for Environmental Prediction use the “breeding cycle” to determine a limited set of directions in phase space that are sampled by the ensemble forecast.

The use of dynamical constraints on the selection of initial conditions for ensemble forecasts is examined in a perfect model study using a pair of three-variable dynamical systems and a prescribed observational error distribution. For these systems, one can establish that the direct use of dynamical constraints has no impact on the error of the ensemble mean forecast and a negative impact on forecasts of higher-moment quantities such as forecast spread. Simple examples are presented to show that this is not a result of the use of low-order dynamical systems but is instead related to the fundamental nature of the dynamics of these particular low-order systems themselves. Unless operational prediction models have fundamentally different dynamics, this study suggests that the use of dynamically constrained ensembles may not be justified. Further studies with more realistic prediction models are needed to evaluate this possibility.

## Abstract

A number of operational atmospheric prediction centers now produce ensemble forecasts of the atmosphere. Because of the high-dimensional phase spaces associated with operational forecast models, many centers use constraints derived from the dynamics of the forecast model to define a greatly reduced subspace from which ensemble initial conditions are chosen. For instance, the European Centre for Medium-Range Weather Forecasts uses singular vectors of the forecast model and the National Centers for Environmental Prediction use the “breeding cycle” to determine a limited set of directions in phase space that are sampled by the ensemble forecast.

The use of dynamical constraints on the selection of initial conditions for ensemble forecasts is examined in a perfect model study using a pair of three-variable dynamical systems and a prescribed observational error distribution. For these systems, one can establish that the direct use of dynamical constraints has no impact on the error of the ensemble mean forecast and a negative impact on forecasts of higher-moment quantities such as forecast spread. Simple examples are presented to show that this is not a result of the use of low-order dynamical systems but is instead related to the fundamental nature of the dynamics of these particular low-order systems themselves. Unless operational prediction models have fundamentally different dynamics, this study suggests that the use of dynamically constrained ensembles may not be justified. Further studies with more realistic prediction models are needed to evaluate this possibility.

## Abstract

A deterministic square root ensemble Kalman filter and a stochastic perturbed observation ensemble Kalman filter are used for data assimilation in both linear and nonlinear single variable dynamical systems. For the linear system, the deterministic filter is simply a method for computing the Kalman filter and is optimal while the stochastic filter has suboptimal performance due to sampling error. For the nonlinear system, the deterministic filter has increasing error as ensemble size increases because all ensemble members but one become tightly clustered. In this case, the stochastic filter performs better for sufficiently large ensembles. A new method for computing ensemble increments in observation space is proposed that does not suffer from the pathological behavior of the deterministic filter while avoiding much of the sampling error of the stochastic filter. This filter uses the order statistics of the prior observation space ensemble to create an approximate continuous prior probability distribution in a fashion analogous to the use of rank histograms for ensemble forecast evaluation. This rank histogram filter can represent non-Gaussian observation space priors and posteriors and is shown to be competitive with existing filters for problems as large as global numerical weather prediction. The ability to represent non-Gaussian distributions is useful for a variety of applications such as convective-scale assimilation and assimilation of bounded quantities such as relative humidity.

## Abstract

A deterministic square root ensemble Kalman filter and a stochastic perturbed observation ensemble Kalman filter are used for data assimilation in both linear and nonlinear single variable dynamical systems. For the linear system, the deterministic filter is simply a method for computing the Kalman filter and is optimal while the stochastic filter has suboptimal performance due to sampling error. For the nonlinear system, the deterministic filter has increasing error as ensemble size increases because all ensemble members but one become tightly clustered. In this case, the stochastic filter performs better for sufficiently large ensembles. A new method for computing ensemble increments in observation space is proposed that does not suffer from the pathological behavior of the deterministic filter while avoiding much of the sampling error of the stochastic filter. This filter uses the order statistics of the prior observation space ensemble to create an approximate continuous prior probability distribution in a fashion analogous to the use of rank histograms for ensemble forecast evaluation. This rank histogram filter can represent non-Gaussian observation space priors and posteriors and is shown to be competitive with existing filters for problems as large as global numerical weather prediction. The ability to represent non-Gaussian distributions is useful for a variety of applications such as convective-scale assimilation and assimilation of bounded quantities such as relative humidity.

## Abstract

Localization is a method for reducing the impact of sampling errors in ensemble Kalman filters. Here, the regression coefficient, or gain, relating ensemble increments for observed quantity *y* to increments for state variable *x* is multiplied by a real number *α* defined as a localization. Localization of the impact of observations on model state variables is required for good performance when applying ensemble data assimilation to large atmospheric and oceanic problems. Localization also improves performance in idealized low-order ensemble assimilation applications. An algorithm that computes localization from the output of an ensemble observing system simulation experiment (OSSE) is described. The algorithm produces localizations for sets of pairs of observations and state variables: for instance, all state variables that are between 300- and 400-km horizontal distance from an observation. The algorithm is applied in a low-order model to produce localizations from the output of an OSSE and the computed localizations are then used in a new OSSE. Results are compared to assimilations using tuned localizations that are approximately Gaussian functions of the distance between an observation and a state variable. In most cases, the empirically computed localizations produce the lowest root-mean-square errors in subsequent OSSEs. Localizations derived from OSSE output can provide guidance for localization in real assimilation experiments. Applying the algorithm in large geophysical applications may help to tune localization for improved ensemble filter performance.

## Abstract

Localization is a method for reducing the impact of sampling errors in ensemble Kalman filters. Here, the regression coefficient, or gain, relating ensemble increments for observed quantity *y* to increments for state variable *x* is multiplied by a real number *α* defined as a localization. Localization of the impact of observations on model state variables is required for good performance when applying ensemble data assimilation to large atmospheric and oceanic problems. Localization also improves performance in idealized low-order ensemble assimilation applications. An algorithm that computes localization from the output of an ensemble observing system simulation experiment (OSSE) is described. The algorithm produces localizations for sets of pairs of observations and state variables: for instance, all state variables that are between 300- and 400-km horizontal distance from an observation. The algorithm is applied in a low-order model to produce localizations from the output of an OSSE and the computed localizations are then used in a new OSSE. Results are compared to assimilations using tuned localizations that are approximately Gaussian functions of the distance between an observation and a state variable. In most cases, the empirically computed localizations produce the lowest root-mean-square errors in subsequent OSSEs. Localizations derived from OSSE output can provide guidance for localization in real assimilation experiments. Applying the algorithm in large geophysical applications may help to tune localization for improved ensemble filter performance.

## Abstract

An extension to standard ensemble Kalman filter algorithms that can improve performance for non-Gaussian prior distributions, non-Gaussian likelihoods, and bounded state variables is described. The algorithm exploits the capability of the rank histogram filter (RHF) to represent arbitrary prior distributions for observed variables. The rank histogram algorithm can be applied directly to state variables to produce posterior marginal ensembles without the need for regression that is part of standard ensemble filters. These marginals are used to adjust the marginals obtained from a standard ensemble filter that uses regression to update state variables. The final posterior ensemble is obtained by doing an ordered replacement of the posterior marginal ensemble values from a standard ensemble filter with the values obtained from the rank histogram method applied directly to state variables; the algorithm is referred to as the marginal adjustment rank histogram filter (MARHF). Applications to idealized bivariate problems and low-order dynamical systems show that the MARHF can produce better results than standard ensemble methods for priors that are non-Gaussian. Like the original RHF, the MARHF can also make use of arbitrary non-Gaussian observation likelihoods. The MARHF also has advantages for problems with bounded state variables, for instance, the concentration of an atmospheric tracer. Bounds can be automatically respected in the posterior ensembles. With an efficient implementation of the MARHF, the additional cost has better scaling than the standard RHF.

## Abstract

An extension to standard ensemble Kalman filter algorithms that can improve performance for non-Gaussian prior distributions, non-Gaussian likelihoods, and bounded state variables is described. The algorithm exploits the capability of the rank histogram filter (RHF) to represent arbitrary prior distributions for observed variables. The rank histogram algorithm can be applied directly to state variables to produce posterior marginal ensembles without the need for regression that is part of standard ensemble filters. These marginals are used to adjust the marginals obtained from a standard ensemble filter that uses regression to update state variables. The final posterior ensemble is obtained by doing an ordered replacement of the posterior marginal ensemble values from a standard ensemble filter with the values obtained from the rank histogram method applied directly to state variables; the algorithm is referred to as the marginal adjustment rank histogram filter (MARHF). Applications to idealized bivariate problems and low-order dynamical systems show that the MARHF can produce better results than standard ensemble methods for priors that are non-Gaussian. Like the original RHF, the MARHF can also make use of arbitrary non-Gaussian observation likelihoods. The MARHF also has advantages for problems with bounded state variables, for instance, the concentration of an atmospheric tracer. Bounds can be automatically respected in the posterior ensembles. With an efficient implementation of the MARHF, the additional cost has better scaling than the standard RHF.

## Abstract

It is possible to describe many variants of ensemble Kalman filters without loss of generality as the impact of a single observation on a single state variable. For most ensemble algorithms commonly applied to Earth system models, the computation of increments for the observation variable ensemble can be treated as a separate step from computing increments for the state variable ensemble. The state variable increments are normally computed from the observation increments by linear regression using the prior bivariate ensemble of the state and observation variable. Here, a new method that replaces the standard regression with a regression using the bivariate rank statistics is described. This rank regression is expected to be most effective when the relation between a state variable and an observation is nonlinear. The performance of standard versus rank regression is compared for both linear and nonlinear forward operators (also known as observation operators) using a low-order model. Rank regression in combination with a rank histogram filter in observation space produces better analyses than standard regression for cases with nonlinear forward operators and relatively large analysis error. Standard regression, in combination with either a rank histogram filter or an ensemble Kalman filter in observation space, produces the best results in other situations.

## Abstract

It is possible to describe many variants of ensemble Kalman filters without loss of generality as the impact of a single observation on a single state variable. For most ensemble algorithms commonly applied to Earth system models, the computation of increments for the observation variable ensemble can be treated as a separate step from computing increments for the state variable ensemble. The state variable increments are normally computed from the observation increments by linear regression using the prior bivariate ensemble of the state and observation variable. Here, a new method that replaces the standard regression with a regression using the bivariate rank statistics is described. This rank regression is expected to be most effective when the relation between a state variable and an observation is nonlinear. The performance of standard versus rank regression is compared for both linear and nonlinear forward operators (also known as observation operators) using a low-order model. Rank regression in combination with a rank histogram filter in observation space produces better analyses than standard regression for cases with nonlinear forward operators and relatively large analysis error. Standard regression, in combination with either a rank histogram filter or an ensemble Kalman filter in observation space, produces the best results in other situations.

## Abstract

Ensemble Kalman filters use the sample covariance of an observation and a model state variable to update a prior estimate of the state variable. The sample covariance can be suboptimal as a result of small ensemble size, model error, model nonlinearity, and other factors. The most common algorithms for dealing with these deficiencies are inflation and covariance localization. A statistical model of errors in ensemble Kalman filter sample covariances is described and leads to an algorithm that reduces ensemble filter root-mean-square error for some applications. This sampling error correction algorithm uses prior information about the distribution of the correlation between an observation and a state variable. Offline Monte Carlo simulation is used to build a lookup table that contains a correction factor between 0 and 1 depending on the ensemble size and the ensemble sample correlation. Correction factors are applied like a traditional localization for each pair of observations and state variables during an ensemble assimilation. The algorithm is applied to two low-order models and reduces the sensitivity of the ensemble assimilation error to the strength of traditional localization. When tested in perfect model experiments in a larger model, the dynamical core of a general circulation model, the sampling error correction algorithm produces analyses that are closer to the truth and also reduces sensitivity to traditional localization strength.

## Abstract

Ensemble Kalman filters use the sample covariance of an observation and a model state variable to update a prior estimate of the state variable. The sample covariance can be suboptimal as a result of small ensemble size, model error, model nonlinearity, and other factors. The most common algorithms for dealing with these deficiencies are inflation and covariance localization. A statistical model of errors in ensemble Kalman filter sample covariances is described and leads to an algorithm that reduces ensemble filter root-mean-square error for some applications. This sampling error correction algorithm uses prior information about the distribution of the correlation between an observation and a state variable. Offline Monte Carlo simulation is used to build a lookup table that contains a correction factor between 0 and 1 depending on the ensemble size and the ensemble sample correlation. Correction factors are applied like a traditional localization for each pair of observations and state variables during an ensemble assimilation. The algorithm is applied to two low-order models and reduces the sensitivity of the ensemble assimilation error to the strength of traditional localization. When tested in perfect model experiments in a larger model, the dynamical core of a general circulation model, the sampling error correction algorithm produces analyses that are closer to the truth and also reduces sensitivity to traditional localization strength.

## Abstract

A theory for estimating the probability distribution of the state of a model given a set of observations exists. This nonlinear filtering theory unifies the data assimilation and ensemble generation problem that have been key foci of prediction and predictability research for numerical weather and ocean prediction applications. A new algorithm, referred to as an ensemble adjustment Kalman filter, and the more traditional implementation of the ensemble Kalman filter in which “perturbed observations” are used, are derived as Monte Carlo approximations to the nonlinear filter. Both ensemble Kalman filter methods produce assimilations with small ensemble mean errors while providing reasonable measures of uncertainty in the assimilated variables. The ensemble methods can assimilate observations with a nonlinear relation to model state variables and can also use observations to estimate the value of imprecisely known model parameters. These ensemble filter methods are shown to have significant advantages over four-dimensional variational assimilation in low-order models and scale easily to much larger applications. Heuristic modifications to the filtering algorithms allow them to be applied efficiently to very large models by sequentially processing observations and computing the impact of each observation on each state variable in an independent calculation. The ensemble adjustment Kalman filter is applied to a nondivergent barotropic model on the sphere to demonstrate the capabilities of the filters in models with state spaces that are much larger than the ensemble size.

When observations are assimilated in the traditional ensemble Kalman filter, the resulting updated ensemble has a mean that is consistent with the value given by filtering theory, but only the expected value of the covariance of the updated ensemble is consistent with the theory. The ensemble adjustment Kalman filter computes a linear operator that is applied to the prior ensemble estimate of the state, resulting in an updated ensemble whose mean and also covariance are consistent with the theory. In the cases compared here, the ensemble adjustment Kalman filter performs significantly better than the traditional ensemble Kalman filter, apparently because noise introduced into the assimilated ensemble through perturbed observations in the traditional filter limits its relative performance. This superior performance may not occur for all problems and is expected to be most notable for small ensembles. Still, the results suggest that careful study of the capabilities of different varieties of ensemble Kalman filters is appropriate when exploring new applications.

## Abstract

A theory for estimating the probability distribution of the state of a model given a set of observations exists. This nonlinear filtering theory unifies the data assimilation and ensemble generation problem that have been key foci of prediction and predictability research for numerical weather and ocean prediction applications. A new algorithm, referred to as an ensemble adjustment Kalman filter, and the more traditional implementation of the ensemble Kalman filter in which “perturbed observations” are used, are derived as Monte Carlo approximations to the nonlinear filter. Both ensemble Kalman filter methods produce assimilations with small ensemble mean errors while providing reasonable measures of uncertainty in the assimilated variables. The ensemble methods can assimilate observations with a nonlinear relation to model state variables and can also use observations to estimate the value of imprecisely known model parameters. These ensemble filter methods are shown to have significant advantages over four-dimensional variational assimilation in low-order models and scale easily to much larger applications. Heuristic modifications to the filtering algorithms allow them to be applied efficiently to very large models by sequentially processing observations and computing the impact of each observation on each state variable in an independent calculation. The ensemble adjustment Kalman filter is applied to a nondivergent barotropic model on the sphere to demonstrate the capabilities of the filters in models with state spaces that are much larger than the ensemble size.

When observations are assimilated in the traditional ensemble Kalman filter, the resulting updated ensemble has a mean that is consistent with the value given by filtering theory, but only the expected value of the covariance of the updated ensemble is consistent with the theory. The ensemble adjustment Kalman filter computes a linear operator that is applied to the prior ensemble estimate of the state, resulting in an updated ensemble whose mean and also covariance are consistent with the theory. In the cases compared here, the ensemble adjustment Kalman filter performs significantly better than the traditional ensemble Kalman filter, apparently because noise introduced into the assimilated ensemble through perturbed observations in the traditional filter limits its relative performance. This superior performance may not occur for all problems and is expected to be most notable for small ensembles. Still, the results suggest that careful study of the capabilities of different varieties of ensemble Kalman filters is appropriate when exploring new applications.