1. Introduction

















In order for
If we consider the true state to be a function of space and time evolving in an infinite-dimensional function space, then the truncated true state and the observation are essentially two different finite-dimensional projections of this infinite-dimensional space (Dee 1995; Janjić and Cohn 2006; Oke and Sakov 2008). In section 2 we will show that the errors between the exact projections and the finite-dimensional approximations are correlated for generic observations. Error correlations arising from model truncation have been previously observed (Mitchell and Daley 1997; Hamill and Whitaker 2005; Liu and Rabier 2002). In particular, models are often composed of discrete dynamics occurring at points of a two- or three-dimensional grid. Remote observations by satellite or radiosonde can be viewed as integrations over a region including several grid points. Here, we provide a general framework to explain correlations between these quantities. We also consider other sources of error such as model mismatch and instrument error, and we show that significant correlations persist except in the case where instrument error dominates (since this error will be modeled as white noise that is uncorrelated with the state).
In section 3, a correlated version of the EnKF is developed that takes the correlations in (3) into account, and recovers the Kalman equations for linear systems. In section 4, the correlated unscented Kalman filter (CUKF; an unscented version of the EnKF) is applied to examples from section 2. Using an appropriate
In section 5 we investigate the effect of correlations in greater detail. First, we show that in the linear case when
2. Correlation between system and observation errors
To understand how correlations arise in applied data assimilation, we must first leave behind the idealized scenario described in (1) and (2). Following Dee (1995) and Janjić and Cohn (2006), we describe the true evolution and observation processes by replacing the discrete solution

































a. Evaluation and averaging projections














































As a special case, consider the situation when both the system and observation errors are dominated by the same single variable (either time or one of the spatial variables). In this case, the leading-order terms would differ only by a constant, so that up to higher-order terms the system error and observation error would be multiples of one another. This not only implies that








Even for general nonlinear observations functions h and approximate observation functions
b. Time-averaged observations of an ODE






























c. Estimation of the full covariance matrix
The correlations described above will be illustrated in two simple examples. To show the effects most clearly, we assume a perfect model. We begin with a simple ODE solver.








Demonstrating correlated noise in the truncated L63 system. (a) Comparing the true x coordinate of L63 (gray) to a one-step forecast using the forward Euler method (blue, solid curve) with
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

Demonstrating correlated noise in the truncated L63 system. (a) Comparing the true x coordinate of L63 (gray) to a one-step forecast using the forward Euler method (blue, solid curve) with
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
Demonstrating correlated noise in the truncated L63 system. (a) Comparing the true x coordinate of L63 (gray) to a one-step forecast using the forward Euler method (blue, solid curve) with
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
In Fig. 1c we show the estimated covariance matrix
The difference between the positive correlations in Figs. 1b and 1c and negative correlations in Figs. 1e and 1f will have a noticeable effect in filter accuracy, as shown in section 4a. In section 5, we will establish a theory explaining this disparity in the linear case.






















In Fig. 2 we compare the full resolution and truncated solutions for 64 grid points and

(top) (left) Ground truth 512 gridpoint solution (middle left) the same solution decimated to 64 grid points (middle right) the observation, which integrates the leftmost solution over 9 grid points before truncating, and (right) the observation error, which is the difference between the middle two solutions. (bottom) (left),(middle left) As in (top). (middle right) The 1-step integrator output from the truncated model, using 64 grid points and
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

(top) (left) Ground truth 512 gridpoint solution (middle left) the same solution decimated to 64 grid points (middle right) the observation, which integrates the leftmost solution over 9 grid points before truncating, and (right) the observation error, which is the difference between the middle two solutions. (bottom) (left),(middle left) As in (top). (middle right) The 1-step integrator output from the truncated model, using 64 grid points and
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
(top) (left) Ground truth 512 gridpoint solution (middle left) the same solution decimated to 64 grid points (middle right) the observation, which integrates the leftmost solution over 9 grid points before truncating, and (right) the observation error, which is the difference between the middle two solutions. (bottom) (left),(middle left) As in (top). (middle right) The 1-step integrator output from the truncated model, using 64 grid points and
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
It is helpful to view the empirical full covariance matrix

For the Kuramoto–Sivashinsky model truncated onto 64 grid points with
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

For the Kuramoto–Sivashinsky model truncated onto 64 grid points with
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
For the Kuramoto–Sivashinsky model truncated onto 64 grid points with
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1



In Fig. 3 we should emphasize the presence of many small eigenvalues, indicating that the
3. Filtering in the presence of correlations
In this section we review versions of the Kalman filter for linear and nonlinear dynamics, which include full correlations of system and observation errors. We begin with the linear formulas, and then discuss the unscented version of the ensemble Kalman filter for nonlinear models.
a. The Kalman filter for correlated system and observation noise





























b. The correlated unscented Kalman filter (CUKF)
We now generalize the correlated system and observation noise filtering approach to nonlinear systems and we will show that for linear systems we recover exactly the equations above.








































In appendix A we show the equivalence of the CUKF and Kalman filter (KF) for linear problems with correlated noise, which shows that the CUKF is a natural generalization to nonlinear problems with correlated errors. We note that the generalization of the CUKF approach to an ensemble square root Kalman filter (EnSQKF) is a straightforward extension of the same Kalman update formulas. Integrating correlated noise into other Kalman filters such as the EnKF and ensemble transform Kalman filter (ETKF) can also be achieved using the Kalman update for correlated noise. For large problems the covariance matrix would need to be localized in order to be a practical method; for example, the localized ensemble transform Kalman filter (LETKF) can be adapted to use the unscented ensembles used here Berry and Sauer (2013). A significant remaining task is generalizing the ensemble adjustment Kalman filter EAKF for additive system noise and correlated system and observation noise. Serial filters such as the EAKF cannot currently be applied even for additive system noise, which is not correlated to the observation noise, and instead these filters typically use inflation to try to account for system error. Generalizing the serial filtering approach to allow these more general types of inflation is a significant and important task and is beyond the scope of this article.
4. Filtering systems with truncation errors
In this section we apply the CUKF to truncated observations of the Lorenz-63 and Kuramoto–Sivashinsky systems as described in section 2. The dynamics and observations considered in this section have no added noise, so that the system errors arise only from truncation of the numerical solvers, and the observation errors arises only from local integration (representation error only). The CUKF will use the empirically estimated
a. Example: Lorenz equations
First we consider the Lorenz-63 system in (8) with the observation described in section 2b. Using the same data generated in that example, we applied the CUKF and UKF. The estimates produced by these filters are shown in Fig. 4a (for the same time interval shown in Fig. 1). In Fig. 4b we show the errors between each filter’s estimates and the truth, compared to the observation representation errors over the same time interval. The CUKF, which uses the full

(a) Comparison of the true solution,
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

(a) Comparison of the true solution,
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
(a) Comparison of the true solution,
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
We then repeated this experiment using the RK4 integrator instead of forward Euler with the same truncated time step of
b. Example: Kuramoto–Sivashinsky
Next we consider filtering the observations of the Kuramoto–Sivashinsky model in (9) introduced in section 2c. Using a ground truth integrated with 512 spatial grid points and

(a),(b) Comparison of filter results using the UKF without correlations to filtering with correlations (CUKF) on the Kuramoto–Sivashinsky model truncated in space to
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

(a),(b) Comparison of filter results using the UKF without correlations to filtering with correlations (CUKF) on the Kuramoto–Sivashinsky model truncated in space to
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
(a),(b) Comparison of filter results using the UKF without correlations to filtering with correlations (CUKF) on the Kuramoto–Sivashinsky model truncated in space to
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
The results of both UKF are robust for large
Finally, in Fig. 5d we show the effect of inflation in the UKF by adding a constant multiple of the identity to either
5. Maximally correlated random variables and perfect recoverability
In the previous section, the importance of using the full correlation matrix
a. Maximum correlation
We begin by defining maximally correlated random variables.
Definition 5.1 (Maximally correlated random variables)
Let






















A simple example of maximal correlation is to consider the
It follows from Lemma B.1 that given random variables
Now consider the case when
b. Perfect recoverability in maximally correlated linear systems


























If (18) has a solution that is stabilizing, meaning that all the eigenvalues of
Theorem 5.2








Notice that the Kalman filter has an asymmetry between


















Mean-squared error of filter estimates for linear models
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

Mean-squared error of filter estimates for linear models
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
Mean-squared error of filter estimates for linear models
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1























c. Examples of UKF and perfect recovery in nonlinear systems
In this section we will apply the UKF to synthetic datasets generated with nonlinear dynamics where the system and observation errors are Gaussian distributed pseudorandom numbers. A surprising result is that despite the nonlinearity, we still obtain perfect recovery up to numerical precision for maximally correlated errors. Moreover, in analogy to the linear case, perfect recovery is not possible when the instabilities in the nonlinear dynamics become sufficiently strong.
We first consider the Lorenz-63 system introduced above (8). We consider the discrete time dynamics

Mean-squared error of filter estimates with positively correlated noise for (a) L63 in periodic and chaotic parameter regimes and (b) L96 dynamical systems for various values of the forcing parameter. Black curve is based on the filter using
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1

Mean-squared error of filter estimates with positively correlated noise for (a) L63 in periodic and chaotic parameter regimes and (b) L96 dynamical systems for various values of the forcing parameter. Black curve is based on the filter using
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1
Mean-squared error of filter estimates with positively correlated noise for (a) L63 in periodic and chaotic parameter regimes and (b) L96 dynamical systems for various values of the forcing parameter. Black curve is based on the filter using
Citation: Monthly Weather Review 146, 9; 10.1175/MWR-D-17-0331.1















6. Discussion
Approximating a dynamical system on a grid is pervasive in geophysical data assimilation applications. For dynamical processes, time is usually handled in discrete fashion. We have shown that correlation between system and observation errors should be expected when the system errors derive from local truncation errors from differential equation solvers, both in discrete time and on a spatial grid, and when observational error is dominated by either observation model error or representation error.
In section 3, we introduced an approach to the ensemble Kalman filter that accounts for the correlations between system and observation errors. In particular, we showed that for spatiotemporal problems, extending the covariance matrix to allow cross correlations can reduce filtering error as much as a significant increase in grid resolution. Of course, obtaining more precise estimates of the truth with much coarser discretization allows faster runtimes and/or larger ensembles to be used.
Correlations are most significant when other independent sources of observation and system error are small compared to the truncation error. Of course, other sources of error, such as model error, may influence both the state and the observations leading to further significant correlations, but for simplicity we focus on correlations arising in the perfect model scenario. It is reasonable to expect that in many physical systems, the noise affecting the state of the system would also affect the sensor or observation system.
The generalization of the CUKF to an ensemble square root Kalman filter (EnSQKF) is a straightforward extension. However, it remains to extend the ensemble adjustment Kalman filter EAKF for additive system noise to correlations between system and observation noise. An EAKF formulation is critical for situations when the ensemble size is necessarily much smaller than either the state or observation dimensions (N and M, respectively). This situation is common when the covariance matrices, which are explicitly used in the UKF approach above do not fit in memory. A significant challenge in this formulation is that we cannot appropriately inflate the ensemble since we assume the full correlation matrix
In this article, we have not dealt with the question of real-time estimation of the full covariance matrix
Acknowledgments
We thank three reviewers whose helpful suggestions led to a much improved paper. This research was partially supported by National Science Foundation Grant DMS1723175.
APPENDIX A
Equivalence of CUKF and KF for Linear Problems with Correlated Errors

















APPENDIX B
Maximal Correlation when 

In section 5 we showed how to define the maximal correlation matrix
Lemma B.1





If
, let be the first M columns of and be the first block of and let and .If
, let and and let be any N columns of and the corresponding block of .









Proof








APPENDIX C
Proof of Theorem 5.2
Proof

























In the case when
REFERENCES
Bélanger, P. R., 1974: Estimation of noise covariance matrices for a linear time-varying stochastic process. Automatica, 10, 267–275, https://doi.org/10.1016/0005-1098(74)90037-5.
Berry, T., and T. Sauer, 2013: Adaptive ensemble Kalman filtering of non-linear systems. Tellus, 65A, 20331, https://doi.org/10.3402/tellusa.v65i0.20331.
Dee, D. P., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation. Mon. Wea. Rev., 123, 1128–1145, https://doi.org/10.1175/1520-0493(1995)123<1128:OLEOEC>2.0.CO;2.
Guttman, L., 1946: Enlargement methods for computing the inverse matrix. Ann. Math. Stat., 17, 336–343, https://doi.org/10.1214/aoms/1177730946.
Hamill, T. M., and J. S. Whitaker, 2005: Accounting for the error due to unresolved scales in ensemble data assimilation: A comparison of different approaches. Mon. Wea. Rev., 133, 3132–3147, https://doi.org/10.1175/MWR3020.1.
Hodyss, D., and N. Nichols, 2015: The error of representation: Basic understanding. Tellus, 67A, 24822, https://doi.org/10.3402/tellusa.v67.24822.
Janjić, T., and S. E. Cohn, 2006: Treatment of observation error due to unresolved scales in atmospheric data assimilation. Mon. Wea. Rev., 134, 2900–2915, https://doi.org/10.1175/MWR3229.1.
Janjić, T., and Coauthors, 2018: On the representation error in data assimilation. Quart. J. Roy. Meteor. Soc., https://doi.org/10.1002/qj.3130, in press.
Julier, S. J., and J. K. Uhlmann, 2004: Unscented filtering and nonlinear estimation. Proc. IEEE, 92, 401–422, https://doi.org/10.1109/JPROC.2003.823141.
Kuramoto, Y., and T. Tsuzuki, 1976: Persistent propagation of concentration waves in dissipative media far from thermal equilibrium. Prog. Theor. Phys., 55, 356–369, https://doi.org/10.1143/PTP.55.356.
Liu, Z.-Q., and F. Rabier, 2002: The interaction between model resolution, observation resolution and observation density in data assimilation: A one-dimensional study. Quart. J. Roy. Meteor. Soc., 128, 1367–1386, https://doi.org/10.1256/003590002320373337.
Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130–141, https://doi.org/10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2.
Lorenz, E. N., 1996: Predictability—A problem partly solved. Proc. Seminar on Predictability, Vol. 1, Reading, United Kingdom, ECMWF, 18 pp., https://www.ecmwf.int/sites/default/files/elibrary/1995/10829-predictability-problem-partly-solved.pdf.
Mehra, R., 1970: On the identification of variances and adaptive Kalman filtering. IEEE Trans. Autom. Control, 15, 175–184, https://doi.org/10.1109/TAC.1970.1099422.
Mehra, R., 1972: Approaches to adaptive filtering. IEEE Trans. Autom. Control, 17, 693–698, https://doi.org/10.1109/TAC.1972.1100100.
Mitchell, H. L., and R. Daley, 1997: Discretization error and signal/error correlation in atmospheric data assimilation. Tellus, 49A, 32–53, https://doi.org/10.3402/tellusa.v49i1.12210.
Oke, P. R., and P. Sakov, 2008: Representation error of oceanic observations for data assimilation. J. Atmos. Oceanic Technol., 25, 1004–1017, https://doi.org/10.1175/2007JTECHO558.1.
Ran, A., and R. Vreugdenhil, 1988: Existence and comparison theorems for algebraic Riccati equations for continuous- and discrete-time systems. Linear Algebra Appl., 99, 63–83, https://doi.org/10.1016/0024-3795(88)90125-5.
Satterfield, E., D. Hodyss, D. D. Kuhl, and C. H. Bishop, 2017: Investigating the use of ensemble variance to predict observation error of representation. Mon. Wea. Rev., 145, 653–667, https://doi.org/10.1175/MWR-D-16-0299.1.
Simon, D., 2006: Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. Wiley-Interscience, 552 pp.
Sivashinsky, G., 1977: Nonlinear analysis of hydrodynamic instability in laminar flames I. Derivation of basic equations. Acta Astronaut., 4, 1177–1206, https://doi.org/10.1016/0094-5765(77)90096-0.
Van Leeuwen, P. J., 2015: Representation errors and retrievals in linear and nonlinear data assimilation. Quart. J. Roy. Meteor. Soc., 141, 1612–1623, https://doi.org/10.1002/qj.2464.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 1913–1924, https://doi.org/10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.