1. Introduction
The purpose of data assimilation is to provide the most likely numerical representation xa (the analysis) of the atmosphere, or of the ocean, from known observations yo. Nevertheless, observations are known to be noisy; that is, if xt denotes the true numerical state then yo = 𝗛(xt) + εo, with εo as the observation error and 𝗛 the (linear in what follows) observation operator that maps the model space into the observation space. Moreover, the observational network is heterogeneous in space and in time, with not enough observed area, which leads to a closure problem. A practical solution consists in adding a background xb (generally the most recent forecast) and to compute the analysis state as a correction δxa of the background, so that xa = xb + δxa.







Different models exist for correlation matrix and among them, the diagonal assumption in spectral space (Courtier et al. 1998), which leads to homogeneous correlation functions. Amid the heterogeneous correlation models, one can cite the diagonal assumption in wavelet space (Fisher 2003), the recursive filter approach (Purser et al. 2003a,b), and the diffusion equation (Weaver and Courtier 2001).
The formulation based on the diffusion equation is now considered. It appears that along the minimization process of the cost function
The note is organized as follow. Section 2 recalls the correlation model based on the diffusion equation and its link with probabilities. The relationship between stochastic differential equations (SDEs) and elliptic partial differential equations (EPDEs) is described in section 3. This section also gives details about the numerical stochastic resolution of SDE for different boundary conditions. A one-dimensional illustration is then proposed in section 4, with emphasis on the technical implementation. The present work focuses on the horizontal covariance modeling, and details about 3D models can be found in Weaver and Courtier (2001) or in Pannekoucke (2009) for a nonseparable formulation. This is justified by the fact that in 3D models, it is classic to represent the horizontal component separately from the vertical component.
2. Covariance modeling with the diffusion operator








A crucial aspect of this model is related to the choice of the local diffusion tensor ν(x). A solution to the estimation of these tensors has been proposed by Pannekoucke and Massart (2008), and it relies on the ensemble methods to compute ν(x) from the local length scale (Pannekoucke et al. 2008), under a local homogeneous assumption of the tensor field: the matrix Γ(x) is first estimated from an ensemble of forecasts, then the local diffusion tensor is set as ν(x) = Γ(x)/2. This is the diffusion tensor field used in the heterogeneous diffusion in Eq. (1).
Moreover, an efficient numerical implementation of the diffusion in Eq. (1) is needed to construct 𝗟1/2. The numerical solution can be obtained with a time and space discretization of Eq. (1) (Weaver and Courtier 2001) or also by using a spectral approach instead of a space discretization (Pannekoucke and Massart 2008).





Finally, the benefit is that the time integration is simply reduced to local interpolations of the initial condition at appropriate positions. This property is the key idea that motivates this note.
3. A stochastic integration scheme for EPDEs
The aim of this section is to recall how the solution of 1D EPDEs can be obtained as the expectation under a stochastic process (to be determined). A rigorous, but easily grasped, mathematical description can be found in Oksendal (2003), with an extension to n-dimensional cases. Thus, in a first time, the required background on stochastic calculus is depicted and then illustrated, next section, within a simple one-dimensional test bed.
a. Link between stochastic calculus and EPDEs




















Conversely, as previously mentioned, the solution u(x, t) of the second-order partial differential equation in Eq. (10), with the initial condition u(x, 0), can be obtained as the expectation, at time t, of the random variable u(Xtx, 0), where Xt is the stochastic process of generator
b. Numerical solution and convergence properties














In practice, the functions u(x, 0), a(x) and b(x) are known by their values on a grid. Out of the grid, an interpolation method is required. In this paper, only two kinds of interpolation are considered: the nearest neighbor and the linear interpolation. This is equivalent to assume the functions piecewise constant in the former interpolation method, and continuously piecewise linear in the latter. Of course, such interpolations introduce discrepancies with the continuous case, and the right-hand-side bound in Eq. (14) must be completed with a third term, which depends on the spatial discretization. In the nearest-neighbor case, this error is in O(δx), while it is in O(δx2) in the linear case. As a consequence, the numerical heat kernel in the homogeneous case is no more Gaussian.
As suggested in the previous section, the solution u(x, t) simply results from an empirical average of local particles: in Eq. (1) the value u(x, t) can be estimated from an ensemble of particles whose paths are limited within the neighborhood around x. The width of this area depends on the local diffusion magnitude. From common results on normal law and considering a 1D homogeneous diffusion of diffusion coefficient ν = L2/2, there is a probability bigger than 99.99% that, at time t = 1, a particle stays inside the segment of radius 3.9L around the starting point. Similar thresholds can be found for n-dimensional cases thanks to the chi-square law (e.g., in 2D, the equivalent threshold is 4.3L). These bounds can serve for high performance computing by distributing the computation of Eq. (11) over the global domain into several independent computations over local areas with an appropriate halo. For this purpose, the width of the halo can be fixed thanks to a length scale L chosen as the largest length scale over the local area (including anisotropic effects). The numerical cost can be reduced by replacing the normal law ζ with a random variable of a same mean and variance, such as the Bernoulli variable of values −1 and 1, with equal probability. The weak convergence is unchanged. Also, samples of the Bernoulli variable (a switch on a uniform law) are cheaper to obtain than normal law [often generated with a Box–Muller algorithm; Kloeden and Platen (1992)].
c. Boundary conditions
The domain considered here is an open bounded set D ⊂ ℝ2 with an external boundary ∂D. Periodic, homogenous Neumann and homogeneous Dirichlet boundary conditions are detailed, since they are used in data assimilation (Weaver and Courtier 2001).
The homogeneous Neumann condition corresponds to the case where the derivative is null either on the boundary (or as part of the boundary). The Neumann conditions are also called “wall conditions.” The last denomination is well illustrated in stochastic calculus: homogeneous Neumann conditions match the case where the particles are reflected on the boundary. The reflection 𝗫′(t + dt) of a particle 𝗫(t + dt) escaping the domain, is computed as 𝗫′(t + dt) − 𝗢 = 𝗦[𝗫(t + dt) − 𝗢], where 𝗢 is the nearest point of 𝗫(t + dt) owning to the boundary, and 𝗦 = 𝗜 − 2nnT is the matrix of the local symmetry with n the local unit vector normal to the boundary at position 𝗢 (Szymczak and Ladd 2003).
The Dirichlet condition is related to cases where, on the boundary, the solution must verify u(x, t) = Φ∂D(x) for x ∈ ∂D, where Φ∂D is some arbitrary function. In stochastic calculus, this kind of condition is constructed as follow. If a particle crosses the boundary ∂D at a time τ (stopping time), then it is stopped on ∂D so that for all t > τ, 𝗫t = 𝗫τ. In fact, it can be shown (Oksendal 2003) that the solution of Eq. (1), with the Dirichlet condition u(x, t) = Φ∂D(x), is given by
Thus, after each time step in Eq. (12), a test is required to verify if the particle is still inside the domain or if it has left it. Then, an appropriate reflection or stopping is applied, depending on the boundary condition. Of course, if the domain is periodic, the particle path always stays inside the domain and there no testing is necessary.
4. Applications in data assimilation
The previous formalism is now applied, for data assimilation purposes, in a simple one-dimensional test bed. The aim of this section is to document the use of the stochastic method for the time integration of the correlation model based on the diffusion operator.


First of all, the stochastic process with the generator of Eq. (15) is determined with respect to section 3. Then, the stochastic time integration is tested under various forms and boundary conditions to construct 𝗕N1/2 and its adjoint 𝗕NT/2. Of course, these matrices are not explicitly stored in computer memory but implemented so that computations are equivalent in their overall effect to simulation of these operators being applied multiplicatively to input vectors.
a. Stochastic process associated with the 1D diffusion






b. Implementation of 𝗕1/2 and 𝗕T/2
In practice, 𝗕1/2 is not stored in the computer memory but is evaluated on a vector as 𝗕1/2v. The evaluation by 𝗕1/2 is achieved through the product in Eq. (3). Products matrix/vector, with diagonal matrices 𝗪−1/2, Λ and Σ, are reduced to element-wise products.
In 1D framework, the diagonal values of 𝗪−1 are δx for points inside the domain. For points on the boundary, the diagonal value is δx for periodic conditions and δx/2 for other cases.
Under a local homogeneity assumption, a good approximation for the normalization Λ can be deduced from the local diffusion tensor (Pannekoucke and Massart 2008). In a 2D, the normalization applied at position x is Λ(x)2 = 2π
The principal difficulty rests with the computation of the operator 𝗟1/2. In this paper, the operator is a half-time stochastic integration (from t = 0 to t = ½) of the heterogeneous diffusion equation. It requires the computation at time t = ½ of N particles per grid points, governed by an Itô diffusion. Each path is integrated according to Eq. (12), with appropriate boundary conditions as described in section 3c. These endpoints, at half-time, are computed once and for all, and are stored in computer memories. Then endpoints are used at each computation of 𝗟N1/2v in the empirical mean evaluation in Eq. (11), where the initial condition u(x, 0) is replaced by interpolations of v at the endpoint paths.
The adjoint 𝗕NT/2 can be carried out from the direct code of 𝗕N1/2, by using classic adjoint coding rules (Giering and Kaminski 1998). The endpoints involved in the computation of 𝗕N1/2 are reused in the computation of 𝗕NT/2. In this way, 𝗕NT/2 leads to the exact transpose of the matrix 𝗕N1/2.
Note that the computation of the background-error covariance matrix as a product of 𝗕N = 𝗕N1/2𝗕NT/2 ensures the symmetry and the positiveness of the covariance model.
c. Numerical experiments and results
Several points for the use of the stochastic approach need to be documented: the sensitivity to the interpolation method; the convergence versus the number of particles, the ratio Lh/δx, and the time step; the validity of the approximated normalization; and the use of Dirichlet–Neumann boundary conditions. These are now detailed.
1) Sensitivity with interpolation methods
As explained in section 3b, an interpolation is needed in the numerical resolution of the dynamics in Eq. (12). The discrepancies between the background error covariance matrix resulting from the linear interpolation and the nearest-neighbor interpolation are illustrated in Fig. 1. In these simulation, the diffusion is heterogeneous with periodic boundary conditions and δt = 1/200. The number of particles (N = 105) is large enough to damp the sampling noise. The correlation functions obtained by using the exact normalization are presented in Fig. 1a (only 1 out of 10 is represented). Note that these correlation functions can be considered as analysis increments associated to one observation assimilation experiments. In that case, a single observation is located at the position of the maximum of each correlation function. The correlation function related to the positions 90° and 180° are in bold. It appears that correlation functions are quasi-Gaussian. The difference between correlations obtained with nearest-neighbor interpolation minus those obtained with linear interpolation is represented in Fig. 1b (again only 1 out of 10 is represented). The difference related to the positions 90° and 180° are also in bold. The linear interpolation leads to greater correlation values than the nearest-neighbor interpolation. This is confirmed by the length scale diagnosis, reported in Fig. 1c, where the length scale of 𝗕N in the linear interpolation case (dash–dotted line) is larger than in the nearest-neighbor case (dashed line). Note that both interpolations over estimate the true length scale field (solid line).
Results from this sensitivity test show that the correlation functions obtained with a nearest-neighbor interpolation are similar to those of the functions obtained with the linear interpolation. Moreover, the nearest-neighbor interpolation is less costly than the linear interpolation. Therefore, the nearest-neighbor interpolation appears as a good compromise for data assimilation constraints.
2) Convergence versus the number of particles, the length scale magnitude, and the time step
The sampling noise effect, arising from the number of particles N, is investigated for the homogeneous diffusion with periodic boundary conditions. For such a case, the theoretical correlation function is known and equal to ρ(x) = exp(−x2/2Lh2). The theoretical correlation matrix 𝗕ho is compared with simulated matrices 𝗕N for N in (100, 400, 1600, 6400). The 𝗕N is computed with the nearest-neighbor interpolation and with the integration time step sets to δt = 1/200. Moreover, an exact normalization is employed for Λ. The discrepancy between 𝗕ho and 𝗕N is quantified by the relative error eN1 = 100(‖𝗕N − 𝗕ho‖/‖𝗕ho‖), where ‖·‖ is the Hilbert norm and ‖𝗕‖ =
The convergence of 𝗕N toward 𝗕∞ is theoretically in O(N−1/2) since the convergence of 𝗕N1/2 to 𝗕∞1/2 is in O(N−1/2) [according to Eq.(13)]. Confirmation can be seen in Fig. 2 where the dash–dotted line represents the relative error eN2 = 100(‖𝗕N − 𝗕∞‖/‖B∞‖), where 𝗕∞ is approximated by 𝗕N∞ for a large enough N∞ (here N∞ = 105). The slope of eN2 is asymptotically −½ (represented by the dashed line). But the error increases with Lh/δx for a small ensemble of particles: for N = 100, the error is less than 10% for Lh = 250 km (see Fig. 2a), while it is close to 15% for Lh = 1000 km (see Fig. 2c). Same results are found for heterogeneous case (not shown here).
Some sensitivity experiments (not shown here) on the time step δt = 1/2n (n is the number of discretization of [0, ½], with N = 400, indicate that correlations are overestimated for small n (of the order of 10) but converge rapidly for large n (of the order of 50). Note that in any case, the stochastic integration is stable, even with the extreme case δt = ½, and the resulting correlation functions are quasi-Gaussian.
With previous results, a practical ensemble size of N = 100 (around 10% error) or N = 400 (around 5% error) is accurate enough [that is in O(100)]. But this should be confirmed for real case applications since the error depends on the ratio Lh/δx. Small values for δt = 1/2n increase the cost of the integration of particle paths, but this is not limiting as these trajectories are computed one time, only to retain the endpoints; thus, large n can be used. In most of experiments performed in the note n is set to 100.
3) Discussion on the normalization
As mentioned in section 4b, the normalization Λ can be approximated from the local diffusion tensor. Under this approximation, the local normalization at the point x is Λ(x)2 =
The approximated normalization is thus still efficient in this stochastic framework. However, some boundary effects can be encountered and are discussed hereafter.
4) Illustration with nonperiodic boundary conditions
In data assimilation, bounded domains are common (e.g., in local area model or in ocean modeling with coast treatment). Homogeneous Dirichlet and Neumann boundary conditions are classically considered in covariance modeling with the diffusion operator (Weaver and Courtier 2001). They are now tested for a homogeneous diffusion operator for N = 400 and δt = 1/200. The boundary conditions are set as follow: homogeneous Dirichlet at x = 0° and homogeneous Neumann at x = 360°. The correlation functions obtained with the approximated normalization Λ are reported in Figs. 4a,b. As mentioned in section 4c(1), each function can be viewed as an increment analysis obtained from a single observation assimilation. It appears that these functions are not correlation functions since the variance is not close to one. In Fig. 4a, the functions vanish with a decreasing x, this is consistent with the Dirichlet condition that imposes the solution to be null at x = 0°. At the opposite, in Fig. 4b, the functions increase with an increasing x, this is consistent with the Neumann condition that imposes with reflections, an accumulation of particles at x = 180°. These understandable behaviors are not balance by the normalization Λ, sets as a constant
5. Conclusions
In this note, a stochastic integration of the diffusion equation has been proposed and tested to be used in the model of covariance matrix related on the diffusion operator. The idea is to replace the deterministic integration by local interpolations of the initial condition. This method is naturally adapted for parallel computation since the computation only depends on combination of the initial condition taken on independent areas.
The stochastic integration is achieved thanks to a particle method whose path dynamic is described by a stochastic differential equation deduced from the heterogeneous diffusion equation.
It has been shown, within a one-dimensional test bed, that this approach can be considered in a data assimilation framework for background error covariance modeling. It rests on the expansion of the covariance matrix as the product of a matrix (the square root matrix) and its transposition. The transpose of the square root matrix is achieved through the adjoint code of the square root matrix direct code. It can be inferred from experiments that the nearest-neighbor interpolation (at a low cost compared with other interpolation methods), and a relatively small ensemble of particles O(100), are enough to build quasi-Gaussian correlation functions.
However, there is still some work to achieve for the two-dimensional case where the number of particle might be much larger then the simple one-dimensional case. This study is considered as a preliminary change of perspective in the refinement of the diffusion formulation, and it requires further developments and optimizations to be ready to use.
Acknowledgments
Isabelle Mirouze, Sebastien Massart, and Jean-Antoine Maziejewski are warmly thanks for their careful reading of the manuscript. The authors are grateful to Christophe Baehr for fruitful discussions on stochastic calculus and numerous advice on this note.
REFERENCES
Courtier, P., and Coauthors, 1998: The ECMWF implementation of three-dimensional variational assimilation (3D-VAR). I: Formulation. Quart. J. Roy. Meteor. Soc., 124 , 1783–1807.
Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 472 pp.
Fisher, M., 2003: Background error covariance modelling. Proc. ECMWF Seminar on Recent Developments in Data Assimilation for Atmosphere and Ocean, Reading, United Kingdom, ECMWF, 45–63.
Giering, R., and T. Kaminski, 1998: Recipes for adjoint code construction. ACM Trans. Math. Softw., 24 (4) 437–474. doi:10.1145/293686.293695.
Kloeden, P. E., and E. Platen, 1992: Numerical Solution of Stochastic Differential Equations. Springer, 636 pp.
Mirouze, I., and A. Weaver, 2010: Representation of correlation functions in variational assimilation using an implicit diffusion operator. Quart. J. Roy. Meteor. Soc., in press.
Oksendal, B., 2003: Stochastic Differential Equations: An Introduction with Applications. Springer, 374 pp.
Pannekoucke, O., 2009: Heterogeneous correlation modeling based on the wavelet diagonal assumption and on the diffusion operator. Mon. Wea. Rev., 137 , 2995–3012.
Pannekoucke, O., and S. Massart, 2008: Estimation of the local diffusion tensor and normalization for heterogeneous correlation modelling using a diffusion equation. Quart. J. Roy. Meteor. Soc., 134 , 1425–1438.
Pannekoucke, O., L. Berre, and G. Desroziers, 2008: Background error correlation length-scale estimates and their sampling statistics. Quart. J. Roy. Meteor. Soc., 134 , 497–508.
Purser, R., W-S. Wu, D. Parrish, and N. Roberts, 2003a: Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances. Mon. Wea. Rev., 131 , 1524–1535.
Purser, R., W-S. Wu, D. Parrish, and N. Roberts, 2003b: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances. Mon. Wea. Rev., 131 , 1536–1548.
Szymczak, P., and A. J. C. Ladd, 2003: Boundary conditions for stochastic solutions of the convection-diffusion equation. Phys. Rev. E, 68 , 036704. doi:10.1103/PhysRevE.68.036704.
Weaver, A., and P. Courtier, 2001: Correlation modelling on the sphere using a generalized diffusion equation. Quart. J. Roy. Meteor. Soc., 127 , 1815–1846.
APPENDIX
Stochastic Process for n-Dimensional Heterogeneous Diffusion





(a) A few correlation functions computed with nearest-neighbor interpolation with a large N, (b) difference between correlation functions obtained with the linear and the nearest-neighbor interpolation, and (c) length scale fields for the two interpolation methods compared with the theoretical length scale. (See text for details).
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

(a) A few correlation functions computed with nearest-neighbor interpolation with a large N, (b) difference between correlation functions obtained with the linear and the nearest-neighbor interpolation, and (c) length scale fields for the two interpolation methods compared with the theoretical length scale. (See text for details).
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1
(a) A few correlation functions computed with nearest-neighbor interpolation with a large N, (b) difference between correlation functions obtained with the linear and the nearest-neighbor interpolation, and (c) length scale fields for the two interpolation methods compared with the theoretical length scale. (See text for details).
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

Relative errors vs the number of particles per grid points for Lh = (a) 250, (b) 500, and (c) 1000 km. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

Relative errors vs the number of particles per grid points for Lh = (a) 250, (b) 500, and (c) 1000 km. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1
Relative errors vs the number of particles per grid points for Lh = (a) 250, (b) 500, and (c) 1000 km. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

Test of the normalization approximated from the local diffusion tensor in the heterogeneous case. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

Test of the normalization approximated from the local diffusion tensor in the heterogeneous case. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1
Test of the normalization approximated from the local diffusion tensor in the heterogeneous case. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

Illustration of the correlation functions in (a),(c) the homogeneous Dirichlet case and (b),(d) the homogeneous Neumann case. With (a),(b) an approximated normalization and (c),(d) an exact normalization. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1

Illustration of the correlation functions in (a),(c) the homogeneous Dirichlet case and (b),(d) the homogeneous Neumann case. With (a),(b) an approximated normalization and (c),(d) an exact normalization. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1
Illustration of the correlation functions in (a),(c) the homogeneous Dirichlet case and (b),(d) the homogeneous Neumann case. With (a),(b) an approximated normalization and (c),(d) an exact normalization. See text for details.
Citation: Monthly Weather Review 138, 8; 10.1175/2010MWR3239.1