• Avramidi, I. G., 1999: Covariant techniques for computation of the heat kernel. Rev. Math. Phys., 11, 947980.

  • Bekas, C. F., , E. Kokiopoulou, , and Y. Saad, 2007: An estimator for the diagonal of a matrix. Appl. Numer. Math., 57, 12141229.

  • Derber, J., , and A. Rosati, 1989: A global ocean data assimilation system. J. Phys. Oceanogr., 19, 13331347.

  • Di Lorenzo, E., , A. M. Moore, , H. G. Arango, , B. D. Cornuelle, , A. J. Miller, , B. S. Powell, , B. S. Chua, , and A. F. Bennett, 2007: Weak and strong constraint data assimilation in the Inverse Ocean Modelling System (ROMS): Development and application for a baroclinic coastal upwelling system. Ocean Modell., 16, 160187.

    • Search Google Scholar
    • Export Citation
  • Dong, S.-J., , and K.-F. Liu, 1994: Stochastic estimation with Z2 noise. Phys. Lett., 328B, 130136.

  • Girard, D. F., 1989: A fast Monte-Carlo cross-validation procedure for large least squares problems with noisy data. Numer. Math., 56, 123.

    • Search Google Scholar
    • Export Citation
  • Gusynin, V. P., , and V. A. Kushnir, 1991: On-diagonal heat kernel expansion in covariant derivatives in curved space. Class. Quantum Gravity, 8, 279285.

    • Search Google Scholar
    • Export Citation
  • Hutchison, M. F., 1989: A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. J. Commun. Stat. Simul., 18, 10591076.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., , J. Zhu, , J. She, , S. Zhuang, , W. Fu, , and J. Gao, 2009: Assimilating temperature and salinity profile observations using an anisotropic recursive filter in a coastal ocean model. Ocean Modell., 30, 7587.

    • Search Google Scholar
    • Export Citation
  • Martin, P. J., , C. N. Barron, , L. F. Smedstad, , T. J. Campbell, , A. J. Wallcraft, , R. C. Rhodes, , C. Rowley, , and T. L. Townsend, 2009: User's manual for the Navy Coastal Ocean Model Version 4.0. NRL/MR/7320-09-9151, Naval Research Laboratory, Stennis Space Center, MS, 75 pp.

    • Search Google Scholar
    • Export Citation
  • Mirouze, I., , and A. T. Weaver, 2010: Representation of correlation functions in variational data assimilation using an implicit diffusion operator. Quart. J. Roy. Meteor. Soc., 136, 14211443.

    • Search Google Scholar
    • Export Citation
  • Pannekoucke, O., , and S. Massart, 2008: Estimation of the local diffusion tensor and normalization for heterogeneous correlation modelling using a diffusion equation. Quart. J. Roy. Meteor. Soc., 134, 14251438.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J., 2008a: Normalization of the diffusive filters that represent the inhomogeneous covariance operators of variational assimilation, using asymptotic expansions and the techniques of non-Euclidean geometry. Part I: Analytic solutions for symmetrical configurations and the validation of practical algorithms. NOAA/NCEP Office Note 456, 48 pp.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J., 2008b: Normalization of the diffusive filters that represent the inhomogeneous covariance operators of variational assimilation, using asymptotic expansions and the techniques of non-Euclidean geometry. Part II: Riemannian geometry and the generic parametrix expansion method. NOAA/NCEP Office Note 457, 55 pp.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J., , W. Wu, , D. F. Parrish, , and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances. Mon. Wea. Rev., 131, 15361548.

    • Search Google Scholar
    • Export Citation
  • Weaver, A., , and P. Courtier, 2001: Correlation modeling on a sphere using a generalized diffusion equation. Quart. J. Roy. Meteor. Soc., 127, 18151846.

    • Search Google Scholar
    • Export Citation
  • Weaver, A., , J. Vialard, , and D. L. T. Anderson, 2003: Three and four-dimensional variational assimilation with a general circulation model of the Tropical Pacific Ocean. Part I: Formulation, internal diagnostics, and consistency checks. Mon. Wea. Rev., 131, 13601378.

    • Search Google Scholar
    • Export Citation
  • Xu, Q., 2005: Representations of inverse covariances by differential operators. Adv. Atmos. Sci., 22 (2), 181198.

  • Yaremchuk, M., , and S. Smith, 2011: On the correlation functions associated with polynomials of the diffusion operator. Quart. J. Roy. Meteor. Soc., 137, 19271932.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Correlation functions corresponding to the homogeneous operators (4)(5) with identical decorrelation scales.

  • View in gallery

    Five columns of the BEC operators used in the study: (a) Gaussian-shaped correlations and (b) their approximation by the inverse of the second-order polynomial in . White circles denote locations of the diagonal elements of the corresponding correlation matrices. (c) The map of nonnormalized diagonal elements of . Depth contours are in m.

  • View in gallery

    (a) Error distribution after 60 iterations of the MC method, (b) its optimally smoothed version, (c) the respective dependences of the domain-averaged error 〈ε〉 and on the iteration number k. The dashed line is the approximation of by the k−1/3 law.

  • View in gallery

    (a) Error distribution after 60 iterations of the HM method. (b) As in (a), but for the smoothed RHM method. (c) Reduction of the domain-averaged error 〈ε〉 with iterations for the MC (gray), RHM (solid), and HM (dashed) methods. The bottom graphs are obtained after optimal smoothing of the diagonal estimates. Thin horizontal lines show error levels provided by the LH0 (〈ε〉 = 0.17) and LH1 methods 〈ε〉 = 0.10.

  • View in gallery

    Adjustment of the normalization factors near the boundary. (a) Map of the xth column of obtained numerically by integrating (15) over the real domain S′ (the normalization factor is the map value at point x). (b) Approximation to (a) obtained by integrating in the infinite domain and renormalizing the result to have the same integral over S′ as in (a). (c) The difference between (a) and its approximation (b).

  • View in gallery

    Diagonal approximation errors under the (a) zeroth-order and (b) first-order LH methods for the Gaussian BEC model.

  • View in gallery

    Diagonal elements of in the Okinawa region at z = 20 m. The actual values are multiplied by 104.

  • View in gallery

    Scatterplots of the true diagonal elements of (vertical axis) vs their approximations by (a) LH0 and (b) LH1 algorithms. The actual values are multiplied by 103. Near-boundary points are excluded. (c) Diagonal approximation errors as a function of γ for the (black) and (gray) BEC models. The dashed line shows the value of γ3 given by (12).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 7 7 3
PDF Downloads 2 2 2

On the Renormalization of the Covariance Operators

View More View Less
  • 1 Naval Research Laboratory, Stennis Space Center, Mississippi
© Get Permissions
Full access

Abstract

Many background error correlation (BEC) models in data assimilation are formulated in terms of a smoothing operator , which simulates the action of the correlation matrix on a state vector normalized by respective BE variances. Under such formulation, has to have a unit diagonal and requires appropriate renormalization by rescaling. The exact computation of the rescaling factors (diagonal elements of ) is a computationally expensive procedure, which needs an efficient numerical approximation.

In this study approximate renormalization techniques based on the Monte Carlo (MC) and Hadamard matrix (HM) methods and on the analytic approximations derived under the assumption of the local homogeneity (LHA) of are compared using realistic BEC models designed for oceanographic applications. It is shown that although the accuracy of the MC and HM methods can be improved by additional smoothing, their computational cost remains significantly higher than the LHA method, which is shown to be effective even in the zeroth-order approximation. The next approximation improves the accuracy 1.5–2 times at a moderate increase of CPU time. A heuristic relationship for the smoothing scale in two and three dimensions is proposed for the first-order LHA approximation.

Corresponding author address: Max Yaremchuk, Naval Research Laboratory, Bldg. 1009, Stennis Space Center, MS 39529. E-mail: max.yaremchuk@nrlssc.navy.mil

Abstract

Many background error correlation (BEC) models in data assimilation are formulated in terms of a smoothing operator , which simulates the action of the correlation matrix on a state vector normalized by respective BE variances. Under such formulation, has to have a unit diagonal and requires appropriate renormalization by rescaling. The exact computation of the rescaling factors (diagonal elements of ) is a computationally expensive procedure, which needs an efficient numerical approximation.

In this study approximate renormalization techniques based on the Monte Carlo (MC) and Hadamard matrix (HM) methods and on the analytic approximations derived under the assumption of the local homogeneity (LHA) of are compared using realistic BEC models designed for oceanographic applications. It is shown that although the accuracy of the MC and HM methods can be improved by additional smoothing, their computational cost remains significantly higher than the LHA method, which is shown to be effective even in the zeroth-order approximation. The next approximation improves the accuracy 1.5–2 times at a moderate increase of CPU time. A heuristic relationship for the smoothing scale in two and three dimensions is proposed for the first-order LHA approximation.

Corresponding author address: Max Yaremchuk, Naval Research Laboratory, Bldg. 1009, Stennis Space Center, MS 39529. E-mail: max.yaremchuk@nrlssc.navy.mil

1. Introduction

Modeling of the background error correlation (BEC) by smoothing operators in variational data assimilation has recently gained considerable attention, primarily due to computational efficiency of their implementation and versatility in approximating anisotropic and inhomogeneous BECs (e.g., Xu 2005; Pannekoucke and Massart 2008; Mirouze and Weaver 2010). In the framework of this approach, the correlation matrix of the BE field is approximated by a positive function of the diffusion operator:
e1
where ν is the diffusion tensor, whose components may depend on the spatial coordinates x. Attractive features of such models are their low computational cost and direct control of inhomogeneity and anisotropy by ν under the positive-definiteness constraint. Among the most popular functions of used in practice, are the exponential (yielding the Gaussian-shaped correlations) and the inverse of a positive polynomial. Numerically, these functions are implemented by integrating the finite-difference diffusion equation on the model grid and have been used in many applications (e.g., Derber and Rosati 1989; Weaver et al. 2003; Di Lorenzo et al. 2007; Liu et al. 2009).

An important numerical issue arising with this type of covariance modeling is the necessity to find a rescaling transformation such that the diagonal elements of the covariance matrix are equal to the background error (BE) variances derived from the history of data assimilation into the numerical model. The direct way of finding is to compute the inverse square root of the diagonal elements of . The latter can be found by convolving with δ functions in every point x of the model grid and taking the results of these convolutions at the same points. This procedure, however, is practically unfeasible as the model grid size (the number of convolutions to be executed) may often exceed M ~ 106–107.

In this study we compare the numerical efficiency of several methods of estimating the diagonal elements of a symmetric positive-definite matrix defined by its action on an arbitrary vector and propose an algorithm for estimating under the locally homogeneous approximation (i.e., slowly varying smoothing scales of ). The manuscript is organized as follows. In section 2, we briefly overview the diagonal estimation methods with an emphasis on the possibility to exploit the analytic formulas, available for the considered functional forms under the assumption of spatial homogeneity. In particular, an extension of the method of Purser et al. (2003) for estimating the Gaussian kernel diagonal is proposed for higher dimensions. In section 3, we describe the setup of numerical experiments with anisotropic inhomogeneous correlation model and document the results. Summary and conclusions complete the paper.

2. Renormalization methods

a. Monte Carlo technique

This method originates from a large family of stochastic algorithms used for estimating elements and traces of extra-large matrices (e.g., Girard 1989; Hutchison 1989; Dong and Liu 1994). Weaver and Courtier (2001) were among the first to use this approach in geophysical applications for estimating the diagonal of the Gaussian-shaped BEC operators.

The underlying idea is to define an ensemble of K random vectors sk on the model grid and perform componentwise averaging of the products according to the following formula:
e2
where the overbar denotes averaging over the ensemble, and stand for the componentwise multiplication and division of the vectors, respectively. Simple considerations show that when all the components of s have identical δ-correlated distributions with zero mean, contributions to from the off-diagonal elements tend to cancel out, and converges to as K → ∞. More accurately, the squared relative approximation error:
e3
is inversely proportional to the ensemble size K. In other words, one may expect to achieve 10% accuracy at the expense of approximately 100 multiplications by if the first ensemble member gives a 100% error. This estimate may seem acceptable since in geophysical applications the BE variances are usually known with limited precision and approximating the diagonal with 5%–10% error seems satisfactory.

b. The Hadamard matrix method

The Monte Carlo (MC) technique was developed further by Bekas et al. (2007), who noticed that the method may converge to d in the finite number of iterations M if the ensemble vectors are mutually orthogonal. An easy way to construct such an ensemble is to draw the vectors sk from the columns of the M × M Hadamard matrix (HM), which span the model’s state space (see appendix A for more details on the HM theory).

Although it is not proven yet that Hadamard matrices can be constructed for an arbitrary M, very efficient recursive algorithms for generating HM columns do exist for Ms, whose factorization involves only prime numbers not exceeding 100. Since the exact convergence (which can be achieved at M iterations) is never needed in practice, it is not even necessary to draw sk from the Hadamard matrix, whose dimension exactly coincides with the state space dimension: if generation of sk is impossible because of some odd value of M (e.g., 1004), it can always be replaced by a slightly larger number (e.g., 1008), and sk can be defined as the first 1004 components of the 1008-dimensional Hadamard vectors. In the numerical experiments described in the next section it is shown that such modification does not affect the method’s convergence during the first several hundred iterations.

c. Locally homogeneous approximations

An alternative approach to the diagonal estimation employs a priori information on the structure of . Consider a homogeneous (ν = const) case and assume that the coordinate axes are aligned along the eigenvectors of the diffusion tensor, whose (positive) eigenvalues are and n is the number of dimensions of the physical space (1, 2, or 3). Then the matrix elements of the two types of BEC operators that are considered here can be written down explicitly (see appendix B) as
e4
e5
where
eq1
is the distance between the correlated points, measured in terms of the smoothing scales λi; d = (2π)n/2Ω−1 are the (constant) diagonal elements of ; is the diffusion volume element; is the identity operator; K and Γ denote the Bessel function of the second kind and the gamma function, respectively; m is a positive integer (an approximation order of by ); s = mn/2; and . The parameter m can also be interpreted as the number of “time steps” used in discrete-time integration of the corresponding diffusion equation by the implicit scheme (see appendix B).
When ν varies in space, (4)(5) are no longer valid, and the diagonal elements d depend on x and on the type of the BEC operator. However, if we assume that ν is locally homogeneous (LH) (i.e., varies in space on a typical scale L, which is much larger than λi) the diagonal elements d(x) can be expanded in the powers of the small parameter , where is the mean eigenvalue of . The zeroth-order LH approximation term (LH0) is apparently
e6
because for infinitely slow variations of ν (L → ∞) the normalization factors must converge to the above expression for the constant diagonal elements d. It is noteworthy that (6) was found to be useful even in the case of strong inhomogeneity ϵ ≥ 1. In particular, Mirouze and Weaver (2010) found that such an approximation provided 10% errors in a simplified 1D case.
The accuracy of (6) can formally be increased by considering the next term in the expansion of the diagonal elements of . The technique of such asymptotics has been well developed for the diagonal of the Gaussian kernel (4) in Riemannian spaces (e.g., Gusynin and Kushnir 1991; Avramidi 1999). More recently, the approach was considered by Purser (2008a,b) in the atmospheric data assimilation context. Application of this technique to the diffusion operator (1) in flat space yields the following asymptotic expression for the diagonal elements of in the local coordinate system, where ν(x) is equal to the identity matrix and takes the form of the Laplacian operator Δ:
e7
Here h is a small (|h| ~ ϵ) correction to ν in the vicinity of x. Note that terms in the round brackets have the order O(ϵ3) because each spatial differentiation adds an extra power of ϵ.
The asymptotic estimate (7) involves second derivatives that tend to amplify errors in practical applications when ϵ may not be small. Therefore, using (7) in its original form could be inaccurate even at a moderately small value of ϵ. To increase the computational efficiency, it is also desirable to formulate the first-order approximation as a linear operator, acting on d0(x). Keeping in mind that |h| ~ ϵ, we can utilize the following relationships:
e8
e9
and represent the second term in the round brackets of (7) as follows:
e10
where h′ is the traceless part of h. On the other hand, if we neglect the divergence of h′, (7) can be rewritten in the following form:
e11
where
e12
Taking (8)(9) into account and replacing Δ by , we finally get the desired ansatz for the first-order approximation (LH1) of the diagonal:
e13
The relationship (13) was derived by Purser et al. (2003) for one-dimensional case (γ1 = 0.5) and tested by Mirouze and Weaver (2010), who reported a significant (2–4 times) improvement of the accuracy in 1D simulations.
It is likely that the estimate similar to (13) can also be obtained for , possibly with a different coefficient . We assume, however, that may not differ too much from γn given similarity in the shapes of the correlation functions (4)(5) (Fig. 1). Furthermore, because of the approximate nature of (13), the best representation of d(x) in realistic applications may be achieved with a value of γn significantly different from the one given by (12). For that reason, in the numerical experiments we adopt a more general form of (13), assuming
e14
and investigate the dependence of the respective approximation errors 〈ε1,2〉 on the free parameter γ.
Fig. 1.
Fig. 1.

Correlation functions corresponding to the homogeneous operators (4)(5) with identical decorrelation scales.

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

3. Numerical results

To assess the efficiency of the methods outlined in the previous section, two series of numerical experiments with realistically inhomogeneous BEC models are performed. In the first series we test the methods in the 2D case with the state vector having the dimension of several thousand. In the second series, the LH0 and LH1 techniques are examined in a realistic 3D setting with a state space dimension of ~106.

a. Experimental setting in 2D

The state space is described by scalar functions defined on the orthogonal curvilinear grid of the Navy Coastal Ocean Model (NCOM; Martin et al. 2009) set up in the Monterey Bay (Fig. 2). The number M of grid points (dimension of the state space) was 3438. A vector field v(x) was used to generate the diffusion tensor as follows. The smaller principal axis λ2 of is set to be orthogonal to v with the corresponding “background” length scale λ2 = 3δ, where δ(x) is the spatially varying grid step. The length of the larger axis λ1 is set to be equal to , where υ is a prescribed threshold value of |v|. If v is a velocity field, then a structure like this simulates enhanced diffusive transport of model errors in the regions of strong currents on the background of isotropic error diffusion with the decorrelation scale λ2.

Fig. 2.
Fig. 2.

Five columns of the BEC operators used in the study: (a) Gaussian-shaped correlations and (b) their approximation by the inverse of the second-order polynomial in . White circles denote locations of the diagonal elements of the corresponding correlation matrices. (c) The map of nonnormalized diagonal elements of . Depth contours are in m.

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

In the 2D experiments, the vector field v is generated by treating bottom topography h(x) (Fig. 2) as a streamfunction. The threshold value υ was taken to be one-fifth of the rms variation of |h| over the domain.

All the experiments described in sections 3b3e, are performed using the BEC models (4)(5) with the parameters n = m = 2. Composite maps of five columns of the corresponding BEC operators are shown in Figs. 2a,b. The diffusion operator (1) is constrained to have a zero normal derivative at the open and rigid boundaries of the domain in both 2D and 3D experiments.

Numerically, the action of the Gaussian-shaped BEC operator on a state vector y0 was evaluated by explicitly integrating the corresponding diffusion equation for the virtual “time period” defined by ν starting from the “initial condition” y0. The minimum number of “time steps” required for the scheme’s stability in such setting was 5256. The action of was computed by solving the system of equations with a conjugate gradient method. The number of iterations, required for obtaining a solution, varied within 2000–2500. To make the shapes of the and compatible (Fig. 1), the diffusion tensor in was multiplied by 8/π (Yaremchuk and Smith 2011).

The exact values d(x) of the diagonal elements are shown in Fig. 2 (right panel). Their magnitude appears to be lower in the regions of “strong currents” (large v), as the corresponding δ functions are dispersed over larger areas by diffusion. The d(x) are higher near the boundaries because part of the domain available for dispersion is screened by the condition prescribing zero flux across either the open or rigid boundary.

b. Monte Carlo technique

The MC method is implemented in two ways: in the first series of experiments, the components of sk are taken to be either 1 or −1 with equal probability; in the second series they are drawn from the white noise on the interval [−1, 1]. The residual error ε is computed using (3). In both series the rates of reduction of ε with iteration k were similar and closely followed the law (thin solid line in Fig. 3c).

Fig. 3.
Fig. 3.

(a) Error distribution after 60 iterations of the MC method, (b) its optimally smoothed version, (c) the respective dependences of the domain-averaged error 〈ε〉 and on the iteration number k. The dashed line is the approximation of by the k−1/3 law.

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

Figure 3a shows the distribution of ε(x) after 60 iterations of the MC method with the BEC model. The estimate is apparently affected by sampling noise, which can be identified by fine structures at scales close to the grid spacing.

To improve the accuracy, the MC estimates are low-pass filtered with the corresponding BEC operators at every iteration. To optimize the filter, the diffusion operator in is multiplied by the tunable parameter γ, which effectively reduced the mean decorrelation (smoothing) scale κ = γ−1/2 times. Figure 3b demonstrates the result of such optimal smoothing (κopt = 2.5), which enabled almost fourfold reduction of the domain-averaged error 〈ε〉 to 0.15. The right panel in Fig. 3 shows the evolution of 〈ε〉 with iterations for the operator. The optimal smoothing scale reduction factor κ−1 appears to follow the law κ−1(k) ~ k−1/3 (dashed line in Fig. 3c). It is remarkable that significant error reduction occurs even if the smoothing scale falls below the grid spacing κ−1 < ⅓.

c. The Hadamard matrix method

Experiments with the Hadamard scheme are also done in two series. Since the value of M = 3438 is not divisible by 4, it is hard to find the HM of that dimension. Instead, in the first series of experiments, sk were specified as the first 3438 numbers taken from the columns of the 3456-dimensional HM, which can be easily constructed from the 12-dimensional HMs.

To check the impact of “nonexact” dimension on the convergence, we artificially increased M to 3456 by removing 18 land points in the domain. Experiments with the HM of exact dimension show that differences in convergence between the nonexact and exact experiments start to be visible only after several hundred iterations. After M iterations the error of the exact-HM method is reduced to the machine precision, while the error of the first series of experiments stumbled at approximately 10−3 after 1500 iterations. This is consistent with the 18/3438 ~ 0.5% degree of nonorthogonality of sk drawn from the nonexact HM.

Similar to the MC method, the accuracy of HM estimates are improved significantly after smoothing. In addition it is found that the effect of smoothing can be enhanced if the computer mapping of the 2D model field on the 1D vector is randomized: Fig. 4a bears an apparent trace of columnwise numbering of the model field, which remains visible even after applying the algorithm, generating the HM columns. As a consequence, error distribution in Fig. 4a contains large scales in cross-shore direction, which tend to make the smoothing algorithm less effective (Fig. 4c, dashed curves). This drawback can be easily corrected by randomization of the above-mentioned map (the randomized HM and the RHM method), which provides an error pattern similar to Fig. 3a, but with somewhat lower value of 〈ε〉 clearly visible in Fig. 4c, where dashed lines show evolution of 〈ε〉 for the straight HM method before (upper line) and after smoothing, while the solid black lines show similar quantities for the RHM method.

Fig. 4.
Fig. 4.

(a) Error distribution after 60 iterations of the HM method. (b) As in (a), but for the smoothed RHM method. (c) Reduction of the domain-averaged error 〈ε〉 with iterations for the MC (gray), RHM (solid), and HM (dashed) methods. The bottom graphs are obtained after optimal smoothing of the diagonal estimates. Thin horizontal lines show error levels provided by the LH0 (〈ε〉 = 0.17) and LH1 methods 〈ε〉 = 0.10.

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

Comparison with the MC method (gray curves in Fig. 4c) demonstrate a noticeable advantage of the HM technique (upper curves), which remains visible at higher iterations n > 100 even after smoothing (lower curves). This advantage increases with iterations for two reasons: the HM method converges faster than k−1/2 by its nature, whereas the efficiency of smoothing (targeted at removing the small-scale error constituents) degrades as the signal-to-noise ratio of the diagonal estimates increases with k.

From the practical point of view, it is not reasonable to do more than several hundred iterations, as 〈ε〉 drops to the value of a few percent (Fig. 3c), which is much smaller than the accuracy in the determination of the background error variances. We may therefore conclude that it is advantageous to use the RHM technique when k ≃ 100. In this case (assuming that kM), utilization of the HM with exact dimension is not necessary, as it does not affect the convergence.

d. Asymptotic expansion method

Since the principal axes of the diffusion tensor at every point are defined by construction, computation of the zeroth-order approximation (6) to the normalization factors is not expensive. Near the boundaries, however, the factors described by (6) have to be adjusted by taking into account the geometric constraints imposed on the diffusion. In this study, this adjustment was computed for points located closer than 3λ1 from the boundary. It is assumed that the boundary had negligible impact on the shape of the diffused δ function (Fig. 5), so the normalization factor near the boundary was computed by dividing the reference factor (obtained by convolving the BEC operator with the δ function in the “open sea”) by the respective integral over the land-free subdomain S′ shown in Fig. 5:
e15
Fig. 5.
Fig. 5.

Adjustment of the normalization factors near the boundary. (a) Map of the xth column of obtained numerically by integrating (15) over the real domain S′ (the normalization factor is the map value at point x). (b) Approximation to (a) obtained by integrating in the infinite domain and renormalizing the result to have the same integral over S′ as in (a). (c) The difference between (a) and its approximation (b).

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

Integrals in the rhs of (15) have to be taken numerically for all the near-boundary points x. To speed up the computations, we adopted the LH assumption near the boundaries, and replaced the convolutions in (15) by the respective analytical functions (4)(5) with the fixed value ν(x). It is necessary to note that the assumption underlying (15) is not exact for the zero normal gradient condition in use (e.g., see Mirouze and Weaver 2010). However, the errors, caused by neglecting distortions introduced by the zero-flux conditions are significantly smaller (3%–7%, see Fig. 5) than the accuracy of the LH assumption itself. These errors could be removed, for example, by introducing “transparent” conditions (Mirouze and Weaver 2010).

Figure 6 demonstrates horizontal distribution of the error ε(x) obtained by approximating the diagonal elements of with (6) (i.e., the zeroth-order LH method, or LH0) and with (13) (i.e., the first-order LH method, or LH1). Despite an apparent violation of the LH assumption in many regions (e.g., λ1 changes from 20δ to the background value of 3δ at distances L ~ 5–6δ < λ1 across the shelf break), the mean approximation error of the diagonal elements appears to be relatively small (19%) for the LH0 method, with most of the maxima confined to the regions of strong inhomogeneity (Fig. 6a). The next approximation (Fig. 6b) reduces 〈ε〉 to 9%. Numerical experiments with the model have shown similar results (16% and 10% errors).

Fig. 6.
Fig. 6.

Diagonal approximation errors under the (a) zeroth-order and (b) first-order LH methods for the Gaussian BEC model.

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

Another series of experiments are performed with the varying scaling parameter γ to find an optimal fit to d. Computations were made for 0 ≤ γ ≤ 1. The best result for the Gaussian BEC was obtained for γ2 = 0.30, which is fairly consistent with the value (γ2 = 0.33) given by (12). In the case of operator, the optimal value is γ2 = 0.24, still in a reasonable agreement with (12), given the strong inhomogeneity of ν and deviation of the BEC operator from the Gaussian form. A somewhat smaller value of can be explained by the sharper shape of the respective correlation function at the origin (Fig. 1), which renders d0 to be less dependent on the inhomogeneities in the distribution of ν, and, therefore, requiring less smoothing in the next approximation.

e. Numerical efficiency

Table 1 provides an overview of the performance for the tested methods. For comparison purposes we show CPU requirements by the smoothed MC and RHM methods after they achieve the accuracies of the LH0 and LH1 methods. It is seen that both MC and RHM methods are 300–1000 times more computationally expensive than the LH technique. In fact, for the 2D case considered, the computational cost of the stochastic methods is similar to that of the three-dimensional variational data assimilation (3DVAR) analysis itself.

Table 1.

Relative CPU times required by the MC and RHM methods to achieve the accuracies 〈ε〉 of the LH0 and LH1 methods (shown in parentheses).

Table 1.

Comparison of the spatial distributions of the approximation error 〈ε〉(x) favor the LH methods as well: they do show significantly less small-scale variations and may have a potential for further improvement.

Comparing Figs. 3b, 4b, and 6b shows that, in contrast to the MC and RHM methods, LH0 errors tend to increase in the regions of strong inhomogeneity, but they decrease substantially after smoothing by the LH1 algorithm. At the same time, the LH1 errors tend to have relatively higher values near the boundaries; the effect is less visible in the MC and RHM patterns (Figs. 3b and 4b). This feature can be partly attributed to certain inaccuracy in the algorithm for analytic estimation of the near-boundary elements (Fig. 5c).

Table 1 also shows that LH methods outperform both MC and HM techniques. Although considerable resources are required to compute near-boundary integrals for the adjustment factors (15), the overall CPU savings are quite remarkable.

f. LH experiments in the 3D setting

To check the performance of the LH0 and LH1 methods further, a larger 3D domain is taken from the NCOM setup in the Okinawa region south of Japan (Fig. 7), with horizontal resolution of 10 km and 45 vertical levels. The state vector dimension M (total number of the grid points) in this setting was 862 992.

Fig. 7.
Fig. 7.

Diagonal elements of in the Okinawa region at z = 20 m. The actual values are multiplied by 104.

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

Because of the large M, it is computationally unfeasible to directly compute all the diagonal elements of the BEC matrix. Therefore, accuracy checks are performed on a subset of 10 000 points, randomly distributed over the domain and the value of 〈ε〉 is estimated by averaging over these points.

The diffusion tensor is constructed in the way described in section 3a, but the generating field v(x) is taken to be the horizontal velocity field from an NCOM run. The value of λ3 (in the vertical direction) is independent of horizontal coordinates, but varied in the vertical as 3δz, where δz is the vertical grid step. Figure 7 illustrates spatial variability of the diagonal elements at z = 20 m. The smallest values are observed in the regions of Kuroshio and the North Equatorial Current, where the largest velocities are observed, and the reaches the largest values [(6)]. To better test the algorithm, a relatively small threshold value υ = 0.02 m s−1 is prescribed, so that diffusion is anisotropic in more than 90% of the grid points.

Figure 8 demonstrates the accuracy of LH0 and LH1 methods in such setting: the LH0 method provides an accuracy of 9%, which is further improved to 6% by the LH1 scheme. The major improvement occurs in the regions where points with highly anisotropic ν neighbor isotropic points and reduce the diagonal elements in the latter. The effect is reflected by the negative bias of the scatterplot at high values of d0, which reach their maximum of 0.0237 in the points with isotropic ν (Fig. 8a).

Fig. 8.
Fig. 8.

Scatterplots of the true diagonal elements of (vertical axis) vs their approximations by (a) LH0 and (b) LH1 algorithms. The actual values are multiplied by 103. Near-boundary points are excluded. (c) Diagonal approximation errors as a function of γ for the (black) and (gray) BEC models. The dashed line shows the value of γ3 given by (12).

Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1

Figure 8c shows the dependence of approximation error ε on the value of γ3 for both correlation models. The best approximation is obtained at γ3 = 0.26, a value somewhat lower than suggested by the heuristic formula (γ3 = 5/18 = 0.28). Similarly to the 2D case, the optimal value of is less than , in agreement with more rapid off-diagonal decay of the matrix elements.

In general, it appears that the relationship (12) provides a reasonable guidance to the estimation of the smoothing parameter in the LH1 method. For the model, the operator acting on d0 can be implemented by either reducing γ−1 times the number of time steps in the integration of the diffusion equation, or by γ−1/2-fold reduction of the decorrelation radius. For the model only the second option is applicable: it would also reduce the number of iterations required for computing the action of the BEC operator.

4. Summary and discussion

In this study we examined the computational efficiency of several techniques used for estimating the diagonal elements of the two types BEC operators: with the Gaussian-shaped kernel and with the kernel generated by the second-order polynomial approximation to . The considered techniques include the “stochastic” MC and HM methods, which retrieve only from its action on a vector, and the “deterministic” scheme based on the analytic diagonal expansion under the assumption of local homogeneity of the diffusion tensor. The deterministic scheme was tested in two regimes: the zeroth (LH0) and the first-order (LH1) approximations.

Numerical experiments conducted with realistic diffusion tensor models have shown that (i) HM technique proves to be superior in efficiency compared to the MC technique when accuracies of less than 10% (k > 100) are required; (ii) both stochastic methods require 300–1000 times more CPU time to achieve the accuracy, compatible with the most efficient LH1 method; (iii) with the Gaussian model, the LH1 method demonstrates the best performance with the value of the smoothing parameter γ compatible with the one given by the relationship (12) derived from the asymptotic approximation of the Gaussian kernel diagonal.

In deriving the ansatz (13) for the LH1 model we followed the approach of Purser et al. (2003), who proposed to smooth the zeroth-order diagonal by the square root of the BEC operator in a one-dimensional case. Using the asymptotic technique for the heat kernel expansion, we obtained a formula for higher dimensions, and tested its validity by numerical experimentation.

It should be noted that the formal asymptotic expansion (7) is local by nature and tends to diverge in practical applications, where spatial variations of the diffusion tensor may occur at distances L comparable with the typical decorrelation scale . To effectively immunize the expansion from the ill effects of the abrupt changes in ν, we utilized a nonlocal empirical modification, still fully consistent with the original expansion in the limit , but sufficiently robust with respect to the errors related to the high-order derivatives of ν. A similar technique was developed by Purser (2008a,b), who used empirical saturation functions to stabilize the higher-order approximations of the heat kernel diagonal.

In general, results of our experiments show high computational efficiency of the LH1 scheme, whose total CPU requirements is just a fraction of the CPU time required by the convolution with BEC operator—a negligible amount compared to the cost of 3DVAR analysis.

A separate question, requiring further investigation, is the accurate treatment of the boundary conditions. In the present study we assumed that boundaries affect only the magnitude of the corresponding columns of , but not their structure. This approximation is only partly consistent with the zero normal flux conditions for , but can be avoided if one uses “transparent” boundary conditions (e.g., Mirouze and Weaver 2010), which do not require computation of the adjustment factors (15).

On the other hand, it might be beneficial to keep physical boundary conditions in the formulation of , as they are likely to bring more realism to the dynamics of the BE field. In the considered diffusion tensor model, anisotropic BE propagation is superimposed on the small-scale isotropic BE diffusion, which takes place at scales that are not well resolved by the grid (less than 3δ). This may give some grounds to employ isotropic homogeneous model at distances d < 3δ from the boundary, because the deterministic part of the BE transport associated with the boundary effects is not well resolved at these scales anyway. Such simplification greatly reduces the computational cost of the LH algorithm with zero-flux conditions because the region screened by the boundary is readily available from the land mask and needs no computation/rescaling at every point. In contrast to the 2D problem considered, such computations may be more costly in the 3D problems, as the number of adjustment factors n ~ N2/3 may be quite large even with a moderate grid size N ~ 106–107. However, our preliminary experiments with inhomogeneous BEC models with larger N have shown that estimation of the diagonal with the LH1 scheme still remains two orders in magnitude less expensive than the 3DVAR analysis.

Results of this study indicate that LH1 approximations to the BEC diagonal may serve as an efficient tool for renormalization of the correlation operators in variational data assimilation, as they are capable of providing 3%–10% accuracy in realistically inhomogeneous BEC models.

Acknowledgments

This study was supported by the Office of Naval Research (Program Element 0602435N).

APPENDIX A

Hadamard Matrices

By definition, a Hadamard matrix (HM) is a square matrix whose entries are either 1 or −1 and whose columns are mutually orthogonal. The simplest way to construct HMs is the recursive Sylvester algorithm, which is based on the obvious property: if is an M × M Hadamard matrix, then
eq2
is also HM. Starting from , the HMs with order M = 2n, n = 1, 2… can be easily constructed. The HMs with M = 12, 20 were constructed “manually” more than a century ago. A more general HM construction algorithm, which employs the Galois fields theory was found in 1933. However, it is still unknown if HMs exist for all M = 4n where n is a positive integer.

Hadamard matrices are widely used in many branches of applied mathematics and statistics (http://en.wikipedia.org/wiki/Hadamard_matrix). In the present study we used the MatLab code that handles only the cases when M/12 or M/20 is a power of 2. Despite this restriction, the available values of M were sufficient for our purposes.

APPENDIX B

Correlation Modeling with Diffusion Operator

Consider a family of linear operators in the space of sufficiently smooth square-integrable functions f(x), x ∈ ℝn:
eb1
Assume that ν = const in the definition of the diffusion operator (1) and perform the coordinate transformation . In the new coordinate system the diffusion tensor ν is the unity matrix, takes the form of the Laplacian operator Δ, whereas the δ function in the rhs of (B1) is rescaled as a result of the contraction of the volume element: . Equation (B1) takes the following form:
eb2
After changing the coordinates in with the Fourier transform, both Δ and the operator are diagonalized. The diagonal elements are given by
eb3
It is easy to note that by virtue of the textbook formula , the Fourier representation (B3) of converges at large m to
eb4
the well-known expression for the Green function of the diffusion equation. Since the diagonals given by (B3)(B4) are strictly positive, the respective operators in are positive definite, and can be interpreted as correlation operators. This property has been widely used for the background error covariance modeling in practical data assimilation.
Matrix elements of the operators and can be found via the inverse Fourier transformations of (B3)(B4):
eb5
eb6
where s = mn/2 and is the square of the distance between the correlated points and in the transformed coordinate system. Making a substitution in the rhs of (B5)(B6), we obtain the matrix elements (4)(5) in the original coordinate system.
From the viewpoint of numerical applications it is instructive to connect the discretized operator equation (B1) with various time integration techniques of the discretized diffusion equation. The numerical approximation of in (B1) is never calculated in practice because of the immense cost of such a computation. Instead, the result of action by on a (discrete) model state vector is calculated by solving the system of equations
eb7
where denotes the discretized diffusion operator. If we now assume that represents the “initial state” and prescribe the “time step” δt such that the “virtual ntegration time” is mδt = 1, the action of the correlation operator (B1) can be identified as a result of discrete-time integration of the diffusion equation with the implicit scheme:
eb8
starting from the initial state .
Similarly, the action of is never computed by convolving a state vector with the discretized kernel (B5), but rather by the discrete-time integration of the diffusion equation with the explicit numerical scheme:
eb9
In contrast to the implicit scheme (B8), δt in this numerical method is limited from above by the stability condition requiring that eigenvalues of the operator must be less than 1 in the absolute value. As a consequence, the minimum number of time steps m = 1/δt may be quite large, making the numerical implementation of computationally impractical.

REFERENCES

  • Avramidi, I. G., 1999: Covariant techniques for computation of the heat kernel. Rev. Math. Phys., 11, 947980.

  • Bekas, C. F., , E. Kokiopoulou, , and Y. Saad, 2007: An estimator for the diagonal of a matrix. Appl. Numer. Math., 57, 12141229.

  • Derber, J., , and A. Rosati, 1989: A global ocean data assimilation system. J. Phys. Oceanogr., 19, 13331347.

  • Di Lorenzo, E., , A. M. Moore, , H. G. Arango, , B. D. Cornuelle, , A. J. Miller, , B. S. Powell, , B. S. Chua, , and A. F. Bennett, 2007: Weak and strong constraint data assimilation in the Inverse Ocean Modelling System (ROMS): Development and application for a baroclinic coastal upwelling system. Ocean Modell., 16, 160187.

    • Search Google Scholar
    • Export Citation
  • Dong, S.-J., , and K.-F. Liu, 1994: Stochastic estimation with Z2 noise. Phys. Lett., 328B, 130136.

  • Girard, D. F., 1989: A fast Monte-Carlo cross-validation procedure for large least squares problems with noisy data. Numer. Math., 56, 123.

    • Search Google Scholar
    • Export Citation
  • Gusynin, V. P., , and V. A. Kushnir, 1991: On-diagonal heat kernel expansion in covariant derivatives in curved space. Class. Quantum Gravity, 8, 279285.

    • Search Google Scholar
    • Export Citation
  • Hutchison, M. F., 1989: A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. J. Commun. Stat. Simul., 18, 10591076.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., , J. Zhu, , J. She, , S. Zhuang, , W. Fu, , and J. Gao, 2009: Assimilating temperature and salinity profile observations using an anisotropic recursive filter in a coastal ocean model. Ocean Modell., 30, 7587.

    • Search Google Scholar
    • Export Citation
  • Martin, P. J., , C. N. Barron, , L. F. Smedstad, , T. J. Campbell, , A. J. Wallcraft, , R. C. Rhodes, , C. Rowley, , and T. L. Townsend, 2009: User's manual for the Navy Coastal Ocean Model Version 4.0. NRL/MR/7320-09-9151, Naval Research Laboratory, Stennis Space Center, MS, 75 pp.

    • Search Google Scholar
    • Export Citation
  • Mirouze, I., , and A. T. Weaver, 2010: Representation of correlation functions in variational data assimilation using an implicit diffusion operator. Quart. J. Roy. Meteor. Soc., 136, 14211443.

    • Search Google Scholar
    • Export Citation
  • Pannekoucke, O., , and S. Massart, 2008: Estimation of the local diffusion tensor and normalization for heterogeneous correlation modelling using a diffusion equation. Quart. J. Roy. Meteor. Soc., 134, 14251438.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J., 2008a: Normalization of the diffusive filters that represent the inhomogeneous covariance operators of variational assimilation, using asymptotic expansions and the techniques of non-Euclidean geometry. Part I: Analytic solutions for symmetrical configurations and the validation of practical algorithms. NOAA/NCEP Office Note 456, 48 pp.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J., 2008b: Normalization of the diffusive filters that represent the inhomogeneous covariance operators of variational assimilation, using asymptotic expansions and the techniques of non-Euclidean geometry. Part II: Riemannian geometry and the generic parametrix expansion method. NOAA/NCEP Office Note 457, 55 pp.

    • Search Google Scholar
    • Export Citation
  • Purser, R. J., , W. Wu, , D. F. Parrish, , and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances. Mon. Wea. Rev., 131, 15361548.

    • Search Google Scholar
    • Export Citation
  • Weaver, A., , and P. Courtier, 2001: Correlation modeling on a sphere using a generalized diffusion equation. Quart. J. Roy. Meteor. Soc., 127, 18151846.

    • Search Google Scholar
    • Export Citation
  • Weaver, A., , J. Vialard, , and D. L. T. Anderson, 2003: Three and four-dimensional variational assimilation with a general circulation model of the Tropical Pacific Ocean. Part I: Formulation, internal diagnostics, and consistency checks. Mon. Wea. Rev., 131, 13601378.

    • Search Google Scholar
    • Export Citation
  • Xu, Q., 2005: Representations of inverse covariances by differential operators. Adv. Atmos. Sci., 22 (2), 181198.

  • Yaremchuk, M., , and S. Smith, 2011: On the correlation functions associated with polynomials of the diffusion operator. Quart. J. Roy. Meteor. Soc., 137, 19271932.

    • Search Google Scholar
    • Export Citation
Save