1. Introduction



An important numerical issue arising with this type of covariance modeling is the necessity to find a rescaling transformation
In this study we compare the numerical efficiency of several methods of estimating the diagonal elements of a symmetric positive-definite matrix
2. Renormalization methods
a. Monte Carlo technique
This method originates from a large family of stochastic algorithms used for estimating elements and traces of extra-large matrices (e.g., Girard 1989; Hutchison 1989; Dong and Liu 1994). Weaver and Courtier (2001) were among the first to use this approach in geophysical applications for estimating the diagonal of the Gaussian-shaped BEC operators.








b. The Hadamard matrix method
The Monte Carlo (MC) technique was developed further by Bekas et al. (2007), who noticed that the method may converge to d in the finite number of iterations M if the ensemble vectors are mutually orthogonal. An easy way to construct such an ensemble is to draw the vectors sk from the columns of the M × M Hadamard matrix (HM), which span the model’s state space (see appendix A for more details on the HM theory).
Although it is not proven yet that Hadamard matrices can be constructed for an arbitrary M, very efficient recursive algorithms for generating HM columns do exist for Ms, whose factorization involves only prime numbers not exceeding 100. Since the exact convergence (which can be achieved at M iterations) is never needed in practice, it is not even necessary to draw sk from the Hadamard matrix, whose dimension exactly coincides with the state space dimension: if generation of sk is impossible because of some odd value of M (e.g., 1004), it can always be replaced by a slightly larger number (e.g., 1008), and sk can be defined as the first 1004 components of the 1008-dimensional Hadamard vectors. In the numerical experiments described in the next section it is shown that such modification does not affect the method’s convergence during the first several hundred iterations.
c. Locally homogeneous approximations































Correlation functions corresponding to the homogeneous operators (4)–(5) with identical decorrelation scales.
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
3. Numerical results
To assess the efficiency of the methods outlined in the previous section, two series of numerical experiments with realistically inhomogeneous BEC models are performed. In the first series we test the methods in the 2D case with the state vector having the dimension of several thousand. In the second series, the LH0 and LH1 techniques are examined in a realistic 3D setting with a state space dimension of ~106.
a. Experimental setting in 2D
The state space is described by scalar functions defined on the orthogonal curvilinear grid of the Navy Coastal Ocean Model (NCOM; Martin et al. 2009) set up in the Monterey Bay (Fig. 2). The number M of grid points (dimension of the state space) was 3438. A vector field v(x) was used to generate the diffusion tensor as follows. The smaller principal axis λ2 of

Five columns of the BEC operators used in the study: (a) Gaussian-shaped correlations and (b) their approximation by the inverse of the second-order polynomial in
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
In the 2D experiments, the vector field v is generated by treating bottom topography h(x) (Fig. 2) as a streamfunction. The threshold value υ was taken to be one-fifth of the rms variation of |∇h| over the domain.
All the experiments described in sections 3b–3e, are performed using the BEC models (4)–(5) with the parameters n = m = 2. Composite maps of five columns of the corresponding BEC operators are shown in Figs. 2a,b. The diffusion operator (1) is constrained to have a zero normal derivative at the open and rigid boundaries of the domain in both 2D and 3D experiments.
Numerically, the action of the Gaussian-shaped BEC operator
The exact values d(x) of the diagonal elements are shown in Fig. 2 (right panel). Their magnitude appears to be lower in the regions of “strong currents” (large v), as the corresponding δ functions are dispersed over larger areas by diffusion. The d(x) are higher near the boundaries because part of the domain available for dispersion is screened by the condition prescribing zero flux across either the open or rigid boundary.
b. Monte Carlo technique
The MC method is implemented in two ways: in the first series of experiments, the components of sk are taken to be either 1 or −1 with equal probability; in the second series they are drawn from the white noise on the interval [−1, 1]. The residual error ε is computed using (3). In both series the rates of reduction of ε with iteration k were similar and closely followed the

(a) Error distribution after 60 iterations of the MC method, (b) its optimally smoothed version, (c) the respective dependences of the domain-averaged error 〈ε〉 and
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
Figure 3a shows the distribution of ε(x) after 60 iterations of the MC method with the
To improve the accuracy, the MC estimates are low-pass filtered with the corresponding BEC operators at every iteration. To optimize the filter, the diffusion operator in
c. The Hadamard matrix method
Experiments with the Hadamard scheme are also done in two series. Since the value of M = 3438 is not divisible by 4, it is hard to find the HM of that dimension. Instead, in the first series of experiments, sk were specified as the first 3438 numbers taken from the columns of the 3456-dimensional HM, which can be easily constructed from the 12-dimensional HMs.
To check the impact of “nonexact” dimension on the convergence, we artificially increased M to 3456 by removing 18 land points in the domain. Experiments with the HM of exact dimension show that differences in convergence between the nonexact and exact experiments start to be visible only after several hundred iterations. After M iterations the error of the exact-HM method is reduced to the machine precision, while the error of the first series of experiments stumbled at approximately 10−3 after 1500 iterations. This is consistent with the 18/3438 ~ 0.5% degree of nonorthogonality of sk drawn from the nonexact HM.
Similar to the MC method, the accuracy of HM estimates are improved significantly after smoothing. In addition it is found that the effect of smoothing can be enhanced if the computer mapping of the 2D model field on the 1D vector is randomized: Fig. 4a bears an apparent trace of columnwise numbering of the model field, which remains visible even after applying the algorithm, generating the HM columns. As a consequence, error distribution in Fig. 4a contains large scales in cross-shore direction, which tend to make the smoothing algorithm less effective (Fig. 4c, dashed curves). This drawback can be easily corrected by randomization of the above-mentioned map (the randomized HM and the RHM method), which provides an error pattern similar to Fig. 3a, but with somewhat lower value of 〈ε〉 clearly visible in Fig. 4c, where dashed lines show evolution of 〈ε〉 for the straight HM method before (upper line) and after smoothing, while the solid black lines show similar quantities for the RHM method.

(a) Error distribution after 60 iterations of the HM method. (b) As in (a), but for the smoothed RHM method. (c) Reduction of the domain-averaged error 〈ε〉 with iterations for the MC (gray), RHM (solid), and HM (dashed) methods. The bottom graphs are obtained after optimal smoothing of the diagonal estimates. Thin horizontal lines show error levels provided by the LH0 (〈ε〉 = 0.17) and LH1 methods 〈ε〉 = 0.10.
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
Comparison with the MC method (gray curves in Fig. 4c) demonstrate a noticeable advantage of the HM technique (upper curves), which remains visible at higher iterations n > 100 even after smoothing (lower curves). This advantage increases with iterations for two reasons: the HM method converges faster than k−1/2 by its nature, whereas the efficiency of smoothing (targeted at removing the small-scale error constituents) degrades as the signal-to-noise ratio of the diagonal estimates increases with k.
From the practical point of view, it is not reasonable to do more than several hundred iterations, as 〈ε〉 drops to the value of a few percent (Fig. 3c), which is much smaller than the accuracy in the determination of the background error variances. We may therefore conclude that it is advantageous to use the RHM technique when k ≃ 100. In this case (assuming that k ≪ M), utilization of the HM with exact dimension is not necessary, as it does not affect the convergence.
d. Asymptotic expansion method


Adjustment of the normalization factors near the boundary. (a) Map of the xth column of
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
Integrals in the rhs of (15) have to be taken numerically for all the near-boundary points x. To speed up the computations, we adopted the LH assumption near the boundaries, and replaced the convolutions in (15) by the respective analytical functions (4)–(5) with the fixed value ν(x). It is necessary to note that the assumption underlying (15) is not exact for the zero normal gradient condition in use (e.g., see Mirouze and Weaver 2010). However, the errors, caused by neglecting distortions introduced by the zero-flux conditions are significantly smaller (3%–7%, see Fig. 5) than the accuracy of the LH assumption itself. These errors could be removed, for example, by introducing “transparent” conditions (Mirouze and Weaver 2010).
Figure 6 demonstrates horizontal distribution of the error ε(x) obtained by approximating the diagonal elements of

Diagonal approximation errors under the (a) zeroth-order and (b) first-order LH methods for the Gaussian BEC model.
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
Another series of experiments are performed with the varying scaling parameter γ to find an optimal fit to d. Computations were made for 0 ≤ γ ≤ 1. The best result for the Gaussian BEC was obtained for γ2 = 0.30, which is fairly consistent with the value (γ2 = 0.33) given by (12). In the case of
e. Numerical efficiency
Table 1 provides an overview of the performance for the tested methods. For comparison purposes we show CPU requirements by the smoothed MC and RHM methods after they achieve the accuracies of the LH0 and LH1 methods. It is seen that both MC and RHM methods are 300–1000 times more computationally expensive than the LH technique. In fact, for the 2D case considered, the computational cost of the stochastic methods is similar to that of the three-dimensional variational data assimilation (3DVAR) analysis itself.
Relative CPU times required by the MC and RHM methods to achieve the accuracies 〈ε〉 of the LH0 and LH1 methods (shown in parentheses).

Comparison of the spatial distributions of the approximation error 〈ε〉(x) favor the LH methods as well: they do show significantly less small-scale variations and may have a potential for further improvement.
Comparing Figs. 3b, 4b, and 6b shows that, in contrast to the MC and RHM methods, LH0 errors tend to increase in the regions of strong inhomogeneity, but they decrease substantially after smoothing by the LH1 algorithm. At the same time, the LH1 errors tend to have relatively higher values near the boundaries; the effect is less visible in the MC and RHM patterns (Figs. 3b and 4b). This feature can be partly attributed to certain inaccuracy in the algorithm for analytic estimation of the near-boundary elements (Fig. 5c).
Table 1 also shows that LH methods outperform both MC and HM techniques. Although considerable resources are required to compute near-boundary integrals for the adjustment factors (15), the overall CPU savings are quite remarkable.
f. LH experiments in the 3D setting
To check the performance of the LH0 and LH1 methods further, a larger 3D domain is taken from the NCOM setup in the Okinawa region south of Japan (Fig. 7), with horizontal resolution of 10 km and 45 vertical levels. The state vector dimension M (total number of the grid points) in this setting was 862 992.

Diagonal elements of
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
Because of the large M, it is computationally unfeasible to directly compute all the diagonal elements of the BEC matrix. Therefore, accuracy checks are performed on a subset of 10 000 points, randomly distributed over the domain and the value of 〈ε〉 is estimated by averaging over these points.
The diffusion tensor is constructed in the way described in section 3a, but the generating field v(x) is taken to be the horizontal velocity field from an NCOM run. The value of λ3 (in the vertical direction) is independent of horizontal coordinates, but varied in the vertical as 3δz, where δz is the vertical grid step. Figure 7 illustrates spatial variability of the
Figure 8 demonstrates the accuracy of LH0 and LH1 methods in such setting: the LH0 method provides an accuracy of 9%, which is further improved to 6% by the LH1 scheme. The major improvement occurs in the regions where points with highly anisotropic ν neighbor isotropic points and reduce the diagonal elements in the latter. The effect is reflected by the negative bias of the scatterplot at high values of d0, which reach their maximum of 0.0237 in the points with isotropic ν (Fig. 8a).

Scatterplots of the true diagonal elements of
Citation: Monthly Weather Review 140, 2; 10.1175/MWR-D-11-00139.1
Figure 8c shows the dependence of approximation error ε on the value of γ3 for both correlation models. The best approximation is obtained at γ3 = 0.26, a value somewhat lower than suggested by the heuristic formula (γ3 = 5/18 = 0.28). Similarly to the 2D case, the optimal value of
In general, it appears that the relationship (12) provides a reasonable guidance to the estimation of the smoothing parameter in the LH1 method. For the
4. Summary and discussion
In this study we examined the computational efficiency of several techniques used for estimating the diagonal elements of the two types BEC operators: with the Gaussian-shaped kernel
Numerical experiments conducted with realistic diffusion tensor models have shown that (i) HM technique proves to be superior in efficiency compared to the MC technique when accuracies of less than 10% (k > 100) are required; (ii) both stochastic methods require 300–1000 times more CPU time to achieve the accuracy, compatible with the most efficient LH1 method; (iii) with the Gaussian model, the LH1 method demonstrates the best performance with the value of the smoothing parameter γ compatible with the one given by the relationship (12) derived from the asymptotic approximation of the Gaussian kernel diagonal.
In deriving the ansatz (13) for the LH1 model we followed the approach of Purser et al. (2003), who proposed to smooth the zeroth-order diagonal by the square root of the BEC operator in a one-dimensional case. Using the asymptotic technique for the heat kernel expansion, we obtained a formula for higher dimensions, and tested its validity by numerical experimentation.
It should be noted that the formal asymptotic expansion (7) is local by nature and tends to diverge in practical applications, where spatial variations of the diffusion tensor may occur at distances L comparable with the typical decorrelation scale
In general, results of our experiments show high computational efficiency of the LH1 scheme, whose total CPU requirements is just a fraction of the CPU time required by the convolution with BEC operator—a negligible amount compared to the cost of 3DVAR analysis.
A separate question, requiring further investigation, is the accurate treatment of the boundary conditions. In the present study we assumed that boundaries affect only the magnitude of the corresponding columns of
On the other hand, it might be beneficial to keep physical boundary conditions in the formulation of
Results of this study indicate that LH1 approximations to the BEC diagonal may serve as an efficient tool for renormalization of the correlation operators in variational data assimilation, as they are capable of providing 3%–10% accuracy in realistically inhomogeneous BEC models.
This study was supported by the Office of Naval Research (Program Element 0602435N).
APPENDIX A
Hadamard Matrices



Hadamard matrices are widely used in many branches of applied mathematics and statistics (http://en.wikipedia.org/wiki/Hadamard_matrix). In the present study we used the MatLab code that handles only the cases when M/12 or M/20 is a power of 2. Despite this restriction, the available values of M were sufficient for our purposes.
APPENDIX B
Correlation Modeling with Diffusion Operator






































REFERENCES
Avramidi, I. G., 1999: Covariant techniques for computation of the heat kernel. Rev. Math. Phys., 11, 947–980.
Bekas, C. F., , E. Kokiopoulou, , and Y. Saad, 2007: An estimator for the diagonal of a matrix. Appl. Numer. Math., 57, 1214–1229.
Derber, J., , and A. Rosati, 1989: A global ocean data assimilation system. J. Phys. Oceanogr., 19, 1333–1347.
Di Lorenzo, E., , A. M. Moore, , H. G. Arango, , B. D. Cornuelle, , A. J. Miller, , B. S. Powell, , B. S. Chua, , and A. F. Bennett, 2007: Weak and strong constraint data assimilation in the Inverse Ocean Modelling System (ROMS): Development and application for a baroclinic coastal upwelling system. Ocean Modell., 16, 160–187.
Dong, S.-J., , and K.-F. Liu, 1994: Stochastic estimation with Z2 noise. Phys. Lett., 328B, 130–136.
Girard, D. F., 1989: A fast Monte-Carlo cross-validation procedure for large least squares problems with noisy data. Numer. Math., 56, 1–23.
Gusynin, V. P., , and V. A. Kushnir, 1991: On-diagonal heat kernel expansion in covariant derivatives in curved space. Class. Quantum Gravity, 8, 279–285.
Hutchison, M. F., 1989: A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. J. Commun. Stat. Simul., 18, 1059–1076.
Liu, Y., , J. Zhu, , J. She, , S. Zhuang, , W. Fu, , and J. Gao, 2009: Assimilating temperature and salinity profile observations using an anisotropic recursive filter in a coastal ocean model. Ocean Modell., 30, 75–87.
Martin, P. J., , C. N. Barron, , L. F. Smedstad, , T. J. Campbell, , A. J. Wallcraft, , R. C. Rhodes, , C. Rowley, , and T. L. Townsend, 2009: User's manual for the Navy Coastal Ocean Model Version 4.0. NRL/MR/7320-09-9151, Naval Research Laboratory, Stennis Space Center, MS, 75 pp.
Mirouze, I., , and A. T. Weaver, 2010: Representation of correlation functions in variational data assimilation using an implicit diffusion operator. Quart. J. Roy. Meteor. Soc., 136, 1421–1443.
Pannekoucke, O., , and S. Massart, 2008: Estimation of the local diffusion tensor and normalization for heterogeneous correlation modelling using a diffusion equation. Quart. J. Roy. Meteor. Soc., 134, 1425–1438.
Purser, R. J., 2008a: Normalization of the diffusive filters that represent the inhomogeneous covariance operators of variational assimilation, using asymptotic expansions and the techniques of non-Euclidean geometry. Part I: Analytic solutions for symmetrical configurations and the validation of practical algorithms. NOAA/NCEP Office Note 456, 48 pp.
Purser, R. J., 2008b: Normalization of the diffusive filters that represent the inhomogeneous covariance operators of variational assimilation, using asymptotic expansions and the techniques of non-Euclidean geometry. Part II: Riemannian geometry and the generic parametrix expansion method. NOAA/NCEP Office Note 457, 55 pp.
Purser, R. J., , W. Wu, , D. F. Parrish, , and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances. Mon. Wea. Rev., 131, 1536–1548.
Weaver, A., , and P. Courtier, 2001: Correlation modeling on a sphere using a generalized diffusion equation. Quart. J. Roy. Meteor. Soc., 127, 1815–1846.
Weaver, A., , J. Vialard, , and D. L. T. Anderson, 2003: Three and four-dimensional variational assimilation with a general circulation model of the Tropical Pacific Ocean. Part I: Formulation, internal diagnostics, and consistency checks. Mon. Wea. Rev., 131, 1360–1378.
Xu, Q., 2005: Representations of inverse covariances by differential operators. Adv. Atmos. Sci., 22 (2), 181–198.
Yaremchuk, M., , and S. Smith, 2011: On the correlation functions associated with polynomials of the diffusion operator. Quart. J. Roy. Meteor. Soc., 137, 1927–1932.