• Abramowitz, M., , and I. A. Stegun, 1970: Handbook of Mathematical Functions. 9th ed. Dover Publications, 1046 pp.

  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131 , 634–642.

  • Barnier, B., and Coauthors, 2006: Impact of partial steps and momentum advection schemes in a global ocean circulation model at eddy permitting resolution. Ocean Dyn., 56 , 543–567.

    • Search Google Scholar
    • Export Citation
  • Bateman, H., , and A. Erdelyi, 1954: Tables of Integral Transforms. Vols. 1 and 2. McGraw-Hill Book Company, 835 pp.

  • Brankart, J-M., , and P. Brasseur, 1996: Optimal analysis of in situ data in the western Mediterranean using statistics and cross-validation. J. Atmos. Oceanic Technol., 13 , 477–491.

    • Search Google Scholar
    • Export Citation
  • Brankart, J-M., , C-E. Testut, , P. Brasseur, , and J. Verron, 2003: Implementation of a multivariate data assimilation scheme for isopycnic coordinate ocean models: Application to a 1993–96 hindcast of the North Atlantic Ocean circulation. J. Geophys. Res., 108 , (C3). 3074. doi:10.1029/2001JC001198.

    • Search Google Scholar
    • Export Citation
  • Cohn, S. E., 1997: An introduction to estimation theory. J. Meteor. Soc. Japan, 75 , 257–288.

  • Da Silveira, I., , L. Miranda, , and W. Brown, 1994: On the origins of the North Brazil Current. J. Geophys. Res., 99 , (C11). 22501–22512.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., , and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Agulhas current using the ensemble Kalman filter with a quasigeostrophic model. Mon. Wea. Rev., 124 , 85–96.

    • Search Google Scholar
    • Export Citation
  • Fratantoni, D., , W. Johns, , and T. Townsend, 1995: Rings of the North Brazil Current: Their structure and behavior inferred from observations and a numerical simulation. J. Geophys. Res., 100 , (C6). 10633–10654.

    • Search Google Scholar
    • Export Citation
  • Fukumori, I., 2002: A partitioned Kalman filter and smoother. Mon. Wea. Rev., 130 , 1370–1383.

  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796–811.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, 341 pp.

  • Kimeldorf, G., , and G. Wahba, 1970: A correspondence between Bayesian estimation of stochastic processes and smoothing by splines. Annu. Math. Stat., 41 , 495–502.

    • Search Google Scholar
    • Export Citation
  • Landau, L., , and E. Lifshitz, 1951: Statistical Physics: Course of Theoretical Physics. Vol. 5. Butterworth-Heinmann, 592 pp.

  • Liu, Z., , and F. Rabier, 2002: The interaction between model resolution and observation resolution and density in data assimilation. Quart. J. Roy. Meteor. Soc., 128 , 1367–1386.

    • Search Google Scholar
    • Export Citation
  • Liu, Z., , and F. Rabier, 2003: The potential of high density observations for numerical weather prediction: A study with simulated observations. Quart. J. Roy. Meteor. Soc., 129 , 3013–3035.

    • Search Google Scholar
    • Export Citation
  • McIntosh, P. C., 1990: Oceanographic data interpolation: Objective analysis and splines. J. Geophys. Res., 95 , (C8). 13529–13541.

  • Morse, P. M., , and H. Feshbach, 1953: Methods of Theoretical Physics. Part I and II. Feshbach, 1978 pp.

  • Ott, E., , H. B. R. I. Szunyogh, , A. V. Zimin, , E. J. Kostelich, , M. Corazza, , E. Kalnay, , D. J. Patil, , and J. A. Yorke, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A , 415–428.

    • Search Google Scholar
    • Export Citation
  • Penduff, T., , J. Le Sommer, , B. Barnier, , A-M. Treguier, , J-M. Molines, , and G. Madec, 2007: Influence of numerical schemes on current-topography interactions in 1/4Β° global ocean simulations. Ocean Sci., 3 , 509–524.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., , J. Verron, , and M. C. Roubaud, 1998: Singular evolutive extended Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16 , 323–340.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., 2006: Importance of data: A meteorological perspective. Ocean Weather Forecasting: An Integrated View of Oceanography, E. P. Chassignet and J. Verron, Eds., Springer, 343–360.

    • Search Google Scholar
    • Export Citation
  • Reif, F., 1965: Fundamentals of Statistical and Thermal Physics. McGraw Hill, 651 pp.

  • Testut, C., , P. Brasseur, , J. Brankart, , and J. Verron, 2003: Assimilation of sea-surface temperature and altimetric observations during 1992–1993 into an eddy permitting primitive equation model of the North Atlantic Ocean. J. Mar. Syst., 40–41 , 291–316.

    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131 , 1485–1490.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Observation error covariance as a function of the distance ρ, as obtained numerically by inversion of the tridiagonal matrix given by (39) (for Οƒ0 = 1 and different values of β„“/Δξ). The solution is drawn (dotted curves) (left) for β„“ = 1 and decreasing Δξ = 2, 1, and 0.5 and (right) for Δξ = 1 and decreasing β„“ = 2, 1, 0.5, and 0.1. Larger bullets correspond to smaller β„“/Δξ. In the left panel, the discrete solutions are multiplied by the factor β„“/Δξ, to show the convergence to the continuous solution given by (35) (solid curve).

  • View in gallery

    Observation error covariance as a function of the distance ρ (along the grid lines), as obtained numerically for regular and isotropic grid spacings (for Οƒ0 = 1 and different values of β„“/Δξ). The solution is drawn (dotted curves) (left) for β„“ = 1 and decreasing Δξ = 1, 0.5, and 0.2 and (right) for Δξ = 1 and decreasing β„“ = 2, 1, 0.5, and 0.1. Larger bullets correspond to smaller β„“/Δξ. In the left panel, the discrete solutions are multiplied by the factor β„“2/Δξ2, to show the convergence to the continuous solution given by (44) (solid curve).

  • View in gallery

    Snapshots of the circulation in the region of the North Brazil Current, as simulated by the model for (top) 2 and (bottom) 14 Dec of the first year. (left) The sea surface height (m), (middle) the magnitude of its gradient (meters per grid point), and (right) sea surface velocity (m sβˆ’1).

  • View in gallery

    As in Fig. 3, but for the (top) means and (bottom) standard deviations of the 5-yr simulation.

  • View in gallery

    Simulated observational noise on sea surface elevation for three correlation lengths: (from left to right) β„“ = 0, 5, and 15 grid points. (top) The random noise (with variance equal to 1) and (bottom) the corresponding gradient, using the grid spacing as the length unit.

  • View in gallery

    Observational update increment on (left) sea surface height (m), (middle) zonal velocity (m sβˆ’1), and (right) meridional velocity (m sβˆ’1), that would result from one single observation (with 0.04-m error standard deviation) of altimetry located at 9Β°N, 54Β°W (in the middle of the area traversed by the mesoscale rings). This illustrates the size of the domain of influence of the observations.

  • View in gallery

    Error standard deviation corresponding to experiment 1, (top) as measured by the ensemble of differences with respect to the true states, and (bottom) as estimated by the scheme (the square root of the diagonal of 𝗣a). It is shown (left) for altimetry (m), (middle) for its gradient (m per grid point), and (right) for velocity (m sβˆ’1).

  • View in gallery

    As in Fig. 7, but for experiment 2.

  • View in gallery

    As in Fig. 7, but for experiment 3.

  • View in gallery

    This figure generalizes the results of Figs. 7, 8, and 9 (here averaged over the domain), by showing them as a function of the observation error correlation length scales (in grid points) β„“o (x axis), characterizing the simulated noise, and β„“p (y axis), which is used to parameterize the observation error covariance matrix π—₯. Shown are results for (left two panels) sea surface height and (right two panels) velocity. Within each variable pair, the left panel shows the true error standard deviation (as measured by the ensemble of differences with respect to the true states), and the right panel shows the ratio between estimated and measured error standard deviations.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 36 36 4
PDF Downloads 25 25 2

Efficient Parameterization of the Observation Error Covariance Matrix for Square Root or Ensemble Kalman Filters: Application to Ocean Altimetry

View More View Less
  • 1 LEGI/CNRS-Grenoble UniversitΓ©s, Grenoble, France
Β© Get Permissions
Full access

Abstract

In the Kalman filter standard algorithm, the computational complexity of the observational update is proportional to the cube of the number y of observations (leading behavior for large y). In realistic atmospheric or oceanic applications, involving an increasing quantity of available observations, this often leads to a prohibitive cost and to the necessity of simplifying the problem by aggregating or dropping observations. If the filter error covariance matrices are in square root form, as in square root or ensemble Kalman filters, the standard algorithm can be transformed to be linear in y, providing that the observation error covariance matrix is diagonal. This is a significant drawback of this transformed algorithm and often leads to an assumption of uncorrelated observation errors for the sake of numerical efficiency. In this paper, it is shown that the linearity of the transformed algorithm in y can be preserved for other forms of the observation error covariance matrix. In particular, quite general correlation structures (with analytic asymptotic expressions) can be simulated simply by augmenting the observation vector with differences of the original observations, such as their discrete gradients. Errors in ocean altimetric observations are spatially correlated, as for instance orbit or atmospheric errors along the satellite track. Adequately parameterizing these correlations can directly improve the quality of observational updates and the accuracy of the associated error estimates. In this paper, the example of the North Brazil Current circulation is used to demonstrate the importance of this effect, which is especially significant in that region of moderate ratio between signal amplitude and observation noise, and to show that the efficient parameterization that is proposed for the observation error correlations is appropriate to take it into account. Adding explicit gradient observations also receives a physical justification. This parameterization is thus proved to be useful to ocean data assimilation systems that are based on square root or ensemble Kalman filters, as soon as the number of observations becomes penalizing, and if a sophisticated parameterization of the observation error correlations is required.

Corresponding author address: Jean-Michel Brankart, LEGI/CNRS, BP53X, 38041 Grenoble CEDEX, France. Email: Jean-Michel.Brankart@hmg.inpg.fr

Abstract

In the Kalman filter standard algorithm, the computational complexity of the observational update is proportional to the cube of the number y of observations (leading behavior for large y). In realistic atmospheric or oceanic applications, involving an increasing quantity of available observations, this often leads to a prohibitive cost and to the necessity of simplifying the problem by aggregating or dropping observations. If the filter error covariance matrices are in square root form, as in square root or ensemble Kalman filters, the standard algorithm can be transformed to be linear in y, providing that the observation error covariance matrix is diagonal. This is a significant drawback of this transformed algorithm and often leads to an assumption of uncorrelated observation errors for the sake of numerical efficiency. In this paper, it is shown that the linearity of the transformed algorithm in y can be preserved for other forms of the observation error covariance matrix. In particular, quite general correlation structures (with analytic asymptotic expressions) can be simulated simply by augmenting the observation vector with differences of the original observations, such as their discrete gradients. Errors in ocean altimetric observations are spatially correlated, as for instance orbit or atmospheric errors along the satellite track. Adequately parameterizing these correlations can directly improve the quality of observational updates and the accuracy of the associated error estimates. In this paper, the example of the North Brazil Current circulation is used to demonstrate the importance of this effect, which is especially significant in that region of moderate ratio between signal amplitude and observation noise, and to show that the efficient parameterization that is proposed for the observation error correlations is appropriate to take it into account. Adding explicit gradient observations also receives a physical justification. This parameterization is thus proved to be useful to ocean data assimilation systems that are based on square root or ensemble Kalman filters, as soon as the number of observations becomes penalizing, and if a sophisticated parameterization of the observation error correlations is required.

Corresponding author address: Jean-Michel Brankart, LEGI/CNRS, BP53X, 38041 Grenoble CEDEX, France. Email: Jean-Michel.Brankart@hmg.inpg.fr

1. Introduction

In atmospheric or oceanic applications of the Kalman filters, the growing number of available observations often leads to a prohibitive cost of the observational update (analysis step), and to the necessity of simplifying the problem. Ad hoc solutions must be found to make the problem numerically tractable. One first option is to synthesize the observational information by aggregating observations in superobservations, or even by dropping the least useful or most redundant measurements (data thinning). Another option is to transform the original algorithm and reduce its computational complexity by taking advantage of prior hypotheses on the error statistics (i.e., on the shape of the state and observation error covariance matrices). Simplifications are thus applied on the error second-order statistical moments (which are anyway only approximately known) rather than on the observations themselves. Of course, these two options are not mutually exclusive; they can interact with and complement each other. As explained in Rabier (2006), the need for data thinning can also result from over simplistic assumptions in the parameterization of the observation error covariance matrix. For instance, with a suboptimal scheme neglecting observation error correlations, decreasing the observation density can help improving the accuracy of the estimation (Liu and Rabier 2002, 2003). In this paper, we propose to reduce the numerical cost of the observational update by using simplified (but rather general) parameterizations of the observation error covariance matrix. The expected consequence is that, with improved efficiency, together with sufficient accuracy and robustness in the representation of the observation error covariance matrix, this method can substantially reduce the need for data thinning.

If the forecast error covariance matrix is available in square root form, as in square root or ensemble Kalman filters, it is possible to use a modified observational update algorithm (proposed by Pham et al. 1998), whose computational complexity is linear in the number of observations (instead of being cubic in the standard formula), providing that the observation error covariance matrix can be inverted at low cost, as for instance if it is diagonal. It is the purpose of this paper to introduce specific parameterizations of the observation error correlations that preserve the numerical efficiency of that modified algorithm. This can be done (i) by expressing the observation error covariance matrix as the sum of a diagonal and a low rank matrix, or (ii) by applying a linear transformation to the observation vector (and assuming uncorrelated observations in the transformed space). It is interesting to note that, in parameterization ii, nonsquare transformation matrices are possible, which means that the observation vector can be augmented with new observations that are linear combinations of the original observations. Both parameterizations are presented in section 2 of this paper. In section 3, a specific choice of linear transformation, consisting of augmenting the observation vector with gradients of the original observations, is studied in more detail.

In section 4, the algorithm is applied to ocean altimetric observations, as simulated by a 1/4Β° model of the tropical Atlantic Ocean, and focusing on the North Brazil Current. It is known indeed that altimetric observation errors are spatially correlated, because, for example, of orbit errors or atmospheric correction errors. Moreover, these correlations are important to take into account, because they can directly improve the quality of the observational update (especially for the dynamic height gradient, and thus for velocities), and the accuracy of the associated error estimates. In the North Brazil Current, the ratio between signal amplitude (about 5 cm) and typical observational noise (about 4 cm) remains moderate: the signal is marginally observed; this example is thus appropriate to show the importance of accounting for error correlations to reconstruct the ocean circulation, and to check the validity of our simplified parameterizations.

2. Parameterization of the observation error covariance matrix

In data assimilation problems, the observation error Ο΅ is defined as the difference between the observation vector y (size y) and the observation counterpart in the true state xt (size x):
i1520-0493-137-6-1908-e1
where 𝗛(y Γ— x) is the observation operator. The specification of the observation error statistics thus always requires defining properly the truth of the problem (Cohn 1997; Kalnay 2003), which generally amounts to identifying the exact scope of the estimation problem. In atmospheric or oceanic applications, this is usually done by restricting the range of resolved scales in space and time, using for instance a filtering or averaging operator acting on the continuous state of the atmospheric or oceanic system. Observation error thus not only includes a measurement error, but also a representation error that results from this restriction in the scope of the problem. In this paper, it is assumed that the total observation error Ο΅ is characterized by a zero mean 〈ϡβŒͺ = 0 (unbiased observations) and a known covariance matrix π—₯ = 〈ϡϡTβŒͺ. Our purpose is to introduce efficient approximate parameterizations of this known observation error covariance matrix for use in square root or ensemble Kalman filters.

a. Observational update in square root or ensemble Kalman filters

In Kalman filters, the standard formula to compute the observational update increment Ξ΄x of the model state vector is
i1520-0493-137-6-1908-e2
where Ξ΄y = y βˆ’ 𝗛xf is the innovation vector, representing the difference between the observation vector y (size y) and the model equivalent to the observation in the forecast state vector xf (size x), and 𝗣f(x Γ— x) is the forecast error covariance matrix. The computational complexity (leading behavior for large x and y) of this standard formula is
i1520-0493-137-6-1908-e3
In the computation of C0, it is assumed that a linear system is solved to compute (𝗛𝗣f𝗛T + π—₯)βˆ’1Ξ΄y, with asymptotic complexity y3/6 (for a symmetric matrix). The second terms in C0 corresponds to the left multiplication by (𝗛𝗣f)T. In addition, the cost of application of the observation operator 𝗛 is assumed negligible throughout this discussion. It is negligible for instance if every observation is related to a small number of state variables. If 𝗛 is more complex, it is straightforward to add the cost of 𝗛 to the computational complexity formulas and transform the conclusions accordingly.
If the forecast error covariance matrix is available in square root form:
i1520-0493-137-6-1908-e4
as in square root or ensemble Kalman filters, the standard formula (2) can be transformed into
i1520-0493-137-6-1908-e5
as suggested by Pham et al. (1998), using the Sherman–Morrison–Woodbury formula:
i1520-0493-137-6-1908-e6
Formula (5) is advantageous with respect to formula (2) only if π—₯ can be inverted at low cost. For instance, if π—₯ is diagonal, the asymptotic computational complexity of formula (5) (leading behavior for large x, y and r) is
i1520-0493-137-6-1908-e7
where r is the number of columns in 𝗦f (the maximum rank of 𝗣f). The first term corresponds to the computation of the r Γ— r matrix between brackets; the second term, to the solution of the linear system; and the last term, to the left multiplication by 𝗦f. The main difference between formula (5) and formula (2) is that the linear system to solve is of size r (complexity r3/6) instead of y (complexity y3/6).
The key advantage of formula (5) with respect to formula (2) is that the computational complexity C1 is linear in the number of observations y (a property that disappears if a general matrix π—₯ is inverted). With formula (5), larger observation vectors become numerically tractable. Asymptotically, for large values of x, y, and r, with fixed ratios y/x and r/x (this means in practice that any of these numbers is small with respect to the products of the other two: x β‰ͺ yr, y β‰ͺ xr, r β‰ͺ yx), the gain factor that is obtained by using formula (5) instead of formula (2) simplifies (only the cubic terms remain) to
i1520-0493-137-6-1908-e8
Even in the full rank problem (r β‰₯ x), formula (5) is cheaper than formula (2) (asymptotically) as soon as y/r > 2.53. But the benefit of formula (5) becomes really clear in the small rank problems (r β‰ͺ x), that result from the application of reduced rank or ensemble Kalman filters. In these problems, it is often useful to reach very small r/y ratios, for which formula (5) is by far preferable. Nevertheless, the main drawback of using formula (5) is that it leads to assuming a diagonal observation error covariance matrix. It is the purpose of this paper to show how it is possible to introduce parameterizations of the observation error correlations that preserve the numerical efficiency of formula (5).

It is worth noting here that, in realistic applications, the observational update is often performed locally by dividing the full model state in subdomains, and by performing a separate observational update for every subdomain using a subset of the global observation dataset (see, e.g., Anderson 2003; Houtekamer and Mitchell 1998; Ott et al. 2004; Tippett et al. 2003). With local methods, the size of the observation vector can be significantly reduced with respect to a global observational update, thus modifying the computational complexity (3) and (7) of the algorithms and the gain factor (8) that is obtained by using formula (5) instead of formula (2). The same expressions can however still be applied providing that x and y are defined as the size of the local state and observation vectors. In addition, if r is still the number of columns in 𝗦f, it can usually be set smaller with local methods. The use of low-rank 𝗣f parameterizations (or small size ensembles) is indeed one important reason for which local methods are required (Houtekamer and Mitchell 1998).

b. Observational update of the error covariance in square root or ensemble Kalman filters

In this section, we examine how the conclusions of the previous section must be modified if the problem requires the observational update of the error covariance. (This is always done if the Kalman filter is not approximated by an optimal interpolation scheme.) In Kalman filters, the standard formula to compute the observational update of the error covariance, corresponding to formula (2) is
i1520-0493-137-6-1908-e9
where 𝗣a is the updated (analysis) error covariance matrix. The computational complexity (leading behavior for large x and y) of this standard formula is
i1520-0493-137-6-1908-e10
The first term corresponds to the symmetric matrix inversion, and the two last terms to matrix multiplications. This complexity includes complexity C0 if formulas (2) and (9) are applied together because most operations of formula (2) are included in formula (9).
If the forecast error covariance matrix is available in square root form (4), it is shown (Pham et al. 1998) that the updated matrix, corresponding to formula (5) can be obtained in square root form (𝗣a = 𝗦a𝗦aT) using the formula
i1520-0493-137-6-1908-e11
If π—₯ is diagonal, the computational complexity (leading behavior for large x, y, and r) of formula (11) is
i1520-0493-137-6-1908-e12
The first term corresponds to the computation of the r Γ— r matrix between brackets, the second term includes the computation of the inverse matrix and the Cholesky decomposition of the inverse (the cheapest square root), and the last term corresponds to the left multiplication by 𝗦f. This complexity includes complexity C1 if formulas (5) and (11) are applied together because most operations of formula (5) are included in formula (11).
Again, the key advantage of formula (11) over formula (9) is that the computational complexity C1P is linear in the number of observations y. However, new cubic terms appear so that the (asymptotic) gain factor is not as simple as (8):
i1520-0493-137-6-1908-e13
However, as soon as x/y remains moderate, the conclusions of section a remain valid: the gain behaves proportionally to (r/y)2 for small r/y. And if x/y is large, formula (11) is even more favorable since the gain behaves like (r/y)2y/x for small r/y. It is also worth noting that with formula (11), the additional cost of computing the update of error covariance, with respect to formula (5) is usually moderate, behaving at most (for small r/y) like C1P/C1 ∼ 1+x/y.
In the ensemble Kalman filter (Evensen and van Leeuwen 1996), the update of the error covariance is performed differently: the ensemble forecast (describing the error covariance) is updated by the application of formula (2) to an ensemble of innovation vectors (representing the difference between the observation vector and each member of the ensemble forecast, perturbed by a random vector of covariance π—₯). However, the complexities C0 and C1 are not simply multiplied by the size r of the ensemble because it is cheaper here to explicitly invert the matrix, rather than solving the r linear systems, so that (3) transforms to
i1520-0493-137-6-1908-e14
The first term corresponds to the inversion of the matrix 𝗛𝗣f𝗛T + π—₯, the second term includes the computation of the matrix 𝗛𝗣f𝗛T from the square root representation and the application of the inverse matrix to the ensemble innovations, and the third term corresponds to the left multiplications by (𝗛𝗣f)T to obtain the ensemble corrections. On the other hand, (7) transforms to
i1520-0493-137-6-1908-e15
The first term includes the computation of the r Γ— r matrix between brackets and the application of the inverse matrix to the ensemble innovations, the second term corresponds to the inversion of the matrix between brackets, and the last term to the left multiplication by 𝗦f to obtain the ensemble corrections. New cubic terms appear in (14) and (15) so that the gain factor is analog to (13):
i1520-0493-137-6-1908-e16
However, as soon as x/y remains moderate [x/y β‰ͺ (y/r)2], the conclusions of section a remain qualitatively the same: the computational complexity C1E is linear in y, and for small r/y, the gain (16) behaves proportionally to (r/y)2. [If x/y is large, the leading behavior of C1E/C0E for small r/y is proportional to r/y instead of (r/y)2, because the leading terms in (14) and (15) become the last terms, proportional to x, whose ratio is equal to r/y.]

c. Modal parameterization of the observation error covariance matrix

Any observation error covariance matrix can be written in the form:
i1520-0493-137-6-1908-e17
where 𝗗 is a positive definite diagonal matrix such that π—₯ βˆ’ 𝗗 remains positive definite. There indeed always exists a real square root Θ(y Γ— q) of the positive definite symmetric matrix π——βˆ’1/2π—₯π——βˆ’1/2 βˆ’ π—œ. Using (6), the inverse of π—₯ can be written
i1520-0493-137-6-1908-e18
The application of the transformed algorithm, using formula (5) or (11), with the expression (18) of π—₯βˆ’1 requires the computation of:
i1520-0493-137-6-1908-e19
and
i1520-0493-137-6-1908-e20
where π—Ÿπ—ŸT is the Cholesky decomposition of (π—œ + ΘTΘ)βˆ’1. The leading behavior of the additional computational complexity comes from the second term of (19):
i1520-0493-137-6-1908-e21
The first term corresponds to the computation of the matrix (π—œ + ΘTΘ), the second term to the multiplication 𝗔 = (𝗛𝗦f)Tπ——βˆ’1/2 Θ, the third term includes the inversion of (π—œ + ΘTΘ)and the Cholesky decomposition of the inverse, and the two last terms are matrix multiplications (𝗕 = π—”π—Ÿ and 𝗕T𝗕).

Formula (5) or (11) with parameterization (17) for π—₯ can only be advantageous with respect to formula (2) or (9) (i.e., C1 + C1R < C0, C1P + C1R < C0P or C1E + C1R < C0E) if the number of columns q of Θ is small with respect to the number of observations (q β‰ͺ y); that is, if the observation error covariance matrix π—₯ is the sum of a diagonal matrix and a low rank matrix. [This is why expression (17) is chosen: the diagonal term is necessary to make the matrix regular.] If this can be done, the computational complexity remains linear in the number of observations y and the numerical efficiency of formulas (5) and (11) is preserved.

A further difficulty is that Θ needs to be computed. Obviously, it cannot be computed by decomposition of a full size π—₯ matrix (followed by rank reduction), because the computational complexity of such operation is again proportional to y3. A possibility (for spatially distributed observations) is to define the correlated part of π—₯ at the nodes of a regular grid (π—₯g), compute the square root π—₯g = ΘgΘgT (once for all) on that grid (with rank reduction if possible), and interpolate the modes Θg at observations locations (for every spatial distributions of the observations) to obtain Θ = 𝗛gΘg. Such approximation is valid if the error modes Θg contain only scales that are large against the regular grid resolution; that is, if the π—₯ matrix can be represented by the superposition of a white noise (the diagonal part 𝗗) and a large-scale red noise (the correlated part 𝗗1/2ΘΘT𝗗1/2). In such case, two observations that are close together (much closer than the red noise correlation scales) are assumed fully independent with respect to the white noise, and fully dependent with respect to the red noise.

This parameterization is thus particularly efficient if the typical distance between observations is small with respect to the correlation scales, because then the number q of error modes can be made small with respect to the number of observations y (q β‰ͺ y), and the additional cost C1R, given by (21), remains tractable (asymptotically for large y): the linear term in y is only increased to y(rβ€Š2 + q2 + rq) instead of yrβ€Š2. In other situations, this parameterization cannot preserve the efficiency of formulas (5) and (11) and other solutions must be found (see next sections).

An even more efficient parameterization can be built by using directly a reduced rank parameterization for the inverse observation error covariance matrix π—₯βˆ’1 = ΘΘT, with square root Θ(y Γ— q), q β‰ͺ y. With respect to parameterization (17), the linear term in y is reduced to y(rβ€Š2 + rq) instead of y(rβ€Š2 + q2 + rq). Such a simplified parameterization is used in the oceanographic applications described in Brankart et al. (2003) and Testut et al. (2003). However, singular parameterizations of π—₯βˆ’1 are dangerous because they imply that the null space of π—₯βˆ’1 is assumed unobserved (infinite observation error), which may lead to neglecting important observational information. Again, this amounts to building superobservations (presumably the most useful ones) by projecting the original observations on the error modes (the columns of Θ), and dropping everything that is orthogonal to that. In this paper, we prefer to follow our original plan to keep all observations and thus only propose regular parameterizations of π—₯.

d. Simulating correlations by linear transformation of the observation vector

The observational update given by formula (2) or (5) also minimizes
i1520-0493-137-6-1908-e22
In this equation [as well as in (25) below], it is the Moore–Penrose pseudoinverse of 𝗣f that must be used if the matrix is rank-deficient. In this cost function, we can transform the observation vector by a regular (rank equal to y) linear transformation operator 𝗧: Ξ΄y+ = 𝗧δy, 𝗛+ = 𝗧𝗛, in such a way that J remains unchanged, providing that the observation error covariance matrix is also transformed according to
i1520-0493-137-6-1908-e23
It can be easily verified that the same transformation also leaves unchanged formulas (2), (5), (9), and (11). It follows that any observation error covariance matrix π—₯ can be simulated by a diagonal matrix π—₯+ in a transformed observation space. An immediate solution is to choose π—₯+ as the matrix of eigenvalues of π—₯ and 𝗧 as the matrix with the corresponding normalized eigenvectors (so that π—₯ = 𝗧Tπ—₯+𝗧, with 𝗧 unitary and π—₯+ diagonal). Obviously, this is not the solution that we are looking for, since the computational complexity of the eigenvalue problem is again proportional to y3.

Moreover, for a general linear operator 𝗧, the computational complexity of the application of the operator (e.g., to compute Ξ΄y+ = 𝗧δy) is equal to yy+, where y+ is the size of the transformed observation vector (y+ β‰₯ y for a regular transformation). Hence, this complexity can only be linear in y if the structure of 𝗧 is simple. It can even become negligible (asymptotically) if every transformed observation (in the vector y+) is related to a small number of original observations. (It is the same argument that leads to neglecting the cost of 𝗛, see section 2a.) On the other hand, because the cost of the observational update is linear in y in formulas (5) and (11), we have the freedom to imagine a transformation 𝗧 that increases the number of observations (y+ > y), without prohibitive consequence on the numerical cost. Essentially, as soon as 𝗧 is known and is simple enough, the same computational complexity as (7), (12), and (15) applies, with y replaced by y+. (Thus the relative cost is multiplied by y+/y.) An example of such simple transformation, consisting of adding gradient observations to the original observation vector, is examined in section 3.

In addition, it is interesting to point out that, with uncorrelated observation errors in the transformed observation space, the observational update described by formulas (2) and (9) can be replaced by a repeated application of these formulas, using the observations in y+ one by one. The updated xa, 𝗣a obtained at each stage of the sequence are used as background state and background error covariance (xf and 𝗣f) for the next update. This is the serial processing algorithm that is also often used in ensemble filtering to reduce the numerical cost (at the expense of the assumption that observation errors are independent). By constructing an augmented observation vector with a diagonal error covariance matrix, the transformation method proposed in this paper thus also allows the application of this serial algorithm in presence of observation error correlations. On the other hand, in many applications, there can be several observation datasets with independent errors (e.g., if they originate from different instruments) so that the matrix π—₯ is block-diagonal. Such observation error covariance matrices can also be easily simulated by this method using separate transformations to the corresponding segments of the observation vector, for instance by augmenting the observation vector with discrete gradients computed inside each observation dataset.

Finally, in order to prepare some of the developments of section 3, it is useful to present how the transformation problem must be reformulated if the system is continuous instead of being discrete. The continuous model state x(ΞΎβ€²) is assumed observed by a continuous observation y(ΞΎ) through a general linear observation operator H:
i1520-0493-137-6-1908-e24
where Ξ΅(ΞΎ) is the observational noise. Then (22) becomes:
i1520-0493-137-6-1908-e25
where Ξ΄w(ΞΎ) is the observation residual:
i1520-0493-137-6-1908-e26
and R(βˆ’1)(ΞΎ,Ξ·) is the inverse observation error covariance:
i1520-0493-137-6-1908-e27
and Pf(βˆ’ 1)(ΞΎβ€²,Ξ·β€²) is the inverse (or pseudoinverse) forecast error covariance. If the observation y(ΞΎ) is transformed by a general linear transformation T:
i1520-0493-137-6-1908-e28
J remains unchanged if
i1520-0493-137-6-1908-e29
If the observation error is assumed spatially uncorrelated R+(ΞΎ+, Ξ·+) = R+(ΞΎ+)Ξ΄(ΞΎ+, Ξ·+), this last formula simplifies to
i1520-0493-137-6-1908-e30
Given the transformation operator T(ΞΎ+, ΞΎ) and the observation error variance in the transformed space R+(ΞΎ+), the corresponding observation error covariance in the original space R(ΞΎ, Ξ·) can be computed using (30) together with (27).

3. Simulating correlations by adding gradient observations

a. One-dimensional problem

If we assume that the observations are distributed spatially along a one-dimensional line, we can think of simulating error correlations along the line by adding gradient observations in the observation vector. Starting with the continuous problem, if y(ΞΎ) is the original observation (where ΞΎ is a curvilinear abscissa along the line), the transformed observation vector is then composed of the original function together with its first derivative:
i1520-0493-137-6-1908-e31
where T1(ΞΎ+, ΞΎ) = Ξ΄(ΞΎ+ βˆ’ ΞΎ) is the identity operator and T2(ΞΎ+, ΞΎ) is the derivative operator. Assuming that R+(ΞΎ+) is spatially homogeneous:
i1520-0493-137-6-1908-e32
where Οƒ0 is the observation error standard deviation and Οƒ1 is the gradient error standard deviation, (30) can be rewritten as
i1520-0493-137-6-1908-e33
Using (33), the definition of T1 and T2, and taking benefit of the homogeneity of the solution R(ΞΎ, Ξ·) = R(ρ) with ρ = ΞΎ βˆ’ Ξ·, (27) transforms to
i1520-0493-137-6-1908-e34
whose solution is
i1520-0493-137-6-1908-e35
Adding gradient observations in the observation vector is thus equivalent to assuming that the observation error correlation decreases exponentially with the distance |ρ|, with a decorrelation length β„“ equal to the ratio between observation error and gradient error standard deviations (while the observation error variance is divided by 2).
The discrete problem is similar to the continuous problem, except that no explicit solution can be found analytically. Assume that observations yi are available along the line, at abscissas ΞΎi, i = 1, …, y, and that we add to this observation vector, observations of the discrete gradient (left difference)
i1520-0493-137-6-1908-e36
The size of the new observation vector is thus y+ = 2y βˆ’ 1, and the transformation is
i1520-0493-137-6-1908-e37
where 𝗧1 is the identity matrix, and T2,ij = (Ξ΄ij βˆ’ Ξ΄i βˆ’ 1,j)/(ΞΎi βˆ’ ΞΎi βˆ’ 1). From this, it is easy to compute the observation error covariance matrix π—₯ on y corresponding to a diagonal observation error covariance matrix π—₯+ on y+ using (23). If π—₯+ is homogeneous,
i1520-0493-137-6-1908-e38
and if the observations are regularly distributed (ΞΎi βˆ’ ΞΎiβˆ’1 = Δξ βˆ€i), it follows that
i1520-0493-137-6-1908-e39
Equation (39) is a consistent discretization of (33) [on a limited domain of size (y βˆ’ 1)Δξ, with Neuman homogeneous boundary conditions], except for a factor Δξ/β„“ in the discretization of the Dirac function, so that (35) provides the asymptotic solution of (39) (multiplied by β„“/Δξ) as Δξ β†’ 0 and yΔξ β†’ ∞.

Figure 1 shows the solution of (39) computed numerically (by inversion of a tridiagonal matrix) for Οƒ0 = 1 and different values of β„“/Δξ, as compared with the continuous solution (35). The solution is drawn for β„“ = 1 and decreasing Δξ (left panel), showing the convergence toward the exponential decorrelation as Δξ β†’ 0; and for Δξ = 1 and decreasing β„“, showing how small correlation length scales (smaller than the observation resolution Δξ) are parameterized with this approach.

Anticipating possible mistakes in the applications, it is useful to examine the problems that occur if we replace the original observations by gradient observations (instead of adding gradient observations, as suggested in this paper), and assume a diagonal error covariance matrix in this transformed space. To make the transformation regular, we keep the first observation of y as first element of y+: y1+ = y1 (with error variance Οƒ02), and then use the observation differences as next elements: yi+ = yi βˆ’ yiβˆ’1, i = 2, …, y (with error variance Οƒ12Δξ2, assuming a regular distribution). The transformation matrix is thus square (y+ = y) and regular, and can be inverted:
i1520-0493-137-6-1908-e40
(Each original observation yi is the sum of the first i elements of y+, the first observations plus the sum of the i βˆ’ 1 first differences until yi is reached.) Since 𝗧 is square and regular, (23) can be inverted explicitly:
i1520-0493-137-6-1908-e41
so that the elements of π—₯ are
i1520-0493-137-6-1908-e42
It follows that the resulting error variance Ri = Οƒ02 + iΟƒ12Δξ2 increases linearly with distance with respect to the reference observation. (Of course, the increase can be reduced by placing the reference observation in the middle of the line, or by using the mean of the observations, but the effect remains essentially the same.) Hence, assuming independent errors on the observation differences means that their error variances (Οƒ12Δξ2) add up to form the error variances on the original observations, yi. Even if the errors on the gradient are assumed small, such a transformation is inappropriate because it leads to large errors on the original variable.

b. Two-dimensional problem

In the same way, if we assume that the observations are distributed spatially over an n-dimensional manifold, error correlations along the manifold can be simulated by adding gradient observations in the observation vector. The continuous problem is formally identical with several dimensions, except that ΞΎ is an n-dimensional vector of curvilinear coordinates, and T2 is the n-dimensional gradient, so that (34) becomes
i1520-0493-137-6-1908-e43
where Ξ” is the n-dimensional Laplacian operator and ρ = ||ΞΎ βˆ’ Ξ·|| is the Euclidean distance. Equation (43) stands for the homogeneous and isotropic problem, but it is straightforward to introduce inhomogeneity or anisotropy by a nonlinear change of the ΞΎ coordinates. In two dimensions, the solution of (43) is
i1520-0493-137-6-1908-e44
where K0 is the second kind modified Bessel function of order 0. It can indeed be easily verified [using the properties of K0 in Abramowitz and Stegun (1970)] that (44) is the solution of the homogeneous equation in (43) [i.e., Ξ΄(ρ) replaced by 0] everywhere except at the origin, and that the coefficient Οƒ02/2Ο€ is scaled so that the logarithmic singularity of (44) at ρ = 0 has the right amplitude to be the solution of (43) [viewed as a Green equation; see Morse and Feshbach (1953), chapter 7].
The two-dimensional discrete problem is also similar to the one-dimensional version. We assume that the observations yij are available on the two-dimensional surface at the nodes of a grid, with coordinates ΞΎij, Ξ·ij, i = 1, …, y1, j = 1, …, y2 (where y1 and y2 are the number of rows and columns of the grid), and that we add to the observation vector, observations of the two components of the discrete gradient (left difference):
i1520-0493-137-6-1908-e45
The size of the new observation vector is thus y+ = 3y1y2 βˆ’ (y1 + y2), that is, almost a factor 3 with respect to the number of original observations (y = y1y2). The transformation is
i1520-0493-137-6-1908-e46
where 𝗧1 is the identity matrix and 𝗧2,1, 𝗧2,2 are the discrete gradient operators. Note that each line of the 𝗧 operator combines only one or two observations, so that the cost of application of 𝗧 remains always negligible (the computational complexity is equal to 2y). The application of (23) with π—₯+ homogeneous leads to an expression of π—₯βˆ’1 similar to (39). However, the matrix is no longer tridiagonal because each element is combined with its four neighbors in the grid (), which cannot always be consecutive in the y vector. It can be easily seen that this provides a consistent discretization of (43) in two dimensions (except for a factor Δξ2/β„“2 in the discretization of the two-dimensional Dirac function), so that (44) is the asymptotic solution of the discrete problem (with a scale factor β„“2/Δξ2) as the grid steps tend to zero. Figure 2 presents the same information as Fig. 1 for the two-dimensional problem (with regular and isotropic grid spacings), illustrating the shape of the simulated covariance for Οƒ0 = 1 and for various values of β„“/Δξ, and showing the convergence toward the analytical solution (44) as Δξ β†’ 0.

c. Higher order derivatives

The solution of (43), which is valid for homogeneous and isotropic problems, can also be found in the spectral domain; the solution is then the error power spectrum , which is the Fourier transform of the covariance function. In isotropic problems, it only depends on the modulus of the wave vector (ΞΊ = ||k||), and the solution is
i1520-0493-137-6-1908-e47
Equation (47) gives the spectral distribution of the observational error: no error on the small scales (ΞΊ ≫ 1/β„“) and constant spectral distribution for the large scales (ΞΊ β‰ͺ 1/β„“). The corresponding isotropic covariance function R(ρ) given by (35) and (44) (valid for β„“ β‰  0) can then be found as the inverse Fourier transform (for the one-dimensional function) or the inverse Hankel transform (for the two-dimensional isotropic function) of (47) [see the general formulas (1.0) and (2.0) in Table 1].
More complex observation error power spectra can be simulated by adding p successive derivatives of the observations in the observation vector (with error standard deviation Οƒi, i = 1, …, p). Equation (47) then generalizes to
i1520-0493-137-6-1908-e48
Adding constraints on the successive observations derivatives (e.g., gradient, curvature) is thus equivalent to assuming a specific shape of the observation error power spectrum or of the observation error covariance function. This equivalence is similar in nature to the one-to-one correspondence (Kimeldorf and Wahba 1970; McIntosh 1990; Brankart and Brasseur 1996) between spline analysis (minimizing curvature and gradient) and statistical analysis (with specific shapes of the background error covariance). There, the correspondence is on the background constraint in the cost function (25) (first term) instead of being on the observation constraint (second term of the cost function).

Table 1 provides explicit expressions of the covariance function R(ρ) that can be obtained from (48) for some specific values of the parameters Οƒi. However, with expression (48), any shape of the observation error spectrum (provided that it is indefinitely continuously differentiable) can virtually be simulated; even negative Οƒi2 are possible (for 0 < i < p) providing that . For instance, simulating a Gaussian observation error spectrum (corresponding to a Gaussian covariance function, whatever the number of dimensions) requires an infinite derivative sequence Οƒi = β„“i, i = 1, …, ∞ [functions (1.7) and (2.5) in Table 1], but can be approximated by a truncated sequence. Going to a second-order derivative is nevertheless always necessary to simulate a correlation function with zero derivative at ρ = 0 [as functions (1.2)–(1.6) and (2.2)–(2.4) in Table 1].

It is interesting to note that, in one dimension (n = 1) and for one derivative included in the observation vector (p = 1), the error power spectrum given by (47) is characteristic of a random function Ξ΅(ΞΎ) that is governed by the differential equation:
i1520-0493-137-6-1908-e49
where Οƒ0w(ΞΎ) is a white noise with standard deviation Οƒ0. Equation (49) is the Langevin equation, which is used in statistical physics to describe the time evolution of particle velocities in the Brownian motion (Reif 1965) or the behavior of random fluctuations in thermodynamical systems (Landau and Lifshitz 1951, chapter 12). The corresponding correlation model given by (35) describes thus also the time correlation of these important physical processes. More generally, any observational noise that is related to a white noise with such a linear differential equation (written here in one dimension):
i1520-0493-137-6-1908-e50
is characterized by a power spectrum given by (48), with a number of derivatives p equal to the degree of the differential equation in (50). The parameters Οƒi of the power spectrum can be easily deduced from the coefficients ai, by transforming (50) in the spectral domain. For instance, for p = 2, an observational noise governed by
i1520-0493-137-6-1908-e51
is characterized by a correlation function that can be simulated by adding the first and second derivatives to the observation vector with associated error variances Οƒ12 = Οƒ02β„“βˆ’2/(Ξ»2 βˆ’ 2) and Οƒ22 = Οƒ02β„“βˆ’4 [correlation functions (1.2), (1.3), and (1.4) in Table 1]. Such relationships can help to determine the appropriate parameterization of the error power spectrum as soon as it is possible to find approximate linear differential equations governing the observational noise (e.g., if it is due to unresolved physical processes).

The generality of the method is more directly obvious for discrete problems since any transformation 𝗧 can be obtained by adding finite differences of successive orders. However, increasing the number p of derivatives added to the observation vector also increases the numerical cost, so that the most effective parameterization always results from a compromise between a fine representation of a target observation error spectrum and the numerical efficiency of the observational update. In this respect, two critical elements are always the identification of an accurate prior model for the observation error correlations and the validation of this model using the observed information.

4. Application to altimetry in the North Brazil Current

Ocean altimetric observations are distributed along lines (the satellite ground track), or in the future, also along two-dimensional ribbons (wide-swath altimeters). And it is known that altimetric observation errors (due to the altimetric measure itself, orbit errors, or atmospheric correction errors) are spatially correlated along the ground track (or across the swath). The purpose of this section is to demonstrate the sensitivity of the observational update to these observation error correlations, and to check if the parameterization proposed in this paper is appropriate to take these errors into account. This is done on the particular example of the North Brazil Current circulation.

a. Description of the experiment

The North Brazil Current is a surface western boundary current flowing westward along the north Brazilian coast. It is fed from the southeast by the tropical surface current, and brings the water to the northwest into the Caribbean Sea. The current sheds large anticyclonic rings (with diameter of about 200 km), that are also transported toward the Caribbean Sea, covering the 2000 km in about 3 months [see Fratantoni et al. (1995) for more details]. The total transport of the mean current is about 21 Sv (1 Sv ≑ 106 m3 sβˆ’1; da Silveira et al. 1994), with typical surface velocities of 1 m sβˆ’1 for the main current and for the rings, corresponding to dynamic height differences of about 0.2 m.

A reference simulation of the circulation is computed using a primitive equation model covering the tropical Atlantic between 15Β°S and 20Β°N. It is a subregion of the Drakkar global ocean configuration at a 1/4Β° resolution of the Nucleus for European Modelling of the Ocean model (NEMO; Barnier et al. 2006; Penduff et al. 2007), using boundary conditions extracted from a global simulation. The model atmospheric forcing is computed from European Centre for Medium-Range Weather Forecasts 40-year reanalysis (ERA-40) atmospheric data using bulk aerodynamic formula. A 5-yr reference simulation of the tropical Atlantic model (computed by repeating 5 times the 2002 atmospheric data) is illustrated in Figs. 3 and 4. In this study, we focus on the results obtained in the region of the North Brazil Current (between 4.5Β° and 12.5Β°N, and between 6Β° and 46.5Β°W), that is shown on the figures. Figure 3 presents two snapshots of the sea surface height, together with its gradient and surface velocity for 2 and 14 December of the first year, showing the rings moving westward, and illustrating the close relation between altimetry and surface velocity. Figure 4 shows the mean circulation (sea surface height, gradient and surface velocity) averaged over the 5 yr of the simulation, together with the corresponding standard deviation. The order of magnitude of the sea surface height variability is similar to the bulk error standard deviation on satellite altimetric measurements, which is presently about 0.04 m. This variability is thus only marginally observed by such satellites, so that a fine tuning of the statistical parameters is particularly needed.

To test the observational update with different kinds of observation error parameterization, we need to define (i) the background (or forecast) state xf, and (ii) the true state xt, from which the observations y are sampled, and to which the estimation must be compared. As background state, we use the mean circulation (shown in Fig. 4, top panels); as true state, we use one of the model snapshots (illustrated in Fig. 3). And as observation, we assume that altimetry is observed over the full domain, with a 4-cm error standard deviation, at every node of the model grid. To test the sensitivity of the solution to the kind of observation error, two observation vectors are generated from the true state xt: a first one by adding uncorrelated observation noise, and a second one by adding a correlated observation noise, with a covariance matrix given by (23), with transformation (46) (for various values of β„“= Οƒ0/Οƒ1). The noise is scaled to have a uniform standard deviation Οƒ = 0.04 m. To randomly draw Gaussian noise vectors with known covariance π—₯, we use the method described in the appendix of Fukumori (2002). Figure 5 (top panels) shows an example of such noise vectors, generated for three correlation lengths: β„“ = 0, 5, and 15 grid points. (In this section, β„“ = 0 stands for uncorrelated noise.) The figure also shows the corresponding error on the right difference between adjacent grid points, illustrating how the observational error on the discrete gradient decreases with β„“.

It is interesting to make the link between this simulated observational noise and the characteristics of observation error in real altimetry data. As explained at the beginning of section 2, observation error is always the sum of a measurement error and a representation error. On the one hand, the altimetric measurement is affected by several kinds of error (altimetric measure, orbit error, atmospheric correction error) with a bulk standard deviation of about 3–5 cm, and horizontal correlation patterns that can depend on the satellite orbit and on the state of the atmosphere. On the other hand, altimetric data are actually spatial averages over about 5–10 km along track, which is about a factor of 3 smaller than the resolution of our model. The resulting representation error, which corresponds to this limited range of spatial scales in the continuous ocean system, is thus here likely to remain small with respect to measurement errors. Consequently, the properties of our randomly simulated observational noise (4-cm standard deviation, with various correlation length scales) is chosen quite adequately to be in the range of what can be expected for real altimetric data in this region and for that kind of ocean model.

In addition, in order to increase the robustness of the test, each experiment is repeated by using, as true state, every snapshot of the sequence (one every 6 days for 5 yr), and the results are averaged over that ensemble of experiments. There is thus an ensemble of true states xit, i = 1, …, N (with N = 300) and the corresponding ensemble of observations yi sampled from them. (Observational errors are drawn independently for every member of the sample.) Hence, as soon as the ensemble of true states can be viewed as representative of all possible states of the system, our indicator gives the average error that is committed using the observation error parameterization that is being tested (starting from the mean as background state).

To parameterize the background (or forecast) error covariance matrix 𝗣if, we use the covariance of all snapshots of the model simulation (sampled every month over the 5 yr of the simulation), except those that are less than 1 month away from the true state (larger than the typical decorrelation time scale), in order to avoid any influence of the true state on the input error covariance matrix (𝗣if, is thus recomputed for every member i = 1, …, N). With an ensemble of about 60 independent realizations (one per month), it is only for correlations >0.26 that the 95% confidence interval for correlations (assuming normal pdfs) does not include zero. Correlations of <0.26 are thus not significant. Thus, if the size of the region is much larger than the spatial decorrelation scale, the integrated influence of distant observations with nonsignificant correlations can be as large as that of close observations with significant correlations. To avoid the spurious effect of these inaccurate long-range correlations (resulting from the use of a small size ensemble), we perform a separate local observational update [as in Brankart et al. (2003) or Testut et al. (2003)] for each water column, with an additional weight on the observations decreasing with the distance r as exp(βˆ’rβ€Š2/d2), with d = 200 km (the typical distance at which the correlation ceases to be significant). Figure 6 illustrates the resulting local structure of the background covariance that is used to perform the observational update. The figure shows the observational update increment that would result from one single observation (with 0.04-m error standard deviation), located at 9Β°N, 54Β°W (in the middle of the area traversed by the mesoscale rings). The long-range (nonsignificant) influence is effectively set to zero, without affecting much the local covariance structure described by the ensemble.

b. Uncorrelated errors

In experiment 1 (see Table 2), we use the observation vector that is perturbed by a white noise (observation errors are thus spatially uncorrelated) and the observation error covariance is parameterized using a diagonal matrix (π—₯ = Οƒ2π—œ). The parameterization is thus fully consistent with the simulated errors. Figure 7 (top panels) shows a map of the ensemble standard deviation of errors (difference with respect to the true state) after the observational update corresponding to experiment 1. It is shown for altimetry (ϡ΢), for the altimetric gradient, and for velocity (ϡυ). It is computed as (the formula for the gradient is similar to the formula used for velocity)
i1520-0493-137-6-1908-e52
where ΞΆit, uit, Ο…it is the ith true state (corresponding to the ith snapshot of the model sequence), ΞΆf,uf,Ο…f is the forecast state (corresponding to the mean of the model sequence), and δ΢i, Ξ΄ui, δυi is the ith observational update. This result can be directly compared with Fig. 4 (bottom panels), which represents the same quantity before the observational update. Indeed, since the background state is the mean state and since the ensemble of true state is the ensemble of all snapshots of the sequence, the standard deviation of the sequence is equal to the ensemble standard deviation of errors before the observational update; that is, (52) with δ΢i = 0, Ξ΄ui = 0, and δυi = 0. The comparison shows that the error on altimetry is significantly reduced by the observational update, becoming much smaller than both the background error standard deviation (Fig. 4, bottom panels) and the observational error standard deviation. This is because background errors are correlated over an area of about L Γ— L, with L ∼ 125 km (see Fig. 6), including about L2/Δξ2 ∼ 25 observations with uncorrelated errors. The resulting errors are thus about 1/ϡ΢2 ∼ 1/Οƒf2 + 25/Οƒ2. Observations are dense and very accurate and therefore the background has only little influence and the resulting error is ϡ΢ ∼ Οƒ/5 = 0.008 m, a rough estimation that is quite consistent with the results is observed in Fig. 7. Filtering off a white noise is easy if the background error correlation scales (about L) are large with respect to typical data spacing (Δξ).

However, the error reduction factor (with respect to background error) is less favorable for the gradient of altimetry. This is because computing altimetric difference Δ΢ between adjacent model cells amplifies the relative errors. Relative errors on velocities are again slightly worse because the relation between surface velocity and altimetry is not perfectly geostrophic (and thus not perfectly linear).

On the other hand, the observational update of the error covariance is illustrated in Fig. 7 (bottom panels), showing the estimated error standard deviation (the square root of the diagonal of 𝗣a). This estimation is quite consistent in amplitude and structure with the ensemble standard deviation of the error (measured by difference with respect to the true state), also shown in Fig. 7 (top panels). The good quality of the error estimate (for all variables) is the consequence of the consistent parameterization of the observation error covariance matrix; it also indicates that the background error covariance matrix (𝗣f) is quite accurately parameterized.

c. Correlated errors, with diagonal π—₯ parameterization

In experiment 2, we use the observation vector that is perturbed by the correlated noise (with β„“ = 5 grid points), but keep the same diagonal parameterization of the observation error covariance matrix (π—₯ = Οƒ2𝗹). The parameterization is thus inconsistent with the simulated errors, which are assumed uncorrelated, even though they are not. Figure 8 shows the corresponding error maps to be compared with Fig. 7. The comparison shows that the error on altimetry is significantly larger than in experiment 1. This is because the number of independent observations in the L Γ— L area is reduced to about (L/β„“)2 ∼ 1. Here, the background error keeps an influence, and the typical error is about ϡ΢ ∼ 0.03 m, a rough estimation that is again quite consistent with the results observed in Fig. 8.

However, larger errors on altimetry do not necessarily mean larger errors on the altimetry gradient or larger errors on velocity. In experiment 2, the error increase on the gradient with respect to experiment 1 is actually smaller than the error increase on altimetry. This is because the gradient is better observed through correlated observations than through uncorrelated observations (see Fig. 5). Sticking the solution to correlated observations (even with inappropriate diagonal error parameterization, as in experiment 2) thus partly compensates the easiness of filtering off a white noise from a large-scale (L = 125 km) signal (with optimal parameterization, as in experiment 1). This better observation of the gradient is the reason why keeping all available observations in the observation vector (even if the errors are very correlated) is always a better solution than subsampling the observations.

On the other hand, the observational update of the error covariance is illustrated in Fig. 8 (bottom panels), showing the estimated error standard deviation (the square root of the diagonal of 𝗣a). This estimation is identical to that of experiment 1 (Fig. 7, bottom panels), since all statistical parameters (𝗣f,π—₯) are kept identical. This largely underestimates the standard deviation of the true error (measured using the ensemble of differences with respect to the true state), that is shown in the top panels of Fig. 8. The estimation is about a factor of 3 below reality. This situation is the consequence of the inconsistent parameterization of the observation error covariance matrix. The diagonal π—₯ parameterization lets the scheme believe that the data are more accurate than they are, so that it underestimates the error that is effectively in the system.

d. Correlated errors, with consistent π—₯ parameterization

In experiment 3, we use the same observation vector as in experiment 2 (perturbed by the correlated noise), but add gradient observations to simulate correlations in the observation error covariance matrix, with the diagonal covariance matrix (38). By choosing Οƒ0 = 0.275 m and Οƒ1 = Οƒ0/β„“ (for β„“ = 5 grid points, see Table 3), this observation error parameterization is thus perfectly consistent with the simulated errors. According to the theory presented in section 3, these observations with correlated errors are thus equivalent to much less accurate observations (Οƒ0 = 0.275 m), together with accurate observations of the gradient (Οƒ1 = 0.055 m per grid point). This is consistent with the idea suggested above that increasing β„“ reduces the number of independent observations. Figure 9 shows the corresponding error maps to be compared to Figs. 8 and 7. The comparison shows that the errors on altimetry are still larger than in experiment 1 (Fig. 7), because the quantity of information in the observation vector is still the same as in experiment 2, but the errors are smaller than in experiment 2 (Fig. 8), because the observation error parameterization is now consistent, so that the observational update is closer to optimality. (𝗣f is still an approximation: it cannot be considered that the background error is drawn randomly from a pdf of covariance 𝗣f.)

However, the error reduction (with respect to experiment 2) on altimetry gradient and velocity is significantly larger because the better confidence that we must give to the gradient is now explicitly taken into account in the observational update, through the nondiagonal parameterization of the observation error covariance matrix π—₯. Better, this parameterization is effectively (and equivalently) applied in practice by the addition of gradient observations in the observation vector. (This addition of gradient observations can only bring the estimated gradient closer to the observed gradient, which is more accurate if the observation errors are spatially correlated.) The improvement of the gradient resulting from the parameterization of error correlations (if they exist) is thus clearly demonstrated by this experiment.

On the other hand, the observational update of the error covariance is illustrated in Fig. 9 (bottom panels), showing the estimated error standard deviation (the square root of the diagonal of 𝗣a). As in experiment 1, it is consistent with the standard deviation of the true error (measured using the ensemble of differences with respect to the true state), which is shown in the top panels of Fig. 9. Again, this is due to the consistent parameterization of the observation error covariance matrix, which has been restored by the addition of gradient observations (with adequate values for Οƒ0 and Οƒ1).

e. Sensitivity to the correlation scale

In this last section, we examine how the results presented above depend on the observation error correlation length. For that purpose, the same experiment is repeated for various simulated observation noises, with a correlation length β„“o ranging from 0 to 10 grid points. And, for each of these simulated noises, several parameterizations of the observation error covariance matrix are tested, using a correlation length β„“p also ranging from 0 to 10 grid points. Figure 10 shows the resulting error standard deviation for sea surface height and velocity (as measured by the ensemble of differences with respect to the true states), averaged over the domain of interest, as a function of β„“o and β„“p. The figure also shows the ratio between the averaged estimated error and the averaged measured error. It is only if β„“o = β„“p that the parameterization is consistent with the simulated errors: it is thus along that line that the measured error should be minimum (for a given β„“o) and that the ratio between estimated and measured errors should be equal to 1.

The results show that underestimating the observation error correlation length scale (β„“p β‰ͺ β„“o) leads to a moderate increase of the error on sea surface height, but to a very significant increase of the error on velocity. The estimation of the error standard deviation is also well below reality. This situation indeed corresponds to giving too much importance to the observations and to imposing too weak a constraint on the gradient. On the contrary, overestimating the observation error correlation length scale (β„“p ≫ β„“o) leads to a moderate increase of the error on velocity but to a significant increase of the error on sea surface height. The estimation of the error standard deviation is also well above reality for sea surface height, whereas no sensitivity can be observed for velocity. A correct tuning of β„“p is thus required to accurately estimate both variables, with consistent error estimates.

However it must be noted that in these experiments, the optimal parameterization is not on the line β„“o = β„“p as it should be, but noticeably below that value, especially for the large values of β„“o. The benefit obtained by giving to the observations more credit than they deserve (by using β„“p < β„“o) can only be explained by inaccuracies in the parameterization of the 𝗣f matrix, which is here approximated by a limited size ensemble. Overestimating the confidence in the observations is thus somewhat useful here to compensate suboptimalities in the statistical parameterization of the scheme.

5. Conclusions

Classical algorithms to compute the observational update in Kalman filters are penalized by a computational complexity proportional to the cube of the number of observations. In square root or ensemble Kalman filters, this algorithm can be modified [as proposed by Pham et al. (1998)] to become linear in the number of observations if the observation error covariance matrix is diagonal. In this paper, it has been demonstrated that these benefits can be preserved with two nondiagonal parameterizations of the observation error covariance matrix π—₯. The first method, parameterizing π—₯ as the sum of a diagonal and a low rank matrix, is especially efficient if the typical distance between observations is small with respect to the correlation scales. The second method, simulating correlations by application of a linear transformation of the observation vector (with diagonal π—₯ in the transformed space), is more generic. It is shown to be especially efficient to describe simple correlation structures if gradient observations can be added to the observation vector. This is possible, for instance, if the observations are distributed along lines or at the nodes of two-dimensional grids so that discrete gradients can be computed by subtracting successive observations. This has been shown to be equivalent to assuming a specific form of the observation error covariance matrix, with a correlation function or power spectrum that have been computed analytically in the asymptotic limit of dense (continuous) observations. The correlation scale is then the ratio of the observation error standard deviations that are assumed for the original observations and for the gradient observations.

Test experiments have been performed with the aim of reconstructing the circulation of the North Brazil Current, as simulated by a 1/4Β° model of the tropical Atlantic Ocean, using synthetic altimetric observations. Various observation datasets were generated by perturbation with uncorrelated and correlated noise, and for several correlation scales. For each dataset, diagonal and nondiagonal parameterizations of the observation error covariance matrix have been used to perform the observational update of altimetry together with surface velocities. The results show first that the more the observations are correlated, the less information they contain about altimetry. This is also true for velocity (but to a lesser degree) although the gradient of altimetry is better observed through correlated observations. Second, assuming a diagonal observation error covariance matrix in presence of a correlated noise leads to a nonoptimal solution that mainly penalizes the reconstruction of surface velocities, and underestimates the error variance (a factor of 3 lower than reality in our experiments). Third, optimal parameterizations of the observation error covariance matrix usually produce solutions that are close to minimizing the resulting error, although an artificial increase of the confidence to the observations (e.g., using a smaller correlation length) can lead to smaller errors (by compensating misspecifications of the forecast error covariance). Fourth, the experiments also suggest that our efficient parameterization of the observation error covariance matrix by adding gradient observations is appropriate to parameterize adequately observation error correlations. Adding explicit gradient observations can even be useful to compensate deficiencies in the forecast error statistics, by ensuring a direct control of velocities through gradient data. It must be stressed, however, that these conclusions may be sensitive to the region of interest: a fine-tuning of the observation error correlations may be less critical in regions where the noise-to-signal ratio is much smaller (like in the Gulf Stream region), since high relative accuracy is always obtained.

In ocean data assimilation applications, altimetry is always a key element of the observation system. However, the growing number of available observations (not only altimetry) often leads to a prohibitive numerical cost and to the temptation of simplifying the problem by aggregating (or even dropping) observations or by simplistic assumptions about the statistics (such as uncorrelated observation errors). These simplifications are always done at the expense of an optimal use of the observations, and singularly of altimetry, which is sensitive to that kind of approximation. The scheme proposed in this paper is a response to that problem: analyzing more observations at lesser cost becomes possible with realistic and robust parameterization of the observation error correlations. Being closer to statistical optimality, the scheme can thus make a better use of the observational information (especially about velocity), and be of direct benefit to ocean data assimilation systems.

Incidentally, our results also suggest a possible way of improving data thinning strategies. For instance, if the density of altimetric observations along the ground track is reduced, a critical information about the gradient (and thus about velocity) is likely to be lost, especially if the observation errors are correlated. A better strategy is certainly to transform the original observation vector by adding gradient observations, and parameterize a diagonal observation error covariance matrix in the transformed space as explained in this paper. The data thinning can then be performed on the transformed observation vector (aggregating observations and rescaling error variances) as if the data were independent. In that way, it becomes possible to give a reasonable importance to the gradient information in the reduced observation vector.

Acknowledgments

This work was conducted as part of the MERSEA project funded by the European Union (Contract AIP3-CT-2003-502885), with additional support from CNES. We also thank the anonymous reviewers for their useful comments and suggestions. The calculations were performed with the support of IDRIS/CNRS.

REFERENCES

  • Abramowitz, M., , and I. A. Stegun, 1970: Handbook of Mathematical Functions. 9th ed. Dover Publications, 1046 pp.

  • Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131 , 634–642.

  • Barnier, B., and Coauthors, 2006: Impact of partial steps and momentum advection schemes in a global ocean circulation model at eddy permitting resolution. Ocean Dyn., 56 , 543–567.

    • Search Google Scholar
    • Export Citation
  • Bateman, H., , and A. Erdelyi, 1954: Tables of Integral Transforms. Vols. 1 and 2. McGraw-Hill Book Company, 835 pp.

  • Brankart, J-M., , and P. Brasseur, 1996: Optimal analysis of in situ data in the western Mediterranean using statistics and cross-validation. J. Atmos. Oceanic Technol., 13 , 477–491.

    • Search Google Scholar
    • Export Citation
  • Brankart, J-M., , C-E. Testut, , P. Brasseur, , and J. Verron, 2003: Implementation of a multivariate data assimilation scheme for isopycnic coordinate ocean models: Application to a 1993–96 hindcast of the North Atlantic Ocean circulation. J. Geophys. Res., 108 , (C3). 3074. doi:10.1029/2001JC001198.

    • Search Google Scholar
    • Export Citation
  • Cohn, S. E., 1997: An introduction to estimation theory. J. Meteor. Soc. Japan, 75 , 257–288.

  • Da Silveira, I., , L. Miranda, , and W. Brown, 1994: On the origins of the North Brazil Current. J. Geophys. Res., 99 , (C11). 22501–22512.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., , and P. J. van Leeuwen, 1996: Assimilation of Geosat altimeter data for the Agulhas current using the ensemble Kalman filter with a quasigeostrophic model. Mon. Wea. Rev., 124 , 85–96.

    • Search Google Scholar
    • Export Citation
  • Fratantoni, D., , W. Johns, , and T. Townsend, 1995: Rings of the North Brazil Current: Their structure and behavior inferred from observations and a numerical simulation. J. Geophys. Res., 100 , (C6). 10633–10654.

    • Search Google Scholar
    • Export Citation
  • Fukumori, I., 2002: A partitioned Kalman filter and smoother. Mon. Wea. Rev., 130 , 1370–1383.

  • Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126 , 796–811.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, 341 pp.

  • Kimeldorf, G., , and G. Wahba, 1970: A correspondence between Bayesian estimation of stochastic processes and smoothing by splines. Annu. Math. Stat., 41 , 495–502.

    • Search Google Scholar
    • Export Citation
  • Landau, L., , and E. Lifshitz, 1951: Statistical Physics: Course of Theoretical Physics. Vol. 5. Butterworth-Heinmann, 592 pp.

  • Liu, Z., , and F. Rabier, 2002: The interaction between model resolution and observation resolution and density in data assimilation. Quart. J. Roy. Meteor. Soc., 128 , 1367–1386.

    • Search Google Scholar
    • Export Citation
  • Liu, Z., , and F. Rabier, 2003: The potential of high density observations for numerical weather prediction: A study with simulated observations. Quart. J. Roy. Meteor. Soc., 129 , 3013–3035.

    • Search Google Scholar
    • Export Citation
  • McIntosh, P. C., 1990: Oceanographic data interpolation: Objective analysis and splines. J. Geophys. Res., 95 , (C8). 13529–13541.

  • Morse, P. M., , and H. Feshbach, 1953: Methods of Theoretical Physics. Part I and II. Feshbach, 1978 pp.

  • Ott, E., , H. B. R. I. Szunyogh, , A. V. Zimin, , E. J. Kostelich, , M. Corazza, , E. Kalnay, , D. J. Patil, , and J. A. Yorke, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A , 415–428.

    • Search Google Scholar
    • Export Citation
  • Penduff, T., , J. Le Sommer, , B. Barnier, , A-M. Treguier, , J-M. Molines, , and G. Madec, 2007: Influence of numerical schemes on current-topography interactions in 1/4Β° global ocean simulations. Ocean Sci., 3 , 509–524.

    • Search Google Scholar
    • Export Citation
  • Pham, D. T., , J. Verron, , and M. C. Roubaud, 1998: Singular evolutive extended Kalman filter with EOF initialization for data assimilation in oceanography. J. Mar. Syst., 16 , 323–340.

    • Search Google Scholar
    • Export Citation
  • Rabier, F., 2006: Importance of data: A meteorological perspective. Ocean Weather Forecasting: An Integrated View of Oceanography, E. P. Chassignet and J. Verron, Eds., Springer, 343–360.

    • Search Google Scholar
    • Export Citation
  • Reif, F., 1965: Fundamentals of Statistical and Thermal Physics. McGraw Hill, 651 pp.

  • Testut, C., , P. Brasseur, , J. Brankart, , and J. Verron, 2003: Assimilation of sea-surface temperature and altimetric observations during 1992–1993 into an eddy permitting primitive equation model of the North Atlantic Ocean. J. Mar. Syst., 40–41 , 291–316.

    • Search Google Scholar
    • Export Citation
  • Tippett, M. K., , J. L. Anderson, , C. H. Bishop, , T. M. Hamill, , and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131 , 1485–1490.

    • Search Google Scholar
    • Export Citation
Fig. 1.
Fig. 1.

Observation error covariance as a function of the distance ρ, as obtained numerically by inversion of the tridiagonal matrix given by (39) (for Οƒ0 = 1 and different values of β„“/Δξ). The solution is drawn (dotted curves) (left) for β„“ = 1 and decreasing Δξ = 2, 1, and 0.5 and (right) for Δξ = 1 and decreasing β„“ = 2, 1, 0.5, and 0.1. Larger bullets correspond to smaller β„“/Δξ. In the left panel, the discrete solutions are multiplied by the factor β„“/Δξ, to show the convergence to the continuous solution given by (35) (solid curve).

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 2.
Fig. 2.

Observation error covariance as a function of the distance ρ (along the grid lines), as obtained numerically for regular and isotropic grid spacings (for Οƒ0 = 1 and different values of β„“/Δξ). The solution is drawn (dotted curves) (left) for β„“ = 1 and decreasing Δξ = 1, 0.5, and 0.2 and (right) for Δξ = 1 and decreasing β„“ = 2, 1, 0.5, and 0.1. Larger bullets correspond to smaller β„“/Δξ. In the left panel, the discrete solutions are multiplied by the factor β„“2/Δξ2, to show the convergence to the continuous solution given by (44) (solid curve).

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 3.
Fig. 3.

Snapshots of the circulation in the region of the North Brazil Current, as simulated by the model for (top) 2 and (bottom) 14 Dec of the first year. (left) The sea surface height (m), (middle) the magnitude of its gradient (meters per grid point), and (right) sea surface velocity (m sβˆ’1).

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 4.
Fig. 4.

As in Fig. 3, but for the (top) means and (bottom) standard deviations of the 5-yr simulation.

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 5.
Fig. 5.

Simulated observational noise on sea surface elevation for three correlation lengths: (from left to right) β„“ = 0, 5, and 15 grid points. (top) The random noise (with variance equal to 1) and (bottom) the corresponding gradient, using the grid spacing as the length unit.

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 6.
Fig. 6.

Observational update increment on (left) sea surface height (m), (middle) zonal velocity (m sβˆ’1), and (right) meridional velocity (m sβˆ’1), that would result from one single observation (with 0.04-m error standard deviation) of altimetry located at 9Β°N, 54Β°W (in the middle of the area traversed by the mesoscale rings). This illustrates the size of the domain of influence of the observations.

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 7.
Fig. 7.

Error standard deviation corresponding to experiment 1, (top) as measured by the ensemble of differences with respect to the true states, and (bottom) as estimated by the scheme (the square root of the diagonal of 𝗣a). It is shown (left) for altimetry (m), (middle) for its gradient (m per grid point), and (right) for velocity (m sβˆ’1).

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 8.
Fig. 8.

As in Fig. 7, but for experiment 2.

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 9.
Fig. 9.

As in Fig. 7, but for experiment 3.

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Fig. 10.
Fig. 10.

This figure generalizes the results of Figs. 7, 8, and 9 (here averaged over the domain), by showing them as a function of the observation error correlation length scales (in grid points) β„“o (x axis), characterizing the simulated noise, and β„“p (y axis), which is used to parameterize the observation error covariance matrix π—₯. Shown are results for (left two panels) sea surface height and (right two panels) velocity. Within each variable pair, the left panel shows the true error standard deviation (as measured by the ensemble of differences with respect to the true states), and the right panel shows the ratio between estimated and measured error standard deviations.

Citation: Monthly Weather Review 137, 6; 10.1175/2008MWR2693.1

Table 1.

Observation error power spectra and associated covariance functions. All spectra have the form of (48), so that they can be directly simulated by adding successive derivatives of the observations in the observation vector. The corresponding covariance functions have been derived from the tables of integral transforms compiled by Bateman and Erdelyi (1954); Jp is the first kind Bessel function of order p, Kp is the second kind modified Bessel function of order p, and kei0 is a Kelvin function (see Abramowitz and Stegun 1970). In functions (1.4) and (1.5), the parameters ΞΈ and Ξ± are such that βˆ’Ο€/2 < ΞΈ < Ο€/2 and 0 < Ξ± < 1. Some particular cases are included separately: (1.1) is (1.6) with p = 0; (1.2) is (1.4) with ΞΈ = Ο€/4; (1.3) is (1.4) with ΞΈ = 0; (1.3) is (1.6) with p = 1; (2.1) is (2.4) with p = 0; (2.3) is (2.4) with p = 1.

Table 1.
Table 2.

The three experiments described in this paper only differ either by the (simulated) observation error or by the parameterization of the observation error. The observation error standard deviation is always set to 0.04 m, with consistent parameterization. The difference is only in the correlation: in experiment 1, the observation error is simulated by a white noise; in experiments 2 and 3, it is a correlated noise with spatial correlation β„“ = 5 grid points. In experiments 1 and 2, the observation errors are parameterized using a diagonal π—₯ matrix (i.e., assuming uncorrelated errors); in experiment 3, gradient observations are added to simulate correlations. Hence, only experiments 1 and 3 receive a parameterization that is consistent with the simulated errors, (Οƒ1 = ∞ means that no gradient observations are used)

Table 2.
Table 3.

Values of observation error standard deviation (Οƒ0, m) and gradient error standard deviation (Οƒ1, m per grid point) to use for parameterizing observation errors with standard deviation Οƒ = 0.04 m and correlation length β„“ (in observation grid points). The correspondence is established using (23), with transformation (46).

Table 3.

* Current affiliation: MERCATOR-Ocean, Toulouse, France.

Save