## Abstract

Most data assimilation algorithms require the inverse of the covariance matrix of the observation errors. In practical applications, the cost of computing this inverse matrix with spatially correlated observation errors is prohibitive. Common practices are therefore to subsample or combine the observations so that the errors of the assimilated observations can be considered uncorrelated. As a consequence, a large fraction of the available observational information is not used in practical applications. In this study, a method is developed to account for the correlations of the errors that will be present in the wide-swath sea surface height measurements, for example, the Surface Water and Ocean Topography (SWOT) mission. It basically consists of the transformation of the observation vector so that the inverse of the corresponding covariance matrix can be replaced by a diagonal matrix, thus allowing to genuinely take into account errors that are spatially correlated in physical space. Numerical experiments of ensemble Kalman filter analysis of SWOT-like observations are conducted with three different observation error covariance matrices. Results suggest that the proposed method provides an effective way to account for error correlations in the assimilation of the future SWOT data. The transformation of the observation vector proposed herein yields both a significant reduction of the root-mean-square errors and a good consistency between the filter analysis error statistics and the true error statistics.

## 1. Introduction

Future wide-swath radar altimeter missions [e.g., Surface Water and Ocean Topography (SWOT); Durand et al. 2010] offer an unprecedented opportunity for observing ocean dynamics at scales smaller than 100 km. These observations will indeed provide global measurements of sea surface height (SSH) at kilometer scale with centimeter precision. If used adequately in operational centers, this information could eventually provide direct estimates of surface currents at scales down to 10 km. Reaching this objective could yield a breakthrough in our understanding of energy cascades and tracer transport in the surface ocean (Fu et al. 2012).

From a physical perspective, the potential of wide-swath altimetric observations of SSH depends on our ability to map surface pressure gradients at high resolution (<10 km). SSH measurements over two-dimensional swaths indeed provide an instantaneous piece of information on the near-surface pressure field at high resolution that should allow for estimating surface pressure gradients more accurately than with conventional altimetry (Fu et al. 2012). Accurate estimates of near-surface pressure gradients then provide key information on the forces that constrain the time evolution of oceanic surface flows.

But it is unclear whether current implementations of data assimilation methods used in operational centers are adapted to benefit from the full potential of wide-swath altimetric data for estimating surface pressure fields. These implementations either subsample or combine altimetric observations so that the errors of SSH measurements can be considered uncorrelated (Oke et al. 2008). This is because of the prohibitive cost of explicitly computing the inverse of the observation error covariance matrix. This cost is proportional to in general (*m* is the total number of observations), whereas it is only linear in *m* when the covariance matrix is diagonal.

The assimilation of wide-swath SSH measurements may be a challenge for the algorithms currently used by the prediction centers to assimilate altimetric observation because the errors associated with these measurements are expected to be highly correlated in space. On top of this, the number of available observations will significantly increase. Therefore, it is desirable to still rely on the algorithms for which the complexity is linear in the number of observations. At the same time, to obtain high-quality information about the finescale structure of the oceanic flow, it will be necessary to use all observations and to take into account the error correlations to extract as much information as possible from the observations.

The problem of dealing with correlated observation errors in data assimilation is not new. Liu and Rabier (2002) suggest that data assimilation schemes commonly used in operational (meteorological) centers are not suited to correctly deal with dense observations with correlated errors. Studies conducted by Stewart et al. (2008, 2013), Miyoshi et al. (2013), and Waller et al. (2014) all highlight the clear benefit of accounting for the observation error correlations in data assimilation. These authors also propose methods to estimate and account for the observation error correlations in data assimilation, but they do not show the possible implementation with high-dimensional systems.

In this article we present a method to account for the correlations of the errors expected in the SWOT measurements in data assimilation with high-dimensional systems while still using a diagonal observation error covariance matrix. The general idea is to apply a local transformation to the original observations. In this work, the transformation consists of augmenting the observation vector with the first- and second-order spatial derivatives of the original observations. This local transformation has an order *m* complexity and is fully compatible with spatial localization in the analysis step. This way we fulfill both conditions exposed above; that is, we still rely on an algorithm for which the complexity is linear in the number of observations, and we account for the error covariances in the assimilation. The idea of applying local transformations to the observations is not new and is presented in the work of Brankart et al. (2009). The novelty of this study is the fitting of a parametric covariance matrix to the “observed” SWOT covariance matrix by solving an optimization problem.

The objective of this article is twofold: (i) to present a simple and cheap method to efficiently account for the spatial covariances of the SWOT observation errors in linear least squares data assimilation methods, and (ii) to test and illustrate the method for the assimilation of the future SWOT data. Section 2 briefly presents the challenge of accounting for observation error correlations from the technical viewpoint. In section 3 the software used to model the SWOT error budget is presented, and the error is characterized in terms of its probability distribution and statistical moments. In section 4a the methodology used to fit a parametric form to the SWOT error covariance matrix is described, and in section 4b the obtained results are presented. Section 5a presents the configuration of the conducted numerical experiments. The results of these experiments are presented in section 5b. A final discussion and concluding remarks are presented in section 6.

## 2. Observation error correlations in data assimilation

Least squares data assimilation methods aim to find a state vector that minimizes the cost function given by

where and is the canonical inner product. In the cost function [Eq. (1)] the first term measures the distance between the observations and the corresponding state vector , where maps the state vector from the model space to the observation space, weighted by the inverse of the covariance matrix of the observation error ; and the second term measures the distance between the background state and the state vector , weighted by the inverse of the covariance matrix of the background error . Therefore, it is readily seen that the solution of this problem depends on the specification of the matrices and .

The minimum of the cost function [Eq. (1)] can be efficiently calculated when the matrix is diagonal, that is, when the observation errors are not correlated. For reduced-order methods based on the Kalman filter, with *N* directions represented in the state space (the ensemble size in ensemble Kalman filters), the computational cost of minimizing Eq. (1) is of order , when both and are full matrices. When is diagonal, the Kalman gain may be rewritten to provide a computational cost of order (Brankart et al. 2009). Since for reduced-order methods *N* is usually much smaller than *m*, and the covariance structures presented in are in general poorly known, is very often approximated by a diagonal matrix and the algorithms with complexity cost of order are usually used. Common methods to minimize the negative consequences of such an approximation are data thinning (Järvinen and Undén 1997), data combination (Oke et al. 2008), and the inflation of the observation error variances (e.g., Li et al. 2009; Brankart et al. 2010).

## 3. Simulation of SWOT measurement errors

SWOT measurement errors are modeled using the “SWOT simulator” software developed at JPL (Ubelmann et al. 2016; Gaultier et al. 2016). In the first step, the simulator constructs a regular grid based on the baseline orbit parameters of the satellite (20.86-day repeat cycle, inclination of 77.6°, and altitude of 891 km) and the characteristics of the radar interferometer on board (120-km wide swath with a middle gap of 20 km). The grid resolution is adjustable. In this work, we choose 9 km to reduce the computational burden and to remove small-scale noise. In the second step, the simulator reads from files SSH data simulated with an ocean circulation model and interpolates these data on the SWOT grid. In the third and last step, it produces random fields of SWOT-like errors according to spectral power density functions determined by the SWOT scientific team (Esteban-Fernandez 2013) and adds them to the SSH data. Details about the generation of errors can be found in Ubelmann et al. (2016) and Gaultier et al. (2016).

The SWOT simulator models six types of errors: Ka-band Radar Interferometer (KaRIn) error (due to thermal noise in the interferometer channel), roll error (due to oscillations of the platform), timing error (due to the precision of the radar timing system), phase error (due to roughness of the sea surface at the scale of the radar pulse), baseline dilatation error (due to the variation of the length of the baseline), and wet troposphere error (due to the path delay of the radar pulse, ascribable to tropospheric humidity). In the SWOT simulator, individual realizations of errors associated with each of the abovementioned sources are drawn from statistical distributions of errors specified according to the current knowledge of the SWOT error budget. The results presented in the following sections are therefore valid according to the current knowledge of the SWOT error budget.

Figure 1 illustrates one sample of each component of the error listed above at the 9-km resolution. The errors with the largest amplitudes are the roll, phase, and wet troposphere errors, while the least important sources are the timing and baseline errors. KaRIn errors are of the same order of magnitude as the first three at high resolution; here, they are smoothed by the grid coarsening. The cross-track distribution of the error amplitude is heterogeneous for the roll, phase, and baseline errors, with larger errors close to the outer boundary of each swath. All types of error except the KaRIn error exhibit significant spatial correlations in both cross- and along-track directions; between the two half-tracks, roll errors are anticorrelated by nature, while baseline errors are fully correlated.

To quantify the error statistics and get an estimate of the SWOT error covariance matrix, 5000 realizations of the error are computed with the simulator at a spatial resolution of 9 km. We choose this resolution to filter out the spatially uncorrelated KaRIn noise present at the nominal SWOT resolution of 1 km. This reduces the number of realizations necessary to characterize the error statistics and makes the implementation of the method described in the next section simpler. The treatment of the SWOT data at full resolution will be addressed in future works.

Figure 2 shows the total error distribution for a point at the outer boundary of the left swath and the analytical Gaussian function with the same sample mean and covariance. It is readily seen that the error is Gaussian distributed with mean and standard deviation and , respectively. This is an expected result, since each error component is, by construction, an independent realization of a Gaussian random variable; it is well known that a finite sum of independent Gaussian-distributed variables is also Gaussian distributed.

Figure 3 shows the spatial correlation field for a point at the outer boundary of the left swath. The correlation length is slightly anisotropic with the preferred orientation pointing in the along-track direction. The correlation values are very high and decrease to 0.5 (black line in the figure) at around 388 km from the reference point. The large-scale correlations reveal the need for a nondiagonal parameterization of the covariance matrix, otherwise most information contained in these correlations would be lost.

In the next section the sample covariance matrix diagnosed from the 5000 realizations of the SWOT simulator is used as the known covariance matrix for which a parametric form is searched.

## 4. Parameterization of the covariance matrix

### a. Theoretical aspects

Our objective is to assimilate a set of *m* observations that are known to have correlated errors by evaluating the observation term of the cost function [Eq. (1)], in a transformed space where the observation error covariance matrix is diagonal.

To evaluate the observation term in a transformed space, it is first needed to define a linear transformation , with for a transformed space size , and an appropriate diagonal matrix , which is associated with the transformed observation vector . Then, one may see that the observation term in the original space is equal to the observation term in the transformed space,

if

Therefore, our problem is to find an appropriate once we are given and .

The principal axis theorem guarantees the existence of , which applied to produces . However, this is obtained through the eigendecomposition of , which has a computational complexity of order . In addition, in this case has a complex structure and each new observation is a linear combination of all original observations . This is an important point, since in the cases in which domain localization is used in the assimilation procedure, it requires the computation of the eigendecomposition of the associated local covariance matrix at each analysis step (see section 5a for a more complete description of the domain localization technique).

The main interest of the presented method is to use simple local linear transformations for which the cost of calculating remains linear in the number of observations. As it is shown in Brankart et al. (2009), when has some specific form, such as isotropic exponential correlations for instance, or is characteristic of a correlation function governed by a differential equation, there exists and such that exactly matches .

### b. Implementation for SWOT observations

Since from the SWOT simulator does not have any analytic form and is not a solution of a known differential equation, our approach is to choose a simple linear transformation and then find an that minimizes the square of the residual *r* measured by the Frobenius norm ,

where is the identity matrix. We recall that the Frobenius norm is induced by the inner product of two matrices from which we may write the norm as . The advantage of using a norm induced by an inner product is the possibility of using orthogonal projections to solve our problem. Moreover, the chosen cost function has the advantage of avoiding the computation of the inverse of .

Given the nature of the observations this study is dealing with, the chosen transformation operator is composed of the identity and the first two derivatives of the observations,

where is the first-order spatial derivative in the along- and cross-track directions and is the second-order spatial derivative in the along- and cross-track directions. The choice of this particular linear transformation is motivated by the work of Brankart et al. (2009). Brankart et al. (2009) have indeed shown how the addition of gradient observations to the observation vector can be equivalent to assuming a specific form of the observation error covariance matrix. With this transformation, the extra observations added to the observation vector have a straightforward physical interpretation. The first derivative of SSH is indeed related to the geostrophic velocity, and the second derivative of the SSH is related to the geostrophic vorticity. However, the method performance is not necessarily dependent on how dominant the geostrophic dynamics actually is. Note although that the smooth shape of the SWOT error correlation function (see Fig. 3) at the origin requires at least the inclusion of the second derivative.

Experiments not shown in this article indicate that decreases with the addition of higher-order derivatives in . However, one should keep in mind that the size of the observation vector also increases, which is why we kept only the first two derivatives. For the transformation in Eq. (5), the size of the new observation vector for an original 2D observation of size and for forward discrete operators is ; that is, the size of the observation vector is increased by almost a factor of 5.

At this point all necessary ingredients to solve the proposed problem are known: the covariance matrix from the SWOT simulator and a simple linear transformation . Therefore, given in the form of a block matrix,

with diagonal blocks _{0}, _{1a}, _{1c}, _{2a}, and _{2c}, we can use the definition of the operator and Eq. (3) to calculate the inverse of in the space of observations ,

where and are the discrete forward first- and second-order derivative operators in the along- and cross-track directions, respectively.

Thus, the problem is to find the matrices _{0}, _{1a}, _{1c}, _{2a}, and _{2c} that minimize the cost function [Eq. (4)].

To reduce the size of the minimization problem and to make it more regular, we use the simulator to gain background information about by calculating an ensemble of realization of the SWOT errors , and assuming that is proportional by block to the variance of the direct transformation of the sample covariance matrix ,

Therefore, the problem is to find five scalar parameters that minimize Eq. (4). The minimization is here performed with a Gauss–Newton method.

Figure 4 shows _{0}, _{1a}, _{1c}, _{2a}, and _{2c} obtained by the minimization of Eq. (4). Matrix _{0} displays variances much larger than the original observation error variances, by a factor between 100 and 1000. These high variances are coherent with the long-range correlations observed in the observation errors. The observation for which more weight is given is the along-track second-order derivative of the original observations. This may happen due to a more regular error in the along-track direction when compared to the cross-track direction.

Figure 5 shows the covariance fields from the reference, SWOT simulator–based covariance matrix, and from the identified covariance matrix, with respect to the same grid point. The identified matrix appears as a reliable approximation of the SWOT error covariance matrix, especially for the near field. Some small differences are observed for distances larger than 500 km. This is not a big issue, since in general the analysis step conducted by prediction centers is performed with a typical domain localization radius of a few hundred kilometers (e.g., Oke et al. 2008), which makes the impact of these distant covariances negligible.

Finally, the spectral characteristics of the covariance matrices are analyzed. Figure 6 shows the singular values of both matrices, the one issued from the SWOT simulator and the matrix identified by our method in the original observation space. It is seen that the spectra are very close for the first 2440 singular values, which is quite a good result since the estimated rank of is 3015.

In this section we have shown that with an extended observation space, obtained by adding successive derivatives of the original SWOT observations to the observation vector, it is possible to model a quite general covariance function. The benefit of using is illustrated in the following section.

## 5. Numerical experiments

### a. Experiment configuration

The Sea Box for Assimilation (SEABASS) configuration of the NEMO model (Madec 2008) at 1/12° resolution is used to simulate the true state and the background state. This configuration simulates an idealized double-gyre circulation forced by a stationary analytical zonal wind. For a more complete description of the SeaBass setup, we refer to Cosme et al. (2010) and Bouttier et al. (2012).

The matrix is built using an ensemble of 181 perturbations calculated from regularly spaced outputs of a 10-yr simulation.

The true state is drawn after an additional 2-yr simulation; that is, it is not included in the sequence of states used to produce . The observations are obtained by interpolating the SSH field from the true state onto the SWOT swath and by adding one realization of the error from the SWOT simulator.

An ensemble of 100 backgrounds states is drawn from a Gaussian distribution , where is the true system state and *α* is a factor used to reduce the spread of the background perturbations. Drawing the background states this way ensures that the background error is in the space spanned by the perturbations, which means that given enough and noise-free observations one can perfectly correct any background state with a linear combination of the ensemble of perturbations. The *α* parameter is used to make the variance level of the background error closer to the variance of the observation error. This situation is expected in an assimilation system after it has reached its asymptotic error level and for which the model error is reasonably small compared to the uncertainties in the initial condition.

Because of the limited size of the ensemble used to construct the background covariance matrix, domain localization (Janjić et al. 2011) is used to cut off unrealistic long-range correlations due to sampling errors. In this method each model grid point is analyzed independently from the others, and only observations lying within a distance of 600 km from the analyzed point are taken into account. In addition, the inverse of the observation error covariance matrix is Schur multiplied by a Gaussian function to ensure that the observations that are closer to the analyzed point have greater weights.

An experiment is defined by a set of 100 Kalman filter analyses performed using the 100 background states. Three experiments using different approximations of are conducted: (i) using a diagonal matrix containing only the diagonal terms of the original covariance matrix : ; (ii) using the diagonal matrix corresponding to the first block of our transformed matrix , which is in fact an inflated matrix and corresponds to a very common solution used to assimilate correlated observations (Miyoshi et al. 2013); and (iii) using our parametric matrix ; we recall that in this case the assimilated observations are SSH and its first and second derivatives.

For the analysis we use the local ensemble transform Kalman filter (LETKF). For more details about the algorithm used in the analysis step, we refer to Candille et al. (2015).

### b. Data assimilation results

This section aims to demonstrate the validity of the approach presented in section 4 through its application in a data assimilation analysis step. The results are presented in terms of the root-mean-square error (rmse), for which the mean is taken over the ensemble of analyses, and two aspects are analyzed: (i) which approximation of produces the smallest rmse, and (ii) which approximation produces the best consistency between the analysis covariance matrix computed by the assimilation algorithm and the actual analysis error variance. When both matrices and are correctly specified, the rmse calculated from the ensemble of the analyses should be equal to the square root of the diagonal of .

When is used—that is, when the observation error correlation structure is ignored—too much confidence is given to the observations and this has a double effect: (i) the analysis field is strongly contaminated with the measurement errors, resulting in high rmse; and (ii) the analysis error covariance matrix does not represent the actual uncertainty of the analysis. This is shown in Fig. 7, which displays the rmse and the square root of the diagonal of the posterior matrix calculated by the filter for the observed variable SSH, the meridional and zonal surface velocities, and the surface temperature. The filter produces rmse values larger than the background rmse values for SSH, meridional velocity, and temperature. Errors in zonal velocity are reduced, but there are some regions with larger errors, particularly at the southern portion of the observed region. Concerning consistency, the term for SSH has collapsed, that is, it has a very low value. Therefore, there is no consistency between the statistics produced by the filter and the true statistics. For velocities and temperature, the values of the standard deviation produced by the filter are not as low as for SSH, but there is still no consistency between the true error statistics and the error statistics produced by the filter.

When is used for the analysis—that is, when observation error covariances are ignored but variances are inflated to account for this ignorance—the error statistics produced by the filter are consistent with the true error statistics, but the analysis reduces the errors very poorly on the SWOT track. This is shown in Fig. 8. In comparison with the experiment using , the filter analysis produces smaller rmse values for SSH, meridional velocity, and temperature. For zonal velocity, errors are larger in the central region, where the dynamics are more intense—that is, in the area under the influence of the main wind jet. In terms of consistency, this experiment qualitatively outperforms the experiment using for all variables. However, as shown by , errors are poorly reduced in comparison with the previous case, because observation error variances are here considered very large.

Using in the Kalman filter analysis leads to a significant reduction of the errors for all variables and, simultaneously, a good consistency between the true error statistics and the error statistics produced by the filter. This is shown in Fig. 9. Definitely, including additional information about the first and second derivatives of the observation in the observation vector to parameterize the error spatial covariances has a strong and beneficial effect on the analyzed field, especially for the observed variable (SSH). In particular, it makes the filter capture the true uncertainty of the ocean state estimation.

## 6. Discussion and concluding remarks

This article explored the idea of extending the observation vector with a simple local transformation of the original observations vector for the assimilation of these observations with a least squares–based data assimilation method such as the Kalman filter. The technique is designed to simulate spatial correlations present in the observation errors. The chosen transformation is composed of the identity matrix and the first and second derivatives of the original observation. A diagonal matrix, whose entries are obtained through the solution of an optimization problem, is associated with this new set of observations.

The obtained results prove that the linear transformation introduced here is appropriate for the assimilation of SWOT-like observations at a 9-km resolution. The effectiveness of the proposed methodology was analyzed in terms of (i) the rmse values of Kalman filter analyses; and (ii) the consistency between the error standard deviation of the Kalman filter analysis, computed from the analysis covariance matrix, and the error standard deviation of an ensemble of Kalman filter analyses. Both metrics are improved when observation error covariances are accounted for through the observation vector transformation. Ignoring the covariances leads to an inconsistency between the true error statistics and the error statistics produced by the Kalman filter. Compensating this ignorance with an inflation of the observation error variances leads to better consistency but a reduced impact in terms of rmse values.

Again, the technique presented in this work results in the use of a diagonal form of the observation error covariance matrix. Therefore, it enables the assimilation of dense and structured satellite observations of the atmosphere or the ocean, such as sea surface temperature or ocean color. Here, the focus was made on the future wide-swath altimetry mission SWOT. Regarding SWOT, we face two specific challenges: (i) transposing the technique to SWOT observations at their nominal resolution of 1 km, and (ii) operating the technique in a full data assimilation system. At the 1-km resolution, the data are heavily affected by the spatially uncorrelated noise from the KaRIn instrument, what makes the application of the technique more difficult. To address the first point, we plan to investigate the use of advanced image restoration techniques to filter out KaRIn noise and make the method applicable to SWOT data at the kilometer resolution. The second point is challenging for two reasons: First, because it will be difficult to implement advanced data assimilation techniques with models at SWOT-like resolutions of a few kilometers at short and middle terms; second, because the simultaneous assimilation of SWOT with other observations of different resolutions (conventional altimetry, typically), a practice commonly known as multiscale data assimilation (e.g., Fieguth et al. 1995), is challenging in itself. These challenges will be addressed in future works.

## Acknowledgments

The research presented in the paper was supported by the Centre National d’Études Spatiales (CNES) and the Seventh Framework Programme FP7/2007–2013 of the European Commission through the Stochastic Assimilation for the Next Generation Ocean Model Applications (SANGOMA) project (Grant Agreement 283580). Computations were carried out using high performance computing resources from Grand équipement national de calcul intensif–Institut du développement et des ressources en informatique scientifique (GENCI-IDRIS; Grant 2014-0111279).

## REFERENCES

## Footnotes

^{a}

Current affiliation: Mercator Océan, Toulouse, France.

^{b}

Current affiliation: CLS, Toulouse, France.