## 1. Introduction

Future wide-swath radar altimeter missions [e.g., Surface Water and Ocean Topography (SWOT); Durand et al. 2010] offer an unprecedented opportunity for observing ocean dynamics at scales smaller than 100 km. These observations will indeed provide global measurements of sea surface height (SSH) at kilometer scale with centimeter precision. If used adequately in operational centers, this information could eventually provide direct estimates of surface currents at scales down to 10 km. Reaching this objective could yield a breakthrough in our understanding of energy cascades and tracer transport in the surface ocean (Fu et al. 2012).

From a physical perspective, the potential of wide-swath altimetric observations of SSH depends on our ability to map surface pressure gradients at high resolution (<10 km). SSH measurements over two-dimensional swaths indeed provide an instantaneous piece of information on the near-surface pressure field at high resolution that should allow for estimating surface pressure gradients more accurately than with conventional altimetry (Fu et al. 2012). Accurate estimates of near-surface pressure gradients then provide key information on the forces that constrain the time evolution of oceanic surface flows.

But it is unclear whether current implementations of data assimilation methods used in operational centers are adapted to benefit from the full potential of wide-swath altimetric data for estimating surface pressure fields. These implementations either subsample or combine altimetric observations so that the errors of SSH measurements can be considered uncorrelated (Oke et al. 2008). This is because of the prohibitive cost of explicitly computing the inverse of the observation error covariance matrix. This cost is proportional to *m* is the total number of observations), whereas it is only linear in *m* when the covariance matrix is diagonal.

The assimilation of wide-swath SSH measurements may be a challenge for the algorithms currently used by the prediction centers to assimilate altimetric observation because the errors associated with these measurements are expected to be highly correlated in space. On top of this, the number of available observations will significantly increase. Therefore, it is desirable to still rely on the algorithms for which the complexity is linear in the number of observations. At the same time, to obtain high-quality information about the finescale structure of the oceanic flow, it will be necessary to use all observations and to take into account the error correlations to extract as much information as possible from the observations.

The problem of dealing with correlated observation errors in data assimilation is not new. Liu and Rabier (2002) suggest that data assimilation schemes commonly used in operational (meteorological) centers are not suited to correctly deal with dense observations with correlated errors. Studies conducted by Stewart et al. (2008, 2013), Miyoshi et al. (2013), and Waller et al. (2014) all highlight the clear benefit of accounting for the observation error correlations in data assimilation. These authors also propose methods to estimate and account for the observation error correlations in data assimilation, but they do not show the possible implementation with high-dimensional systems.

In this article we present a method to account for the correlations of the errors expected in the SWOT measurements in data assimilation with high-dimensional systems while still using a diagonal observation error covariance matrix. The general idea is to apply a local transformation to the original observations. In this work, the transformation consists of augmenting the observation vector with the first- and second-order spatial derivatives of the original observations. This local transformation has an order *m* complexity and is fully compatible with spatial localization in the analysis step. This way we fulfill both conditions exposed above; that is, we still rely on an algorithm for which the complexity is linear in the number of observations, and we account for the error covariances in the assimilation. The idea of applying local transformations to the observations is not new and is presented in the work of Brankart et al. (2009). The novelty of this study is the fitting of a parametric covariance matrix to the “observed” SWOT covariance matrix by solving an optimization problem.

The objective of this article is twofold: (i) to present a simple and cheap method to efficiently account for the spatial covariances of the SWOT observation errors in linear least squares data assimilation methods, and (ii) to test and illustrate the method for the assimilation of the future SWOT data. Section 2 briefly presents the challenge of accounting for observation error correlations from the technical viewpoint. In section 3 the software used to model the SWOT error budget is presented, and the error is characterized in terms of its probability distribution and statistical moments. In section 4a the methodology used to fit a parametric form to the SWOT error covariance matrix is described, and in section 4b the obtained results are presented. Section 5a presents the configuration of the conducted numerical experiments. The results of these experiments are presented in section 5b. A final discussion and concluding remarks are presented in section 6.

## 2. Observation error correlations in data assimilation

The minimum of the cost function [Eq. (1)] can be efficiently calculated when the matrix *N* directions represented in the state space (the ensemble size in ensemble Kalman filters), the computational cost of minimizing Eq. (1) is of order *N* is usually much smaller than *m*, and the covariance structures presented in

## 3. Simulation of SWOT measurement errors

SWOT measurement errors are modeled using the “SWOT simulator” software developed at JPL (Ubelmann et al. 2016; Gaultier et al. 2016). In the first step, the simulator constructs a regular grid based on the baseline orbit parameters of the satellite (20.86-day repeat cycle, inclination of 77.6°, and altitude of 891 km) and the characteristics of the radar interferometer on board (120-km wide swath with a middle gap of 20 km). The grid resolution is adjustable. In this work, we choose 9 km to reduce the computational burden and to remove small-scale noise. In the second step, the simulator reads from files SSH data simulated with an ocean circulation model and interpolates these data on the SWOT grid. In the third and last step, it produces random fields of SWOT-like errors according to spectral power density functions determined by the SWOT scientific team (Esteban-Fernandez 2013) and adds them to the SSH data. Details about the generation of errors can be found in Ubelmann et al. (2016) and Gaultier et al. (2016).

The SWOT simulator models six types of errors: Ka-band Radar Interferometer (KaRIn) error (due to thermal noise in the interferometer channel), roll error (due to oscillations of the platform), timing error (due to the precision of the radar timing system), phase error (due to roughness of the sea surface at the scale of the radar pulse), baseline dilatation error (due to the variation of the length of the baseline), and wet troposphere error (due to the path delay of the radar pulse, ascribable to tropospheric humidity). In the SWOT simulator, individual realizations of errors associated with each of the abovementioned sources are drawn from statistical distributions of errors specified according to the current knowledge of the SWOT error budget. The results presented in the following sections are therefore valid according to the current knowledge of the SWOT error budget.

Figure 1 illustrates one sample of each component of the error listed above at the 9-km resolution. The errors with the largest amplitudes are the roll, phase, and wet troposphere errors, while the least important sources are the timing and baseline errors. KaRIn errors are of the same order of magnitude as the first three at high resolution; here, they are smoothed by the grid coarsening. The cross-track distribution of the error amplitude is heterogeneous for the roll, phase, and baseline errors, with larger errors close to the outer boundary of each swath. All types of error except the KaRIn error exhibit significant spatial correlations in both cross- and along-track directions; between the two half-tracks, roll errors are anticorrelated by nature, while baseline errors are fully correlated.

To quantify the error statistics and get an estimate of the SWOT error covariance matrix, 5000 realizations of the error are computed with the simulator at a spatial resolution of 9 km. We choose this resolution to filter out the spatially uncorrelated KaRIn noise present at the nominal SWOT resolution of 1 km. This reduces the number of realizations necessary to characterize the error statistics and makes the implementation of the method described in the next section simpler. The treatment of the SWOT data at full resolution will be addressed in future works.

Figure 2 shows the total error distribution for a point at the outer boundary of the left swath and the analytical Gaussian function with the same sample mean and covariance. It is readily seen that the error is Gaussian distributed with mean and standard deviation

Histogram of the total SWOT error distribution for a point at the outer boundary of the left swath. The dashed line represents a Gaussian distribution with the same mean and standard deviation as the ensemble of error realizations used to build the histogram.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Histogram of the total SWOT error distribution for a point at the outer boundary of the left swath. The dashed line represents a Gaussian distribution with the same mean and standard deviation as the ensemble of error realizations used to build the histogram.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Histogram of the total SWOT error distribution for a point at the outer boundary of the left swath. The dashed line represents a Gaussian distribution with the same mean and standard deviation as the ensemble of error realizations used to build the histogram.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Figure 3 shows the spatial correlation field for a point at the outer boundary of the left swath. The correlation length is slightly anisotropic with the preferred orientation pointing in the along-track direction. The correlation values are very high and decrease to 0.5 (black line in the figure) at around 388 km from the reference point. The large-scale correlations reveal the need for a nondiagonal parameterization of the covariance matrix, otherwise most information contained in these correlations would be lost.

Error correlations for a point at the outer boundary of the left swath. Colors are in nondimensional units. The black line represents a correlation of 0.5.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Error correlations for a point at the outer boundary of the left swath. Colors are in nondimensional units. The black line represents a correlation of 0.5.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Error correlations for a point at the outer boundary of the left swath. Colors are in nondimensional units. The black line represents a correlation of 0.5.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

In the next section the sample covariance matrix diagnosed from the 5000 realizations of the SWOT simulator is used as the known covariance matrix for which a parametric form is searched.

## 4. Parameterization of the covariance matrix

### a. Theoretical aspects

Our objective is to assimilate a set of *m* observations that are known to have correlated errors by evaluating the observation term

The principal axis theorem guarantees the existence of

The main interest of the presented method is to use simple local linear transformations for which the cost of calculating

### b. Implementation for SWOT observations

*r*measured by the Frobenius norm

Experiments not shown in this article indicate that

_{0},

_{1a},

_{1c},

_{2a}, and

_{2c}, we can use the definition of the operator

Thus, the problem is to find the matrices _{0}, _{1a}, _{1c}, _{2a}, and _{2c} that minimize the cost function [Eq. (4)].

Figure 4 shows _{0}, _{1a}, _{1c}, _{2a}, and _{2c} obtained by the minimization of Eq. (4). Matrix _{0} displays variances much larger than the original observation error variances, by a factor between 100 and 1000. These high variances are coherent with the long-range correlations observed in the observation errors. The observation for which more weight is given is the along-track second-order derivative of the original observations. This may happen due to a more regular error in the along-track direction when compared to the cross-track direction.

Identified diagonal blocks that compose

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Identified diagonal blocks that compose

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Identified diagonal blocks that compose

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Figure 5 shows the covariance fields from the reference, SWOT simulator–based covariance matrix, and from the identified covariance matrix, with respect to the same grid point. The identified matrix appears as a reliable approximation of the SWOT error covariance matrix, especially for the near field. Some small differences are observed for distances larger than 500 km. This is not a big issue, since in general the analysis step conducted by prediction centers is performed with a typical domain localization radius of a few hundred kilometers (e.g., Oke et al. 2008), which makes the impact of these distant covariances negligible.

Covariances calculated for a point at the outer boundary of the left swath: (left) calculated from the SWOT simulator–based covariance matrix

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Covariances calculated for a point at the outer boundary of the left swath: (left) calculated from the SWOT simulator–based covariance matrix

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Covariances calculated for a point at the outer boundary of the left swath: (left) calculated from the SWOT simulator–based covariance matrix

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Finally, the spectral characteristics of the covariance matrices are analyzed. Figure 6 shows the singular values of both matrices, the one issued from the SWOT simulator and the matrix identified by our method in the original observation space. It is seen that the spectra are very close for the first 2440 singular values, which is quite a good result since the estimated rank of

Logarithm of the singular values calculated from

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Logarithm of the singular values calculated from

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Logarithm of the singular values calculated from

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

In this section we have shown that with an extended observation space, obtained by adding successive derivatives of the original SWOT observations to the observation vector, it is possible to model a quite general covariance function. The benefit of using

## 5. Numerical experiments

### a. Experiment configuration

The Sea Box for Assimilation (SEABASS) configuration of the NEMO model (Madec 2008) at 1/12° resolution is used to simulate the true state and the background state. This configuration simulates an idealized double-gyre circulation forced by a stationary analytical zonal wind. For a more complete description of the SeaBass setup, we refer to Cosme et al. (2010) and Bouttier et al. (2012).

The matrix

The true state

An ensemble of 100 backgrounds states *α* is a factor used to reduce the spread of the background perturbations. Drawing the background states this way ensures that the background error *α* parameter is used to make the variance level of the background error closer to the variance of the observation error. This situation is expected in an assimilation system after it has reached its asymptotic error level and for which the model error is reasonably small compared to the uncertainties in the initial condition.

Because of the limited size of the ensemble used to construct the background covariance matrix, domain localization (Janjić et al. 2011) is used to cut off unrealistic long-range correlations due to sampling errors. In this method each model grid point is analyzed independently from the others, and only observations lying within a distance of 600 km from the analyzed point are taken into account. In addition, the inverse of the observation error covariance matrix is Schur multiplied by a Gaussian function to ensure that the observations that are closer to the analyzed point have greater weights.

An experiment is defined by a set of 100 Kalman filter analyses performed using the 100 background states. Three experiments using different approximations of

For the analysis we use the local ensemble transform Kalman filter (LETKF). For more details about the algorithm used in the analysis step, we refer to Candille et al. (2015).

### b. Data assimilation results

This section aims to demonstrate the validity of the approach presented in section 4 through its application in a data assimilation analysis step. The results are presented in terms of the root-mean-square error (rmse), for which the mean is taken over the ensemble of analyses, and two aspects are analyzed: (i) which approximation of

When

(left) The rmse calculated from the ensemble of analyses. (right) Error standard deviation calculated by the filter, resulting from the experiment that uses *V*, surface zonal velocity *U*, and surface temperature.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

(left) The rmse calculated from the ensemble of analyses. (right) Error standard deviation calculated by the filter, resulting from the experiment that uses *V*, surface zonal velocity *U*, and surface temperature.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

(left) The rmse calculated from the ensemble of analyses. (right) Error standard deviation calculated by the filter, resulting from the experiment that uses *V*, surface zonal velocity *U*, and surface temperature.

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

When

As in Fig. 7, but for results with

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

As in Fig. 7, but for results with

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

As in Fig. 7, but for results with

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

Using

As in Fig. 7, but for results with

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

As in Fig. 7, but for results with

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

As in Fig. 7, but for results with

Citation: Journal of Atmospheric and Oceanic Technology 33, 12; 10.1175/JTECH-D-16-0048.1

## 6. Discussion and concluding remarks

This article explored the idea of extending the observation vector with a simple local transformation of the original observations vector for the assimilation of these observations with a least squares–based data assimilation method such as the Kalman filter. The technique is designed to simulate spatial correlations present in the observation errors. The chosen transformation is composed of the identity matrix and the first and second derivatives of the original observation. A diagonal matrix, whose entries are obtained through the solution of an optimization problem, is associated with this new set of observations.

The obtained results prove that the linear transformation introduced here is appropriate for the assimilation of SWOT-like observations at a 9-km resolution. The effectiveness of the proposed methodology was analyzed in terms of (i) the rmse values of Kalman filter analyses; and (ii) the consistency between the error standard deviation of the Kalman filter analysis, computed from the analysis covariance matrix, and the error standard deviation of an ensemble of Kalman filter analyses. Both metrics are improved when observation error covariances are accounted for through the observation vector transformation. Ignoring the covariances leads to an inconsistency between the true error statistics and the error statistics produced by the Kalman filter. Compensating this ignorance with an inflation of the observation error variances leads to better consistency but a reduced impact in terms of rmse values.

Again, the technique presented in this work results in the use of a diagonal form of the observation error covariance matrix. Therefore, it enables the assimilation of dense and structured satellite observations of the atmosphere or the ocean, such as sea surface temperature or ocean color. Here, the focus was made on the future wide-swath altimetry mission SWOT. Regarding SWOT, we face two specific challenges: (i) transposing the technique to SWOT observations at their nominal resolution of 1 km, and (ii) operating the technique in a full data assimilation system. At the 1-km resolution, the data are heavily affected by the spatially uncorrelated noise from the KaRIn instrument, what makes the application of the technique more difficult. To address the first point, we plan to investigate the use of advanced image restoration techniques to filter out KaRIn noise and make the method applicable to SWOT data at the kilometer resolution. The second point is challenging for two reasons: First, because it will be difficult to implement advanced data assimilation techniques with models at SWOT-like resolutions of a few kilometers at short and middle terms; second, because the simultaneous assimilation of SWOT with other observations of different resolutions (conventional altimetry, typically), a practice commonly known as multiscale data assimilation (e.g., Fieguth et al. 1995), is challenging in itself. These challenges will be addressed in future works.

## Acknowledgments

The research presented in the paper was supported by the Centre National d’Études Spatiales (CNES) and the Seventh Framework Programme FP7/2007–2013 of the European Commission through the Stochastic Assimilation for the Next Generation Ocean Model Applications (SANGOMA) project (Grant Agreement 283580). Computations were carried out using high performance computing resources from Grand équipement national de calcul intensif–Institut du développement et des ressources en informatique scientifique (GENCI-IDRIS; Grant 2014-0111279).

## REFERENCES

Bouttier, P.-A., Blayo E. , Brankart J. M. , Brasseur P. , Cosme E. , Verron J. , and Vidard A. , 2012: Toward a data assimilation system for NEMO. Mercator Ocean Quarterly Newsletter, No. 46, Ramonville Saint-Agne France, 24–30.

Brankart, J., Ubelmann C. , Testut C. , Cosme E. , Brasseur P. , and Verron J. , 2009: Efficient parameterization of the observation error covariance matrix for square root or ensemble Kalman filters: Application to ocean altimetry.

,*Mon. Wea. Rev.***137**, 1908–1927, doi:10.1175/2008MWR2693.1.Brankart, J., Cosme E. , Testut C. , Brasseur P. , and Verron J. , 2010: Efficient adaptive error parameterizations for square root or ensemble Kalman filters: Application to the control of ocean mesoscale signals.

,*Mon. Wea. Rev.***138**, 932–950, doi:10.1175/2009MWR3085.1.Candille, G., Brankart J. M. , and Brasseur P. , 2015: Assessment of an ensemble system that assimilates

*Jason-1*/*Envisat*altimeter data in a probabilistic model of the North Atlantic Ocean circulation.,*Ocean Sci.***11**, 2647–2690, doi:10.5194/os-11-425-2015.Cosme, E., Brankart J.-M. , Verron J. , Brasseur P. , and Krysta M. , 2010: Implementation of a reduced rank square-root smoother for high resolution ocean data assimilation.

,*Ocean Modell.***33**, 87–100, doi:10.1016/j.ocemod.2009.12.004.Durand, M., Fu L.-L. , Lettenmaier D. , Alsdorf D. , Rodriguez E. , and Esteban-Fernandez D. , 2010: The Surface Water and Ocean Topography mission: Observing terrestrial surface water and oceanic submesoscale eddies.

,*Proc. IEEE***98**, 766–779, doi:10.1109/JPROC.2010.2043031.Esteban-Fernandez, D., 2013: SWOT project: Mission performance and error budget. Revision A, NASA/JPL Tech. Rep. JPL D-79084, 83 pp. [Available online at http://swot.jpl.nasa.gov/files/SWOT_D-79084_v5h6_SDT.pdf.]

Fieguth, P. W., Karl W. C. , Willsky A. S. , and Wunsch C. , 1995: Multiresolution optimal interpolation and statistical analysis of TOPEX/Poseidon satellite altimetry.

,*IEEE Trans. Geosci. Remote Sens.***33**, 280–292, doi:10.1109/36.377928.Fu, L.-L., Alsdorf D. , Morrow R. , Rodriguez E. , and Mognard N. , Eds., 2012: SWOT: The Surface Water and Ocean Topography mission; wide-swath altimetric measurement of water elevation on Earth. Jet Propulsion Laboratory Publ. 12-05, 228 pp. [Available online at http://trs-new.jpl.nasa.gov/dspace/bitstream/2014/41996/3/JPL%20Pub%2012-5.pdf.]

Gaultier, L., Ubelmann C. , and Fu L.-L. , 2016: The challenge of using future SWOT data for oceanic field reconstruction.

,*J. Atmos. Oceanic Technol.***33**, 119–126, doi:10.1175/JTECH-D-15-0160.1.Janjić, T., Nerger L. , Albertella A. , Schröter J. , and Skachko S. , 2011: On domain localization in ensemble-based Kalman filter algorithms.

,*Mon. Wea. Rev.***139**, 2046–2060, doi:10.1175/2011MWR3552.1.Järvinen, H., and Undén P. , 1997: Observation screening and background quality control in the ECMWF 3D-Var data assimilation system. ECMWF Tech. Memo. 236, 33 pp.

Li, H., Kalnay E. , and Miyoshi T. , 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 523–533, doi:10.1002/qj.371.Liu, Z.-Q., and Rabier F. , 2002: The interaction between model resolution, observation resolution and observation density in data assimilation: A one-dimensional study.

,*Quart. J. Roy. Meteor. Soc.***128B**, 1367–1386, doi:10.1256/003590002320373337.Madec, G., 2008: NEMO ocean engine. IPSL Note du Pôle de modélisation de l’Institut Pierre-Simon Laplace 27, 209 pp. [Available online at http://www.nemo-ocean.eu/content/download/5302/31828/file/NEMO_book.pdf.]

Miyoshi, T., Kalnay E. , and Li H. , 2013: Estimating and including observation-error correlations in data assimilation.

,*Inverse Probl. Sci. Eng.***21**, 387–398, doi:10.1080/17415977.2012.712527.Oke, P. R., Brassington G. B. , Griffin D. A. , and Schiller A. , 2008: The Bluelink Ocean Data Assimilation System (BODAS).

,*Ocean Modell.***21**, 46–70, doi:10.1016/j.ocemod.2007.11.002.Stewart, L. M., Dance S. L. , and Nichols N. K. , 2008: Correlated observation errors in data assimilation.

,*Int. J. Numer. Methods Fluids***56**, 1521–1527, doi:10.1002/fld.1636.Stewart, L. M., Dance S. L. , and Nichols N. K. , 2013: Data assimilation with correlated observation errors: Experiments with a 1-D shallow water model.

,*Tellus***65A**, 19546, doi:10.3402/tellusa.v65i0.19546.Ubelmann, C., Gaultier L. , and Fu L.-L. , 2016: SWOT simulator for ocean science. Accessed 30 November 2016. [Available online at https://github.com/SWOTsimulator/swotsimulator/blob/master/doc/source/science.rst.]

Waller, J. A., Dance S. L. , Lawless A. S. , and Nichols N. K. , 2014: Estimating correlated observation error statistics using an ensemble transform Kalman filter.

,*Tellus***66A**, 23294, doi:10.3402/tellusa.v66.23294.