## Abstract

Argo floats have significantly improved the observation of the global ocean interior, but as the size of the database increases, so does the need for efficient tools to perform reliable quality control. It is shown here how the classical method of optimal analysis can be used to validate very large datasets before operational or scientific use. The analysis system employed is the one implemented at the Coriolis data center to produce the weekly fields of temperature and salinity, and the key data are the analysis residuals. The impacts of the various sensor errors are evaluated and twin experiments are performed to measure the system capacity in identifying these errors. It appears that for a typical data distribution, the analysis residuals extract 2/3 of the sensor error after a single analysis. The method has been applied on the full Argo Atlantic real-time dataset for the 2000–04 period (482 floats) and 15% of the floats were detected as having salinity drifts or offset. A second test was performed on the delayed mode dataset (120 floats) to check the overall consistency, and except for a few isolated anomalous profiles, the corrected datasets were found to be globally good. The last experiment performed on the Coriolis real-time products takes into account the recently discovered problem in the pressure labeling. For this experiment, a sample of 36 floats, mixing well-behaved and anomalous instruments of the 2003–06 period, was considered and the simple test designed to detect the most common systematic anomalies successfully identified the deficient floats.

## 1. Introduction

The Argo program, with a fleet of more than 3000 floats deployed over the World Ocean, delivers to the scientific community a tremendous real-time dataset on the temperature and salinity properties of the upper 2000 m. These data are now widely distributed geographically and seasonally uniform; they constitute the in situ counterpart of the global observation of the ocean surface by satellite altimetry and temperature remote sensing.

The Argo data assembly centers (DACs) collect the float data in real time and apply a series of standard automatic quality control (QC) tests defined by an international data management group (Argo Data Management 2005) to set the QC flag values. These data are then transmitted to the global data assembly centers (GDACs) in charge of the distribution to the Argo users and to the Global Telecommunication System (GTS). A second level of processing, called “delayed mode,” is performed by the principal investigators (PIs). At the present time, three levels of corrections are defined and applied when appropriate: 1) thermal mass correction (Johnson et al. 2007), 2) pressure adjustment if not done in situ, and 3) salinity correction. For this last correction, the method recommended by Argo (Argo Data Management 2005) is derived from the Wong et al. (2003) method, later adapted by Boehme and Send (2005) to the North Atlantic Ocean.

The Argo regional centers (ARCs) have been designed to collect the delayed-mode datasets relevant to their area and check their overall consistency. These centers are still in development and in the meantime, data downloaded from the Argo datacenters need complementary validation. The real-time datasets contain a percentage of erroneous data that escaped the automatic quality controls. The delayed-mode datasets are of higher quality, but the correction applied to the salinity depends on the reference database used and on a subjective evaluation by the PI. Thus, scientists as well as operational systems are faced with the problem of validating very large datasets. Undetected biases, in particular, may have dramatic consequences when analyzing long-term changes.

We show here how the in situ analysis system (ISAS) run by Coriolis (one of the Argo GDACs) to produce the real-time weekly analysis of temperature and salinity (see www.coriolis.eu.org) can be used to perform the validation. ISAS uses estimation theory to combine information from previous knowledge of the ocean with all synoptic measurements, taking advantage of the relatively dense Argo coverage and of any other measurements. We propose here to use the analysis residuals to detect systematic errors. To be consistent with the hypothesis implicit in the method, the distribution of the residuals should not differ too much from a Gaussian, with zero mean and a standard deviation similar to the a priori error. Any bias or trend in the residual would indicate a sensor offset or drift. The analysis of their statistical and long-term behavior thus provides a method to check the consistency of each data a) with the nearby measurements in time and space, b) with the climatology, and c) with the a priori statistics expressed in the covariance scales and variance amplitude.

The analysis method is presented in section 2. In section 3 the effect of sensor drift on the data is simulated and the sensitivity of the analysis residuals is explored. In section 4 the method is applied to the Atlantic 2000–04 real-time dataset to perform global detection of sensor drifts and offsets. Section 5 deals with the consistency check of the first delayed-mode Argo dataset over the Atlantic. Finally, section 6 proposes systematic tests, taking into account the pressure problem recently identified in the sounding oceanographic Lagrangian observer (SOLO) floats, and applies them on a subset of floats using the real-time analysis results made available by Coriolis.

## 2. The in situ analysis system

The in situ analysis system implemented at the Coriolis data center produces gridded fields of temperature and salinity based on all data transmitted to the center at the date of analysis. Until the beginning of 2005, the analyses were limited to the Atlantic (Fig. 1); later they were extended to the global ocean. ISAS is univariate, which means that temperature and salinity are estimated independently. It is based on optimal interpolation; the estimated quantity is the anomaly in depth levels relative to a reference monthly climatology. The analyzed temperature and salinity fields are obtained by adding the estimated anomaly to the reference climatology. Analyzing temperature and salinity separately may in some cases create anomalous water masses (Lozier et al. 1994). Although we cannot totally rule out this effect, we have checked the analyzed temperature/salinity diagrams in the most sensible areas (fronts) and did not detect such anomalous water masses. We attribute it to the fact that we are analyzing anomalies and also that, thanks to Argo, the temperature and salinity sampling are similar.

### a. Optimal interpolation method

ISAS uses estimation theory to map a scalar field on a regular grid from sparse and irregular data [see Liebelt (1967) for the basic theory and Bretherton et al. (1976) and Kaplan et al. (1997) for applications to the ocean]. The interpolated field, represented by the state vector **x**, is constructed as the anomaly relative to a reference field at the grid points **x**^{f}. This reference is derived from previous knowledge (climatology or forecast). Only the unpredicted part **d** of the observation vector **y**^{o} or the anomaly relative to the reference field at the data points **y**^{f}, called innovation, is used. The analyzed field **x**^{a} is obtained as a linear combination of the innovation and is associated with a covariance matrix 𝗣^{a}. The error on the estimation is given by the diagonal of this matrix:

In the objective analysis formalism, the 𝗞^{OI} matrix is built from the matrices that express the covariances of the field, from grid point to data point (𝗖_{ao}) and from data point to data point (𝗖_{oo}) and the observation noise covariance matrix 𝗥; 𝗣 is the covariance of the field at grid points. This solution makes implicit use of an observation matrix 𝗛, such that **y**^{o} = 𝗛𝘅 + **ε**, which by analogy with the Ide et al. (1997) formalism can be expressed as 𝗛^{T} = 𝗣^{−1}𝗖_{ao}.

It should be noticed that this formalism provides at the same time an estimate of the misfit *δ* between observations and analysis, also called analysis residuals:

We will base our study on the analysis of these residuals to detect erroneous data: outliers, biases, or drifts. The advantage of such method is that the residuals are computed with the correct mapping matrix 𝗛. Moreover, it is not necessary to perform an analysis at each data point to obtain the residuals: they are obtained at once for the whole dataset. ISAS thus appears to be an efficient tool for dealing with a large and diverse database.

### b. Implementation

#### 1) Dataset

Most of the data used here come from Argo floats that provide temperature and salinity in the upper 2000 m with a nominal accuracy of 0.01°C and 0.01 (PSS-78 scale), respectively. The second most numerous datasets are expendable bathythermographs (XBTs), which measure temperature within 0.03 to 0.1°C. A few expendable conductivity temperature–depth (XCTD) instruments provide salinity measurements within 0.03 to 0.1, and some conductivity–temperature–depth (CTD) instruments give better accuracy (on the order of 0.001°C for temperature and 0.001 for salinity after calibration). Finally, some moorings, most of them located along the equator, transmit high-frequency temperature measurements at a few levels with an accuracy similar to the Argo floats. The transmission system imposes another limitation on the measurement accuracy. The data collected directly by the Argo DACs are transmitted with full resolution. Data transmitted through the GTS (WMO 2004) have been truncated: temperature, salinity, and current (TESAC)-type data report two places beyond the decimal points and BATHY-type data report only one. We have used only the data considered “good” (1) or “probably good” (2). The core of the dataset is the raw (uncorrected) data flagged as 1 or 2 by the Argo automatic QC. In section 5, delayed-mode (corrected) data from Argo floats, flagged 1 or 2 by the PI, are also taken into account.

Prior to the analysis, the raw and delayed-mode data files are converted into standard data files. Pressure is converted to depth using the local density profile given by the climatology, and then all profiles are interpolated on standard levels. We defined 250 levels between 0 and 2200 m; the vertical spacing is 5 m down to 400 m, 10 m down to 2000 m, and 20 m below. The error introduced by the interpolation is taken into account by multiplying the measurement error by a factor increasing from 1 to 2 as the distance between measurement and interpolation levels increases. Only 59 of these levels spread over the depth range are analyzed. Finally, to avoid oversampling by the high-frequency profiles and ill conditioning of the matrix to be inverted, data obtained from the same platform (moorings in particular) within less than 24 h and in a range of 24 km (about half the grid resolution) are averaged.

#### 2) Configurations

The work presented in sections 3 to 5 is based on the Atlantic configuration on a grid with 1/3° resolution (Mercator, varying as the cosine of latitude). The reference climatology is derived from the Reynaud et al. (1998) Atlantic seasonal climatology defined on a 1° square grid with 35 levels between 0 and 2300 m. This seasonal climatology uses bottles and CTD data from the twentieth century and is heavily weighted toward the period 1960–90. It has been interpolated horizontally and vertically on our analysis grid. The monthly fields are obtained by linear interpolation of the seasonal fields.

The analysis CORAAT-01 presented in section 4 is based on the Atlantic data from the period 2000–04. It uses all real-time data (flags 1 and 2) present in the database at the beginning of 2005. This analysis aims at detecting errors missed by the real-time QC, such as slow salinity drifts, before performing a scientific study of the variability over this period.

The consistency of delayed-mode corrected datasets is explored with CORAAT-03 analysis. In that case, real-time and delayed-mode data flagged 1 and 2 are used. In case of duplicates, the delayed-mode profile replaces the real-time one. This analysis spans a longer time period (2000–05) but the domain is restricted to the area 20°–70°N, where a large number of delayed-mode data were available at the time of the analysis. More data are taken into account in CORAAT-03 than in CORAAT-01 because some data with a real-time QC flag 3 were corrected and are now available in delayed mode with a QC flag set to 1 or 2. CORAAT-03 is presented in section 5.

The results presented in section 6 are deduced from the real-time analysis products made available by Coriolis. This analysis, called CORTGL01, covers the global ocean with 1/2° resolution. The reference climatology is based on the *World Ocean Atlas 1998* (Antonov et al. 1998; Boyer et al. 1998).

#### 3) A priori statistics

Statistical information on the field and data noise is introduced through the covariance matrices (𝗣, 𝗖_{ao}, and 𝗖_{oo}) that appear in Eq. (2). We assume that the covariances of the analyzed field can be specified by a structure function modeled as the sum of two Gaussian functions, each associated with specific time and space scales:

where *dx*, *dy*, and *dt* are the space and time separation and *L _{ix}*,

*L*, and

_{iy}*L*the corresponding

_{iT}*e*-folding scales. The first scale length is assumed to be isotropic and equal to 300 km, the target Argo resolution. The corresponding time scale

*L*

_{1,t}is set to 3 weeks. The second length is set equal to 4 times the average Rossby radius of the area, as computed from the annual climatology. In the equatorial band, this value is bounded by the large-scale length in the zonal direction and by the length scale of the adjacent zones in the meridional direction, introducing some anisotropy in this region. At high latitudes, it is bounded by the resolution of the estimation grid. The time scale

*L*

_{2,t}is set to 1 week. It should be said that the choices for the time and space scales result from a compromise between what is known of ocean time and space scales and what can actually be resolved with the Argo array (one profiler every 3°, every 10 days).

The variances *σ _{i}*

^{2}control the weight given to each ocean scale. The total variance of the anomaly is deduced from the 2000–04 database. To reach statistical significance, which requires a large number of data representing all types of anomalies, the basin has been divided into areas (approximately 5° latitude by 10° to 40° longitude) over which the statistics are assumed to be homogeneous. For each area, and at each standard level, the total variance is computed as the variance of the anomaly relative to the monthly reference field. It is estimated as being the sum of four terms:

Here, *σ*_{1}^{2} and *σ*_{2}^{2} are the two terms appearing in Eq. (7) and the remaining sum *σ*_{UR}^{2} + *σ*_{ME}^{2} is the total error variance: *σ*_{ME}^{2} corresponds to the instrumental errors and *σ*_{UR}^{2} represents small scales unresolved by the analysis and considered as noise, sometimes called representativity errors. A unique *σ*_{ME}^{2} profile has been computed from the measurement errors of the standard database and subtracted from the total variance *σ*_{tot}^{2} to obtain the ocean variance *σ*^{2} (first three terms of the sum). The ocean variance is adjusted to remain larger than *σ*_{ME}^{2}. To take into account the fact that the dataset might not be sufficient to describe the variability, the variance obtained has been multiplied by a factor of 1.2. Examples of the final profiles of *σ*_{ME}^{2} and ocean variance *σ*^{2} are given in Fig. 2 for the area centered on 55°N, 25°W. We express the variance associated with each scale as a function of the ocean variance by introducing normalized weights:

The free parameters of the system are the weights (*w*_{1}, *w*_{2}, *w*_{UR}) that define the distribution of variance over the different scales. The weights for the standard deviation result from two considerations: first, statistical knowledge of ocean variability based on the space–time spectrum, and second, the limited resolution of the Argo array, which imposes some a priori filtering. A large part of the mesoscale field is rejected in the subgrid error. For this analysis, the weights are set to *w*_{1} = 1/6, *w*_{2} = 2/6, and *w*_{UR} = 3/6 over the whole domain. The error matrix 𝗥 combines the measurement error and the representativity error due to unresolved scales; it is assumed to be diagonal, although this is only a crude approximation because both errors are likely to be correlated for measurements obtained with the same instrument or within the same area and time period. In fact, the delayed-time QC uses these correlations to diagnose the biases, as will be seen in the following section.

## 3. Identification of sensor errors in the residuals

### a. Simulating sensor errors

Argo floats are equipped with sensors that measure pressure, temperature, and conductivity. To better identify a possible drift in any of these sensors, errors in each measured parameter are simulated to compute the impact on the final data output. The most common measurement error observed is a depth-independent negative drift in salinity, likely due to an error in the conductivity measurement. On Seabird sensors it can be explained by the biological fouling that changes the effective volume of the conductivity cell. The simulated errors are a 0.05% increase in the conductivity, a temperature offset of +0.02°C, and a pressure offset of +50 dbar. These errors were added separately to high-quality CTD data from the Observatoire de la variabilité interannuelle et décennale en Atlantique Nord (OVIDE; Lherminier et al. 2007) repeat section performed in the subpolar gyre as a contribution to the Climate Variability and Predictability Project (CLIVAR). To simulate the typical Argo float sampling, the original conductivity and temperature profiles sampled at 1 dbar were averaged over standard levels.

The results are illustrated (Fig. 3) by the temperature and salinity profiles of OVIDE station 57 (measured on 20 June 2004 near 55°N, 26°W). The 0.05% increase in conductivity leads to a nearly constant +0.039 salinity error, whereas the opposite effect is obtained with the +0.02°C temperature offset, resulting in a total salinity error of −0.041. The same conductivity, temperature, and pressure perturbations applied on all profiles of the section give similar errors. The major effect of the pressure sensor offset is the apparent shallowing of isotherms and isohalines, which introduces an error proportional to the vertical gradient (Fig. 3). This error is superimposed on the nearly constant error introduced by the pressure error in the equation relating conductivity to salinity. A pressure error will be more easily identified on temperature because of the stronger temperature gradients. Scatterplots of temperature error as a function of the vertical gradient of temperature will show strong correlation.

### b. Sensitivity test 1: Single profile

Sensitivity tests were performed with simulated salinity offsets, as produced by an error factor on conductivity. The temperature and salinity fields over the North Atlantic on 20 June 2004 that include the OVIDE CTD section were selected to perform twin experiments to evaluate how errors in the sensors can be identified with the analysis system. The reference experiment was performed with Coriolis data flagged 1 and 2 by the automatic QC between 31 May and 20 June, as would be done for a real-time analysis. The perturbed experiment is based on the same dataset except for one test profile (OVIDE station 57). To test the limits of detectability, a small 0.02 salinity offset was added to the original salinity data.

The innovation vector (anomaly relative to the monthly climatology) is consistent with the a priori information (Fig. 4). It lies within the a priori standard deviation of the signal, except for a layer between 250 and 350 m. This layer corresponds to a strong vertical gradient of salinity where moderate isopycnal displacement is likely to produce strong anomalies. The residuals also fit within the a priori error and do not show strong bias. The correlation of the residuals with the innovation profile reflects some lack of resolution of the system due to limited data coverage. The residuals obtained with the perturbed experiment are similar but show a depth-independent increase of 0.013, indicating that 66% of the error is correctly resolved (not shown).

### c. Sensitivity test 2: Time series

A single erroneous profile will be detected only in cases in which the offset is significantly larger than the a priori error. In general, the statistical behavior of the residuals from a particular instrument must be used to discriminate between unresolved ocean variability and real error. In this second sensitivity test, we will follow a specific float over 3 yr. The reference experiment is a series of weekly analysis of the North Atlantic between 03 February 2002 and 29 December 2004. Float 6900177, which drifted around 48°N, 20°W and showed no clear sensor drift or offset, was selected as a test case for a second analysis in which the salinity from this float was perturbed by a constant −0.04 offset. When comparing the scatterplots of temperature and salinity residuals obtained for this float in the reference and perturbed analysis, the salinity shift is clearly seen at depths where the ocean variability is smaller (Fig. 5). The mean value of the salinity residuals calculated for the layer 700–2000 m is −0.022. The process has been iterated by correcting the float with the estimation of the offset before performing a new analysis, and the correct offset was recovered to better than 0.01 after two iterations.

## 4. Validating the Atlantic 2000–04 real-time dataset

### a. Salinity offset and drift detection

The global analysis of the Argo floats in the Atlantic for 2000–04 has been performed by analyzing the residuals produced by the CORAAT-01 reanalysis. The time series of residuals associated with floats that transmitted more than 16 temperature and salinity profiles were reviewed. A total of 482 floats, most of them located in the Northern Hemisphere, have thus been processed (Fig. 6). Residuals represent the misfit between nearby data or a discrepancy between the data and reference climatology that is larger than specified by the a priori data–data or data–field covariances. In some poorly sampled regions of the world, the departure from climatology, represented by the a priori field variances, is underestimated or the climatology does not represent the mean for the time of the analysis. In that case, the residuals are expected to be geographically correlated because we expect climatological changes to be large scale, and the residuals from a same area will tend to have the same value over time. Errors due to sensor problems behave differently because they tend to be correlated along the sensor trajectory over its lifetime. To evaluate the bias due to the climatology and a priori variances, we averaged the residuals in time over 3° squares, keeping only the squares with more than 16 profiles. We observe a large-scale structure with negative anomalies over most of the North Atlantic in the map of the mean residuals for the layer 1000–1600 m (Fig. 7). Although instrumental bias cannot be excluded, it might correspond to the decadal freshening of the deep ocean properties noticed by Dickson et al. (2002). The bias remains rather weak, however, barely exceeding −0.01, and thus should not mask the instrumental drift.

For each float the trajectory, the residuals diagram, and the residuals time series have been reviewed. The anomalous behaviors were sorted into three categories: offset, drift, and the combination of offset plus drift. It appears that despite the problems in the conductivity sensors that occurred on some floats during the early years, the Argo fleet behaved well (Table 1). Only 70 floats (15%) were pointed out as anomalous, with 27 of them showing markedly strong drift or offset. Most of the time, large drifts are positive whereas large offsets are negative. In addition to these 70 bad floats, 43 floats showed anomalies that were not clearly offset or drift. After this first pass, we have not been able to attribute the anomalies to sensor problems because they could also be due to strong mesoscale variability or changes in the float position relative to a front.

At the beginning of 2007, a problem was identified in the pressure labeling of some types of SOLO floats (Schiermeier 2007). Most of these floats were deployed in 2005–06, but 76 of them, which are now clearly identified, were part of our dataset. It is interesting to see afterward how these data were qualified in the analysis exposed above. It must be noticed first that these floats represent only a small percentage of the total number of profiles because they were launched only at the end of the period. Moreover, because the focus was on deep salinity residuals and many of them did not go deeper than 1000 m, they were paid little attention. Nevertheless, the diagnostic of the screening is as follows: 30 floats were considered good, 13 floats were detected as having salinity drift or bias, 4 displayed excessive spikes, and 29 showed no clear bias or drift but had anomalous variability. Many of these 29 floats were among the 43 ambiguous floats mentioned in the previous paragraph. When applied to the SOLO floats alone, the statistics lead to a 61% error, but because the screening was performed focusing on the salinity drift problem and the float type was not considered, the pressure problem had not been identified at that time.

### b. Comparison with Argo sensor drift estimate

The method based on residual diagnostics as presented here is not meant to replace the Argo correction method because it applies only to the floats showing no or moderate drift and does not propose a correction. It is interesting, however, to compare both methods on a test case. Profiler 4900214, which traveled off Newfoundland for more than 2 yr, started to drift at the end of its first year. The salinity residuals were averaged over the layer 700–2000 m (the depth range used by the Argo correction method) and compared with the salinity anomalies estimated by the Argo method over the same layer (Fig. 8). The regional negative bias due to the climatology is seen on ISAS residuals of the neighboring data over the whole period and in profiler 4900214 residuals until September 2003. The analysis residuals start to grow linearly from September 2003 to February 2004 until they become limited by the error a priori variance. The Argo delayed-mode processing anomalies show similar behavior except that they are not bounded, and the linear drift continues until the end of the period. The salinity correction proposed by the Argo method is obtained as a piecewise linear fit to this curve. We see here that ISAS cannot estimate correctly the sensor offset when it becomes larger than the a priori measurement error, defined for floats that perform correctly. The a priori measurement error—the sum of the instrumental and subgrid errors (*σ*_{UR}^{2} + *σ*_{ME}^{2})—normalizes the residuals in the cost function minimized by the optimal method; thus, the method strongly penalizes solutions that lead to normalized residuals larger than 1. On the other hand, ISAS appears to perform better than the Argo method in discriminating ocean mesoscale variability and sensor errors, and we note that short-timescale variability is smaller in ISAS residuals.

## 5. Consistency of Argo delayed-mode datasets

The consistency of the delayed-mode Argo datasets over the North Atlantic for the period 2000–05 is now considered by analyzing the residuals produced by the CORAAT-03 reanalysis. The time series of residuals associated with floats that transmitted more than 16 profiles and containing delayed-mode salinity data were studied. A total of 120 floats, leading to 8847 profiles, were retained. Because the salinity shifts are more easily detected in layers where the ocean variability is low (section 3c) we focus the analysis on the time series of residuals averaged over the deep layers (700–1000, 1000–1600, and 1600–1950 m).

The signal we try to detect here is more subtle than in the previous sections. Large salinity drifts and offsets existing in the real-time data are already corrected. To allow the identification of suspicious cases while limiting the number of profiles and floats that need individual control, three criteria have been defined. The threshold values have been adjusted after a careful inspection of the residuals. A greater attention is thus dedicated to 1) floats for which the mean residuals, averaged over the float lifetime in the deepest layer (1600–1950 m), are greater than 0.0075; (2) profiles for which salinity residuals are greater than 3 times the standard deviation of the residuals from the deepest layer; and (3) floats for which the QC flag for the adjusted salinity field is equal to 2. Such flags are assigned by the PI when he or she considers that the estimated correction is based on insufficient information, such as when the sensor is unstable or when the float exhibits problems with pressure measurements. The first criterion is meant to detect floats with biased residuals but not to detect offset in the salinity measurements with this order of magnitude. Indeed, residuals underestimate real offset (section 3b) and offset or drift might concern only a part of the time series.

Among the 120 Argo floats containing delayed-mode data, 10 floats show a mean residual in the deepest layers greater than 0.0075 (Fig. 9). For most of the floats, salinity residuals are correlated with temperature residuals (Fig. 10), suggesting an oceanic mesoscale signal rather than a sensor problem. For example, the mean residuals obtained for float 6900181 in the layers 700–1000, 1000–1600, and 1600–1950 m are 0.148, 0.138, and 0.029, respectively. Those large values (largest positive peak in Fig. 9) are due to a Mediterranean eddy (Meddy), identified on a potential temperature versus salinity (*θ*–*S*) diagram by a salinity greater than 36 on the 11°–13° isotherms, in which the float remained trapped during most of its lifetime (Fig. 10).

In two cases (floats 4900133 and 4900136), the negative salinity residuals are not associated with negative temperature residuals. The residual time series for float 4900133 are displayed in Fig. 11. This raises a question: is this bias real, consistent with the other datasets and due to changes relative to the climatology, or is it due to a sensor problem that was not correctly taken into account by the PI of the float? Both floats drifted near the eastern U.S. coastline and their behavior is consistent with that of three other floats found in the same area. All of them exhibit negative residuals of the same order of magnitude that are consistent with the negative bias observed in the CORAAT-01 reanalysis (Fig. 7). It is thus likely that the bias is real and due to changes in the ocean properties compared to the reference climatology. Nevertheless, the PI responsible for those floats was asked to verify the proposed correction.

The second criterion (deep residuals greater than three standard deviations) pointed out nine profiles (belonging to seven floats), only five of which are considered problematic. For instance, the peak observed at all depths at the beginning of November 2003 in the residual time series of float 4900130 is due to two bad profiles (Fig. 11) for which the delayed-mode QC flag was not set correctly to 4. Although suspicious profiles represent a very low percentage of the whole fleet (5 over 8847 profiles), it is necessary to be able to detect them to guarantee the quality of the Argo dataset. Finally, none of the floats with a QC flag set to 2 for the delayed-mode salinity data was found suspicious by the residual analysis.

The delayed-mode Argo datasets considered here show an overall good quality. Two floats out of the 120 considered were identified as having suspect negative offset and five out of 8847 profiles were considered erroneous. In all cases, the PIs were warned and the delayed-mode profile flags are now corrected. Although the number of erroneous or suspicious profiles is small, we believe that it does not bring into question the need for checking the consistency of the Argo dataset at basin scale and the implementation of this procedure at the ARC level. Indeed, the analysis was performed when only few delayed-mode Argo profiles were available; with the growing number of delayed-mode Argo profiles, we expect to find a larger number of erroneous or inconsistent profiles in the database.

## 6. Application to the Coriolis real-time analysis products

The results of the global analysis for the temperature and salinity fields are made available by the Coriolis data center as NetCdf formatted files containing the gridded field, the error maps, the data, and the corresponding residuals. We have downloaded the global 2006 dataset release, covering the 2000–06 period. The gridded fields can be used directly for studying the ocean variability or the data can be introduced in a specific analysis or assimilation. Because this dataset is the result of a real-time processing, it is absolutely necessary to perform some type of systematic control before any scientific interpretation. The SOLO floats with pressure data problems, for example, are still part of this dataset because most of them had remained undetected until the beginning of 2007. This control can be done by reviewing the analysis residuals as described in the previous sections, but we tried here to define more quantitative tests to allow for a systematic processing. We operated as follows: 36 floats, all launched in the Atlantic, were selected. They represent the main instrument types (8 SOLO, 15 Apex, 13 Provor) and include well-behaved instruments and floats showing problems in salinity sensor or pressure data. The time series of data and residuals of these 36 floats were carefully screened; we illustrate here with two examples the typical features of the two types of failure. Then a quantitative test to detect the failure is applied to the 36 floats, and we compare the results of the automatic and detailed control.

### a. Salinity drift

Float 4900216 was launched west of the Azores in May 2002 and transmitted data until April 2006. The temperature/salinity diagram (Fig. 12) shows a drift in salinity toward higher values. A similar diagram constructed with the salinity residuals (*δ _{S}*) in layers deeper than 1000 m clearly shows that the center of gravity moves away from the origin along the salinity axis. The time series of salinity residuals averaged over a layer excluding the highly variable upper 400 m () has a clear trend not seen in temperature (Fig. 13). Given the high ocean variability due to mesoscale ocean changes along the float trajectory, and depending on the form of the drift, it is not obvious that a linear fit would necessarily express the trend. It was thus preferred to apply a nonparametric evaluation of the trend known as a “reverse arrangement test” (Bendat and Piersol 2000). The number of arrangements

*A*of the series is given by

where *h _{ij}* = 1 if > > and

*h*= 0 otherwise.

_{ij}A float will be considered as having a drift when the number of arrangements *A* of the series > falls out of the interval defined by ±2.7 std dev of a random distribution. When applied to the 36 selected floats, this test detected six floats as having a salinity drift (1 SOLO, 3 Apex, 2 Provor), which corresponds to the detailed screening diagnostic. An additional test was performed on salinity to detect a salinity offset: The sensor is assumed to have an offset if the absolute value of the time mean of > is larger than 0.02. One Provor float was found to have an offset (which was confirmed by a later detailed analysis).

### b. Error on pressure

As pointed out in section 3a, an error in the pressure sensor leads to errors on temperature and salinity proportional to the vertical gradients of these properties. Because the temperature gradient is usually stronger, we focus on this variable to detect errors in the pressure data. The measurements given by the SOLO float 1900360, launched in the South Atlantic in March 2004 and still transmitting data at the end of 2006, are shown here to illustrate the method. The profiles of innovation and residual show anomalous variations over time (see Fig. 14): deeper than 400 m, the level that corresponds to a change in the vertical sampling of the float, the sign of the residuals tend to alternate, with shallower profiles corresponding to strong negative anomalies. This behavior is related to the software error mentioned by the Argo centers. To define a quantitative measure to detect this type of error, we computed the pressure error *δ _{P}* equivalent to the temperature residual

*δ*using the relation

_{T}For each profile, this error is averaged over all levels deeper than 400 m, and the mean value is compared to the corresponding vertical standard deviation. In the case of float 1900360, most of the mean pressure errors are larger than the standard deviation, and the larger errors tend to be negative (Fig. 15). In general, a negative pressure error leads to a positive density residual, but the natural variability of density may hide the systematic error, as seen in Fig. 15. The pressure test is based on this pressure error series. The pressure sensor is assumed to have an offset over a given depth range when more than 70% of the mean pressure errors are larger than the vertical standard deviation and the RMS value of over time is larger than 20 dbar. This test successfully detected the four SOLO floats that had been identified as problematic by the detailed analysis.

## 7. Summary and conclusions

The Argo array makes possible the monitoring of the ocean interior and thus expands considerably the range of scientific analysis. This new potential generates the need for validation tools adapted to those large real-time datasets. We have shown here how analysis residuals, an often neglected byproduct of the optimal estimation method used to produce gridded fields, can be efficiently used for detecting instrumental drifts and errors. This variable belongs to the data space and is a natural measure when the focus is on data validation.

The study relies on ISAS, the optimal analysis system used at the Coriolis data center. We have simulated the impact of the various sensor errors and evaluated the system capacity in identifying these errors. We showed that for a typical data distribution, the analysis residuals extract 2/3 of the sensor error after a single analysis. The screening of salinity residuals appears to be an interesting complement to the Argo automatic QC. It efficiently identifies data that deviate significantly from the climatology and the neighboring observations. This processing is now applied daily by Coriolis.

The long-term behavior of the residual gives access to smaller offsets and drifts. Diagnostic tools considering simultaneously temperature and salinity residuals such as time series plots and *T*–*S* diagrams allow us to inspect a large number of floats. The method has been applied on the full array of Atlantic Argo floats (482 floats), of which about 15% were pointed out as showing drift or offset. A similar analysis was performed to check the consistency of Argo delayed-mode datasets issued by different data centers. The experiment was conducted on 120 floats in the North Atlantic. The analysis of the residuals shows that offsets and drifts had been successfully corrected by the delayed-mode procedure. No systematic biases were found in the dataset and only five profiles over 8847 were identified as problematic.

To deal with very large global datasets, such as are now delivered by the Argo centers, we have designed automatic tests to detect salinity offset, drift, and pressure data error. This last experiment was conducted on 36 floats and used the Coriolis real-time 2000–06 products. The detection of salinity drift relies on a reverse arrangement test whereas the pressure test is based on the pressure offset deduced from the temperature residuals and the temperature vertical gradient. These simple tests have successfully identified the defective floats.

Although they have demonstrated their potential efficiency, these new quality control tools still need improvements. The first task would be to update the reference climatology, including variances and covariance scales. As the Argo array reaches completion, it becomes possible to define a climatology that would be closer to the present mean state in terms of water mass properties and resolution. Enough data are now available to evaluate reliable variances over the 3° Argo scales. Because they rely only on the results of the analysis tool presently used in operational mode by Coriolis, we suggest that the diagnostics presented here be implemented by this data center. The results of the diagnostics could be made available to the different DACs, ARCs, PIs, and scientific users to help in the data qualification.

## Acknowledgments

This work was supported by EC FP6 IP project MERSEA (SIP3-CT-2003-502885). Numerous remarks and suggestions by anonymous reviewers helped improve the manuscript and led to a new analysis of recent datasets.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**(

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Fabienne Gaillard, Laboratoire de physique des océans, IFREMER, BP 70, 29280 Plouzané, France. Email: fabienne.gaillard@ifremer.fr