Satellite altimeter measurements are used to check the quality of the Argo profiling floats time series. The method compares collocated sea level anomalies from altimeter measurements and dynamic height anomalies calculated from Argo temperature and salinity profiles for each Argo float time series. Different kinds of anomalies (sensor drift, bias, spikes, etc.) have been identified on some real-time but also delayed-mode Argo floats. About 4% of the floats should probably not be used until they are carefully checked and reprocessed by the principal investigators (PIs). The method appears to be very complementary to the existing quality control checks performed in real time or delayed mode. It could also be used to quantify the impact of the adjustments made in delayed mode on the pressure, temperature, and salinity fields.
In November 2007, the global Argo array of profiling floats has reached its initial target of 3000 operating floats worldwide. The array now provides for the first time a global monitoring of ocean temperature and salinity data in real time (Gould et al. 2004). In the very recent years, these new datasets have led to a series of new scientific investigations. As part of climate variability and global sea level rise studies, Argo data are, in particular, of great interest. They have been used together with historical datasets and altimeter measurements as an attempt to quantify the causes of global sea level rise—ocean thermal expansion, glacier and ice cap/sheet melting, and snowpack reduction (Lombard et al. 2007; Willis et al. 2008). These studies require very high-qualified data since the signals of interest are of very low amplitude and are highly sensitive to biases or errors present in the dataset.
One thus needs to be very careful in using real-time Argo datasets for such studies since they are only subject to simple automated quality checks. A recent example showed that some signals have been misinterpreted as climate signals while they were due to errors in the Argo datasets (Lyman et al. 2006; Willis et al. 2007, 2009). The best Argo quality data for climate research applications are only available in delayed mode, but to date, only half of the profiles older than one year have been delayed-mode controlled. Delayed-mode Argo quality controlled is also a challenging task as it requires high-quality CTD in the float vicinity. We propose here a complementary approach based on the analysis of the consistency between Argo and satellite altimeter data. This is a fully independent quality check of the data that can thus be used as a verification tool. Such an analysis is also needed before Argo and altimeter data are jointly assimilated in ocean models. Our methods were actually initially developed as part of the French Mercator-Ocean data assimilation system (Drevillon et al. 2008).
Sea level anomalies (SLAs) from altimeter measurements and dynamic height anomalies (DHAs) calculated from in situ temperature and salinity profiles are strongly correlated (Gilson et al. 1998; McCarthy et al. 2000; Guinehut et al. 2006). By exploiting this correlation along with mean representative statistical differences between the two datasets, it is thus possible to use the altimeter measurements to extract random or systematic errors in Argo float time series. Furthermore, the use of contemporaneous altimeter measurements is very powerful since these satellite data grant access to the mesoscale and interannual variability of the ocean, which more classical validation methods based on comparisons with climatological fields do not.
2. Data and method
The full Argo dataset has been uploaded from the Coriolis Global Data Acquisition Center as of February 2008 (http://www.coriolis.eu.org). The full dataset has passed through the real-time quality control procedures applied by each Data Acquisition Center and about half of the profiles older than one year have passed through delayed-mode procedures (see the Argo quality control manual for more details: Wong et al. 2008). For this study, when available, delayed-mode fields are preferred to real-time ones and only measurements having pressure, temperature, and salinity observations considered “good” (i.e., with a quality flag numerical grade of “1”) are used.
The altimeter data used are Archiving, Validation, and Interpretation of Satellite Oceanographic data (AVISO) combined products, which provide maps of SLA obtained from an optimal combination of all available satellite altimeters (AVISO 2008). The delayed-mode version of the product is used until September 2007 and the real-time one up to the present. The maps are available every 7 days on a 1/3° × 1/3° Mercator grid. As this product provides SLA relative to a 7-yr time mean from 1993 to 1999, to be fully consistent with the Argo dataset sampling period, the SLA maps are recalculated using a 5-yr time mean from 2003 to 2007.
Dynamic heights relative to 900-m depth are first calculated from Argo pressure, temperature, and salinity profiles. The 900-m depth is chosen to keep the maximum of profiles in the open ocean, as many floats do not profile deeper than 1000 m, particularly at low latitudes due to technical limitations. If the reference depth is taken to be at 1000-m depth, the number of profiles is reduced by 16.6%. Furthermore, the method could be easily extended to other reference levels in order to check the quality of the subset of floats that are drifting at lower levels, like the Mediterranean ones. Next, to calculate dynamic height anomalies consistent with altimeter SLA, a contemporaneous Argo climatology is used as the mean dynamic height. To compare SLA and DHA, SLA maps are then interpolated to the time and location of each in situ DHA measurement using a linear space/time interpolation. Finally, general statistics between the two datasets (correlation coefficient, rms of the differences) are generated for each Argo float time series. They are then compared to a priori knowledge of these statistics.
The choice of the mean dynamic height is very important for SLA/DHA comparison studies since it can be a source of observed differences. In this study, we have calculated an Argo climatology using the same dataset but discarding questionable floats. As an iterative process, the questionable floats were first extracted from the full dataset if the rms of the differences between collocated SLA and DHA for the float time series was greater than the rms of SLA; DHA being calculated using the annual mean World Ocean Atlas 2005 (WOA05) as the mean dynamic height (Locarnini et al. 2006; Antonov et al. 2006). The Argo mean dynamic height provides improved results when altimeter data are compared with Argo data, allowing a more easy extraction of errors on Argo profiles. Particularly, the mean global bias of −1.1 cm that exists using WOA05 mean dynamic height is reduced to zero with the Argo climatology. Regionally, the mean corrected bias can be on the order of 4–6 cm.
3. Consistencies between altimeter and Argo SLA
An example of an SLA/DHA time series is given in Fig. 1 for the World Meteorological Organization (WMO) 5900026 float traveling from September 2003 up to the present from east to west, south of the island of Java in the Indian Ocean. The different DHA time series correspond to the one calculated from the real-time measurements (DHA-Real), from the measurements adjusted in real time (DHA-Adjusted), and from the one adjusted in delayed mode (DHA-Delayed) [see the Argo quality control manual (Wong et al. 2008) for more details]. General statistics between the two time series are calculated using values adjusted in delayed mode if available, values adjusted in real time if not, and real-time values otherwise. That means that different types of DHA can be successively used for a float time series, as it is the case for the WMO 5900026 float. For this particular example, very good consistencies between the two time series are found during the more than 4-yr life time of the float. Mesoscale structures of up to 25 cm are well represented in both time series and the impact of the delayed-mode and real-time adjustment is clearly visible. General statistics between the two time series calculated over the 197 cycles of this float show a correlation of 0.88, mean differences of −1.9 cm, and rms of the differences of 5.3 cm, corresponding to about 27% of the variance of the altimeter signal in the area.
Since altimeter data include both steric and nonsteric contributions to sea level (e.g., Pattullo et al. 1955; Gill and Niiler 1973; Stammer 1997; Fukumori et al. 1998; Guinehut et al. 2006) and dynamic height anomalies calculated from temperature and salinity profile data represent the steric contribution between the surface and the reference level (i.e., 900-m depth in this study), small differences between the two datasets are expected. These differences are due to nonsteric contributions to sea level and to temperature and salinity changes below 900-m depth. An a priori statistical regional characterization of these differences must thus be utilized. Two key statistical parameters are thus used for the validation. The first one is the correlation coefficient between the two time series and the second one is the rms of the differences between the two time series expressed as a percentage of the variance of the altimeter signal.
Referenced values for these two fields are calculated using the same datasets but discarding the questionable floats already detected for the calculation of the Argo climatology (see section 2). The references are computed using a large number of observations and on a 1° × 1° horizontal grid using observations available in a 2° latitude by 10° longitude radius of influence around each point to reduce the problems due to the nonuniform temporal and spatial distribution of the in situ measurements and to take into account the latitudinal structure of the signal.
Correlation coefficients between the two time series are almost everywhere greater than 0.5 and even greater than 0.7 in most parts of the oceans (Fig. 2). Lower values are found in the high-latitude regions where nonsteric contributions to sea level are expected to be larger. These results complement the study of Guinehut et al. (2006), which uses mainly XBT measurements and thus salinity calculated from climatological fields. The use of in situ temperature and salinity profiles in the present study shows a higher correlation coefficient of the order of 0.1 almost everywhere and also a much better description of the Southern Ocean. The rms of the differences between the two time series shows values lower than 50% in most parts of the Indian, Pacific, and Atlantic Oceans north of 30°S (Fig. 3). The Antarctic Circumpolar region has value on the order of 50%–70% or greater. The Atlantic and South Pacific Oceans show also different regions with high values (>80%), which correspond to regions of very low variability of the altimeter signal (<4-cm rms) and to regions with lower correlation coefficients (<0.6). The purpose of this study is not to give an exhaustive explanation of the physical meaning of these statistics; a full investigation of these signals is left for future study.
4. Altimeter SLA as a validation tool for Argo time series
The two key statistical parameters (correlation coefficient and rms of the differences) are computed for each Argo float time series and are compared to the referenced values described in the previous section and represented in Figs. 2 and 3. If they fall outside the given values, the Argo float time series is thus suspected to have a problem (drift, bias, spike,…). Because the references have been computed using a large number of observations and in a way to obtain relatively large-scale and uniform maps, a float is suspected to have a problem if the rms of the differences between the DHA and SLA time series is 3 times greater than the referenced values. Besides, in order to obtain representative statistics, floats having very short time series of only a few cycles (i.e., floats presenting a very quick failure or floats younger than one year) are not considered in this study. Finally, in order to account for both altimeter and in situ errors that are not expected to be less than 1–2 cm in terms of sea level, if the variability of the altimeter SLA at the float time series is very small and lower than 2 cm, it seems very difficult to perform any kind of automatic checking and the float is separated and controlled by hand (less than 10 floats enter this category).
Results are summarized in Fig. 4; one point represents the value for one time series at its mean position. For most of the floats (more than 3900), the rms of the differences between SLA and DHA are on the order of the referenced numbers. The 160 anomalous floats with much higher values (Fig. 4b) can be detected all over the different oceans. These higher values are mainly the results of errors on the float time series due to a systematic offset, a very important spike, or the drift of a sensor (salinity or pressure). About two-thirds of these 160 floats additionally present much lower correlation coefficients between SLA and DHA than the referenced values, showing a very questionable behavior of the float.
Some examples of anomalous floats are given in Fig. 5 together with the float position and its mean statistics. The WMO 1900581 float traveling in the South Atlantic Ocean since January 2006 exhibits a constant negative high bias on the order of 15 cm with the altimeter data (Fig. 5a). For the particular cases of bias, the validity of the mean dynamic height used to calculate anomalies from the in situ T/S profiles can be questioned. In this study, we have calculated a dedicated Argo mean dynamic height that is fully consistent with the Argo dataset, so this hypothesis should be rejected. Moreover, as the value of the bias is very high, it most probably reflects a problem in the salinity or pressure sensor of the float. The delayed-mode procedure that aims at correcting salinity drift or offset would probably confirm the result. The second example (WMO 1900249 float; Fig. 5b) shows a progressive drift of the DHA time series regarding the SLA time series as the float is traveling from east to west in the tropical Atlantic Ocean. Additionally, the correlation between the two time series is null, while it is expected to be greater than 0.5, showing a clear malfunction of one of the sensors. The last example, for the WMO 3900225 float in the South Pacific Ocean (Fig. 5c), shows that part of the data have been delayed-mode controlled and then the SLA and DHA time series match each other very well. At the end of the time series, when values adjusted in real time are available, they show a constant offset of about 10 cm with the altimeter data. This offset seems to be due to the salinity offset value of 0.092 applied in real time, which is without any doubt overestimated and wrong compared to the 0.015 value applied for the delayed mode.
These three examples are only given as illustrations since all the other anomalous floats fall within one of these three categories. It reminds one that the real-time Argo datasets are only subjected to simple automated quality checks. Furthermore, the adjusted values in real time and delayed mode are not subjected to any kind of validation whatsoever. Finally, as with any dataset, Argo data are prone to errors coming from handling mistakes by the operators.
5. Discussion and conclusions
Different kinds of anomalies (drift, offset, spike,…) have been identified on some real-time but also delayed-mode Argo profiles. About 4% of the floats should be put aside until they are carefully checked and reprocessed by the PIs. As an on-going collaborative effort, the list of these floats is published on a regular basis on the Coriolis global data assembly center (GDAC) Web site. Since the beginning of this study, as part of the results have been provided to some PIs, some float time series might have already been corrected or separated in the GDAC database.
The proposed method should be considered as an iterative process since some parameters like the Argo mean climatology and the referenced statistics should be refined as more delayed-mode Argo floats are made available. The south tropical Atlantic Ocean shows, for example, a lack of referenced data due to a large number of questionable floats in the area. Other data sources like high-quality CTD measurements should also be included in the process like for the Argo delayed-mode quality procedures (Wong at al. 2003; Böhme and Send 2005).
Despite the very conservative criteria used here, the method is very instructive in extracting anomalous floats. We found it also very instructive and complementary to more classical validation methods to work on a vertically integrated field that is of dynamic height. It gives a very quick idea of the behavior of the time series of the float. It should be used as a validation/verification tool but also to quantify the real impact of the adjusted fields (salinity, temperature, pressure). Additionally, the use of contemporaneous altimeter measurements is very powerful since altimeter measurements allow access to the mesoscale and interannual variability of the ocean, which more classical methods based on comparisons with climatological fields do not.
Finally, as Argo in situ temperature and salinity profile measurements and altimeter SLA data are the two most important complementary components of the ocean observing system used together in climate and operational oceanography applications, the consistencies between the two datasets should be verified on a regular basis. Real-time consistency tests should be performed in near–real time to detect suspicious floats earlier than in delayed mode. Development of efforts on cross-validation and cross-calibration tools between these two datasets but also between the different in situ instruments (Argo, XBT, CTD) should thus be carried on.
The Argo data were collected and made freely available by the international Argo project (a pilot program of the Global Ocean Observing System) and the national programs that contribute to it (http://www.argo.ucsd.edu; http://argo.jcommops.org). The altimeter products were produced by SSALTO/DUACS and distributed by AVISO with support from CNES. The study was carried out under a contract with the Coriolis data center and also with support from Mercator-Ocean. We thank H. Freeland and another anonymous reviewer for comments.
Corresponding author address: Stephanie Guinehut, CLS Space Oceanography Division, 8-10 rue Hermès, 31520 Ramonville Saint-Agne, France. Email: email@example.com