## 1. Introduction

Autonomous CTD profiling floats are instruments that move freely with the ocean current at fixed parking depths and cycle from a profiling depth to the sea surface at regular time intervals. While rising to the surface, these autonomous floats take profiles of conductivity (*C*) and temperature (*T*) versus pressure through the water column. From these variables, depth (*D*), salinity, density, and other derived quantities can be calculated. The data are sent to various data centers via satellites, before the floats sink back to their prescribed parking depths to continue their drifts. The Argo program (information available online at http://argo.jcommops.org) plans to deploy 3000 such autonomous CTD profiling floats with a target profiling depth of 2000 m to observe temperature and salinity within the upper layers of the global ocean, and currents at the parking depths.

These profiling floats have an expected mean lifespan of about 4 yr at present, and are anticipated to give good measurements of temperature and pressure over this span. However, salinity measurements may experience sensor drifts owing to biofouling and a variety of other problems. Unlike traditional CTD casts, where in situ bottle data standardized to the International Association for the Physical Sciences of the Ocean's (IAPSO) standard seawater are obtained for salinity calibration, “ground truth” salinity data are not usually available for these floats. The moving nature of these floats also means that only a few can be retrieved for examination and postdeployment laboratory salinity calibrations. We here discuss a system for calibrating the salinity measurements from these autonomous CTD profiling floats with regional temperature–salinity relationships, by using nearby historical hydrographic data.

## 2. Salinity calibration by *θ*–*S* climatology

The two main state variables of the ocean, potential temperature, *θ,* and salinity, *S,* are related to each other by definite patterns that represent the mean characteristics of a region (e.g., Worthington 1981; Emery and Dewar 1982). These climatological *θ*–*S* relationships are influenced by seasonal and decadal variations, and by strong isolated vortices, such as the Kuroshio rings or lenses of Mediterranean water (so-called Meddies). There can also be high variability in the vicinity of strong fronts between water masses, for example, across the Gulf Stream in the Atlantic, or the South Equatorial Current in the Indian Ocean. However, for most of the global ocean, mean *θ*–*S* relationships can be used to estimate salinity from measurements of temperature and pressure. The certainty of the estimation will depend on the degree of spatial and temporal variability in the region. Estimates of these climatological *θ*–*S* relationships and their variability are here used to calibrate the salinity measurements from the CTD profiling floats.

### a. A world θ–S climatology database

To establish a *θ*–*S* climatology for the World Ocean, historical salinity measurements from a selected subset of both CTD and bottle data from the World Ocean Database (Conkright et al. 1998, hereafter WOD98) have been assembled and interpolated onto a set of potential isotherms, or *θ* surfaces. The float salinity measurements are later compared to the historical salinity measurements on this set of *θ* surfaces. Potential temperature surfaces are more appropriate than traditional isobaric surfaces as the coordinate system for calculations, because isobaric calculations can produce “anomalous anomalies” in areas with *θ*–*S* curvature and vertical excursions (either spatial and/or temporal) of density surfaces (Lozier et al. 1994). For the purpose of this calibration, *θ* surfaces are also superior to potential density surfaces because calculation of density is sensitive to salinity errors. In other words, the two state variables *θ* and *S* are kept separate. The float salinity measurements are essentially calibrated using the more accurate float measurement, temperature, as the independent variable. Temperature and pressure measurements from the CTD sensors on these floats are, in general, accurate to 0.002°C and 2.4 db, respectively, with expected temperature sensor drift of 0.0005°C yr^{−1}, although at present limited telecommunications bandwidth keeps temperature resolution to about 0.005°C (R. Davis 2002, personal communication).

To capture most of the water column for the World Ocean, 54 standard *θ* surfaces have been selected between −1° and 30°C. A shape-preserving spline (Akima 1970) is used to vertically interpolate the historical bottle salinity data to the standard *θ* surfaces, while the historical CTD salinity data are subsampled at the standard *θ* levels. All interpolated salinity data have been visually inspected for extreme outliers, which have subsequently been removed. Interpolation of salinity data has been done from the deepest *θ* surface to the shallowest. In cases of *θ* inversions, only salinity on the deepest instance of each isotherm is used. For most of the world's oceans, this method of interpolation will retain the larger and more stable part of the water column below any shallow temperature inversion layers. The exceptions are on the continental shelves of Antarctica, Labrador, Greenland, and in the Arctic Circle, where the water column is weakly stratified with multiple temperature inversions. However, these exceptions compose a very small percentage of the world's oceans.

### b. Objective estimates of climatological θ–S relationships at float locations

Climatological values of salinity at the location of the float profiles are estimated by using the vertically interpolated historical salinity data and an objective mapping method. The objective method is based on the Gauss–Markov theorem. It gives a pointwise estimate that is linear and unbiased, is optimal in the least squares sense, and also returns an estimate of the uncertainty (error variance) that takes into account the distribution of the data used (Bretherton et al. 1976; McIntosh 1990). Our procedure, described below, accounts for both the spatial and temporal variations in the climatological *θ*–*S* relationships.

The covariance of the data is assumed to be Gaussian, with the decay scale determined by three scale parameters: a longitudinal scale, Lx; a latitudinal scale, Ly; and a temporal scale, *τ.* The spatial scales are anisotropic, with Lx greater than Ly to reflect the predominantly zonal currents in the ocean interior. We use two sets of spatial scales, a set of large scales (Lx_{1}, Ly_{1}) and a set of small scales (Lx_{2}, Ly_{2}), to estimate the large-scale field and the small-scale field. Presently, they have been somewhat arbitrarily set at Lx_{1} = 20°, Ly_{1} = 10° and Lx_{2} = 8°, Ly_{2} = 4°, based on regional water mass variability scales. The temporal scale is estimated by the ventilation timescale, which in turn is estimated from apparent ages based on the partial pressure of chlorofluorocarbon, CFC-12 (e.g., Doney and Bullister 1992). A global CFC-12 dataset, obtained from J. Bullister (2001, personal communication), provides the temporal scale *τ* for the various *θ* surfaces. The bulk of these data are now available publicly from the World Ocean Circulation Experiment (WOCE) Hydrographic Program Office (http://whpo.ucsd.edu). The production and release of CFC into the atmosphere began in the 1930s; hence, the maximum CFC apparent age is about 50 yr. Thus, for deep layers in which the actual residence times are significantly longer, using CFC age will tend to lower the weights for old historical data compared to those using the real ventilation timescales. However, most of the older historical data also have larger measurement errors, so these lower weights are appropriate.

For each float profile at (*x*_{0}, *y*_{0}, *t*_{0}) and on every standard *θ* surface, we select WOD98 data points from an area enclosed by an ellipse with radii Lx_{1} and Ly_{1} (the large spatial scales), with (*x*_{0}, *y*_{0}) as the center. From this initial set, we select 600 “best” historical data points for objective mapping based on three criteria. First, we randomly select 200 data points from the initial elliptical area. This ensures that the large-scale mean is well represented by measurements around the float profile. Second, from the remaining data points, we select 200 historical points (*x*_{i}, *y*_{i}, *t*_{i}) with the shortest spatial separation factor relative to the large length scales, (*x*_{i} − *x*_{0})^{2} /^{2}_{1}*y*_{i} − *y*_{0})^{2}/^{2}_{1}*x*_{i} − *x*_{0})^{2}/^{2}_{2}*y*_{i} − *y*_{0})^{2}/^{2}_{2}*t*_{i} − *t*_{0})^{2}/*τ*^{2}. This step ensures that more contemporaneous and close-by historical data are included. Choosing data based on these three criteria means that the choice of historical data is not spatially biased toward hydrographic lines that have dense station spacings, while at the same time guaranteeing that the most nearby (in time and space) measurements are included. The objective map will therefore contain a good estimate of the mean *θ*–*S* relationship and its spatial and temporal variability in the region. If fewer than 600 historical points are available within the ellipse, then all available points are used.

As the float drifts toward the coast, the elliptical area determined by Lx_{1} and Ly_{1} would enclose landmasses, thus decreasing the area from which historical data can be chosen. To maximize the choice of data, and to take into account the shift from interior flow regimes with predominantly zonal orientation to the coastal regime where currents tend to parallel the shore, the ellipse is stretched in the north–south direction to avoid enclosing any landmasses when a float drifts toward coastal bathymetry that has a north–south component. This is done by lengthening the longitudinal scale Lx_{1} and shortening the latitudinal scale Ly_{1}, but maintaining the same area as that enclosed by the original ellipse. In other words, the area from which the “best” spatial historical points are selected is preserved while the ellipse is deformed. When the ellipse degenerates into a circle, Lx_{1} and Ly_{1} return to their original values, but the longitudinal–latitudinal axes are rotated so that the longer axis becomes parallel to the continental slope.

*S*′, at each location and on each standard

*θ*surface is given by

*S*

**d**

*ω***d**

**d**

**d**= [

*d*

_{1}, … ,

*d*

_{m}] denotes the set of selected historical data for that standard

*θ*surface and 〈

**d**〉 denotes the mean value of the set

**d**. In other words, the a priori estimate is assumed to be 〈

**d**〉, the mean value of

**d**. For each historical datum

*d*

_{i}at (

*x*

_{i},

*y*

_{i},

*t*

_{i}), there is a true signal

*s*

_{i}, and some random noise

*η*

_{i}, that includes measurement errors and the random processes and natural variability in the ocean that cause deviations from the climatology. From the relationship

*d*

_{i}=

*s*

_{i}+

*η*

_{i}, the signal variance and the noise variance of the data can be estimated, and are incorporated into the coefficient matrix

**. The signal variance is approximated by (1/**

*ω**m*) Σ

_{i}(

*d*

_{i}− 〈

**d**〉)

^{2}, where

*m*is the number of data points on each

*θ*surface. The noise variance is estimated by (1/2

*m*) Σ

_{i}(

*d*

_{i}−

*d*

_{j})

^{2}, where

*d*

_{j}is the data point that has the shortest distance from

*d*

_{i}on each

*θ*surface. This method of estimating the noise variance assumes that the noise is uncorrelated over the distance, that it has uniform variance, and that the signal has a longer correlation distance than the data separation (Fukumori and Wunsch 1991).

**in (1) takes the form**

*ω***= 𝗖**

*ω**dg*· (𝗖

*dd*)

^{−1}, where 𝗖

*dg*denotes the data–grid covariance matrix and 𝗖

*dd*denotes the data–data covariance matrix. As mentioned previously, the covariance function is assumed to be Gaussian. Building on Roemmich (1983), a two-stage mapping is employed. In the first stage, the covariance is a function of the large-scale spatial separation only, and the Gaussian decay scale is determined by the large spatial scales Lx

_{1}and Ly

_{1}: By using (1) and (2a), the historical data are mapped to the location of the float profile, as well as to the selected historical data points themselves. The differences between the original values and the estimated values at the historical data points are called the residuals. The first-stage estimate at the location of the float profile,

*S*

^{′}

_{1}

_{2}and Ly

_{2}, as well as the temporal scale

*τ*: The second-stage estimate

*S*

^{′}

_{2}

The final objective estimate at the float profile location *S*^{′}_{f}*S*^{′}_{f}*S*^{′}_{1}*S*^{′}_{2}_{1}, Ly_{1} and Lx_{2}, Ly_{2}), but are also close to the float profile location in time (relative to *τ*). When there are historical data nearby in space and time, the objective estimate will reflect their values and have small errors. If the time differences between the historical data and the float measurements exceed *τ,* the second-stage contribution will be small. In this case, the final estimate will relax back toward the first-stage map, or the large-scale time-mean climatological field, and the errors will be larger. If the float drifts into areas with no chlorofluorocarbon (CFC) apparent age estimates, such as some of the marginal seas, the second stage maps the residuals with Lx_{2} and Ly_{2} only.

In the first stage of mapping, 𝗖*dg* and 𝗖*dd* are scaled by the signal variance of the historical data, while in the second stage of mapping, 𝗖*dg* and 𝗖*dd* are scaled by the signal variance of the residuals. In addition, the noise variance of the historical data is added to the main diagonal of 𝗖*dd* in both stages of mapping. The same noise variance is used in both stages of mapping because, as discussed previously, the noise in the data represents the random oceanic processes that cannot be measured and so does not change with scales.

### c. Weighted least squares fit for a time-varying slope in potential conductivity space

Corrections to the float salinity data are obtained by fitting to the objectively estimated climatological salinity field on the standard *θ* surfaces by weighted least squares. Sensor calibrations are best applied to measured quantities, which for the floats is conductivity. However, direct comparison of conductivity is not ideal, because conductivity depends on pressure (as well as salinity and temperature), and the pressures of the historical *θ* surfaces will not necessarily match those of the floats. A more suitable parameter is a derived quantity, potential conductivity, defined as *C*_{θ} = *C*(*S,* *θ,* *P* = 0). In other words, it is the conductivity calculated from the equation of state (Fofonoff and Millard 1983) using the observed salinity, potential temperature (instead of in situ temperature) and a pressure of zero instead of the actual pressure (J. Toole 2000, personal communication). By using a reference pressure of zero, potential conductivity eliminates the differences in the pressures of the standard *θ* surfaces between climatological and float data. All salinity values from the floats and climatology (and their errors) on the standard *θ* surfaces are therefore converted to potential conductivity.

Calibration drift of the conductivity sensors on the floats is mainly due to either biological fouling or ablation of a biocide used to prevent biofouling. The dimensional variation from such processes causes the cell geometry to change and alters the effective volume over which the conductivity is measured. This in turn causes the ratio of the measured to true conductivity to change. Thus, the correction to the conductivities is assumed to be a multiplicative factor (or a slope term). With traditional CTD casts that are accompanied by in situ bottle data, it is often empirically necessary to fit both a slope and a bias to conductivity in order to obtain a calibration with small residuals throughout the water column. In those cases, the presence of accurate in situ bottle data from the shallow layers to the deeper layers means that a wide range of conductivity values are available for the least squares method to obtain a good fitting for a slope and a bias (an overdetermined system). However, in our case, estimation of climatology is only accurate in the deeper layers, which, by using the inverse of the error variance as weights, effectively means that only a narrow range of conductivity values are available for least squares fitting (an underdetermined system). This situation is crudely analogous to the example of fitting a straight line through a single data point, where an infinite number of solutions are possible. To constrain the solution, we assume a priori that there is no bias (i.e., the straight line passes through the origin). The resulting slope term is the preferable choice of model parameter because its physical interpretation is more akin to the expected behavior of the conductivity cell as its geometry changes.

With accurate contemporary shipboard salinity measurements, Bacon et al. (2001) have demonstrated a method for calibrating profiling autonomous Lagrangian circulation explorer (PALACE) float data by using an additive correction only in salinity space. An additive salinity correction is roughly equivalent to a multiplicative conductivity correction. Thus both our method and that of Bacon et al. (2001) have effectively chosen to model only a conductivity slope. However, if sufficiently accurate data were available over a range of conductivities, it would be preferable to correct for both a conductivity slope and a conductivity bias.

*i*th profile from a float then takes the form

^{′}

_{i}

*r*

_{i}

*C*

_{i}

_{i}

*C*

_{i}is the float potential conductivity,

^{′}

_{i}

*r*

_{i}is the multiplicative correction term, and ɛ

_{i}is the model error. An estimate of

*r*

_{i}is found by using standard weighted least squares minimization between the potential conductivities from the float and from climatology. A multiplicative correction term

*r*

_{i}is solved for individual float profiles

*F*

_{i}. This is because a profile-varying (time-varying) correction will take into account the gradual evolution of the changes in the sensor cell geometry. We further assume that the rate of cell geometry change in a float is relatively constant over a number of profiles. In other words, the conductivity cell changes slowly over time instead of in sudden jumps. Hence for every float profile

*F*

_{i},

*r*

_{i}is found by minimizing a 2

*k*+ 1 profile series of differences between the float potential conductivities and those from the climatology, where

*k*> 0. The series is comprised of

*k*profiles prior to

*F*

_{i}(

*F*

_{i−k}, … ,

*F*

_{i−1}),

*k*profiles after

*F*

_{i}(

*F*

_{i+1}, … ,

*F*

_{i+k}), with

*F*

_{i}itself at the origin, or the center of the profile series. The inclusion of multiple profiles in the least squares fit serves to smooth out some of the transient oceanic noise sampled by individual float profiles, thus giving a more stable calibration.

**m**

*C*

_{i}, 𝗗 is the data matrix consisting of the time series of corresponding estimated climatological potential conductivities

*C*

^{′}

_{i}

**m**are the model parameters, and ɛ are the model errors. Two parameters have been built into the system: the multiplicative correction term

*r*

_{i}, and the time derivative of the multiplicative correction term ∂

*r*

_{i}. Hence, for a float profile

*F*

_{i}with

*n*

_{i}number of

*θ*levels, the linear system is of the form

For profiles taken shortly after the float was deployed or near the end of the available profiles, the system of equations is truncated appropriately. Typically we use *k* = 10. So, for example, for the third profile of a float, the system would include the first 3 + *k* = 13 profiles, assuming that many profiles exist. This system has Σ *n*_{i} simultaneous equations and two unknowns. Hence except for the extreme case where only one profile with only one *θ* level is available, this is formally an overdetermined system.

Since the climatological estimates have varying uncertainties, they will not provide equal constraints on the calibration constants. For example, the objective estimates of salinity (and then potential conductivity) for the deeper *θ* surfaces usually have significantly smaller errors than those at shallower depths. We follow the standard practice of defining a diagonal weighting matrix, 𝗪, of dimension Σ *n*_{i} × Σ *n*_{i}, where the diagonal elements are chosen to be the reciprocal of the mapping error variance corresponding to the potential conductivities in 𝗗. That is, 𝗪 = diag[*σ*^{2}_{map}(*C*′)*σ*^{2}_{map}*C*′) is calculated from *σ*^{2}_{map}*S*^{′}_{f}*θ* surfaces where the *θ*–*S* relationships are more stable are used dominantly in the calibration.

**m**

^{T}, retaining two eigenvalues (Menke 1989). The least squares solution to the weighted problem in (6) then is

The calibrated salinity values are obtained from the corrected potential conductivities in (4) with *r*_{i} in (7). There is no significant difference between final values as to whether one corrects potential conductivity or conductivity before converting back to salinity.

^{2}

_{r}

^{2}

_{∂r}

*r*

_{i}and ∂

*r*

_{i}, respectively, and 𝗥

^{2}is the error covariance of the data matrix 𝗗. In the case where all the climatological profiles in the time series were independent, the error covariance matrix 𝗥

^{2}would simply be a diagonal matrix, where the diagonal elements are the mapping errors associated with the climatological potential conductivity estimates,

*σ*

^{2}

_{map}

*C*′) as derived from (3), and zeros as the off-diagonal elements. Such off-diagonal zeros (which represent the independence) are, of course, unrealistic, as there is vertical dependence between the various

*θ*levels and lateral dependence between adjacent climatological profiles in the time series. A data covariance matrix, 𝗰𝗼𝘃 𝗗, therefore needs to be constructed to give a realistic error estimation.

*θ*levels, the vertical extents of water masses are used to provide a measure of the vertical scales. An oceanic water mass is a body of water with a common formation history, hence a characteristic

*θ*–

*S*combination (Tomczak and Godfrey 1994). As the water masses spread they mix, and so at any given point in the ocean, the depth range that a water mass occupies gives some indication as to the degree of mixing, or vertical coherence, of the water column. For example, in the Pacific Ocean, the water column from the shallow depth to the abyssal layer is typically occupied by surface waters, subtropical waters, central waters, mode waters, intermediate waters, deep waters, and bottom waters, respectively. Surface waters are separated from bottom waters by about 4000 m in the vertical and about 4000 km in distance between their respective formation regions, and so these two water masses obviously are independent from each other. To estimate the vertical covariance between

*θ*levels, a set of

*θ*boundaries are established to delimit the generic vertical water mass structure of a typical ocean basin. The

*θ*boundaries are set at 30°, 24°, 18°, 12°, 8°, 4°, 2.5°, 1°, and 0°C. The vertical covariance function between two levels

*θ*

_{p}and

*θ*

_{q}is given by

*θ*

_{pq}

*θ*

_{p}

*θ*

_{q}

^{2}

*Lθ*

^{2}

*Lθ*is the vertical water mass scale determined by the differences between

*θ*boundaries.

Similarly, the lateral covariance between climatological profiles in the data matrix is estimated using the Gaussian function 𝗖*dd*_{ij}(*x,* *y*) from (2b), with the small spatial scales but with *t*_{i} − *t*_{j} = 0. This means that the lateral covariance between profiles depends only on their spatial separation relative to the small spatial scales. The mean ages of the climatological profiles are assumed to be similar; hence, their temporal separations are not taken into account and are therefore set to zero.

A data covariance matrix 𝗰𝗼𝘃 𝗗 is then constructed using the vertical covariance matrix 𝗖*θ* and the lateral covariance function 𝗖*dd*_{ij}; 𝗰𝗼𝘃 𝗗 consists of (2*k* + 1) × (2*k* + 1) tilings of copies of 𝗖*θ* (less than 2*k* + 1 for a truncated profile series). Each tile 𝗖*θ*_{ij} is of dimension *n*_{i} × *n*_{j}, where *n*_{i} is the number of available *θ* levels for the *i*th profile in the 2*k* + 1 (or the truncated) profile series. Each tile 𝗖*θ*_{ij} is then scaled by 𝗖*dd*_{ij}, the lateral covariance between the *i*th and *j*th profiles. Hence, for example, the tiles along the main diagonal of 𝗰𝗼𝘃 𝗗 are simply 𝗖*θ*_{ii}, because 𝗖*dd*_{ii} = 1. That is, the diagonal tiles of 𝗰𝗼𝘃 𝗗 represent the covariance of each profile with itself, so they simply have 1 as their diagonal elements, and *Cθ*_{pq} as their off-diagonal elements.

^{2}is then calculated as

^{2}

*σ*

^{2}

_{map}

*C*

*κ*associated with a float is known, it can be incorporated into 𝗥

^{2}by adding diag(

*κ*

^{2}) to (9). The error variance [

^{2}

_{r}

^{2}

_{∂r}

*r*

_{i}, ∂

*r*

_{i}] for the

*i*th profile can then be estimated by substituting 𝗥

^{2}into (8). In this way, the errors associated with estimating the background climatology are carried through to the weighted least squares calculations. Note that (8) assumes that the

*i*th profile is at the center of the 2

*k*+ 1 profile series. At the beginning and end of a float's lifetime, the series will be truncated, and so the

*i*th profile will not be at the center. In those cases,

^{2}

_{r}

*r*

_{i}) needs to be increased by adding

*δ*

^{2}×

^{2}

_{∂r}

*δ*is the distance between

*i*and the midpoint of the truncated series (e.g., for

*i*= 2 in a seven-profile series,

*δ*= 3.5 − 2 = 1.5), and

^{2}

_{∂r}

*r*

_{i}.

Due to the need to accumulate a time series of float profiles to calculate a stable time-varying slope correction term, this is a delayed-mode calibration system (*k* > 0). The second parameter, ∂*r*_{i}, is not used explicitly in the calibration procedure since the parameters are calculated using each profile as the origin, and *r*_{i} gives the correction for the profile of interest. However, ∂*r*_{i} can be used to project the correction trend, so that the corrected salinity values for a profile can be estimated in real time. In other words, ∂*r*_{i} can be used for a suboptimal real-time salinity adjustment estimation in advance of the delayed-mode procedure described here. The main feature of this calibration system is that the uncertainties from each stage of the process are propagated to the end. The result is a set of calibrated float salinity data with rigorous error estimates. In the following section, we provide several examples to illustrate the workings of this calibration system.

## 3. Examples

Biological fouling has a large potential to affect conductivity measurement stability in autonomous CTD profiling floats. As examples we present two types of conductivity sensors, both of which are designed to minimize the effects of biofouling, but which still experience calibration drifts. The Falmouth Scientific Instruments (FSI) conductivity sensors use inductive cells with external electrical fields, which can be distorted by marine growth. To minimize biological fouling, the cells are sometimes coated with a toxic antifouling agent, but the agent itself distorts the external fields and causes changes in the cell geometry. As the antifouling agent ablates, the cell size is altered and the salinity measurements tend to drift toward artificially high values. The Sea-Bird Electronics (SBE) sensors use electrode cells with internal fields. Small amounts of antifouling material placed at the external ends of the cell (entrance and exit) work to minimize internal biofouling without altering the cell geometry. The SBE sensor is housed inside a pumped system, so that when not sampling, the cell plumbing keeps the cell filled with poisoned water. The antifouling agents are expected to remain effective over several years. Some SBE sensors have apparently had biocide leakage into the conductivity cells after laboratory calibrations had been performed that altered cell dimensions and led to artificially low salinity measurements at the beginning of the float lifetimes. However, the biocide apparently washed off and the salinity measurements returned nearer to expected values after a few months.

Ten months of data from three floats with different sensors are presented here to show the different behavior of the two kinds of conductivity sensors, and to illustrate the workings of our calibration system. The first float was equipped with an antifouling-coated FSI sensor, Consortium on the Ocean's Role in Climate (CORC) float 1118 (R. Davis 2000, personal communication). This float was deployed in the eastern tropical Pacific, in a region with a fair amount of historical data from WOD98 (Fig. 1a). Between November 1998 and August 1999, it moved westward from about 110° to 115°W between 8° and 10°N. The second float was equipped with a SBE sensor, Cooperative Ocean Observing Experiment (COOE) float 21070 (S. Wijffels 2001, personal communication). This float was deployed in the eastern tropical Indian Ocean northwest of Australia, in a region with relatively sparse historical data (Fig. 1b). It moved southeastward from about 12°S, 106°E to 14°S, 108°E, between October 1999 and July 2000. The third float was also equipped with an SBE sensor, University of Washington (UW) float 453 (S. Riser 2002, personal communication). This float was deployed in the equatorial Atlantic where historical data distribution is dense. From January 2001 to October 2001, it moved from near the equator northwestward toward 3°N (Fig. 1c).

Over the same amount of time, salinity measurements from the three floats changed in different ways. For CORC 1118, which used an FSI sensor that was coated with an antifouling agent, its salinity measurements started in reasonable agreement with the climatology, but steadily drifted toward higher values relative to the estimated background salinity field (Figs. 2a,b). In the eastern tropical Pacific, the water has a tight *θ*–*S* relationship. For example, at 7°C, objective estimates of background salinity along the float trajectory fall within the narrow range of 34.56–34.57, with the mapping errors in the range of 0.003–0.007 (Fig. 3a). The CORC 1118 salinity measurements, however, drifted from 34.58 (profile 1) to 34.65 (profile 21) over 10 months. The wide range of salinity measurements obtained by this float along its trajectory is therefore not due to the different water masses sampled, but is the result of sensor drift. Additional support for the sensor drift is found in the systematic displacements between float measurements and climatology over a temperature range encompassing at least two water masses (Fig. 2b). In the case of this float, sensor drift toward salty values is due to the ablation of the antifouling coating on the conductivity cell. This ablation changes the cell geometry, leading to salinity values that are higher later in the float's life.

For COOE 21070, which used an SBE sensor, its salinity measurements were fresher than the climatology at the beginning of the float's life, but in time drifted closer to the estimated background salinity values (Figs. 2c,d). The eastern tropical Indian Ocean is more variable than the eastern tropical Pacific, but the climatological salinity spread at depth along the float trajectory is still less than the float measurement range. For example at 2.4°C, objective salinity estimates along the float trajectory decrease from 34.72 to 34.70, with the mapping uncertainty at 0.005–0.006 (Fig. 3b). At the same temperature, the float salinity measurements increased from 34.64 (profile 1) to 34.73 (profile 28) in 10 months. This wide range of float measurements is therefore likely the result of instrument drift. In this case, the salinity disparity of nearly 0.1 at the beginning of the float's lifetime is attributed to biocide leakage into the conductivity cell before deployment. After deployment the biocide washed off relatively quickly and the sensor returned to giving more normal salinity measurements. Again, the difference between float measurements and climatology is consistent over a few water masses (Fig. 2c), indicating that the displacement is due to sensor drift.

The UW 453 float also employed an SBE sensor. These sensors are often quite stable over periods of up to 4 yr (Riser and Swift 2001, manuscript submitted to *J. Atmos. Oceanic Technol.,* hereafter RS). However, UW 453 is an exception. Unlike COOE 21070, this float did start in good agreement with the climatological estimates. However, the float measurements drifted toward fresher values after 10 months (Figs. 2e,f). For example, at 3.6°C, historical data show salinity between 34.97 and 34.98 along the float trajectory, with uncertainty in the range 0.003–0.006 (Fig. 3c). Float measurements started at 34.97 (profile 1) and decreased to 34.89 (profile 29), which was fresher than the objective estimate by 0.08. Again, the consistent shift of the measured *θ*–*S* curve relative to the climatology over the entire water column indicates that the fresher measurements are due to sensor drift (Fig. 2f). Perhaps due to some unusual failure of the biocide system, biological growth began to accumulate over the conductivity cell, thus decreasing its effective volume and so leading to the fresher salinity values.

Note that the *θ*–*S* variability of the region and the availability of historical data for calibration are reflected in the salinity mapping errors (Fig. 2). For example, CORC 1118 and UW 453 profiled in regions that are *θ*–*S* stable and densely sampled historically relative to the locations of COOE 21070. Hence the salinity mapping errors for CORC 1118 and UW 453 are less than those for COOE 21070 at the corresponding *θ* levels. In addition, at each location the salinity mapping errors naturally increase from the deeper layers to the shallower layers, thus reflecting the greater *θ*–*S* variability at the shallow depths. Generally deeper profiles sample more temporally stable and spatially uniform *θ*–*S* relationships, which is one reason for a 2000-m target profiling depth for the Argo floats.

All three floats have been put through our routine using a 21-profile time series for calibration, that is, with *k* = 10 (Figs. 2 and 3). For CORC 1118, the calibration procedure has made only slight adjustment to profile 1, but has made significant adjustment to profile 21, displacing the *θ*–*S* curve toward lower salinity values closer to the climatology. The opposite calibration results are obtained for COOE 21070. The procedure has made significant adjustment to profile 1, displacing the *θ*–*S* curve toward higher salinity values closer to the climatology, while only slight adjustment has been made to profile 28. For UW 453, the calibration made almost no adjustment to profile 1, but displaced profile 29 toward higher salinity values. In all cases, the estimated salinity calibration errors are small (less than 0.01). This is because the salinity calibration errors are essentially the salinity objective estimate uncertainties from the deepest *θ* level, which are small in these regions.

The effect of using different lengths of profile series in the calibration can be seen in the time evolution of the potential conductivity slope correction term *r.* Figure 4 shows the evolution of *r* using *k* = 1 and *k* = 10. A shorter time series (small *k*) will better fit fluctuations over a shorter timescale, such as the rapid changes due to biocide wash-off, but a longer time series (large *k*) will average over the effects of the float variability and ocean variability sampled by the float, thus giving a more stable calibration in the long term.

When *k* is small, the solution will asymptotically approach the extreme case where *k* = 0. In that case, one profile is fitted to one objectively estimated cast. A slope term can still be determined, but the estimate will be sensitive to transient oceanic noise sampled by individual float profiles. Ideally, the length of the time series should span several eddy scales (temporal and/or spatial), so that the effects of variability are averaged out over many samples, thus giving a more stable and robust calibration. For most of the world's oceans, this would necessitate using float measurements over several months. However, where rapid biocide wash-off is suspected, as in the case of COOE 21070, the calibration could start out with a small *k* during the wash-off, and transition to a larger *k* later in the float's profile series. Some further exploration to determine optimal *k* is warranted.

Note that the least squares fit is also fairly sensitive to the presence of wild outliers. These situations can be caused by transient biofouling that often affects only one profile. In those cases, setting *k* to a small number is not a good solution. A better method is to manually remove the wild outliers before the calibration is run.

## 4. Discussion

This calibration model assumes that the pressure and temperature measurements from the autonomous CTD profiling floats are accurate and that only the salinity measurements drift slowly over time. To correct the salinity drift, the model makes use of adjacent profiles (a time series) to estimate a time-varying multiplicative correction term *r* by fitting to the estimated climatological potential conductivities on *θ* surfaces. The objective mapping technique provides an error estimate associated with the climatological estimate.

Due to the need to accumulate a time series, this is a delayed-mode system (*k* > 0). Stable calibrations are expected to take a few months after the float is deployed. After this initial period, calibration estimates will still be best from the middle of the profile series, but least squares estimates of a calibration will be possible for the ends of the series, and the time derivative of the slope correction term ∂*r* can be used to predict suboptimal calibrations for upcoming profiles. The statistical uncertainty associated with estimating the background climatology from historical data is carried through in the weighted least squares calculations.

Since *θ*–*S* relationships vary over time even at great depths (e.g., Dickson et al. 2001), this system relies heavily on the availability of a global hydrographic dataset with dense and recent coverage for a good representation of the temporal climate regime contemporary with the float measurements. The WOD98 is a good starting point for a global database, but it needs to be augmented by recent hydrographic data as they become available. These should include shipboard CTD measurements taken at the launching of the floats, or any other shipboard CTD profiles taken near the floats. The inclusion of contemporary high quality calibrated hydrographic data will help to determine whether a measured trend is due to sensor drift or due to natural variability.

In the absence of contemporary hydrographic data, calibration by *θ*–*S* climatology works best for the parts of the ocean where the water has a stable *θ*–*S* relationship with little natural variability. The deep oceans away from the bottom boundary are generally areas that have a long ventilation timescale. Hence sampling to 2000 m or to where the *θ*–*S* curve is stable will help the calibration in most areas. In areas with great spatial and temporal variabilities, where ventilation timescales are short and ventilation is deep, such as frontal regions, the subpolar North Atlantic (e.g., Dickson et al. 1996), and perhaps the Southern Ocean (e.g., Gordon 1982), statistical estimates are more uncertain, and so give rise to uncertain calibrations. An alternative method for the high-variability areas would be to compare floats where they approach each other to build an internally consistent dataset within the float measurements, perhaps using available ship-based hydrographic data as calibration anchor points (e.g., Bacon et al. 2001). Such a system could best be constructed after study of the behavior of many floats with respect to historical data in a region where stable *θ*–*S* relationships exist, such as the Pacific Ocean.

This model thus raises the issue that until accurate and stable conductivity measurements are achievable for these floats, in situ hydrographic observations at intervals less than the ventilation timescale to the bottom of the float profiling depths are needed in order to accurately calibrate profiling float salinity data. Rapid progress has been made in achieving stable salinity measurements over periods of approximately 4 yr. Float technology is still evolving, and the newer sensors have demonstrated the ability to maintain the Argo salinity accuracy target of 0.01 over longer periods (RS). Accompanied by contemporary ship-based measurements, a high quality integrated global ocean observation system is achievable based on these autonomous CTD profiling floats.

## Acknowledgments

We would like to thank Russ Davis, Susan Wijffels, and Steve Riser for making the float data available and for providing information on their floats. The data from the Australian float COOE 21070 are from the Cooperative Ocean Observing Experiment funded by a CSIRO Special Executive Grant. Special thanks are due to John Bullister and the CFC measuring community for providing the CFC apparent ages. Dean Roemmich and John Gilson are gratefully acknowledged for comments on the objective mapping technique. John Toole provided valuable discussions on potential conductivity and its application to sensor calibration. Comments from Ray Schmitt and three anonymous reviewers greatly improved the manuscript. Don Denbo, Willa Zhu, Jason Fabritz, and Kristy McTaggart are acknowledged for their assistance with data and programming. This project was supported by the National Ocean Partnership Program and is a contribution to the Argo project. This publication was supported by JISAO under the NOAA Cooperative Agreement NA67RJ0155.

## REFERENCES

Akima, H., 1970: A new method of interpolation and smooth curve fitting based on local procedures.

,*J. Assoc. Comput. Mach.***17****,**589–602.Bacon, S., , Centurioni L. , , and Gould W. , 2001: The evaluation of salinity measurements from PALACE floats.

,*J. Atmos. Oceanic Technol.***18****,**1258–1266.Bretherton, F., , Davis R. , , and Fandry C. , 1976: A technique for objective analysis and design of oceanographic experiments applied to MODE-73.

,*Deep-Sea Res.***23****,**559–582.Conkright, M. E., and Coauthors. 1998: World Ocean Database 1998. Version 1.2, National Oceanographic Data Center Internal Rep. 14, Ocean Climate Laboratory, National Oceanographic Data Center, Silver Spring, MD, 114 pp.

Dickson, R., , Lazier J. , , Meincke J. , , Rhines P. , , and Swift J. , 1996: Long-term coordinated changes in the convective activity of the North Atlantic.

*Progress in Oceanography,*Vol. 38, Pergamon, 241–295.Dickson, R., , Hurrell J. , , Bindoff N. , , Wong A. , , Arbic B. , , Owens B. , , Imawaki S. , , and Yashayaev I. , 2001: The world during WOCE.

*Ocean Circulation and Climate—Observing and Modelling the Global Ocean,*G. Siedler, J. Church, and J. Gould, Eds., Academic Press, 557–583.Doney, S., , and Bullister J. , 1992: A chlorofluoro-carbon section in the eastern North Atlantic.

,*Deep-Sea Res.***39****,**1857–1883.Emery, W., , and Dewar J. , 1982: Mean temperature–salinity, salinity–depth and temperature–depth curves for the North Atlantic and the North Pacific.

*Progress in Oceanography,*Vol. 11, Pergamon, 219–305.Fofonoff, N., , and Millard R. , 1983: Algorithms for computation of fundamental properties of seawater. UNESCO Technical Papers in Marine Science, No. 44, 53 pp.

Fukumori, I., , and Wunsch C. , 1991: Efficient representation of the North Atlantic hydrographic and chemical distributions.

*Progress in Oceanography,*Vol. 27, Pergamon, 111–195.Gordon, A., 1982: Weddell Deep Water variability.

,*J. Mar. Res.***40****,**(Suppl.),. 199–217.Lozier, M., , McCartney M. , , and Owens W. , 1994: Anomalous anomalies in averaged hydrographic data.

,*J. Phys. Oceanogr.***24****,**2624–2638.McIntosh, P., 1990: Oceanographic data interpolation: Objective analysis and splines.

,*J. Geophys. Res.***95**(C8) 13529–13541.Menke, W., 1989:

*Geophysical Data Analysis: Discrete Inverse Theory*. rev. ed. Academic Press, 289 pp.Roemmich, D., 1983: Optimal estimation of hydrographic station data and derived fields.

,*J. Phys. Oceanogr.***13****,**1544–1549.Tomczak, M., , and Godfrey J. , 1994:

*Regional Oceanography: An Introduction*. Pergamon, 422 pp.Worthington, L., 1981: The water masses of the World Ocean: Some results of a fine-scale census.

*Evolution of Physical Oceanography: Scientific Surveys in Honor of Henry Stommel,*B. A. Warren and C. Wunsch, Eds., The MIT Press, 42–69.Wunsch, C., 1996:

*The Ocean Circulation Inverse Problem*. Cambridge University Press, 442 pp.

^{*}

Joint Institute for the Study of the Atmosphere and Ocean Contribution Number 816, Pacific Marine Environmental Laboratory Contribution Number 2314, and Woods Hole Oceanographic Institution Contribution Number 10525.