## Abstract

This study proposes a method for the cross calibration of tide gauges. Based on the combination of at least three collocated sea level time series, it takes advantage of the least squares variance component estimation (LS-VCE) method to assess both sea level biases and uncertainties in real conditions. The method was applied to a multi-instrument experiment carried out on Aix Island, France, in 2016. Six tide gauges were deployed to carry out simultaneous sea level recordings for 11 h. The best results were obtained with an electrical contact probe, which reaches a 3-mm uncertainty. The method allows us to assess both the biases and the precision—that is, the full accuracy—for each instrument. The results obtained with the proposed combination method have been compared to that of a buddy-checking method. It showed that the combination of all the time series also provides more precise bias estimates.

## 1. Introduction

Tide gauges aim at measuring the vertical distance between the sea level and a reference level (or datum). Historically, tide gauges were first used for tide prediction and navigation (Cartwright 2000); today, their applications have been extended (Pugh and Woodworth 2014). Clustered into networks of continuously operating stations, they are the key components of storm surge or tsunami warning systems and climate-related monitoring programs, such as the Global Sea Level Observing System (GLOSS; IOC 2012).

A wide range of distance meter technologies can serve to implement a tide gauge, as long as it can resolve both sea level and datum along the vertical. The datum of a sea level station is a local and conventional reference level, independent from any instruments. It enables the construction of long time series with successive or overlapping tide gauges. The datum is defined through a network of benchmarks grounded around the sea level station; some of them can be benchmarks from leveling networks (IOC 1985; Pugh and Woodworth 2014). Thus, a preliminary step in field calibrations consists of tying the reference gauge to the station datum or controlling whether it is properly tied.

The simplest and oldest types of tide gauge are graduated poles or tide poles placed against a vertical structure at the coast (Cartwright 2000). Tide poles requiring human-made measurements are still in use, along with electric tape probes for on-site field calibration of more elaborated self-recording tide gauges. Since 1985, the manuals of the Intergovernmental Oceanographic Commission (IOC) have covered the basic principles of the main types of tide gauges in use across the world, ranging from mechanical float gauges (IOC 1985) to radar technologies (IOC 2016), including pressure and acoustic gauges (IOC 2002, 2006).

Over the past decade, radar-based technologies appeared as the preferred ones (IOC 2016). However, new technologies are emerging, based on Global Navigation Satellite System (GNSS) buoys (André et al. 2013), GNSS reflectometry (GNSS-R; Larson et al. 2017), or laser distance measurement (MacAulay et al. 2008). A tide gauge complying with GLOSS standards should be capable of measuring instantaneous sea level with an accuracy better than 1 cm, in all conditions of tide, waves, currents, and weather (IOC 2016). As laboratory testings do not ensure those performances, the practice has evolved toward field experiments (Martín Míguez et al. 2008a,b; Park et al. 2014; Pérez et al. 2014).

When dealing with accuracy requirements, it is useful to distinguish random and systematic errors. The random error is the error component that, in replicate measurements, varies in an unpredictable manner, whereas the systematic error is the error component that, in replicate measurements, changes in a predictable manner (BIPM et al. 2008).

Given the crucial role of tide gauges in coastal sea level observation, the increasing number of available technologies and the evolution of accuracy requirements, this study aims at providing a cross-calibration method that quantifies both systematic errors (the biases) and random errors (the uncertainties) of sea level time series.

Determining the errors of given time series can be achieved through three approaches: 1) the observed time series can be compared with that from a more precise instrument, 2) it can be compared with theory in cases where the observed phenomena can be very precisely modeled, and 3) observed time series of three or more instruments can be analyzed to obtain enough information to determine the uncertainty of each.

The first approach, also known as buddy-checking, is routinely used during calibration campaigns where a pair of tide gauges are compared over a tidal cycle, sometimes with the help of the so-called Van de Casteele (VdC) diagram (Lennon 1968; IOC 1985). During the last decade, several studies have investigated the performances of radar gauges, pressures gauges, GNSS buoys, or GNSS reflectometry based on this approach (Watson et al. 2008; Martín Míguez et al. 2008a, 2012; Pérez et al. 2014; Larson et al. 2017; Pytharouli et al. 2018). Even if this approach can provide bias estimates and general accuracy metrics, such as mean error or root-mean-square error (RMSE), it cannot rigorously separate the uncertainties of each gauge.

The second approach would correspond to removing a tide model from the measured sea level time series. But, because of the complexity of meteorological and ocean dynamics involved in sea level fluctuations, these models are not precise enough to assess the performance of tide gauges at the targeted centimeter level.

The third approach is classically used in metrology (Pálinkáš et al. 2017) and has often been used in geodesy through the three-cornered-hat (TCH) estimation method (Gray and Allan 1974), for example, to determine the stability of reference station positions (Feissel-Vernier et al. 2007; Abbondanza et al. 2015) or the precision of space gravity model (de Viron et al. 2008; Valty et al. 2013). The TCH is not the only possible implementation of the third approach: the more general framework of variance component estimation (VCE) can similarly address this problem, as shown by the theoretical example (4.10) of Amiri-Simkooei (2007). The TCH and VCE examples can separate the uncertainty of each gauge, but assume the absence of sea level biases.

To take advantage of both the first and third approaches, this study proposes a combination model that extends the use of the third approach to the analysis of potentially biased time series. Assessing the tide gauge uncertainties in addition to the sea level bias parameters is made possible by the use the least squares variance component estimation (LS-VCE) method (Teunissen and Amiri-Simkooei 2008). As the model can handle an arbitrary number of time series, it is suited for multi-instrument experiments.

The method is applied to an on-site field calibration experiment carried out at Aix Island off the mid-Atlantic coast of France, where a permanent radar gauge has operated for several years (Gouriou et al. 2013), and various types of tide gauges (including some emerging technologies) were temporarily deployed during the experiment within meters from each other over a tidal cycle in 2016.

## 2. The Aix Island experiment

This experiment was carried out on 7 June 2016, by a team of scientists (see acknowledgments). For 11 h, they recorded one semidiurnal spring tidal cycle with an amplitude of 5.22 m using six different instruments.

The six tide gauges were a permanent radar gauge (RADAR), a permanent tide pole (POLE), an electrical contact probe (PROBE), two GNSS buoys (BUOY1 and BUOY2), and a laser distance meter (LASER). RADAR, POLE, PROBE, and LASER are shown in Fig. 1 and the two GNSS buoys in Fig. 2. All tide gauges and the reference GNSS station were referenced to the station datum by leveling. Each gauge record is defined as an average over a 2-min acquisition window every 10 min.

The radar gauge (RADAR) is the primary tide gauge of the permanent sea level observatory of Aix Island. This station contributes to the French sea level observation network (RONIM) operated by the French hydrographic service (SHOM). It is a Krohne Optiwave 7300C gauge that measures the air range between the transmitter fixed above the sea surface and the sea surface with a sampling frequency of 1 Hz using a frequency-modulated continuous wave technology (IOC 2016).

The tide pole (POLE) is a permanent instrument made of a plastic staff with graduations every 10 cm fixed vertically by a stainless steel structure (Fig. 1). The operator estimates the sea level visually over the predefined 2-min acquisition period.

The electrical contact probe (PROBE) is a measuring tape with millimeter graduations ended by an electrical device that emits a short signal when detecting the seawater surface. We used a Schill probe installed within a stilling pipe anchored along the tide pole (Fig. 1). A sea level record from PROBE is an average over the 2 min of human-made readings every 15 s. Electric probes are typically used as the reference gauge in tide gauge calibrations, so was it in our study. The stilling pipe was too short to allow measurements at the lowest sea levels, which resulted in a gap between 1000 and 1210 UTC.

The first GNSS buoy (BUOY1), designed at the Institut de Physique du Globe de Paris (IPGP), is a GNSS antenna installed above a lifebuoy and protected from the water by a radome (Fig. 2). The second buoy (BUOY2), designed by the Division Technique de l’Institut National des Sciences de l’Univers (DT INSU), is a GNSS antenna housed in the center of a tripod floating structure (Fig. 2). The receivers and batteries of the buoys are located inside a metallic cylinder under each antenna. These two buoys (BUOY1 and BUOY2) were already used in previous campaigns (André et al. 2013). The heights between their phase centers and the water surface are known at the subcentimeter level thanks to previous testing carried out under calm conditions.

The buoy vertical positions, that is, ellipsoidal heights, from GPS were assessed by postprocessing, using a double-differences strategy with a baseline of about 300 m from the ILDX GNSS reference station. Only satellites with elevation angles above 15° were used, with a combination of both L1 and L2 frequencies. The centimeter level accuracy was achieved, using full ambiguity resolution with the RTKlib software suite with RTKPOST v2.4.2 program (Takasu 2013).

LASER is a reflector-free distance-meter Leica DISTO A6. This type of instrument is built for solid surface ranging but showed fair to good performances during this experiment. This instrument uses an optical laser beam with a wavelength of 635 nm. Each LASER record corresponds to an average of measurements done every 4 s.

All instruments time series are presented in Fig. 3. Due to data transmission loss and GNSS recording issues during the experiment, some records from the LASER, BUOY1, and BUOY2 instruments are missing.

## 3. Calibration methods

This study proposes a combination method to go beyond the classical difference methods, allowing a better determination of the biases and their uncertainties. For comparison, we processed the time series using both the combination method and the classical difference method used by the hydrography community, the Van de Casteele diagram (Lennon 1968).

### a. Sea level error model

Due to the short recording period, this study only considers the influence of the three most common types of range measurement biases on the resulting sea level time series, namely, height reference, scale, and clock synchronization errors (Watson et al. 2008; Martín Míguez et al. 2008b).

While converting original range measurements into sea level time series, range biases turn into sea level biases that must be quantified and removed. This study proposes a linear sea level bias model, which expresses the sea level bias as a function of the measured sea level itself. More precisely, the model links the *i*th sea level time series *y*_{i}(*t*) to the real sea level *h*(*t*) through

where *β*_{i}*y*_{i}(*t*) + *α*_{i} is the linear sea level bias model, and *e*_{i}(*t*) is a random error modeled by a centered normal distribution of unknown variance $\sigma i2$.

In Eq. (1), *α*_{i} corresponds to the intercept: a constant term representing the sea level bias when *y*_{i}(*t*) = 0. It may result from a height reference error, but also from the influence of a scale error, as mentioned by (Pérez et al. 2014). Parameter *β*_{i} corresponds to the scale error: a multiplying factor that causes a sea level bias proportional to the tidal range. It can result from both instrument and installation defaults. Finally, *τ*_{i} is the time delay between different tide gauges: it results from clock synchronization issues.

The measured sea level *y*_{i}(*t*) depends nonlinearly on the time delay *τ*_{i}, which makes linear determinations, like the one proposed in this paper, impossible. However, it can be corrected before the other bias estimations, for example, by computing the delay that maximizes the cross correlation between a tested signal and a reference signal. Obtaining *τ*_{i} by cross correlation avoids any assumptions on the periodicity of the measured signal. In our case, the time delay estimation showed that the best correlation was achieved with no delays added, that is, *τ*_{i} = 0, ∀*i*.

The sea level bias model directly quantifies the amplitude of the bias associated with the measurement *y*_{i}(*t*). The correction of the sea level time series can be done after the calibration experiment by subtracting the estimated bias model from the measurements. This linear model can be adapted to other types of biases. For example, longer time series analysis (several days, months, or years) may require to consider time-dependent biases such as trends and jumps (Pytharouli et al. 2018).

### b. The difference-based calibration method

Difference-based methods (DIFF) consist in analyzing the differences $\Delta yi\u2061(t)=yi\u2061(t)\u2212yref\u2061(t)$ between the time series of a tested instrument *y*_{i}(*t*) = *h*(*t*) + *β*_{i}*y*_{i}(*t*) + *α*_{i} + *e*_{i}(*t*) and that of a reference instrument *y*_{ref}(*t*) = *h*(*t*) + *e*_{ref}(*t*).

A commonly used tool for DIFF methods is the VdC diagram, which represents the sea level difference $\Delta yi\u2061(t)$ as a function of *y*_{i}(*t*). Initially developed in 1962, for mechanical tide gauges (IOC 1985), the VdC diagram is nonetheless still applicable for modern sea level measurement technologies (Martín Míguez et al. 2008b). The most attractive feature of this diagram is a fast, visual, detection of possible biases with only one tidal cycle. Figure 4 shows the sea level error patterns resulting from the most common range measurement errors (IOC 1985).

In the presence of the linear biases mentioned before, $\Delta yi\u2061(t)$ follows

In other words, estimates of the sea level bias parameters *α*_{i} and *β*_{i} of Eq. (1) can be obtained by a linear regression of $\Delta yi\u2061(t)$ on *y*_{i}(*t*), which corresponds to fitting a line on a VdC diagram.

Assuming that both random errors *e*_{i}(*t*) and *e*_{ref}(*t*) are uncorrelated, the term *e*_{i}(*t*) − *e*_{ref}(*t*) in Eq. (2) follows a centered normal distribution with an unknown variance $\sigma i2+\sigma ref2$. The merge of the random errors *e*_{i}(*t*) and *e*_{ref}(*t*) in the differences $\Delta yi\u2061(t)$ implies that, without assumption, the DIFF methods can only assess the variance $\sigma i2+\sigma ref2$, which is just an upper bound to the tested gauge variance $\sigma i2$ (Lentz 1993; Martín Míguez et al. 2008b; Pytharouli et al. 2018). To separate $\sigma i2$ and $\sigma ref2$, an additional piece of information is needed: a third time series.

### c. The combination-based calibration method (COMB)

When more than two time series are available, it becomes possible to assess the uncertainties and biases from each tide gauge by estimating a weighted combination of all the time series, using a variance component estimation method. In the following, the acronym COMB refers to the combination method.

#### 1) Functional model

Noting **y**_{i} the *i*th gauge *k* × 1 observation vector (or time series), the full *pk* × 1 stacked vector **y**, containing all observations from the *p* instruments, can be written as

The functional model links the expectation *E*(⋅) of the *pk* × 1 observations vector **y** to *q* unknown parameters using a model of observation equations. When there is no theoretical model for the observed signal, one can estimate a *k* × 1 combined solution **h** from the *p* measured time series, so that

In the case of unbiased gauges, the functional model would be *E*(**y**_{i}) = **h** for each gauge. In the case of the cross calibration of possibly biased time series, the functional model should also account for the biases:

Because biases are always defined with respect to a conventional reference, at least one time series must be considered as conventionally unbiased to avoid an ill-posed equation system. Hence, in the following, the first time series **y**_{1} will be considered as conventionally unbiased.

This functional model can be written using matrix algebra, so that

where **h** = [*h*_{1} ⋅⋅⋅ *h*_{k}]^{T} is the combined solution vector, ** α** = [

*α*

_{2}⋅⋅⋅

*α*

_{p}]

^{T}is the intercept parameter vector, and

**= [**

*β**β*

_{2}⋅⋅⋅

*β*

_{p}]

^{T}is the scale error parameter vector.

In Eq. (4), the combination design *pk* × *k* matrix $Ah$ corresponds to *p* stacked identity matrices $Ik\xd7k$ such as

and both the intercept design *pk* × (*p* − 1) matrix $A\alpha $ and the scale error design *pk* × (*p* − 1) matrix $A\beta $ are constituted with block nonzero vectors so that

and

where **0**_{k×1} and **1**_{k×1} refer to *k* × 1 vectors respectively filled with zeros and ones.

#### 2) Stochastic model

The stochastic model describes the variance var(⋅) of the observation vector **y**. Considering that all measurements are statistically independent and that the uncertainty of the *i*th instrument follows a multivariate normal distribution with a variance $\sigma i2$, the *pk* × *pk* covariance matrix of the observations var(**y**) = $Qy$ reads

where $Ik\xd7k$ and $0k\xd7k$ are respectively the *k* × *k* identity and null matrices.

To use the LS-VCE method, $Qy$ needs to be expressed as a linear combination of cofactor matrices $Qi$ such as

where $\sigma i2$ are referred to as variance components, and correspond to the instrument uncertainties.

In this study, the $Qi$ are known *pk* × *pk* cofactor matrices that follow

#### 3) Least squares estimation

According to the least squares estimation theory (Caspary 1987; Teunissen 2000), for normally distributed observations, an unbiased and minimum variance estimation of the *q* × 1 parameter vector **x** can be achieved by solving a normal equation system $N$**x** = **c**, where $N$ is the normal *q* × *q* matrix defined by $N=ATQy\u22121A$ and **c** is a *q* × 1 vector defined by $c=ATQy\u22121y$. Hence, the estimator of the functional parameter vector $x^$ is given by

and its covariance matrix $Qx^$ follows

In the case of a lack of knowledge on the on-site precision of the tide gauges, that is, on $Qy$, a variance component estimation method can be used to assess the uncertainty of each gauge. As the minimum variance property of least squares estimates requires a realistic weighting between sea level time series, the use of a variance component estimation method also allows for more realistic estimates of the parameter vector $x^$ and its covariance matrix $Qx^$.

#### 4) Least squares variance component estimation

A review of most variance component estimation methods can be found in Fotopoulos (2003) and Amiri-Simkooei (2007). Here, we consider the application of the LS-VCE, which is based on the same least squares estimation principles used in section 3. LS-VCE was first introduced in 1988 by Teunissen (1988) and further developed by Amiri-Simkooei (2007) and Teunissen and Amiri-Simkooei (2008). Under the assumption of the multivariate normal distribution considered in section 2, the method provides an unbiased and minimum variance estimator of the variance components. It can also be shown that the LS-VCE estimates maximize the restricted likelihood function of the considered normal distribution (Amiri-Simkooei 2007). This property is common to most rigorous VCE methods. However, the LS-VCE is more generally applicable and offers additional features, including a direct derivation of the uncertainty of each variance component estimate (Teunissen and Amiri-Simkooei 2008).

The LS-VCE method consists in using the redundancy of information of a system to infer the variance of the observations. In the case of a linear parametric functional model, one can compute a residual *pk* × 1 vector $e^$ such as

where $PA\u22a5$ is a projector matrix defined by

The residual vector $e^$ gives pieces of information about observation uncertainties, potential model misspecifications, and the presence of outliers. By assuming the absence of outliers and model misspecifications, the LS-VCE provides an estimator of the observation uncertainties using $e^$ and $PA\u22a5$.

As for the standard least squares estimation, the LS-VCE method estimates the unknown variance components *p* × 1 vector $\sigma ^2=\u2061[\sigma 12^\cdots \sigma p2^]T$ by solving a normal equations system:

where the normal matrix $N\xaf$ and the vector $c\xaf$ are specific to the stochastic model, and thus different from the normal matrix $N$ and vector **c** in Eq. (7).

For the stochastic model defined in section 2, for which all variance components are to be estimated, the elements $n\xafij$ and $c\xafi$ of $N\xaf$ and $c\xaf$ are defined by (Amiri-Simkooei 2007)

where tr(⋅) stands for the trace operator.

Note that $\sigma ^2$ is involved in the definition of $n\xafij$ and $c\xafi$ through $Qy\u22121$. Hence, Eq. (11) expresses $\sigma ^2$ as a function of $Qy$, which is already a function of $\sigma ^2$ in Eq. (6). Such system of equations, where the equations for the unknowns include functions of the unknowns, can be numerically solved using an iterative procedure starting with an initial guess on the unknowns: the prior variance component vector $\sigma 02$.

The first iteration consists in using the prior vector $\sigma 02$ and cofactor matrices $Qi$ to compute $Qy$ and $PA\u22a5$, which are necessary to build the normal equations system (11). Solving this normal equations system (11) leads to the estimation of an updated variance component vector $\sigma 12$. The next *n* iterations consist in successively updating the variance component vector $\sigma ^n2$ by solving the normal equations system (11) built using the previously estimated variance component vector $\sigma ^n\u221212$. The iterations stop when the difference between two estimated variance component vectors becomes negligible. To obtain more details on the implementation of the LS-VCE method, a symbolic algorithm can be found in Fig. 4.2 of Amiri-Simkooei (2007).

When encountering convergence issues with an arbitrary prior variance component vector, using more realistic prior tide gauge uncertainties may be necessary. One could, for example, use the information provided by the tide gauge manufacturers. However, in the case of convergence, changes in the prior variance components should not change the final LS-VCE results as the method should reach the restricted maximum of likelihood regardless of the starting point.

Once convergence is achieved, an insight into the quality the variance component estimates $\sigma ^2$—the covariance matrix of the variance component estimates—can be obtained by inverting the normal matrix $N\xaf$:

The *i*th diagonal element of $Q\sigma ^2$ corresponds to the variance of the *i*th variance component $\sigma \sigma ^i22$. As for $Qx^$, the uncertainties of variance component estimates depend on the system redundancy and the precision of the observations.

To get interpretable variance component estimates, one can change variance components $\sigma ^i2$ into standard deviation components $\sigma ^i=\sigma ^i2$. To obtain variance component uncertainties with interpretable units, one can follow Amiri-Simkooei et al. (2009), and approximate the new variance of the standard deviation component $\sigma \sigma ^i2$ by applying variance propagation law through the linearized square root function:

The more interpretable standard deviation of the standard deviation component $\sigma \sigma ^i=\sigma \sigma ^i2$ can then be derived by taking the square root of both sides of Eq. (15), which gives

where $\sigma \sigma ^i2$ is the standard deviation of the *i*th variance component $\sigma \sigma ^i2=\sigma \sigma ^i22$.

Hence, one can express the uncertainty estimate of the *i*th tide gauge as $\sigma ^i\xb1\sigma \sigma ^i$ (cm).

## 4. Results

To compare COMB and DIFF methods on a similar basis, the PROBE time series has been considered conventionally unbiased for both methods.

To remove the influence of potential outliers, residuals time series were computed using Eq. (9) before the actual processing of both methods. The functional model (4) and the covariance matrix $Qy$ = $I$ were considered in Eq. (10). Observations that showed residuals above 5 times the median absolute deviation of the gauge residual time series were removed from the dataset. In practice, it concerned less than two observations by time series.

### a. Calibration with the COMB method

Before the assessment of the unknown bias parameters and the combined solution, a realistic covariance matrix $Qy$ was first computed using LS-VCE. An arbitrary standard deviation of 0.8 cm for all the time series was used to build the prior variance component vector. Starting with $\sigma 02$, the iterative procedure, summarized in section 4 and fully described in Amiri-Simkooei (2007), provided the final variance components vector estimate $\sigma ^2$ and its covariance matrix $Q\sigma ^2$. As the elements of both $\sigma ^2$ and $Q\sigma ^2$ are not directly interpretable, the Eq. (16) was used to express each tide gauge uncertainty estimate as $\sigma ^i\xb1\sigma \sigma ^i$ (cm).

The bias parameters and the combined solution were estimated by solving the functional model (4) using the final variance component estimates: $\sigma ^2$ was substituted in Eqs. (7) and (8) through Eq. (6) to obtain the vector $x^$ and its covariance matrix $Qx^$.

Both estimated sea level bias parameters and uncertainties for 10 min records are given, in centimeter, in Table 1. The electrical PROBE is found to be the most precise gauge in this experiment, with an uncertainty of 0.3 cm. The least precise gauges are the tide pole POLE (1.23 cm) and the BUOY1 (1.25 cm). BUOY1 is nearly 2 times less precise than BUOY2 (0.74 cm).

In Table 1, four time series (RADAR, LASER, BUOY1, and BUOY2) show intercept estimates $\alpha i^$ significant at the 3$\sigma \alpha i^$—or 99%—confidence level. Their amplitudes range from −1.87 cm (RADAR) to −4.30 cm (BUOY1). For the scale errors $\beta i^$, only RADAR and POLE show estimates above $3\sigma \beta i^$, with about 0.5 and −0.3 cm m^{−1}, respectively.

The residual time series of each tide gauge are presented in Fig. 5. BUOY1 exhibits a mean shift of about −2 cm between 0720 and 0940 UTC. This artifact appears in the residual time series because it cannot result from the combination model. It means that the other gauges did not observe such a shift, otherwise, it would have been modeled by the combined solution. The presence of this artifact in the BUOY1’s residual time series lowers its precision in Table 1. For the other gauges, no clear pattern appears in the residual time series, which suggests that their biases are correctly modeled.

The combined solution $h^$ and its uncertainty $\sigma h^$ are presented in Fig. 6. Each missing value in one of the time series increases the uncertainty of the combined solution to an extent proportional to its precision. The available measurements are displayed for each tide gauge, in the bottom of Fig. 6. When the most precise tide gauge (PROBE) is not recording, between 1000 and 1210 UTC, the uncertainty $\sigma h^$ of the combined solution increases by almost a factor of two. Despite the missing values of PROBE, the combined solution is estimated for the entire experiment period because all available observations are taken into account.

To investigate whether PROBE is found to be the most precise gauge because it is the conventionally unbiased gauge, the calibration has been reprocessed by instead considering BUOY1 as conventionally unbiased. The alternative calibration results are presented in Table 2. The choice of another conventionally unbiased gauge does not change uncertainty estimates but changes bias parameter estimates and their uncertainties. Bias parameters are the most affected because they intrinsically depend on the definition of a convention. As BUOY1 does not exhibit any scale error in Table 1, the changes in scale error estimates in Table 2 are not dramatic. The sea level time series uncertainty estimates are identical in both alternatives because all biases are considered in each case. An alternative functional model ignoring an existing bias would not have provided identical results.

### b. Comparison with the DIFF method

Using PROBE as the reference gauge, we plotted the VdC diagram for RADAR, POLE, LASER, BUOY1, and BUOY2. A linear regression on each diagram provided intercept and scale error estimates for each gauge. The DIFF method estimates are presented in Table 3. The differences with the COMB method estimates are summarized in Table 4.

The discrepancies between COMB and DIFF methods reach 0.75 cm for the intercepts (BUOY1) and 0.15 cm m^{−1} for the scale errors (BUOY1). In Table 5, changes in bias uncertainties between methods are expressed in terms of uncertainty reduction percentages. The DIFF method provides slightly different results from the COMB method because it only considers a smaller subset of the dataset for each pair of gauge and because it does not take into account the precision of each time series. In this study, the DIFF method can only take into account the overlapping observations between PROBE and the tested gauges. Given that PROBE has no observation between 1000 and 1210 UTC, the DIFF method ignores several observations, which deteriorates the precision of bias estimates. As a consequence, Table 5 shows that the COMB method provides 30%–55% smaller uncertainties than the DIFF method for bias parameter estimates.

The presence of the scale error induces a height dependency of the sea level bias models and their confidence intervals. To illustrate this, Fig. 7 displays the estimated sea level bias models and their uncertainties, obtained with both methods, on the VdC diagram for BUOY1, which is the time series with the most substantial differences between the two models. At the lowest tide, sea level bias models obtained with COMB and DIFF method differs of about 3 mm. Besides, both sea level bias models are more precise around the mean tide than near the tidal extrema. As a consequence, the combined solution of the COMB method is also less precise near the tidal extrema, which results in the few millimeter changes for $\sigma h^$ that appears in Fig. 6 at lowest tide: between 1000 and 1210 UTC.

A representation of all bias estimates obtained with both DIFF and COMB methods is given in Fig. 8. Bias estimates are shown as points in the bias parameter space–intercept versus scale error. Their uncertainties appear as 1*σ* confidence ellipses. The correlations between bias parameters, always around −0.9, induce an inclination of the ellipses. As the cause of the correlation is the same—same signal and same bias model—for every time series, so are the inclinations in Fig. 8. Figure 8 also shows that, while providing more precise estimates, the COMB method still globally agrees with the DIFF method for bias detection.

## 5. Discussion

### a. Performance of the tide gauges

The PROBE time series is twice more precise than that of the next most precise tide gauge. Its good performance results probably from the use of the stilling pipe, which stabilizes the water level and allows accurate readings on the measuring tape. This result comforts the use of electrical probes as references in tide gauge calibration campaigns. The results also show that RADAR, LASER, and BUOY2 uncertainty estimates are below the centimeter level, which confirms that they could provide sea level records with the level of accuracy specified by the IOC with a confidence level of more than 67% if they were not affected by biases.

Among the six tested gauges in this work, only two, of which one automatic gauge, present an uncertainty above 1 cm: POLE (1.23 cm) and BUOY1 (1.25 cm). The 1.23-cm uncertainty of POLE might result from the limitation of human eye reading on the 10-cm graduations. The lower performance of BUOY1 compared to BUOY2 is assigned to the presence of the artifact between 0720 and 0940 UTC. Considering its floating structure is less stable than the more recent model BUOY2, this artifact could be due to the buoy instability in the presence of currents during the ebb tide. BUOY2 did not measure when BUOY1 observed the artifact; one cannot exclude that the artifact is due to a mismodeling of the GNSS data.

### b. Nature of the biases

Separating instrumental and environmental contributions in bias estimates is difficult, especially when the gauges are not fully collocated. We can nonetheless draw some hypotheses for bias attribution.

Usually, significant intercept estimates are due to instrumental height errors. But in this experiment, other explanations are plausible for BUOY1, BUOY2, LASER, and RADAR.

BUOY1 and BUOY2 show similar intercept estimates while being deployed a few tens of meters away from the ground-based instruments. Hence, changes in the dynamic topography due to currents likely impacted their intercept estimates (Pérez et al. 2014). In that case, an environmental effect is detected, not an instrumental bias.

As LASER is not dedicated to water surface measurements, the intercept estimate is certainly due to the penetration of the laser beam into the water. More appropriate laser systems have already been developed, using floating mirrors (MacAulay et al. 2008).

For RADAR, the significant intercept estimate is influenced by the strong scale error. Theoretically, LASER, RADAR, and POLE could show scale error in the case of vertical alignment defaults. This cause is plausible for RADAR and LASER. However, the vertical alignment of POLE can be considered as reliable and the human reading is the most likely source of its scale error.

Even though the nature of significant bias parameters *α*_{i} and *β*_{i} could remain unclear, one can still obtain corrected sea level time series by subtracting the bias model *β*_{i}*y*_{i}(*t*) + *α*_{i} to the measured sea level *y*_{i}(*t*).

### c. Improvement over difference based methods

The proposed calibration method provides an unbiased and minimum variance estimate of the tide gauge uncertainties, their sea level biases, and the combined solution from all the time series. The variance of all estimates, including tide gauge uncertainties, is also determined. Thus, the COMB method leads to a more complete tide gauge calibration than the DIFF method.

The application to the Aix Island experiment revealed that the proposed methodology also leads to more precise bias estimates. This improvement is attributed to the combination of all available observations along with the realistic weighting between each gauge. The drastic precision improvement, from 30% to 55% on the uncertainty of the bias parameters, mostly shows that this method is more robust to the missing values of the most precise time series (PROBE), which is used as a reference to build the VdC diagrams.

For comparison purposes, this study considers only one conventionally unbiased time series. However, the COMB method allows using several unbiased time series and partially unbiased time series at the same time, which is not possible with the DIFF method. Adding unbiased time series should further improve the results of the COMB method.

## 6. Conclusions

The present contribution proposes a method for the cross calibration of tide gauges. Based on the combination of multiple collocated time series, it takes advantage of the least squares variance component estimation (LS-VCE) method to assess both instrumental biases and measurement uncertainties in real conditions. The method was applied to a multi-instrument experiment carried out at Aix Island in 2016. Six instruments were deployed and performed simultaneous sea level recordings for 11 h, with a 10-min sampling.

The electrical probe was found to be 2–4 times more precise than the other gauges. RADAR, LASER, and BUOY2 uncertainty estimates are below the centimeter level, which confirms that, in those conditions, they could provide sea level records with the level of accuracy specified by the IOC if they were not affected by biases. We showed that, within our time series, significant bias parameters were found for all the tested gauges. Hence, this study shows that it is possible to assess both the biases and the precision—that is, the full accuracy—for each gauge.

The results obtained with the combination method have been compared to that of a difference based method. It showed that the combination of all the time series also provides more precise bias estimates.

Because this study is based on an 11-h experiment, time-dependent biases and random errors have not been considered. Further studies using the COMB methods are necessary to investigate the time dependency of sea level bias parameters and tide gauge precision.

## Acknowledgments

This study has been financially supported by the Direction Générale de l’Armement (DGA), the Nouvelle-Aquitaine region, and by the Centre National des Études Spatiales (CNES) as an application of the geodesy missions. The authors are thankful to the other participants of the Aix island experiment: Laurent Testut (CNAP-LEGOS), Médéric Gravelle (CNRS-LIENSs), Álvaro Santamaría-Gómez (CNAP-OMP), Mikaël Guichard (ULR-LIENSs), Elizabeth Prouteau (CNRS-LIENSs), Pierre Chasseloup (SONEL-LIENSs), Cyril Poitevin (LIENSs), Fabien Durand (IRD-LEGOS), Ronan Le Gall (SHOM), and Pascal Le Dû (SHOM). The authors gratefully acknowledge the three anonymous reviewers, whose comments helped improve and clarify this paper.

## REFERENCES

*Concepts of Network and Deformation Analysis*.

*School of Surveying Monogr.*, Vol. 11, University of New South Wales, 183 pp.

## Footnotes

© 2019 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).