## 1. Introduction

Satellite altimetry (SA) is considered as one of the prospective oceanographic tools suitable for long-term monitoring of the ocean state. The arrival of massive satellite data into oceanographic practice has totally changed the observational context of oceanography and greatly reinforced the research in data assimilation. Although SA provides the densest data available in terms of space–time resolution, it still poses some problems with regard to data assimilation since these data provide only *surface* observations for the reconstruction of the three-dimensional (3D) evolution of the ocean state.

In recent years the problem of vertical propagation of the SA information has been addressed by many authors who utilized a wide spectrum of tools, starting from purely statistical (Mellor and Ezer 1991; Cooper and Haines 1996) to those employing dynamical constraints of various complexity (Schröter et al. 1993; Cane et al. 1996; Brasseur et al. 1999; Verron et al. 1999; Yaremchuk et al. 2001). Another fundamental issue with altimetry is the imperfect knowledge of the geoid, implying relatively large error bars for the absolute values of sea surface height (SSH) at scales of *O*(100 km). This means that ocean observability by SA data alone is relatively poor, even if the accumulation of data reduces the degree of uncertainty on model control. To improve the situation, a complementary massive data source on the oceanic interior is required. One of the possible solutions are the automatic profilers to be launched in the forthcoming years by the ARGO Programme (Wilson 2000). These are expected to be suitable for the global monitoring of large-scale ocean currents.

Acoustic tomography (AT), although somewhat more expensive than ARGO drifters, may provide another massive datastream complementary to satellite observations. Owing to relative paucity of AT observations in the ocean, much less attention has been paid to their processing by the tools involving both statistical and dynamical constraints. Most of the AT data inversions present in literature are done by 3D statistical methods (Howe et al. 1987; Yuan et al. 1999). Applications of dynamics to the inversion of the AT data are rather sparse and scanty. As a few examples, Sheinbaum (1995) examined applicability of the adjoint technique to inverting simulated acoustic travel time data within the linear 2D advection–diffusion equation of a passive tracer in a closed rectangular domain. Cornuelle and Worcester (1996) discuss application of the approximate Kalman filter to the AT data inversion within the framework of simple advective constraints. Menemenlis and Wunsch (1997) have studied the impact of simulated large-scale AT observations on the dynamics of a linearized general circulation model with the reduced state space. Certain attempts have been made recently to use a primitive equation model for AT data assimilation in the Mediterranian Basin (Remy and Gaillard 1999). In this study we present an experience of constraining AT–SA data by quasigeostrophic dynamics and investigate the possibility of monitoring the evolution of mesoscale structures by means of a model of intermediate complexity that is controlled by initial and boundary conditions. In the foregoing treatment we consider two types of acoustic tomography data: observations of travel times between the transceivers, hereafter referred to as TT data, and observations of the difference in travel times between the pairs of transceivers, referred to as “differential tomography” or DT data.

The paper is organized as follows. In section 2 we present a 4D variational data assimilation scheme for simultaneous handling of different types of data. The algorithm is based upon nonlinear quasigeostrophic dynamical constraints formulated as a finite-difference numerical model. The model's fit to the data is controlled by both initial and open lateral boundary conditions that evolve with time. Regularization of the problem is introduced by imposing smoothness on the model fields rather than attracting them to a prior “background state.” This is done to avoid uncertainties associated with the poorly known statistics of the background control fields, especially of those at the open lateral boundaries. In formulating twin data experiments (selecting AT transceivers' positions, altimeter tracks, oceanic stratification, and bottom topography) we tried to be as close as possible to the baseline observational network and oceanic parameters of the 2-month medium-range tomography experiment in the Kuroshio Extension region (Yuan et al. 1999).

Results of twin data experiments are reported in section 3. We investigate the quality of assimilation with respect to the presence of various types of data (TT, DT, and/or SA), their noise levels [which include motions unresolved by quasigeostrophic (QG) dynamics], and geometrical configuration of the AT array. The quality of assimilation is gauged by a set of global indices that are proportional to space–time-averaged differences between the assimilated and reference solutions.

In section 4 we analyze the spatiotemporal structure of the error fields by means of statistical analysis of the solutions. Knowledge of the “true” fields, which is very useful in estimating the quality of assimilation, is one of the advantages of the synthetic (model generated) data that underlie the ideology of twin experiments. However, in a real experiment those true fields are never known. In such a case, error analysis is more complicated and requires computation of the posterior error covariance matrix. An example of such computation is discussed in section 4. Finally, conclusions of the present work are given in section 5.

## 2. Description of the method

### a. The model

*ρ*from the background state

*ρ*

_{b}(

*z*) and for the vertical component of the depth-averaged relative vorticity

*ω*

*ω*; streamfunctions

*ψ,*

*ψ*

*W*; and density:

*N*(

*z*) = (−

*g*/

*ρ*

_{0}∂

_{z}

*ρ*

_{b})

^{1/2}is the Brunt–Väisäla frequency;

*f*is the Coriolis parameter;

*β*is its meridional gradient;

*ρ*

_{0}is the mean density of seawater;

*H*

_{0}and

*h*(

*x,*

*y*) denote the mean ocean depth and deviations from that mean, respectively; while

*a*=

*g*/

*fρ*

_{0}and

*b*= −∂

_{z}

*ρ*

_{b}. Also, Δ and

*J*are the horizontal Laplacian and Jacobian operators, subscripts

*t*and

*z*denote differentiation with respect to time and depth, whereas an overbar stands for the vertical average. The elliptic equation for vertical velocity is solved with the following boundary conditions:

*F*

_{τ}is an unknown error in the observations of wind stress curl divided by

*f*, and ∂Ω is the lateral boundary of a 3D domain Ω within which the assimilation is performed. The model fields are discretized on Arakawa's A grid with homogeneous horizontal stepping and variable spacing in the vertical. Central differences in space and time are used to approximate differential operators. The Jacobian is approximated by the standard Arakawa scheme. The major difference from the previous formulation by Nechaev and Yaremchuk (1994) is in improved numerics [approximation of the nonlinear terms, spectral solvers for Eqs. (3)–(4)].

The system of equations (1)–(6) requires setting the initial conditions for *ω**ρ* [Eq. (2)], and the lateral boundary conditions for *ω* [Eq. (1)], *ρ* [Eq. (2)], and *ψ**ρ* = −1/*aω*_{z}, while Δ*ψ* = *ω* and *ψ,* *ω* are known on ∂Ω. Therefore, the problem is entirely controlled by the initial (*t* = 0) values of the fields *ω**ρ,* the values of *ψ,* *ω* (or equivalent values of *ψ**ψ*_{z} = −*aρ,* *ω**ω*_{z}) at the lateral boundary, and the wind forcing error *F*_{τ}.

It should be noted that the system (1)–(9) is equivalent to the standard formulation for the QG dynamics in terms of 3D potential vorticity and streamfunction *ψ.* Our specific choice of variables in formulation of the model was made for two reasons. First, the state vector contains density perturbations instead of the hardly measurable potential vorticity. This requires less-complicated model–data projection operators, resulting in a better conditioning of the inverse formulation of the model. Second, the diagnosed model fields include vertical velocity, which may be important for applications.

The values of physical parameters uncontrolled by the assimilation scheme were chosen to resemble conditions in the Kuroshio Extension region, where an AT array of five acoustic transceivers was deployed and maintained for 2 months in 1997 (Yuan et al. 1999). The background distributions of the Brunt–Väisäla frequency and the reciprocal sound speed velocity *σ*_{b} for that region are shown in Fig. 1. The area is characterized by relatively smooth bottom topography with *H*_{0} = 5300 m and *h*(*x,* *y*) not exceeding 550 m (Fig. 2). The function *h* has been smoothed with the third-order Shapiro filter to remove spatial variability on scales below two horizontal grid steps. Numerical values for the rest of the model parameters are *f* = 7.7(10^{−5}) s^{−1}, *β* = 1.93(10^{−11}) s^{−1} m^{−1}, *ρ*_{0} = 1026.5 kg m^{−3}, *ν* = 500 m^{2} s^{−1}. The model was configured on a rectangular grid with horizontal spacing of 55 km (Fig. 2) and variable resolution in the vertical. The vertical grid levels were optimized to minimize the condition number of the vertical differentiation operator present in (4). The grid had *n* = 21 points in each of the three spatial dimensions (Figs. 1, 2). In the time direction there are 120 intervals, giving *T* = 60 days of integration with the time step of 0.5 days.

### b. Inversion scheme

**X**(

*t*):

**X**

*t*

_{d}

**X**

*t*

_{r}

**X**

*t*

**X**at time

*t*is represented in the model by all the gridpoint values of the fields

*ω*

*ρ.*Given the grid dimensions, the state vector dimension is 9181. Here

_{d}and

_{r}denote the terms corresponding to the real and bogus (Thacker 1988) data, respectively. In the reported experiments “real” data come from the three different sources: satellite altimetry measuring SSH anomalies

*ζ**; acoustic tomography measuring travel time perturbations

*δτ*

_{ρ}along the eigenray paths

*γ*

_{k}; and differential acoustic tomography, measuring the differences of travel times in opposite directions

*δτ*

_{ψ}for a given eigenray path. Numerically the ray paths were computed by solving eigenray problems (Dushaw and Colosi 1998) for each pair of transceivers under the assumption of horizontal homogeneity of the background sound slowness distribution

*σ*

_{b}(

*z*) (Fig. 1).

_{d}has the form

*T*is the time of model integration and the linear operators

_{ζ},

_{ρ}, and

_{ψ}map the fields of model SSH anomalies

*ρ*and

*ψ*onto the measurement points

*q*along satellite tracks and onto the tomographic data, respectively. Operator

^{q}

_{ζ}

*ψ*at

*z*= 0 onto the altimeter measurement points. Tomographic operators

_{ρ}and

_{ψ}perform the same kind of local interpolation as well as integration along the ray paths. Let us consider the form of these operators in more detail.

*σ,*the acoustic signal travel time is expressed via the integral over the unperturbed eigenray path

*γ*

_{k}in the following manner:

*ds*is the element of a ray path. In order to relate the model state vector to

*σ,*we have utilized the empirical equations, relating density (Brydon et al. 1999) and the inverse speed of sound to potential temperature

*θ,*salinity

*S,*and pressure:

*ρ*

*θ,*

*S,*

*p*

*σ*

*θ,*

*S,*

*p*

*θ*/

*S*relationship of the form

*δS*=

**x**,

*z*)

*δθ,*relating temperature and salinity perturbations. The function

^{−1}to 0.086 psu (°C)

^{−1}within Ω. The above assumptions enabled us to derive the relationship between

*σ*and

*ρ*in the form

The function **x**, *z*) varied from 3.2 × 10^{−6} m^{4} kg^{−1} s^{−1} in the upper layers of the northern part of the domain to 15.7 × 10^{−6} m^{4} kg^{−1} s^{−1} in the intermediate layers. The integral (11) was evaluated numerically by local trilinear interpolation of the *ρ* values from the apexes of a grid cell, enveloping the ray element *ds* onto the ends of that element. The second-order integration scheme along the ray path was used. The grid representation of ^{k}_{ρ}*n*^{2}(*n* − 1)-dimensional vector, whose elements *x*^{k}_{p}_{Ω} *x*^{k}_{p}*ρ*)_{p} = *δ**τ*^{k}_{ρ}*p* = 1, … , *n*^{2}(*n* − 1) on the *ρ* grid.

*δτ*

_{ψ}is expressed in terms of the horizontal velocity field

**u**=

**k**× ∇

*ψ*(e.g., Worcester et al. 1991) as

**s**is the unit vector tangent to the path. Being also approximate, this formula holds to the precision of differential tomography measurements under the assumption that rays do not deviate too far from the paths defined by the background distribution

*σ*

_{b}(

*z*). Evaluating the integral (12) numerically in the same manner as (11), we obtained the representation of

^{k}

_{ψ}

*ψ*grid, which is staggered in the vertical with respect to the

*ρ*grid.

_{r}penalizes the amplitude of the wind stress curl error and enforces smoothness in the model fields by penalizing horizontal grid-scale components of the solution

Numerically the assimilation problem is controlled by the open boundary values of *ψ**t*) and *ρ*(∂Ω, *t*) at two external gridpoint layers: the initial conditions *ρ*(Ω, 0) and *ω**F*_{τ}(**x**, *t*). The total number of the grid points occupied by the control fields *D* = 435 902 (the dimension of the control space *D* is given by the subspace of the lateral boundary conditions on *ψ**ρ,* which occupy 383 040 grid points. Wind forcing error control and initial conditions contribute by 43 681 and 9181 dimensions, respectively.

The weighting functions present in (10) and (13) have the sense of inverse variances for the real and “bogus” data. The inverse values of *W**s were estimated as horizontal mean square variations of the corresponding bogus data fields Δ*ω,* Δ*ψ,* Δ*ρ,* and Δ*W* and depended only on *z.* The bogus data fields were picked from the first-guess model trajectory described in section 3a. Prior variance of the SA data *W*^{−1}_{ζ}^{2} in correspondence with the typical error level for SSH anomalies of the TOPEX/Poseidon altimeter (Tapley et al. 1994). Variance of the wind stress curl error was taken to be three times smaller than the spatiotemporal variation of the first-guess field of curl *τ.*

Variances of the AT measurements were assumed to consist of two parts. The first part is due to instrumentation errors, which were set to 5 ms for the TT and DT measurements. The second constituent of the error is associated with the influence of internal waves and other processes unresolved by the model physics. A simple statistical model was used to estimate that influence. We assumed that unresolved processes contributed to density and velocity perturbations as the white noise whose amplitude was four times smaller than the rms horizontal variation of these fields in the first-guess solution at a given level. Travel time variations were then computed using the linear expressions for the operators _{ρ} and _{ψ}. It is noteworthy that these error contributions to travel times were an order in magnitude smaller than the corresponding noise level ɛ because of the averaging nature of AT observations.

The number of real data points *N*_{d} was varied in the experiments reported below. Acoustic data were always measured at each of the 121 time steps, whereas the total number *K* of rays *γ*_{k} was changed. In the maximum configuration we monitored *K* = 250 acoustic rays, giving 121 × 250 = 30 250 observations for TT and DT datasets. The number of SSH data points along satellite tracks shown in Fig. 2 was 1139. This gives 61 639 real observations in total.

Bogus data regularize the problem, providing additional prior information on the structure of the control fields and the model trajectory. The total number of bogus “observations” is equal to the number of the grid points where squared quantities in (13) are evaluated. This number is approximately 52 times larger than *N*_{d}, increasing the total amount of observations to approximately 3.3 million, or 7.6 data points per degree of freedom of the model. An alternative way to regularize the problem is to attract the control fields or the model trajectory to a background state by explicitly specifying the latter in the prior probability distribution. In that case the condition number of the assimilation problem is under control since it heavily depends on the covariance of the background state. On the other hand, both the background state and its statistics are rather poorly known, reducing the reliability of the cost function formulation. Therefore, we chose to regularize the problem by smoothness constraints, which, in our opinion, are more reliable statistically: any numerical model has large errors on spatial scales close to grid spacing; therefore, it is natural to treat grid-scale components of the model fields as *δ*-correlated noise.

*D*×

*D*matrix, whose first 9181 diagonal elements are the horizontally averaged squared values of the cost function gradient with respect to

*ρ*|

_{t=0}and

*ω*

_{t=0}, which are followed by 43 681 elements representing the mean squared value of the gradient with respect to

*F*

_{τ}, etc. With this metric specification the gradient norm can be defined by

To gauge the convergence properties of the assimilation scheme and optimize the amplitude of regularization terms we conducted a series of preliminary experiments varying the relative weights of regularization terms and the type of regularization. Figure 3 shows evolution of the cost function value and the norm of the gradient with iteration number in a typical assimilation experiment with squared Laplacian regularization. The gradient norm is reduced to machine precision (10^{−6}) in approximately 2500 iterations, indicating that the condition number of the problem is about 6 × 10^{6}. This is consistent with an estimate of the condition number for the biharmonic operator associated with regularization terms. Experiments with regularization terms penalizing the squared gradients of the fields have shown much faster convergence but worse performance of the assimilation scheme, primarily due to oversmoothing at the horizontal scales of three to five grid steps. Therefore, we chose to penalize the Laplacians since they provide a sharper cutoff in the horizontal wavenumber spectrum.

Experiments with varying the relative weights of regularization terms have shown that the best performance is achieved when, initially, _{d} ∼ 1000_{r}. Typical evolution of the cost function constituents for that case is shown in Fig. 4. The relative contribution of the real data terms proportional to *W*_{ζ}, *W*^{ρ}_{tom}*W*^{ψ}_{tom}

On the whole the assimilation scheme has demonstrated a reasonable performance despite a fairly large dimension of the control space. After this preliminary study we conducted a set of twin data experiments with a goal to investigate the roles of different data types and observational properties of the AT measurements.

## 3. Twin data experiments

The concept of twin data experiments allows us to study interpolation properties of any data assimilation scheme by means of numerical computations. The model trajectory is reconstructed using certain amounts of data picked from that trajectory and then contaminated by noise. In this paper we report the results of more than 150 experiments focused on the ability of the assimilation scheme to retrieve oceanic state from a synthesis of altimetric and tomographic measurements.

### a. The reference and the first-guess solutions

To mimic the real nature, we generated a reference model solution by specifying the control vector as follows. Initial conditions were generated by adding a number of mesoscale disturbances to the climatological values of *ρ* (Fig. 2) taken from the World Ocean Circulation Experiment climatology. The mesoscale disturbances *ρ*_{meso} simulate the ones obtained from a 3D inversion of the real travel times observed on 20 July 1997 at the transceivers deployed in the Kuroshio extension region (Yuan et al. 1999). The reference initial conditions for density *ρ*_{ref} were obtained as a sum of the climatology and the mesoscale disturbances. Typical horizontal scale of the disturbances was close to 150 km (Fig. 5), while in the vertical their amplitude decayed 10 times at 1500 m. After specifying the reference initial conditions for density we defined the barotropic component of the flow *ω*_{ref} at *t* = 0 by assuming no motion at *z* = 2000 m. To produce the reference ocean trajectory **X**(*t*) the model was integrated for 2 months with a version of free radiation conditions (e.g., Orlanski 1976) at the open boundaries. The wind stress forcing curl ** τ** mimicked several atmospheric cyclonic and anticyclonic disturbances, passing over the basin in the northeast (NE) direction. Their typical size was

*O*(1000 km) with the amplitude of |

**| ∼ 0.2–0.5 dyn cm**

*τ*^{−2}. Evolution of the density

*ρ*

_{ref}and streamfunction

*ψ*

_{ref}of the true ocean state is shown in Fig. 5.

*K*is the number of rays that were kept fixed during integration period; tildes denote random fields with the amplitudes equal to those of

*ψ,*

*ρ*and

*ζ*; while ɛ have the sense of noise levels of the corresponding data. These data were used further to reconstruct the reference state.

Since the model is nonlinear, the result of assimilation may depend on the first-guess solution, which initiates the optimization procedure. In the experiments we used two types of the first-guess solution. Both of them are defined by specification of the control vector, that is, the set of the nodal values of the functions specifying initial, lateral, and surface boundary conditions of the model. The surface boundary condition in both cases was the same as for the reference solution (*F*_{τ} = 0).

For the first-guess solutions, lateral boundary values of *ψ* and *ω* were the functions of *ρ*|_{t=0}, *ω*_{t=0} and *F*_{τ} because they were defined by the same free radiation scheme as for the reference state. This scheme can be viewed as a kind of extrapolation of the field values from the interior to the boundaries of the domain. Therefore, all the difference in the first-guess model trajectories was due to the difference in initial value of the state vector and, as a consequence, in the values of *ψ* and *ω* at the open lateral boundaries.

The first type of initial condition that we used was the pure climatological state underlying the reference solution. This is surely an always available method to approximate nature. In reality, however, a researcher always has data that can often provide a more realistic zeroth approximation. To mimic this situation, we used the “tomography data” *δ**τ*^{k}_{ρ}*δ**τ*^{k}_{ψ}*t* = 0 to improve the first guess of the initial fields *ρ*|_{t=0}, *ω*_{t=0}. In the cost function (10), (13), we took into account only the terms corresponding to the time layer *t* = 0 and found the optimal fields *ρ*|_{t=0}, *ω*_{t=0} by minimizing such a truncated cost function, which is not constrained by dynamics. These initial conditions “improved by data” were subsequently used to generate the data-improved first-guess model trajectory.

At this point it is necessary to mention that in all of the assimilation runs we did not use the free radiation scheme for estimating the fields' values at the lateral boundaries but instead treated the boundary values of the fields as unknown free model parameters (383 040 in number), whose values were retrieved from the data. The radiation boundary conditions were used only for the production of the reference state and estimation of the first-guess values of the control fields at the lateral boundaries.

*e*is defined in terms of the departure of the assimilated model trajectory from the reference one:

*ϕ*is a state vector component of the model (

*ψ*or

*ρ*), while overbar and angular brackets stand for the vertical and horizontal means, respectively. The above-described first-guess solutions were characterized by the errors

*e*

^{clim}

_{ψ}

*e*

^{clim}

_{ρ}

*e*

^{data}

_{ψ}

*e*

^{data}

_{ρ}

*in the time average*owing to the absence of the control of the model trajectory via boundary conditions. Inevitable errors in the approximation of the initial state are especially large at the boundaries that are not affected by the AT data in the 3D case. These errors tend to amplify when the model is integrated with free radiation conditions, resulting in a strong deviation of the model trajectory from the reference one. Data assimilation provides boundary conditions with information from the interior of the domain and drives the model trajectory closer to the reference solution.

### b. Sensitivity to the ray structure

In this section we describe the results of twin data experiments with TT and DT acoustic data. One of the major parameters of an AT observation system is the structure of monitored rays. The ray paths are usually assumed to be given through the analysis of acoustic properties of the background state of an ocean region under consideration. A difficult task is to relate the travel time data to the ray paths of the background state. We shall not consider the problem of ray identification but shall assume that the correspondence between the ray paths and travel times is established.

Each pair of transceivers can be characterized by a limited number of rays, *κ* ∼ 10–15, whose travel times are observed. These rays are characterized by the launch angles between the horizontal plane and the unit vector tangent to the ray at the transceiver's location. Usually rays with larger LAs sample greater thickness of the ocean, but this is not a rule, since ray trajectories depend on the transceiver's position with respect to the main oceanic sound channel, which is located at the depth of maximum *σ*_{b}. The typical ray paths corresponding to the pair of transceivers located at 30.75°N (Fig. 5) are shown in Fig. 6.

As it is seen, the largest sampling depth is achieved when transceivers are located near the sound channel axis (1050 m). Shallower and deeper positioning of the transceivers also results in lower vertical resolution at the depth range 500–2000 m around the main sound channel because ray amplitudes are limited from below by the difference in depth between the channel's axis and the transceiver. In that case even the rays with the smallest launch angles sample a relatively thick depth interval of 500–2000 m, compared to 800–1300 m for the case when transceivers are positioned at the axis of the sound channel.

_{ρ}and

_{ψ}have nonzero components at all the grid points within the sampling depth range on the vertical plane connecting the pair of transceivers. We shall characterize observational properties of a ray system, {

^{k}

_{ρ}

*k*= 1, … ,

*K*}, by the corresponding Gram matrix

^{−1}

_{pp′}

*ρ*|

_{t=0}.

The system would have been perfect if the operators were mutually orthogonal (for instance, each ray would have sampled the ocean at a prescribed depth range that does not intersect with the depth ranges of other rays). A “bad” system is composed of “parallel” operators sampling the ocean along the ray paths that are very close to each other. The corresponding Gram matrices will have different spectral properties: for an “ideal” system all eigenvalues are unity, while for the worst conditioned system the eigenvalues are zero except the largest one *g*_{max}. To characterize the amount of “independent measurements” acquired by an AT system, we introduce the parameter *M* = Tr*G*/*g*_{max}. “Observational efficiency” can be characterized by the ratio *η* = (*M* − 1)/(*K* − 1), which gauges the mean contribution of a ray to the total amount of information provided by the system of *K* rays. For an ideal system, *η* is unity, and it is zero for the worst conditioned one. Of course, the above-mentioned criterion of independence describes the property of observational settings alone and does not take into account the dynamical value of the observations. Nevertheless, we assume that these “static” parameters can still give us some idea of the properties of the ray system in the dynamically constrained 4D case.

In the experiments reported below we have varied the ray systems and checked the quality of assimilation (15) as a function of *M* and *η.* Rays interacting with the ocean surface and bottom were excluded from consideration, because in reality it may be hard to identify them from the travel time data. Horizontal positions of the transceivers were fixed throughout all the experiments.

The first ray system considered takes into account all of the eigenrays extracted from the background profile of *σ*_{b}(*z*). Transceivers for that system were located at the axis of the main sound channel (*h* = 1050 m). We identified *K* = 250 rays. For the second system we took only 96 rays by eliminating the rays with small launch angles *α* < 4°. The third system had 95 rays, but they were evenly distributed with *α.* The fourth ray system was also characterized by even distribution of rays in *α,* but the total number of rays was much smaller (*K* = 40). Three additional experiments were made by varying the vertical position of the transceivers while keeping the homogeneous ray distribution with *α.* Results of these experiments are assembled in Table 1.

Although the maximum assimilation quality is observed for AT with *K* = 250 (first row), this case should not be considered as realistic, because in real-life experiments the number of monitored rays per station pair usually varies between 10 and 15 (Yuan et al. 1999). The maximum observational efficiency *η* is achieved, on the other hand, at *K* = 40, but in that case the quality of assimilation is reduced considerably because of the smaller absolute amount of independent measurements *M* compared to the cases with the larger number of rays. It is also noteworthy that removal of the rays with small launch angles, which sample a limited depth range around the sound channel, does not increase the efficiency of the system and produces larger assimilation errors (cf. lines 2 and 3 in Table 1).

Experiments with varying the depths of the transceivers show that for a given number of rays (95) the optimum efficiency and, correspondingly, the largest error reduction is achieved when *h* coincides with the depth of the sound channel (cf. line 3 and lines 5–7). When the transceivers are displaced from the axis of the sound channel, ray paths exhibit more similarity to each other and the observational performance of the ray system decreases. Another feature of the ray systems in lines 5 and 7 of Table 1 is the stronger clustering of the ray amplitudes and launch angles, which makes it difficult to separate the rays in practice. Only a small fraction of these rays penetrate into the layers above 200 m and below 2500 m. For the case *h* = 1050 m the number of such “large-amplitude” rays is almost two times larger than for *h* = 2100 m and *h* = 500 m. Besides, for *h* = 1050 m, rays with small launch angles probe the ocean state at a narrower depth range [Fig. 6 (top)], also contributing to the better conditioning of the ray system.

We conclude that within the framework of the proposed assimilation algorithm the ray system can be improved by positioning the transceivers in the main sound channel and taking into account the rays with the smallest possible launch angles. It should be noted, however, that practical utilization of these rays may appear to be a challenging task because the corresponding acoustic signals are usually more difficult for identification.

### c. Noise sensitivity

One can expect that AT observations should not be sensitive to high spatial frequency noise because of their integrating nature. To quantify this idea we have conducted a series of experiments with various values of ɛ_{ψ}, ɛ_{ρ}, and ɛ_{ζ}. The velocity and density fields extracted from the reference state trajectory were contaminated by the white noise, whose amplitude was proportional to the rms variation of the corresponding field with the proportionality coefficient ɛ. The value ɛ = 0.25 is assumed to be the typical one for oceanic conditions: depending on depth, it corresponds to the noise amplitudes of 5–15 cm s^{−1} and to 0.01–0.3 kg m^{−3} in the fields of velocity and density. For the measurements of SSH anomalies we took ɛ_{ζ} = 0.05 (2.5 cm) in correspondence with the accuracy of the TOPEX/Poseidon altimeter (Tapley et al. 1994).

The experiments were conducted for all the ray structures outlined in the previous section and for the five values of the noise level. The ray patterns corresponding to *h* = 1050 m proved to be the most efficient in the presence of noise as well. Table 2 demonstrates the results for the first and for the third ray patterns. As it is seen, the quality of assimilation is weakly sensitive to the noise level: *e*^{*}_{ρ}_{ρ} > 0.25. Even when ɛ_{ρ} = 1 the quality of assimilation decreases only by 20%.

Such an extremely low sensitivity is due to a couple of factors. First, as it was said earlier, tomographic data are weakly affected by noise because contributions of small-scale fluctuations of the inverse speed of sound to the travel time tend to cancel along the ray paths. The second reason is the smoothing property of the assimilation scheme, which heavily penalizes grid-scale structures in the solution.

### d. The role of differential tomography and altimetry

Differential tomography provides information on the along-ray velocity component averaged over the ray path. In the framework of quasigeostrophic constraints velocity (DT) data are somewhat complementary to density (TT) observations, since the velocity is connected to density via the geostrophic relationships.

Experiments on separate assimilation of the DT and/or TT data have shown that DT observations provide less information on the ocean state than do TT data (cf. columns 6 and 8 in Table 1). We assume that two reasons may be responsible for that. The first one is the result of a lower signal-to-noise ratio attributed to DT and quantitatively expressed by the relatively lower inverse variance *W*^{ψ}_{tom}*ρ,* *ω*

However, DT data, when assimilated jointly with the TT component, provide a large contribution to the quality of assimilation (cf. columns 4–5 and 7, Table 3). Additional experiments have shown that this contribution becomes more significant when the ray system does not include high-amplitude rays, which probe the entire depth of the water column. At the same time it was observed that the cost function gradients with respect to the barotropic control variables *ω**ψ**W*^{ψ}_{tom}*W*^{ρ}_{tom}

Altimetric data provide information with much lower temporal resolution than TT/DT observations, which typically sample the ocean once every 10–15 min. In our experiments we simulated the TOPEX/Poseidon satellite tracks over the experimental domain (Fig. 2). SSH anomaly was measured along each of these tracks, which had a repeat cycle of 9.9 days. Consequently, our observation system acquired SSH data along a track once every 20 h on the average. Sampling frequency along a track was set to 30 km. The number of SSH observations that were picked up from six cycles spanning the “measurement period” of 2 months totalled 1139.

Experiments have shown that when assimilated alone, SSH measurements provide approximately equal error reductions in the baroclinic and barotropic components of the ocean state. When synthesized with tomography data, SSH signal gives a larger contribution to the improvement of the density field. This indicates that SSH observations contain information on both the barotropic and baroclinic modes. As it is seen from comparing columns 4–8 and 7–9 in Table 3, SSH data provide an important contribution to tomography despite much lower temporal resolution. To investigate the impact of higher spatiotemporal resolution of satellite altimetry, we conducted a number of additional experiments. In the limiting case of SSH data available at all the grid points, *e** increased to 0.67.

Figure 7 gives a comparison of the vertical structures of the error fields characterizing assimilations of AT data with and without altimetry (columns 4, 7–9 in Table 3). It can be seen that SSH data provide major improvement in the uppermost layer of the ocean, which is poorly sampled by the acoustic rays. Minor improvements are also seen in the deep layers, indicating better coverage of the barotropic mode in the presence of SA data. Figure 7 also shows that relative errors are much larger in the deeper layers and their reduction is not so large as in the upper 1000 m of the water column. That can be partly explained by much lower spatial variability of the fields in the deep ocean and their weak dependence on the processes occurring above. Another possible explanation is in the difference in timescales for the upper and lower layers. Effective determination of the 3D ocean structure from the AT data (which are integral by nature) is possible because of the use of data from different times in combination with dynamical constraints that are able to redistribute information in space–time. However, if time evolution is slow compared to the assimilation period, as one would expect for baroclinic modes in deep layers, the system cannot properly determine the 3D ocean structure at those depths, whereas resolution is much better where the ocean changes more rapidly.

## 4. Error analysis

### a. Comparison with the reference fields

One of the advantages of the twin data assimilation experiments is availability of the true fields from which the data were sampled. Therefore, one can make accurate and explicit estimates of the errors in the analysis.

To give the first idea of the assimilation quality in horizontal, in Fig. 8a,b we compare the true (reference) fields at *t* = 0 with their first-guess values and with the best estimate obtained under realistic assumptions on the noise level (ɛ_{ψ} = ɛ_{ρ} = 0.25, ɛ_{ζ} = 0.05). As it is seen, tomography data constrained by dynamics are capable of reproducing eddy features at scales several times smaller than the distance between the transceivers. These features were also captured by the tomography data alone (without SA data). In the deep layers spatial variability of the ocean state is much lower since the reference state density field is dominated by climatology (Fig. 8b). This causes low first-guess errors in density and a relatively small value of *e*^{clim}_{ρ}*e**. Nevertheless one can observe that smaller-scale eddylike features in the streamfunction field are reproduced reasonably well even at these levels.

Evolution of the relative errors with time at 769 m is shown in Fig. 9. This and all other examples below are given for the case of combined TT–DT–SA data assimilation with *K* = 250, ɛ_{ψ,ρ} = 0.25, and ɛ_{ζ} = 0.05. The error curves in Fig. 9 look more or less flat with a weak minima in the middle of assimilation period for both *e*_{ρ} and *e*_{ψ}. That is consistent with Derber's (1987) conclusion that analysis error should be the lowest at the middle of the assimilation period, where information from both past and future has the strongest impact on the optimized solution. A fairly good observability of the mesoscale features is demonstrated by Fig. 10, which shows considerable error reduction and the loss of mesoscale structure in the error fields after assimilation. The only exclusion from that is observed in the NE corner of the array. In some moments of time the initial error is even increased in that area. To some extent the effect can be explained by the sparsity of real measurements in the NE corner and by specific features of the reconstructed flow in that region. Geometry of the current in the north and NE of the domain indicates (Fig. 8a) that streamfunction contours in the area may not carry enough real information from AT, which is mainly substituted by bogus data (not counting occasional and sparse SA observations). The bogus data (smoothness constraints), on the other hand, are not consistent with the true frontal structure in the area, resulting in large assimilation errors.

Normalized error fields for three different levels are presented in Fig. 11. Spatial error patterns exhibit a noticable change at 1075 m: a region of large errors in the northwest disappears and converts into a more isotropic error distribution along the boundaries. This may be attributed to the general flow structure in the deep layers: a southwestward current inflowing from the northeastern corner (Fig. 8b) is well monitored by the acoustic rays downstream, which in turn identify the upstream conditions causing a significant error reduction in the NE corner.

Inspection of Figs. 10b–11 also shows that the assimilation quality is much higher in the pentagonal region (Fig. 12) whose apexes are defined by the acoustic transceivers (ATPs). Computations show that the residual errors *e*_{ψ,ρ} are reduced more than twice (down to 0.2–0.25) if the horizontal average in (14) is performed within the ATP region.

### b. Estimates of the error variances

*y*] of any observable

*y*related to the control vector

**c**through a linear relationship

*y*=

**Lc**by solving the system of equations

**x**

**L**

**L**: var[

*y*] =

**L**

^{T}

**x**. The system of

*D*equations (16) can be solved iteratively as soon as a code is available for multiplication of an arbitrary vector by 𝗛. Selecting

*y*to be a field value at a certain point in space and time, and applying the adjoint code to compute the representation of

**L**in the space of control variables, one can obtain error maps by varying the location of the observable(s) within a selected surface. This is an expensive computation, because error estimate at any point of the map requires an iterative inversion of the Hessian matrix [i.e., solution of the system (16)]. On the other hand, these computations are easily parallelized, since the error estimates at different points in physical space are computationally independent of each other.

As an example, we present two error maps for *ρ* at *z* = 414 m and *ψ* at 2145 m taken at time *t* = 10 days (Fig. 12). The patterns are normalized by the corresponding horizontal mean errors of the first-guess solution. The underlying posterior error distributions var[*ρ*](**x**) = **L**^{T}_{ρ(x)}^{−1}**L**_{ρ(x)} and var[*ψ*](**x**) = **L**^{T}_{ψ(x)}^{−1}**L**_{ψ(x)} were calculated by solving (16) at every point **x** in horizontal, with **L**_{ρ(x)} and **L**_{ψ(x)} being the *D*-dimensional vectors, projecting **c** onto the values of *ρ* and *ψ* at that point. Both maps show considerable error reduction within the ATP region, indicating that a great deal of information has been gained from the assimilation in that area. At the same time these posterior error maps are highly consistent with the results, obtained by simple comparison of the reference and assimilated fields (Figs. 10b, 11).

One should note that the above-described error analysis is approximate, since the model is nonlinear and we have to consider only small deviations from the optimal state to keep the Gaussian assumption valid. As a consequence, the tangent linear model code is used for implicit multiplication of the Hessian by a vector and for the computation of the components of **L** (e.g., Yaremchuk et al. 1998). Despite its approximate nature we can, however, conclude that such a type of analysis may be useful in handling the real data and estimating their assimilation quality.

## 5. Discussion and conclusions

Although the numerical model used in this study has limitations, it does illustrate a possibility of assimilating AT data in combination with satellite altimetry. The results of numerical simulations presented in this article demonstrate that quasigeostrophic currents can be monitored in the open ocean by the means of dynamical interpolation of the acoustic tomography and satellite altimetry. It is shown that the initial error level can be reduced more than two times by assimilating data from five acoustic transceivers in combination with SA and under realistic assumptions on the observational noise level. The error reduction appears to be almost twice as large (up to four times) if estimated only within the subdomain covered by acoustic transceivers.

In formulating the inverse model we tried to be as close as possible to a typical medium-range tomography experiment in the open ocean where lateral boundary conditions are poorly known. Prescribing freedom at the open boundaries greatly increases the number of unknown model parameters so that the objective of the study may seem too ambitious: retrieve almost half-a-million unknowns from less than 70 000 observations. The task may seem even more complicated since we deliberately selected the type of regularization by imposing smoothness constraints rather than attracting the solution to a background field with poorly known statistics. This choice of regularization resulted in somewhat slower convergence than one may expect from the background state regularization combined with the choice of a well-conditioned covariance matrix for that state. However, the problem appears to be computationally feasible for moderate facilities. It is also free from uncertainties associated with the necessity to choose the background state and its covariance. A posteriori error analysis has shown better performance of the smoothness regularization compared to the technique of attracting of the model to the background state generated by the first-guess solution. In this respect, employment of the smoothness constraints is an important feature of the presented assimilation algorithm.

Owing to the large dimensions of the control and data spaces it is impractical to conduct rigorous error analysis, which requires computation of the second moments. As an example, in order to assess the number of independent observations one has to study the system of operators that projects the 435 902-dimensional control space onto the entire dataset (including bogus data points). We provided an approximate solution to the problem by treating the projection operators in a simplified 3D case. As it is seen from Table 1, observational properties of the AT systems estimated in that way correlate well with the quality of assimilation. Approximate error analysis has also been performed by means of implicit inversions of the Hessian matrix. The results appear to be consistent with the error estimates derived from simple comparison of the reference and assimilated solutions. Both kinds of error analysis revealed poorly defined components of the assimilated solution in the northeastern corner of the domain and along the boundaries. These components can be treated as null space constituents if we bound the Hessian spectrum from below by interrupting the iteration procedure before full convergence is achieved. In this respect it is also worthwhile to note that the result of assimilation was fairly independent on the first guess. In the experiments, starting from the data-generated first guess (section 3a), the optimized states deviated by only a few percent from those obtained by using climatology as a first guess.

One can note that a significant unobserved portion of the ocean state still remains in the assimilation results. As it is seen in Table 3, approximately half of the initial error variance is unseen by our measurement system. This number is primarily due to the corner regions of the domain, covered only by smoothness constraints but not by real data. If calculated over the ATP subdomain, which is better covered by AT measurements, the relative value of unresolved variance reduces more than two times to 0.2–0.25. From the formal point of view only a small fraction of the model's degrees of freedom are directly observed. Even for the experiments with *K* = 250 we had one real measurement point per eight control variables. Taking into the account observational properties of the ray patterns discussed in section 3b, this ratio should be increased two to three times to 20–25. It is noteworthy that twofold reduction of this parameter can be achieved at the expense of adding only two transceivers to the array, since the number of AT data points increases quadratically with the number of stations. Another point is a poor coverage of the uppermost layers by the tomography. Sea surface altimetry tends to close that gap in observability. That was clearly seen in the experiment with enhanced spatiotemporal resolution of SSH data and from the vertical patterns in Fig. 7. It is also worthwhile to note that a large fraction of the ocean state variation has been already explained by the climatology, which captures the large-scale frontal zone in the northern part of the domain dominating the spatial variability pattern. Hence, our measurement system has shown its performance in improving the pattern at scales smaller than separation between the stations. Taking all that into account, the value of 0.5 (0.2–0.25 within the ATP subdomain) for the quality of assimilation does not look discouraging.

The major question answered by the present study was about the practical feasibility of monitoring mesoscale features in the open ocean by a limited number of AT transceivers. We assume that our investigation gave a positive answer to that question. Experiments with varying the positions of acoustic transceivers in the vertical have shown that the best observability is achieved when the travel time measurements are performed at the axis of the main sound channel. For a given number of rays, measuring properties of the AT system are optimal when the rays are evenly distributed with the launch angle and small-amplitude rays are taken into account. The three data components considered in this study complement each other in three aspects: TT data provide information on the baroclinic structure of the ocean, DT contains a considerable barotropic signal, while altimetry provides additional description of the upper layer, poorly observed by the ray patterns. One can anticipate that additional in situ measurements like expendable bathythermographs may strongly improve the situation, especially in the upper layer where the empirical formula used for relating sound speed and density fluctuations via *θ*/*S* relationship may fail because of strong mixing.

The system under consideration mimics the dynamics of the Kuroshio extension region, which is characterized by strong perturbations of the temperature and salinity fields. Those perturbations may cause considerable deviation of the ray patterns from those in the unperturbed medium. Consequently, the exploited linear model for the AT observation operators may not be valid in the regions of strong meandering currents (Yaremchuk and Yaremchuk 2001). Therefore, the results obtained in this article would be better viewed as an indication of the possibility to monitor quasigeostrophic currents in typical open ocean regions located far from strong meandering jets and eddies, where linear tomography should work more accurately. In the regions of strong currents one should probably use primitive equations models in combination with nonlinear ray models for AT observations (Wunsch 1987). Further improvement of the observation system can be done by including remote sensing data on sea surface temperature and salinity. Combined with acoustic tomography, these data can effectively constrain more complex models that are capable of resolving the timescales of several hours and take full advantage of high temporal resolution of the AT observations.

## Acknowledgments

This study was supported by the Frontier Research system for Global Change through its sponsorship of the International Pacific Research Center (IPRC). Helpful discussions with Dr. H. Mitsudera and Dr. G. Yuan are acknowledged. Dr. G. Yuan kindly provided the parameters of the background state for the 1997 AT array. Finally, two anonymous reviewers were extremely helpful with corrections and suggestions to the manuscript.

## REFERENCES

Brasseur, P., Ballabrera J. , and Verron J. , 1999: Assimilation of altimetric data in a primitive equation model of the Gulf Stream using the steady state limit of a singular evolutive extended Kalman filter.

,*J. Mar. Syst***22****,**269–294.Brydon, D., Sun S. , and Bleck R. , 1999: A new approximation of the equation of state for seawater, suitable for numerical ocean models.

,*J. Geophys. Res***104****,**(C1),. 1537–1540.Cane, M. A., Kaplan A. , Miller R. N. , Tang B. , Hackert E. C. , and Busalacchi A. J. , 1996: Mapping tropical Pacific sea level: Data assimilation via a reduced state Kalman filter.

,*J. Geophys. Res***101****,**22599–22617.Cooper, M., and Haines K. , 1996: Altimetric assimilation with water property conservation.

,*J. Geophys. Res***101****,**1059–1077.Cornuelle, B. D., and Worcester P. F. , 1996: Ocean acoustic tomography: Integral data and ocean models.

*Modern Approaches to Data Assimilation in Ocean Modeling,*P. Malanotte-Rizzoli, Ed., Elsevier, 97–115.Derber, J. C., 1987: Variational four-dimensional analysis using quasi-geostrophic constraints.

,*Mon. Wea. Rev***115****,**998–1008.Dushaw, B. D., and Colosi J. A. , 1998: Ray tracing for ocean acoustic tomography. Applied Physics Laboratory, University of Washington, Tech. Rep. TM 3-98, 31 pp.

Gilbert, J. Ch, and Lemarechal C. , 1989: Some numerical experiments with variable-storage quasi-Newton algorithms.

,*Math. Program***45****,**407–435.Howe, B. M., Worceter P. F. , and Spindel R. C. , 1987: Ocean acoustic tomography: Mesoscale velocity.

,*J. Geophys. Res***92****,**3785–3805.Mellor, G., and Ezer T. , 1991: A Gulf Stream model and an altimetry assimilation scheme.

,*J. Geophys. Res***96****,**8779–8795.Menemenlis, D., and Wunsch C. , 1997: Linearization of an oceanic general circulation model for data assimilation and climate studies.

,*J. Atmos. Oceanic Technol***14****,**1420–1443.Munk, W., Worcester P. F. , and Wunsch C. , 1995:

*Ocean Acoustic Tomography*. Cambridge University Press, 434 pp.Nechaev, D. A., and Yaremchuk M. I. , 1994: Conductivity-temperature-depth data assimilation into a three-dimensional quasigeostrophic open ocean model.

,*Dyn. Atmos. Oceans***21****,**137–165.Orlanski, I., 1976: A simple boundary condition for unbounded hyperbolic flows.

,*J. Comput. Phys***21****,**251–269.Remy, E., and Gaillard F. , 1999: Assimilation of simulated acoustic tomography data.

*Proc. WMO Symp. on Assimilation of Observations in Meteorology and Oceanography.*Quebec City, QC, Canada, World Meteorological Organization, 38–39.Schröter, J., Sielder U. , and Wenzel M. , 1993: Variational assimilation of the Geosat data into an eddy-resolving model of the Gulf Stream area.

,*J. Phys. Oceanogr***23****,**925–953.Sheinbaum, J., 1995: Variational assimilation of simulated acoustic tomography data and point observations: A comparative study.

,*J. Geophys. Res***100****,**(C10),. 20745–20761.Tapley, B. D., Chambers D. P. , Shum C. K. , Eans R. J. , and Ries J. C. , 1994: Accuracy assessment of the large-scale dynamic ocean topography from Topex/Poseidon altimetry.

,*J. Geophys. Res***99****,**24605–24617.Thacker, W. C., 1988: Fitting models to inadequate data by enforcing spatial and temporal smoothness.

,*J. Geophys. Res***93****,**10556–10566.Thacker, W. C., 1989: On the role of Hessian matrix in fitting models to data.

,*J. Geophys. Res***94****,**6177–6196.Verron, J., Gordeau L. , Pham D. T. , Murtugudde R. , and Busalacchi A. J. , 1999: An extended Kalman filter to assimilate satellite altimeter data into a non-linear numerical model of the tropical Pacific Ocean: Method and validation.

,*J. Geophys. Res***104****,**5441–5458.Wilson, S., 2000: Launching the ARGO armada.

,*Oceanus***42****,**17–19.Worcester, P. F., Dushaw B. , and Howe B. M. , 1991: Gyre-scale reciprocal acoustic transmissions.

*Ocean Variability and Acoustic Propagation,*J. Potter and A. Warn-Varnas, Vol. 1, Kluwer Academic, 119–134.Wunsch, C., 1987: Acoustic tomography by Hamiltonian methods including the adiabatic approximation.

,*Rev. Geophys***25****,**41–53.Yaremchuk, M., and Yaremchuk A. , 2001: Variational inversion of the ocean acoustic tomography data using quadratic approximation to travel times.

,*Geophys. Res. Lett***28****,**1767–1770.Yaremchuk, M., Nechaev D. A. , Schröter J. , and Fahrbach E. , 1998: A dynamically consistent analysis of circulation and transports in the southwestern Weddell Sea.

,*Ann. Geophys***16****,**1024–1038.Yaremchuk, M., Nechaev D. A. , and Thompson K. R. , 2001: Seasonal variation of the North Atlantic Current.

,*J. Geophys. Res***106****,**(C4),. 6835–6852.Yuan, G., Nakano I. , Fujimori H. , Nakamura T. , Kamoshida T. , and Kaya A. , 1999: Tomographic measurements of the Kuroshio extension meander and associated eddies.

,*Geophys. Res. Lett***26****,**79–82.

Assimilation quality *e*^{*}_{ψ}, *e**_{ρ}, computed for the ray systems with different number of rays *K* and vertical position *h* of the transceivers. Superscripts (TT or DT) denote data types used in the experiments

Results of the noise sensitivity experiments. Superscripts (TT or DT) denote the data types used in addition to the altimetry

Quality of assimilation for different combinations of data

^{}

* IPRC/SOEST Contribution Number 92/5561.