## 1. Introduction

Directly observed flow at the depths of the North Atlantic Deep Water in the South Atlantic Ocean shows a system of alternating zonal jets (Hogg and Owens 1999). Deep zonal flow has been explained, for example, using a coarse wind-driven circulation model in the Pacific Ocean (Nakano and Suginohara 2002). Several numerical models of varying resolution of the South Atlantic Ocean have been used to study the origin of the zonal flows, leading to the conclusion that wind is the primary force that generates zonal flow in the Brazil Basin (Treguier et al. 2003). Whether these zonal jets are wind driven or not, it has become clear that a better knowledge and understanding of their horizontal and vertical structure is important to an explanation of their origins and dynamics. Recently, Maximenko et al. (2005) and Richards et al. (2006) have used satellite data and higher-resolution numerical models to investigate the distribution and persistence of zonal flow structures in the Pacific and World Oceans.

The present paper takes an indirect path to the study of the horizontal and vertical structure of ocean circulation. Rather than using numerical models based on primitive equations (i.e., GCMs), we start with observed tracer concentrations and first-principle dynamics to find an estimate of the steady-state circulation. We attempt to distinguish the turbulent component (random mixing) from the advective component. We recognize that this distinction may be dependent on the particular inversion model used and its resolution, that is, that it depends on the method itself. However, we are able to explore this dependency and proceed on the principle that the observed basin-scale tracer distributions operationally define our scale separation. Our results will show that with this definition, a system of zonal jets appear with significant vertical structure, whose horizontal and vertical scales are controlled by the tracer distributions.

In an attempt to extract the signal associated with the mean circulation, oceanographers have found that spatial averaging over scales exceeding the mesoscale yields tracer concentration maps thought to be representative of the mean state. The scarcity of the data, though, usually requires some type of interpolation over the domain of interest. This gives only crude estimates for tracer concentrations in regions where no data are available, and the resulting maps are not necessarily consistent with physical principles.

McKeague et al. (2005) introduced a statistical approach to the estimation of quasi-horizontal flow in an abyssal neutral-density layer based on tracer concentration data. Throughout this paper we will refer to this approach as the *single-layer inversion method* or *2D inversion.* The estimated horizontal flow in this case was found to be primarily zonal, with alternating east–west jets, mainly in the western South Atlantic. The present work extends the single-layer methodology by adding vertical structure. We develop a quasi-3D Bayesian model to reconstruct vertical flow, as well as horizontal flow, and provide improved maps for tracer concentrations. A dynamical analysis of the model as well as uniqueness and accuracy issues are also discussed. The Bayesian approach allows us to construct a probability density (up to a normalizing constant) over the space of all possible solutions. This high-dimensional density provides information regarding the parameters to be estimated. Point estimates as well as uncertainty measures could be easily constructed (in principle) using this distribution. However, the normalizing constant is analytically intractable and numerical simulations are required to produce samples (“snapshots”) from the posterior distribution. We present the posterior mean as the “solution” and posterior standard deviation as “uncertainty,” although many other choices are possible (quantiles, modes, etc.).

*γ*(

*x*,

*y*,

*z*) denotes the neutral density (McDougall 1987). The black dots indicate sites where tracer concentration measurements are available from ships. These bias-corrected hydrographic data [corrected for cruise-dependent standards and offsets following Thurnherr and Speer (2004)] are processed in a similar way to what is described in the 2D inversion method, but this time we are considering several neutral-density layers. Initially our main focus is on the

*γ*= 28.00 isopycnal (which we will call the “active” layer) and we use several layers above and below the active layer to specify the vertical structure. We will refer to this approach as a “quasi-3D” inversion since results will be available for a single isopycnal, but we show results for several different choices of targeted layers (e.g., Fig. 19). We then extend the methodology to a multilayer inversion model.

The domain 𝒮 is discretized using a regular lattice 𝒮* _{G}* consisting of three 37 × 19 grids ℒ

*stacked on top of one another, with each layer representing an isopycnal. The three neutral-density layers represent one above (*

_{G}*γ*= 27.96) and one below (

*γ*= 28.02) the active layer (

*γ*= 28.00), roughly corresponding to depths of 1750, 2000, and 2250 m, respectively. Figure 2 displays oxygen and salinity concentrations obtained by interpolating (using the MATLAB function griddata with option “nearest”) the data onto the grid ℒ

*for each layer. There is significant variation in the spatial distribution of the tracer concentration among these layers, indicating that adding vertical structure to the single-layer method would yield a better understanding of the abyssal circulation in this region. Objective mapping could be used to obtain more reliable maps than those in Fig. 2, but our aim is to relate estimates of tracer concentrations to basic physical principles as well. In Fig. 3, vertical profiles for oxygen and salinity measurements at two different locations in the inversion domain are displayed. The vertical structure around 2000-m depth varies among different tracers and across the domain as well, with local extremes in various places and significant concentration gradients (of the first and second order) in other locations, suggesting that vertical mixing terms may be significant. In our inversion we also include silica (SiO*

_{G}_{2}), potential temperature, and large-scale potential vorticity tracers.

Our findings reveal horizontal flow with zonal jets extending across the entire domain, a feature that is not clearly apparent using the single-layer inversion method, and that we attribute to intensified mixing over the Mid-Atlantic Ridge. This allows the zonal jets to maintain tracer balance across the ridge and to persist in the eastern basin. By taking the analysis further with a multilayer model, with each layer corresponding to an isopycnal, the vertical link between layers will be constructed using thermal wind relations. Rather than integrating thermal wind vertically over the entire water column, we impose it weakly and locally. An estimate of the vertical structure of these zonal flows is then available and we compare it to the flow simply derived by integrating thermal wind relations by using quasi-3D estimates as initial values.

The paper is organized as follows: in section 2 we develop a quasi-3D model involving a single active layer with vertical structure determined by boundary conditions above and below the active layer, along with the proposed Bayesian inversion technique and the Markov chain Monte Carlo (MCMC) computational procedure. Inversion results and their analysis are presented in section 3, along with a discussion of the uniqueness and accuracy of the inversion. In section 4, the proposed approach is extended to multiple active layers. Concluding remarks are presented in section 5 and technical details regarding the sampler used to perform the numerical simulations are in the appendix.

## 2. Quasi-3D inversion

### a. Geostrophic flow model

In this section we build the quasi-3D flow model starting from a single horizontal active layer onto which we add vertical advection and diffusion, with boundary conditions for the tracers in layers above and below the layer of interest.

**= (**u

*u*,

*υ*,

*w*), in situ density

*ρ*, pressure

*p*,

*f*denotes the Coriolis parameter, and

*β*= ∂

*f*/∂

*y*. The geostrophic flow model in the 2D inversion is based on exact geostrophic balanceCross-differentiating and subtracting these two equations, we getNeglecting density variations (using the Boussinesq approximation) and following the reasoning from the single-layer inversion, under the assumption that zonal flow vanishes along the eastern boundary [

*u*(

*x*,

_{e}*y*) = 0)] we can express

*u*aswhere

*x*is the longitude of the eastern boundary of the inversion domain.

_{e}*γ*, obtained by neglecting the nonlinear terms (cabbeling and thermobaricity) in Eq. (12) from McDougall (1991),where

*κ*

^{(z)}denotes the vertical diffusivity. McDougall (1991) notes that cabbeling and thermobaricity are likely to be important in the Antarctic Circumpolar Current. Our domain does not include polar regions; thus, we neglect these terms and restrict the model to the mechanical mixing framework. Vertical variations of

*κ*

^{(z)}have been hypothesized to be important within 500–1000 m of the ridge flank (Toole et al. 1994; Polzin et al. 1997). At middepth, however, vertical variations of

*κ*

^{(z)}are thought to be negligible, so we set the term ∂

*κ*

^{(z)}/∂

*z*in (2) equal to zero. Note that although we include vertical structure, we ignore vertical variations in

*u*,

*υ*,

*w,*and

*κ*

^{(z)}because our inversion has only one active layer and two boundary layers. Later, we relax this assumption and investigate a topographically driven

*κ*

^{(z)}as well as a model with multiple active layers.

*γ*is computed using the equations of state for seawater. Vertical gradients of

*γ*are approximated using centered differences. Figure 4 presents the first and second vertical derivatives of the neutral density

*γ*≡ ∂

_{z}*γ*/∂

*z*,

*γ*≡ ∂

_{zz}*/∂*

^{2}γ*z*

^{2}at

*z*

_{0}= −2000 m, and the ratio

*r*(

*x*,

*y*) =

*γ*

_{zz}/

*γ*

_{z}, which measures the vertical variation of the abyssal neutral density field and is referred to as the

*inverse density scale.*At least five layers (two above and two below the active layer) of data are needed to approximate these vertical gradients. Ignoring vertical variations in

*κ*

^{(z)}and

*w*, we can express the vertical velocityand thus, given

*κ*

^{(z)}, see that

*w*is also determined. The inverse density scale

*r*(

*x, y*) presented in the lower panel of Fig. 4 plays a very important role in the inversion. Because

*κ*

^{(z)}is positive,

*w*is forced to have the same sign as

*r*(

*x, y*). Moreover, if the cross-isopycnal mixing field is fairly smooth, then the vertical velocity field will resemble

*r*(

*x, y*). Although we assume that

*r*is known, it is an inherently noisy quantity (being the ratio of two derivatives). To investigate whether perturbations in

*r*significantly affect the results of the inversion, Eq. (3) could be extended to

*w*(

*x, y*) =

*κ*

^{(z)}(

*x, y*)[

*r*(

*x, y*) +

*ζ*(

*x, y*)], where

*ζ*is a zero-mean 2D Gaussian Markov random field as described in the next section. In the present paper, however, we assume that

*ζ*is negligible.

**(**u

*x*,

*y*,

*z*) = (

*u*,

*υ*,

*w*), diffusion coefficients

*K*= diag[

*κ*

^{(x)},

*κ*

^{(y)},

*κ*

^{(z)}], and tracer concentrations

*C*=

*C*(

*x*,

*y*,

*z*) is made via a steady-state 3D advection–diffusion equationwith Dirichlet boundary conditions

*C*(

*x*,

*y*,

*z*) =

*C*

_{∂𝒮}for (

*x*,

*y*,

*z*) ∈ ∂𝒮. The source term is

*Q*= −

_{C}*λC*for oxygen and

*Q*= 0 for conservative tracers; we specify

_{C}*λ*= 10

^{−10}

*s*

^{−1}. We refer the reader to section 3c for an in-depth discussion regarding our choice for

*λ*. The vertical diffusion coefficient

*κ*

^{(z)}is equivalent to the one used in (2), since neutral density diffuses with the same coefficient as temperature and salinity. We aim in this study to arrive at a solution with no eddy advection terms involving gradients of

*κ*

^{(x)}and

*κ*

^{(y)}. Thus, the horizontal diffusion coefficients are assumed to be spatially invariant. Variations in mixing can be investigated in our framework, but this extension is left for future work. Given a set of parameters Φ = (

**,**u

*K*,

*C*

_{∂𝒮}), the advection–diffusion equation may be solved numerically to reconstruct the tracer concentration (the forward problem).

### b. Bayesian inversion

Beyond the physical model, we need the observational model to connect the observed tracer concentrations to the underlying steady-state tracer fields. We assume that the measured concentrations *C* ≡ [*C*^{(j)}_{obs,i}] satisfy *C*^{(j)}_{obs,i} ∼ *N*(*C*^{(j)}_{i}, *σ*^{2}_{(j)}), where *i* = 1, . . . , *n _{D}* indexes a site on the lattice ℒ

*where data are available;*

_{G}*j*= 1, . . . ,

*n*indexes a tracer; and

_{C}*C*

^{(j)}

_{i}denotes the unknown, true, steady-state tracer concentration. The observations

*C*

^{(j)}

_{obs,i}are assumed to be independent random variables with constant (and prespecified) variance

*σ*

^{2}

_{(j)}for each tracer

*j*(see the end of section 3d for further details). It is important to keep in mind that the underlying tracer concentration

*C*

^{(j)}

_{i}(

*x*,

*y*,

*z*) =

*C*

^{(j)}

_{i}(

*x*,

*y*,

*z*|Φ) is obtained as the solution of the advection–diffusion Eq. (4), given the parameters Φ = (

**,**u

*K*,

*C*

_{∂𝒮}). This solution is numerically approximated using a 3D multigrid iteration routine available in MUDPACK (Adams 1991). In the solver, we use a finer grid created by adding extra grid points between all neighboring points of 𝒮

*to improve the accuracy of the solution to the forward problem. The boundary values on the finer grid are determined by linearly interpolating their values on 𝒮*

_{G}*.*

_{G}*π*(Φ), for the parameters to be estimated. Using the data model described above, the Bayes formula gives the probability density function (pdf) of Φ conditionally on the tracer data, up to a normalizing constantwhere

*L*denotes the likelihood function,We specify

*υ*,

*κ*

^{(x)},

*κ*

^{(y)},

*κ*

^{(z)}, and

*C*

_{∂𝒮}to be independent under the prior distribution. The remaining components

*u*and

*w*are determined via Eqs. (1) and (3). We choose to model

*υ*using a 2D Gaussian Markov random field (GMRF), having pdfwhere the first sum runs over all neighboring sites (

*s*∼

*s*′ if

*s*and

*s*′ are adjacent sites on the lattice ℒ

*). The positive tuning parameters*

_{G}*δ*

_{1}and

*δ*

_{2}control the neighbor interaction and the variance in the prior distributions. We specify these parameters in Table 1. The values presented are selected based on inspection of several simulations from the prior distribution, leading to reasonable prior information. We find that random draws from this prior distribution display the features that we expect (see Fig. 15, upper left). As in the single-layer approach, exponential distributions (with mean 1000 m

^{2}s

^{−1}) are selected for the horizontal diffusion coefficients

*κ*

^{(x)}and

*κ*

^{(y)}. As discussed before, we ignore vertical variations in

*κ*

^{(z)}and restrict it to positive values; hence, the absolute value of a 2D GMRF is an appropriate choice as a prior distribution for the vertical diffusion coefficient. For each tracer

*C*

^{(j)}, the boundary conditions

*C*

^{(j)}

_{∂𝒮}are now represented by two 2D GMRFs (for the layer above and below the layer of interest) and a 1D GMRF (for the edge of the middle layer), thereby allowing for interaction between adjacent sites on the lateral surfaces of the 3D inversion domain.

#### 1) Extracting information from the posterior distribution

*π*(Φ|

*C*) is not possible because of the intractable normalizing constant. Using MCMC techniques, we are able to explore the posterior state space in a manner that allows us to draw samples from it. This exploration is performed by moving from one state (or possible solution) to another such that the amount of time the Markov chain spends in a particular region of the state space (quantified as the number of states from that region visited by the chain) is proportional to the probability mass the posterior distribution puts on that region. Having drawn samples from the posterior distribution, the mean and standard deviations are then estimated using the sample statistics. For more background on MCMC we refer the reader to Liu (2001). Following discussions from the single-layer inversion method, the posterior state space needs to be explored adequately in order to assess the posterior uncertainty accurately. Standard random walk Metropolis moves and column moves as described in McKeague et al. (2005) are used. In addition, we introduce new moves (scaling moves) to improve the mixing of the chain and achieve fast decorrelation in the posterior sample. These moves are described and discussed in the appendix. A complete iteration in the MCMC scheme we use consists of 1000 cycles; each cycle consists of (i) a scaling move; (ii) a random walk move on

*κ*

^{(x)},

*κ*

^{(y)}(one chosen at random); (iii) a random walk move on one randomly selected component of

*υ*; (iv) a random walk move on one randomly selected component of

*κ*

^{(z)}; (v) a random walk move on one randomly selected component of

*C*

_{∂𝒮}; and (vi) a column move.

## 3. Inversion results

### a. Flow estimates

The posterior mean horizontal flow shows predominant zonal flow with alternating jets associated with tracer tongues, narrower in the vicinity to the equator and broader farther south (Fig. 5). Quasi-3D inversion velocities significantly different from zero by at least one and two posterior standard deviations are displayed in the left panel of Fig. 6. In comparison to the solution of the single-layer inversion method (Fig. 6, right panel), we observe more significant zonal flow and less meridional flow with only a few significant values along the east side of the Mid-Atlantic Ridge in the southern part of the domain. There is evidence of alternating zonal jets that extend across the entire domain. To get an adequate comparison of the widths of the alternating zonal jets to the Rossby radius, we would need to obtain sufficient dynamic range, which would require a larger inversion region.

Our solution includes a large-scale potential vorticity (PV = *f* /*H*) estimate (Fig. 7) because PV is one of the tracers. This estimate enables gradients of PV to be calculated consistently with the other tracers and with the velocity field (this does not mean that the relative vorticity is important). We approximate the meridional gradient of PV and display it in the right panel of Fig. 7, with the mean flow field superimposed. We note the zonally banded structure of ∂(PV)/∂*y* and observe that westward zonal jets are generally associated with high positive meridional gradients. The meridional scale of the jets is about 2°–4°, similar to the findings of Treguier et al. (2003).

*x*axis isIn Fig. 8 we detail this analysis for oxygen, showing the RMS for each term as a function of iteration in an MCMC simulation. We find that our solution is primarily advective with vertical components that have an overall small influence, presumably due to the coarse vertical resolution in our model. The vertical terms

*κ*

^{(z)}

*O*and

_{zz}*wO*are one order of magnitude smaller than the horizontal ones and exhibit variation that is too small to be observed on the scale of the plot.

_{z}### b. Vertical structure

Our approach provides estimates and uncertainties of the vertical diffusion coefficient *κ*^{(z)} and velocity *w* (Fig. 9). We observe upwelling throughout the domain. This is mainly a result of the simplified model and the constraints imposed from Eq. (3) [note that *κ*^{(z)} and *w* must have the same sign because the inverse density scale *r*(*x*, *y*) is positive throughout the active layer].

The center panels of Fig. 9 present the posterior mean (left) and standard deviations (right) for the vertical mixing coefficient. Increased vertical mixing is usually associated with higher uncertainty in the posterior distribution. Although the average field is fairly smooth across the entire domain, it is interesting, even though we do not incorporate topography into our model, to notice an association between *κ*^{(z)} and the bottom topography, with higher values over the midocean ridge.

*z*(

*x*,

*y*) denotes the depth of the layer of interest,

*h*(

*x*,

*y*) is the depth of the ocean,

*α*denotes an exponential decay scale, and

*L*(

*x*,

*y*) is an arbitrary (smooth) level function. We treat

*h*and

*z*as fixed (nonrandom) functions of location and

*L*and

*α*as random positive quantities. Given an absence of prior knowledge about the level function, a 2D GMRF is suitable as a prior distribution for

*L*. For the decay scale, we use an exponential prior with mean 500 m. Huang and Jin (2002) specify the vertical diffusivity similarly, using a quadratic exponential decay to study deep circulation induced by bottom intensified mixing. At middepths shallower than the tip of the midocean ridge, the appropriateness of such a model is not clear.

Our results from this approach (Fig. 9, lower panels) show higher posterior variability and overall reduced vertical diffusivity values, becoming insignificant in various places in the eastern and western basin. The posterior mean exponential decay scale of 800 m has very high uncertainty (posterior standard deviation ∼500 m), suggesting that the inversion is trying to introduce more variability than model (7) can accommodate.

*C*

^{(j)}

_{obs,i}and the tracer concentration

*C*(Φ) determined via the advection–diffusion equations is quantified using a cost function (or equivalently, minus the log-likelihood), which is defined as the error sum of squares normalized by twice the error variance and takes the formWe use this cost function to compare the fit to the oxygen data produced by our model with topographic mixing (7) and our earlier model [GMRF directly on

*κ*

^{(z)}; see Fig. 10]. Note that there is a higher error associated with topographic mixing, which we attribute to a less realistic horizontal velocity field. This is due to a strong control exerted over ∂

*w*/∂

*z*, hence over

*υ*and

*u*as well.

### c. Oxygen consumption rate and uniqueness

*λ*appearing in the oxygen equation plays an important role in our setting. Although it varies near regions with high productivity,

*λ*is thought to be nearly constant over the broad interior of the ocean (Feely et al. 2004). For simplicity we set

*λ*to be a constant; this provides a fixed clock rate in the advection–diffusion equation and avoids the ill-posedness of the inverse problem. The magnitude of

*λ*can be estimated from a simple analysis of the terms in the advection–diffusion equation for oxygen,A typical deep ocean is characterized in the interior (outside the western boundary current) by length scales of order

*L*∼ 10

^{6}m, velocities of about 10

^{−2}m s

^{−1}, and diffusion coefficients of 10

^{3}m

^{2}s

^{−1}. Thus, the advection term in the equation above is estimated to be of the order 10

^{−2}m s

^{−1}× 10

^{−5}

*μ*mol kg

^{−1}m

^{−1}= 10

^{−7}

*μ*mol kg

^{−1}s

^{−1}. Similarly, the diffusion term is roughly 10

^{3}m

^{2}s

^{−1}× 10

^{−11}

*μ*mol kg

^{−1}m

^{−2}= 10

^{−8}

*μ*mol kg

^{−1}s

^{−1}. This leads to

*λ*O ∼ 10

^{−7}

*μ*mol kg

^{−1}s

^{−1}, or equivalently,

*λ*∼ 5 × 10

^{−10}s

^{−1}. A similar value can be found by a different scaling argument. The time scale of our system is of the order 10

^{8}s (3 yr); thus, changes in the oxygen concentration are ΔO ∼

*λ*× 10

^{8}× O, hence

*λ*∼ 5 × 10

^{−10}s

^{−1}. Also, Feely et al. (2004) report an oxygen utilization rate (

*λ*O) of 0.1 × 10

^{−9}

*μ*mol kg

^{−1}yr

^{−1}which leads to

*λ*∼ 2 × 10

^{−11}s

^{−1}. Thus, we settled on the average value for

*λ*= 10

^{−10}s

^{−1}.

Higher values for *λ* would increase the “sink” term in the oxygen equation, requiring greater diffusion near the equator where advection is small. Indeed, this is shown by trace plots for *κ*^{(x)} and *κ*^{(y)} in a simulation run using only oxygen and *λ* = 10^{−8} s^{−1} (Fig. 11, lower right panel). The lateral diffusion coefficients are of order 10^{5} m^{2} s^{−1}. This would throw off the balances in the advection–diffusion equation for the other tracers (the diffusion term dominates), which is physically unrealistic. It would also cause the Markov chain to get stuck. To illustrate this, for the same value of *λ*, a simulation run using all the tracers was performed. Examining the trace plots for the lateral diffusion coefficients (Fig. 11, lower left panel) we find that these parameters do not “move” (nearly flat lines) for many iterations.

Lower values of *λ* are not acceptable either. Reducing the oxygen consumption rate by two orders of magnitude, to 10^{−12} s^{−1}, lowers the velocities and diffusion coefficients by the same factor (in order to maintain balance). In particular *κ*^{(y)} becomes almost negligible (about 10 m^{2} s^{−1}; see Fig. 11, upper right panel), even though the Markov chain is not getting stuck in this case. The choice of *λ* = 10^{−10} s^{−1} appears to be satisfactory (Fig. 11, upper left panel) in that it produces plausible diffusion coefficients and adequate convergence of the Markov chain.

We further examine the behavior of the chain by characterizing each accepted solution by a pair of nondimensional quantities (*λ̃*, Pe), where *λ̃* = *λ**L*/*U*. The velocity component *U* is estimated using the spatial RMS described above and varies as the chain is exploring the posterior distribution. The Peclet number is computed as Pe =*UL*/*κ*^{(x)}. The length scale *L* and the clock rate *λ* are held constants: *L* = 10^{6} m, *λ* = 10^{−10} s^{−1}. In Fig. 12 we display visited states in the (Pe, *λ̃*) space. We notice that the Markov chain does converge (the gray dots tend to stay in a relatively small region of this space). Once the chain has reached stationarity, the possible solutions are characterized by Pe ∼ 6. The same value for the Peclet number can also be achieved if diffusion coefficients and velocities are simultaneously very large or very small, which could happen under the prior model, yet in that case *λ̃* would be very small or very large. The sink term in the oxygen advection–diffusion equation is thus selecting the solution.

Tonguelike distributions of tracer concentration data are usually interpreted as advective effects. As Armi and Haidvogel (1982) point out, for a steady-state model, variable and anisotropic diffusion coefficients can generate similar distributions. Floats deployed in the Brazil Basin (Hogg and Owens 1999) have shown a flow dominated by zonal motions. Our results show a significant anisotropy of the diffusion coefficients (see Fig. 13), but we find that the tracer concentration tongues are dominantly advective (see, e.g., Fig. 8). We estimate a mean meridional component of roughly 220 m^{2} s^{−1}, consistent with low eddy kinetic energy levels, of the order 1 cm^{2} s^{−2} with a time scale of 20–30 days. Based on the balance between zonal advection and meridional eddy diffusion and on estimates of tracer gradients, Thurnherr and Speer (2004) come to a similar value for *κ*^{(y)}. Our estimates are also consistent with the large-scale eddy diffusivities [*κ*^{(x)}, *κ*^{(y)}] = (1.5 ± 0.7, 0.7 ± 0.4) × 10^{3} m^{2} s^{−1} estimated using float and tracer observations from the North Atlantic Tracer Release Experiment (Sundermeyer and Price 1998). Simulations done with imposed isotropic diffusivities have shown significantly higher cost functions, as displayed in Fig. 14.

### d. Accuracy of the inversion: Simulation example

In this subsection we examine the question of whether our approach provides an accurate solution in the setting of a synthetic inversion problem. We create a simple example with simulated data and known target parameters to assess the performance of the inversion method proposed in sections 2a and 2b.

We prespecify a “realistic” meridional velocity field *υ* and vertical diffusion coefficient field *κ*^{(z)}, representing the main targets of the inversion (see the first panels of Figs. 15 and 17, respectively). Each field is a random draw from its prior distribution described in section 2b. The horizontal diffusivities are assumed to be *κ*^{(x)} = 1000 m^{2} s^{−1} and *κ*^{(y)} = 500 m^{2} s^{−1}. For simplicity we only use a single tracer (oxygen) with constant boundary conditions on the upper and lower layers, and linear interpolation to the boundary of the middle layer. To generate the oxygen tracer data we numerically solved the forward problem (4) yielding the tracer concentration over the entire grid 𝒮* _{G}*, discard all but the values at observation sites, and add independent N(0, 1) measurement error. To compensate for only using a single tracer we have reduced the measurement error variance from 2.46 to 1.0. We use the same priors as before.

The results are displayed in Figs. 15 –17. The posterior mean flow field (Fig. 15, top right panel) is in close agreement with its target (top left panel) in that the main features are reproduced (e.g., strong zonal flow in the northwest part of the domain). Note, however, that the posterior mean flow is considerably smoother than its target, which is to be expected because it does not reflect any variability. To get a sense of the variability in the inversion, it is helpful to inspect draws of the posterior flow field (Fig. 15, bottom panels); these also exhibit the same features as the target. Posterior histograms of the horizontal diffusion coefficients (Fig. 16) are quite tightly concentrated around the target values, with posterior means of 1031 m^{2} s^{−1} for *κ*^{(x)} and 508 m^{2} s^{−1} for *κ*^{(y)}. The posterior standard deviations are 95 and 69 m^{2} s^{−1} respectively. These posterior distributions are approximately Gaussian and are significantly different from their exponential priors, showing that the data have had a large influence. The posterior mean of the vertical diffusivity *κ*^{(z)} (Fig. 17, middle panel) again reproduces the main features of its target (first panel), at least up to a degree of variability given by the posterior standard deviation (last panel).

Two other important factors are influencing the posterior distribution: the error variance and the number of data sites. There are 239 data locations (in both the real and artificial problems) on a 37 × 19 lattice. Our experiments have shown that we can reduce the number of data points by as much as 10% without significantly altering the resulting posterior mean flow and mixing values. It would require separate (time consuming) simulation runs to determine the influence of each data point individually. Both the error variance and the number of data locations are controlling the “power of the data,” by either strengthening or relaxing the likelihood. Correspondingly, varying these parameters will either reduce or enlarge the posterior variance, assuming the prior does not change. In the real problem, the number of data locations is determined by the grid size and the original hydrographic stations where data were collected. The error variance *σ*^{2}_{(j)} for each tracer is estimated by computing a local sample variance from tracer measurements in a small box around an observation site, and then taking a weighted average of these sample variances, with the weights being reciprocals of the local sample sizes. This is the procedure used in McKeague et al. (2005), resulting in the values given in Table 1 of that paper.

## 4. Thermal wind and a multilayer inversion

*γ*= 27.63 and

*γ*= 27.15 isopycnals. One of the deficiencies of our model is that these estimated velocities and tracer concentrations are not related in any way to each other or to the

*γ*= 28.00 layer. A natural way to link these layers vertically would be to use thermal wind relations:

We note that since tracers are advected by the total velocity, our approach will estimate the total field, barotropic plus baroclinic. Using velocity estimates from the *γ* = 28.00 layer, one can integrate thermal wind equations vertically and obtain velocities across the entire water column. The lower panels in Figs. 18 and 19 present velocities obtained in this fashion on top of interpolated tracer concentration data. Neither of these approaches produces satisfactory results. The quasi-3D methodology lacks a vertical link between layers. Flow derived by integrating thermal wind relations is turbulent, includes eddies, and may not be consistent with observed tracer concentrations. We choose to combine the two methods in one unified model that uses thermal wind to connect layers and the advection–diffusion equation to link tracer concentration data to velocities and diffusion coefficients.

*γ*= 28.00 and

*γ*= 27.15. We discretize

*S*using seven 37 × 19 grids ℒ

*stacked on top of one another, with each layer representing an isopycnal (*

_{G}*γ*= 28.00, 27.96, 27.89, 27.78, 27.63, 27.45, and 27.15, roughly corresponding to depths of 2000 to 500 m in steps of 250 m). Using hydrographic data, we compute the in situ density

*ρ*and approximate its horizontal gradients ∂

*ρ*/∂

*x*, ∂

*ρ*/∂

*y*. With

*k*= 1, . . . , 7 indexing the layers, letWe view

*G*≡ [

*G*

^{(k)}

_{u,obs},

*G*

^{(k)}

_{υ,obs}] as additional data, supplementing observed tracer concentrations. Together with the observational model described before, we also assume that for each layer

*k*and each grid site

*s*∈ 𝒮

_{G},

*G*

^{(k)}

_{u,obs}, (

*s*) and

*G*

^{(k)}

_{υ,obs}(

*s*) are independent Gaussian variables having means (∂

*u*/∂

*z*)

^{(k)}(

*s*) and (∂

*υ*/∂

*z*)

^{(k)}(

*s*), respectively, and constant variances

*σ*

^{2}

_{u}and

*σ*

^{2}

_{υ}. For every hydrographic station where two ship tracks intersect at a right angle, the zonal gradient of

*ρ*will be estimated using data from one cruise while the meridional gradient will be estimated using data from the other cruise. As a result, the two gradients can be assumed to be statistically independent. This argument can be extended to every grid site

*s*∈ 𝒮

*. Following the Bayesian approach, the likelihood function then becomeswhere*

_{G}*i*= 1, . . . ,

*n*′

_{D}now indexes a site on one of the five interior lattices where data are available. The parameters to be estimated are now

*υ*,

*w*,

*κ*

^{(x)},

*κ*

^{(y)},

*κ*

^{(z)}, and

*C*

_{∂𝒮}, which are again specified to be independent under the prior distribution. The prior distributions for

*υ*and

*w*in this case are constructed by stacking seven independent 2D GMRFs on top of each other; the distribution for

*κ*

^{(z)}is constructed similarly except that we use the absolute value of each GMRF. The zonal velocity

*u*is then determined via Eq. (1), which we now write for every layer. Horizontal diffusivities are again assumed to have prior exponential distributions (with a mean of 1000 m

^{2}s

^{−1}), and for each tracer, the same prior (as in the quasi-3D case) is selected for values on the boundary of the inversion domain. We have also experimented with specifying a 2D prior distribution for the meridional velocity on the deepest layer and a 3D prior distribution for the vertical gradient ∂

*υ*/∂

*z*, which will consequently determine

*υ*on every layer. Our simulations in this case have shown an unrealistic flow field and higher cost functions when compared to those resulting from the 3D prior described above. Using Bayes’s formula, we derive the conditional distribution of Φ ≡ (

*υ*,

*w*,

*κ*

^{(x)},

*κ*

^{(y)},

*κ*

^{(z)},

*C*

_{∂}

*), given tracer data and thermal wind data. The vertical gradients ∂*

_{S}*u*/∂

*z*and ∂

*υ*/∂

*z*are approximated using centered differences for the interior layers and forward differences for the upper and lower boundaries. Markov chain Monte Carlo methods are again used to sample from this unnormalized distribution.

For each neutral-density layer specified above, the posterior mean flow and oxygen concentration are displayed in Fig. 20. A few interesting features of the 3D velocity field are to be observed: deep layers exhibit flow that is mostly zonal as opposed to shallower layers where meridional flow becomes significant. In comparison, we also derive flow by integrating thermal wind relations with respect to *z* (Fig. 21) and using quasi-3D results as reference values. Analyzing each layer individually, we observe that the horizontal structure of the two fields differs, especially in the central and northern regions of our domain. However, in every layer, the westward flow near 28°S is present in both fields. In addition, floats deployed to study circulation of the South Atlantic Antarctic Intermediate Water (Núñez-Riboni et al. 2005) show the same westward flow between 20° and 30°S being part of the northern branch of the South Atlantic subtropical gyre. A quantitative comparison of this flow shows that flow derived from float data has a mean speed of 4.7 ± 3.3 cm s^{−1} while that estimated from the inversion has a posterior mean speed of 3.3 cm s^{−1} and a posterior standard deviation of 0.6 cm s^{−1}.

Using data compiled from Núñez-Riboni et al. (2005), we converted the mean zonal transport across the South Atlantic Ocean into averaged mean velocities for the *γ* = 27.45 isopycnal and compared them to multilayer posterior mean zonally averaged *x* velocities (Fig. 22) to find consistent agreement between the two. A significant disagreement appears near 11°S where posterior mean flow is eastward and near 13°S where posterior mean flow is westward. Float sampling was inadequate in this region to observe this flow, and the comparison is invalid there. These are the latitudes of the South Equatorial Countercurrent (SECC) and the South Equatorial Current, and our results are consistent qualitatively with the two currents. The SECC is thought to feed a small gyre near the eastern boundary called the Angola Dome. This gyre is not evident in the estimated mean flow because it lies mainly above the upper boundary of our inversion domain at roughly 500-m depth.

Another interesting feature of the estimated flow field is that zonal jets extend vertically across several layers. Figure 23 displays the vertical structure of the zonal velocity *u* for the 20°W meridian. In the left panel, flow estimated using the multilayer model is used, whereas in the right panel velocities are determined by integrating thermal wind equations. In comparison to the thermal wind flow, we find that the multilayer estimate shows a stronger depth-dependent structure. We also notice smaller horizontal-scale zonal jets at lower latitudes versus broader jets farther south.

The eastward jet at 20°S that we find significant at 2000-m depth extends up to 1000-m depth, while the westward flow at 30°S is present across the entire water column. A relatively strong westward flow at 16°S present in the thermal wind flow does not appear in the multilayer estimate. The solution does not extend close enough to the equator to compare to the much smaller vertical scales found there by Treguier et al. (2003).

## 5. Concluding remarks

We develop a Bayesian approach to inverting tracer concentration measurements. Following the single-layer approach we add vertical structure, initially building a quasi-3D model and furthering this to a full multilayer setup. What sets this method apart from various other approaches is that the term “solution to the inverse problem” (which we define as the collection of velocities, diffusion coefficients, and tracer boundary values to be estimated) may be ambiguous in this view. Traditional inverse methods in data assimilation provide a “unique” or “best” solution, generally computed by minimizing some objective function. Measures of uncertainty in this solution are also provided. We obtain more than that. Consider the set of all possible solutions, given the data measurements and the physical constraints required by the dynamical model. Some of these solutions are more probable than others. Loosely speaking, “probable” solutions can be characterized by a realistic flow field resulting in concentration maps that are close to the observed ones. Our Bayesian approach provides us with the necessary tools to quantify “probable.” A full probability distribution is obtained to characterize the state space of all solutions to the inverse problem [i.e., the parameters *u*, *υ*, *w*, *κ*^{(x)}, *κ*^{(y)}, *κ*^{(z)}, and *C*_{∂}* _{S}*]. Hence, when inquiring about a particular variable, instead of providing a value, we are providing a distribution (including mean, variances, and so on), which is a balance between information in the data and information in the prior. There are three key ingredients in our approach: prior, dynamics, and data. All play an important role and it is impossible to uniquely determine the “source” of information. Our prior assumptions are generally mild. We only impose a certain degree of smoothness for the 2D fields and restrict diffusion parameters to positive values. We have experimented (in a limited way) with different choices for the hyperparameters that specify the prior distribution and found only minor changes in the results. When comparing posterior distributions to the prior ones, we do notice that they are quite different (many velocities now have a posterior mean that is different from zero, and horizontal diffusion coefficients appear to be Gaussian rather than exponential). We conclude that the data have swamped the prior in terms of information. Markov chain Monte Carlo techniques make our approach computationally feasible, allowing us to sample this high-dimensional space. Summary statistics (sample averages, most likely values, and standard deviations) may be further used to select a representative solution and to quantify uncertainty. We are able to reconstruct tracer concentration maps by numerically solving the advection–diffusion equation for a selected set of inputs (coefficients and boundary values). All the computations were carried out on a dual Power Mac G5 workstation. All the results were based on five million cycles of the MCMC sampler as described in section 2b.

We estimate abyssal horizontal flow that is dominated by alternating zonal jets consistent with observed tracer concentration maps and direct measurements of the flow at the depths of the North Atlantic Deep Water. Our results show intensified diapycnal mixing over the Mid-Atlantic Ridge (as observed by Polzin et al. 1997), which allows jets to extend over the entire domain; however, vertical diffusion driven directly by bottom topography results in a less realistic flow field. We have chosen a model configuration with a given resolution of about 1°, amounting to an operational definition of the separation of scales between subgrid-scale diffusion and mean advection. Moreover, by choosing *K* to be uniform over the inversion domain, we neglect its likely spatial variability, which can enter the tracer equation like advection (tracer attracted to high-energy high-*K* regions). We also make the choice to model tracer advection and diffusion with a mean flow and *K*. There is a missing nonlocal effect (not representable by *K*) due to the tracer being bodily translated by large eddies and deposited somewhere else. This might be represented by random eddy flux vectors but it is not implemented in the current configuration. Thus, there are several open questions regarding parameterization of tracer transport in the ocean (or atmosphere, or many other systems) that might be investigated. Our study provides the answer to a simple version of the problem, and a basis for further investigation.

We extend the approach and use thermal wind relations to construct a multilayer model. Integrating thermal wind equations with respect to *z* over the entire water column produces a flow field that contains eddies and is not necessarily consistent with observed tracer maps. We avoid these problems by imposing thermal wind weakly in a local fashion. As a consequence, the vertical structure of the flow departs from the one derived by integrating thermal wind relations, revealing a strongly depth-dependent flow field. Alternating zonal jets extending vertically over 1000 m or more are still the main feature of the flow field. We understand that we make a subjective choice to impose thermal wind relations weakly to provide the vertical structure of the flow. We do account for thermal wind error in this manner, but in our approach we assume that the depths of each isopycnal are known. A further stage in our analysis will relax this assumption, building thermal wind into the dynamical model and working with an updated density field.

This research was supported in part by NSF Grants ATM-0222244 and OCE-0336697.

## REFERENCES

Adams, J. 1991. Multigrid software for elliptic partial differential equations: MUDPACK. NCAR Tech. Note NCAR/TN-357+STR, 51 pp.

Armi, L., and D. B. Haidvogel. 1982. Effects of variable and anisotropic diffusivities in a steady-state diffusion model.

*J. Phys. Oceanogr.*12:785–794.Feely, R. A., , C. L. Sabine, , R. Schlitzer, , J. L. Bullister, , S. Mecking, , and D. Greeley. 2004. Oxygen utilization and organic carbon remineralization in the upper water column of the Pacific Ocean.

*J. Oceanogr.*60:45–52.Green, P. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.

*Biometrika*82:711–732.Hogg, N. G., and W. B. Owens. 1999. Direct measurement of the deep circulation within the Brazil Basin.

*Deep-Sea Res. II*46:335–353.Huang, R. X., and X. Jin. 2002. Deep circulation in the South Atlantic induced by bottom-intensified mixing over the midocean ridge.

*J. Phys. Oceanogr.*32:1150–1164.Liu, J. S. 2001.

*Monte Carlo Strategies in Scientific Computing*. Springer, 343 pp.Maximenko, N. A., , G. Bang, , and H. Sasaki. 2005. Observational evidence of alternating zonal jets in the world ocean.

*Geophys. Res. Lett.*32.L12607, doi:10.1029/2005GL022728.McDougall, T. J. 1987. Neutral surfaces.

*J. Phys. Oceanogr.*17:1950–1964.McDougall, T. J. 1991. Parameterizing mixing in inverse models.

*Dynamics of Oceanic Internal Gravity Waves: Proc. ‘Aha Huliko‘a Hawaiian Winter Workshop,*Honolulu, HI, University of Hawaii at Manoa, 355–386.McKeague, I. W., , G. K. Nicholls, , K. G. Speer, , and R. Herbei. 2005. Statistical inversion of South Atlantic circulation in an abyssal neutral density layer.

*J. Mar. Res.*63:683–704.Nakano, H., and N. Suginohara. 2002. A series of mid-depth zonal flows in the Pacific driven by winds.

*J. Phys. Oceanogr.*32:161–176.Núñez-Riboni, I., , O. Boebel, , M. Ollitrault, , Y. You, , P. Richardson, , and R. Davis. 2005. Lagrangian circulation of Antarctic Intermediate Water in the subtropical South Atlantic.

*Deep-Sea Res. II*52:545–564.Polzin, K. L., , J. M. Toole, , J. R. Ledwell, , and R. W. Schmitt. 1997. Spatial variability of turbulent mixing in the abyssal ocean.

*Science*276:93–96.Richards, K. J., , N. A. Maximenko, , F. O. Bryan, , and H. Sasaki. 2006. Zonal jets in the Pacific Ocean.

*Geophys. Res. Lett.*33.L03605, doi:10.1029/2005GL024645.Sundermeyer, M. A., and J. F. Price. 1998. Lateral mixing and the North Atlantic Tracer Release Experiment: Observations and numerical simulations of Lagrangian particles and a passive tracer.

*J. Geophys. Res.*103:C10. 21481–21497.Thurnherr, A. M., and K. G. Speer. 2004. Representativeness of meridional hydrographic sections in the western South Atlantic.

*J. Mar. Res.*62:37–65.Tierney, L. 1994. Markov chains for exploring posterior distributions.

*Ann. Stat.*22:1701–1728.Toole, J. M., , K. L. Polzin, , and R. W. Schmitt. 1994. Estimates of diapycnal mixing in the abyssal ocean.

*Science*264:1120–1123.Treguier, A. M., , N. G. Hogg, , M. Maltrud, , K. G. Speer, , and V. Thierry. 2003. The origin of deep zonal flows in the Brazil Basin.

*J. Phys. Oceanogr.*33:580–599.

# APPENDIX Bayesian Approach, MCMC Moves, and Acceptance Probabilities

The aim of our paper is to estimate Φ = (** u**,

*K*,

*C*

_{∂𝒮}), given measurements of tracer concentration

*C*. We are pursuing a Bayesian approach that consists of two major steps. The first one is the model specification, which has two components: a prior model for the parameters Φ and a model for the data

*C*(conditional on Φ). The data model (or observation model) has the general form Data =

*f*(Φ, error). Assumptions of distributions of errors lead to the likelihood function

*L*(

*C*|Φ). The prior model describes all the (subjective) information the user has about the parameters (prior knowledge). Formally, this amounts to specifying the prior density

*π*(Φ). Acquisition of information (the data) will modify the prior model into the posterior

*π*(Φ|

*C*), according to the Bayes Eq. (5). The posterior density is the proposed solution: a probability measure on the space of the parameters Φ. It allows us to quantify which sets of parameters are more or less likely and to compute a mean value, a variance, or other measures of uncertainty. The drawback is that

*π*(Φ|

*C*) is not normalized, and in most cases, such as the one presented in this paper, the normalizing constant is intractable. Thus, formally summarizing

*π*(Φ|

*C*) is impossible; however, using MCMC techniques, we can numerically draw samples from this distribution.

*π*(Φ|

*C*). A very popular method to do so is a Metropolis random walk. The chain is started in a state Φ

_{0}, selected arbitrarily. From the current state Φ

*the chain moves to Φ*

_{j}

_{j}_{+1}= Φ

*+*

_{j}*r*with probabilityor stays put, Φ

_{j}_{+1}= Φ

*, with probability 1 −*

_{j}*α*(Φ

_{j}_{+1}|Φ

*). The sequence Φ*

_{j}*,*

_{j}*j*= 0, 1, 2, . . . constitutes a realization of the Markov chain. Here

*r*is a uniform (zero mean) variable, and

*α*(Φ

_{j}_{+1}|Φ

*) is called the acceptance probability. Standard Markov chain theory (Tierney 1994) provides sufficient conditions for the random walk Metropolis algorithm to converge to the target distribution*

_{j}*π*(Φ|

*C*). Once the chain has converged, which we determine by examining trace plots (as in Fig. 11), subsequent output is used to obtain inversion results. In practice, however, several other types of moves (rather than simple random walk) are designed to improve the quality of the posterior sample. We describe all these moves formally as reversible jumps (Green 1995). From the current state Φ we propose to move to Φ′ =

*F*(Φ,

*r*),

*F*being a nonrandom operator and

*r*having a given density

*q*. The new state Φ′ is accepted with probabilitywhere

*r*′ is the unique value such that Φ =

*F*(Φ′,

*r*′) and Data is a generic notation for tracer data

*C*in the quasi-3D setup or tracer data combined with thermal wind data

*G*for the multilayer setup. We combine classical random walk–type moves with several new scaling moves such asFor such moves, the Peclet number will vary strongly. These moves allow the Markov chain to jump quickly between states for which the advective–diffusive balances are different.

Values of the tuning parameters *δ*_{1} and *δ*_{2} that appear in the prior distributions of *υ*, *κ*^{(z)}, and *C*^{(j)}_{∂𝒮}.