## Abstract

A hierarchy of inhomogeneous, nonstationary stochastic models of material transport is formulated, and its properties are described. The transport models from the hierarchy sequence provide progressively more skillful simulations of the subgrid-scale transport by mesoscale eddies, which are typically not resolved in coarse-grid representations of the ocean circulation. The stochastic transport models yield random motion of individual passive particles, and the probability density function of the particle population can be interpreted as the concentration of a passive tracer.

Performance of the models is evaluated by (a) estimating their parameters from Eulerian and Lagrangian statistics of a fluid-dynamic reference solution, (b) solving for the transport, and (c) comparing the stochastic and fluid-dynamic transports. The reference solution represents midlatitude oceanic gyres, and it is found by solving steadily forced, quasigeostrophic equations of motion at large Reynolds number. The gyres are characterized by abundant coherent structures, such as swift, meandering currents, strong vortices, eddies, and planetary waves. The common, nondiffusive spreading of material (i.e., single-particle dispersion that is a nonlinear function of time) is induced by all these structures on intermediate times and by inhomogeneity and lateral boundaries on longer times. The higher-order members of the hierarchy are developed specially for simulating nondiffusive transports by turbulence in the presence of organized fluid patterns.

The simplest, but least skillful, member of the hierarchy is the commonly used diffusion model. In terms of the random particle motion, the diffusion is equivalent to the random walk (Markov-0) process for particle positions. The higher-order members of the hierarchy are the Markov-1 (a.k.a. Langevin or random acceleration), Markov-2, and Markov-3 models, which are jointly Markovian for particle position and its time derivatives. Each model in the hierarchy incorporates all features of the models below it. The Markov-1 model simulates short-time ballistic behavior associated with exponentially decaying Lagrangian velocity correlations, but on large times it is overly dispersive because it does not account for trapping of material by the coherent structures. The Markov-2 model brings in the capability to simulate intermediate-time, subdiffusive (slow) spreading associated with such trappings and with both decaying and oscillating Lagrangian velocity correlations. The Markov-3 model is also capable of simulating intermediate-time, superdiffusive (fast) spreading associated with sustained particle drifts combined with the trapping phenomenon and with the related asymmetry of the decaying and oscillating Lagrangian velocity correlations.

## 1. Introduction

The ocean circulation consists of time-mean currents and transient variability on a very broad range of scales. The latter may be divided into low-frequency, large-scale variability patterns; a variety of coherent mesoscale structures, such as jets, intense vortices, eddies, and planetary (Rossby) waves; and submesoscale fluctuations. All these components transport (i.e., spread and mix) the fluid and its material properties in a complex way. There are two strategic pathways toward understanding the transport processes. First, their rates have to be measured in the ocean. For meso- and larger scales this is accomplished by releasing surface drifters and neutrally buoyant floats and tracking their trajectories and by measuring the distributions of various chemical tracers. Second, simple mathematical models with clear physics have to be developed for simulating the observed transport. Such a transport model is a necessary part of coarse-resolution, ocean circulation simulations as a parameterization of transport by unresolved eddies. This paper focuses on the stochastic transport models that are capable of simulating turbulent transport in the presence of organized fluid patterns.

In the introduction we pose the problem and describe the background. In section 2 we provide a general formalism for the stochastic transport model hierarchy. In section 3 we summarize the transport properties of a fluid-dynamic, eddy-resolving ocean circulation, which is used as the reference solution for evaluating the stochastic transport models. In section 4 we formulate and evaluate the first-order Markov-1 model, and the higher-order Markov-2 and -3 models are similarly analyzed in sections 5 and 6, respectively. Conclusions and discussion follow in section 7.

### a. Statement of the problem

We view our results as part of a long-term strategy aimed at solving the oceanic mesoscale transport problem. The strategy is a sequence of goals that must be achieved in turn:

to create skillful transport models and to examine their performance by comparing them to ocean observations and to eddy-resolving computer simulations;

to implement these models in ocean general circulation models (OGCMs) as efficient subgrid-scale parameterizations;

to find closures that relate the transport model parameters simply to the coarse-grid dynamic fields, which are explicitly resolved in OGCMs.

This paper is a modest contribution to (1), and its purpose is to formulate and test a hierarchy of simple, physically consistent and mathematically rigorous, stochastic transport models capable of simulating some transport properties of the oceanic general circulation, in particular, and of other turbulent flows in the presence of coherent dynamic patterns, in general. The stochastic transport models use a few gross characteristics of the turbulent circulation as internal parameters, and we show that they are not difficult to implement and solve.

A commonly used transport model for representing and parameterizing the passive-tracer, mesoscale, eddy-induced transport in coarse-grid OGCMs is diffusion (a.k.a turbulent eddy diffusion; Taylor 1921). This model represents the large-time asymptotic behavior of single-particle dispersion in homogeneous and stationary turbulence in an unbounded domain. The corresponding evolution of the tracer concentration, *c*(*t, ***x**), is governed by the classical advection-diffusion equation

where **u** is a large-scale advective velocity, and K is the diffusivity tensor coefficient. In the ocean and atmosphere, at the spatial scales with the greatest energy, the magnitude of K is larger by many orders of magnitude than the molecular diffusivity. The widespread use of (1) is due to its simplicity, elegance, and capability of simulating transport characterized by single-particle dispersion that is a linear function of time. Equation (1) is based on the assumption of rapid Lagrangian velocity decorrelation (i.e., a rapid memory loss following Lagrangian particles). Although this assumption is true for many physical systems, such as simple molecular motion, it is often not true for oceanic mesoscale eddies.

We focus on midlatitude oceanic gyres, such as in the North Pacific and North and South Atlantic, but the presented transport theory can be used for other aspects of the ocean circulation and for a broad variety of turbulent flows in the presence of organized patterns. Performance of the stochastic transport models is tested against the fluid-dynamic transport phenomenology from Berloff et al. (2002, hereafter BMB). We focus on the most turbulent circulation from BMB (computed on 7.5-km horizontal grid), and refer to it as the fluid-dynamic solution. The transport phenomenology has the following physical assumptions (also see the end of section 7).

The motion of Lagrangian particles is strictly two-dimensional and nondivergent within each isopycnal fluid layer, consistent with the quasigeostrophic (QG) approximation.

^{1}The mesoscale eddies resolved in the fluid-dynamic model dominate the tracer transport so that no explicit representation is required for transport contributed by submesoscale fluctuations [Armenio et al. (1999) argue that this contribution is small].

The tracer evolution is adiabatic (i.e., without sources and sinks) and dynamically passive (i.e., the tracer is a marker that does not influence the fluid motion).

^{2}The hierarchy of stochastic transport models is based on random Markov processes, which govern the motion of Lagrangian particles (e.g., Gardiner 1983). The motion consists of the time-mean and random components. The locally averaged concentration of particles [the probability density function (PDF) of particle positions] is proportional to a passive tracer concentration. Each model satisfies the physical, well-mixedness constraint in inhomogeneous, nonstationary situations and has its parameters estimatable from statistics of Lagrangian float trajectories. The stochastic Lagrangian particles are passive [i.e., they satisfy (iii)], and we use the transport models in accordance with (1) and (2). The models have the following mathematical assumptions (also see the end of section 7).

Model parameters are deterministic functions varying in space and time (the simplest situation is when each parameter is a universal constant).

Spatial correlations of Lagrangian velocities are completely neglected, hence the particles are not correlated with each other (i.e., they are individual).

Random forcing is Gaussian (a.k.a. normal).

The simplest member of the hierarchy, equivalent to the advection-diffusion process (1), is referred to as the Markov-0 model (a.k.a. the random walk process), and it yields random displacements of particle positions:

where *d***W** is the random increment vector defined below.^{3} The underlying assumption is that observed velocities very rapidly decorrelate in time and may be represented as completely random (as a consequence, the eddies are assumed to have infinitesimal spatial and temporal correlation scales). This model is so classical and well studied [see Griffa (1996) for the oceanographic context], that in this paper we do not consider it explicitly. The higher members of the hierarchy are the Markov-1 (our notation for the random acceleration model), Markov-2, and Markov-3 models, each of which, in a homogeneous, stationary situation without lateral boundaries, is an autoregressive process of the corresponding order for Lagrangian velocity fluctuations. Each of these models has nontrivial velocity correlations that allow simulation of more complex transport processes than diffusion. Each model completely incorporates all features of the models below it in the hierarchy. The Markov-1, Markov-2, and Markov-3 models are jointly Markovian for (**x**, **u**′), (**x**, **u**′, **g**), and (**x**, **u**′, **g**, **p**), respectively; where **x**, **u**′, **g**, and **p** are position, velocity, acceleration, and hyperacceleration, respectively. We do not examine even higher-order Markov models because we have not identified an important transport behavior that requires us to do so. In choosing a particular model from the hierarchy, it is important to follow principles of parsimony—using the fewest parameters necessary for simulating a given phenomenon—and computational efficiency.

### b. Background

Ocean measurements with Lagrangian floats (Freeland et al. 1975; Krauss and Boning 1987; Rupolo et al. 1996), theoretical arguments (Davis 1987; Griffa 1996), and solutions of idealized ocean circulation models (Figueroa and Olson 1994; Figueroa 1994; Bower and Lozier 1994) lead to the conclusion that the diffusion process is often not an accurate model for real oceanic transport by mesoscale eddies. This failure is caused by the presence of coherent structures, which induce Lagrangian velocity correlations over substantial time, and it is further aggravated by flow inhomogeneity, nonstationarity, and intermittency. Nevertheless, K is often estimated from Lagrangian float trajectories, either from the linear fit of the single-particle dispersion at large time, or as K = I*σT*_{L} [where *σ* is fluctuation velocity variance and *T*_{L} is the integral timescale of the Lagrangian velocity autocorrelation function, *R*(*τ*), of the time lag, *τ*]. Both approaches rely on the asymptotic, large-time information [in many parts of the ocean *R*(*τ*) decays slowly], and in the inhomogeneous, nonstationary situation that introduces the substantial nonlocality of the K estimate; hence, its usefulness becomes problematic.

Diffusion (i.e., a random walk) has the underlying assumptions that the velocity fluctuations have a Gaussian distribution and that eddies have infinitesimal scales. Most theories rely on the first assumption, although velocity fluctuations in idealized kinematic flows, 2D turbulence, and oceanic measurements often have strongly non-Gaussian statistical distributions (del Castillo-Negrete 1998; Bracco et al. 2000). Some progress in relaxing this assumption is achieved in stochastic transport models of the 1D atmospheric convective boundary layer (Luhar et al. 1996; Maurizi and Lorenzani 2002), and 2D homogeneous and isotropic turbulence (Pasquero et al. 2001). An approach for relaxing the second assumption is to use a generalized advection-diffusion model that relates the transport to the time-lagged, concentration gradient (Davis 1987). Another approach is to use stochastic models for the particle position and its time derivatives (e.g., Griffa 1996). Transport models based on the use of stochastic equations for the motion of individual particles have an extensive history [see the reviews by Pope (1994) and Rodean (1996)], and the ideas are developed far beyond the random walk process, which is just the simplest stochastic model. Each stochastic particle model with its boundary conditions has an equivalent description of the phase-space PDF in terms of a partial differential equation called the forward Kolmogorov (or Fokker–Planck) equation (Risken 1989). For example, (1) is the Fokker–Planck equation for (2). The stochastic transport models are either straightforwardly integrated in time (Dutkiewicz et al. 1993; Zambianchi and Griffa 1994) or they are used “inversely” for extracting information about certain flow statistics (Griffa et al. 1995).

A hierarchy of Markovian stochastic transport models contains the random walk model on its lowest level. Based on measured Lagrangian velocity spectra, Griffa (1996) argues that the next model after the random walk in the hierarchy—the Langevin or random acceleration model—can be accurate of simulating transport in the upper, but not the deep, ocean. In an application of the random acceleration model, Dutkiewicz et al. (1993) show that small-scale turbulence substantially enhances and modifies the material transport across a weakly meandering, kinematic (i.e., with an ad hoc velocity field rather than a dynamically derived one) zonal jet. As a result the random acceleration model may overestimate the cross-jet transport when the meandering has a period shorter than the velocity correlation time (Cencini et al. 1999). This suggests that for skillful transport modeling one sometimes has to go not only beyond the random walk but also beyond the random acceleration model.

The next model in the hierarchy beyond random acceleration is jointly Markovian for the position, velocity, and acceleration. In this model (in addition to *T*_{L}, which is usually related to the most energetic fluctuations) there is another, shorter than *T*_{L}, timescale. Sawford (1991) argues that this model improves the simulation of homogeneous, isotropic 3D turbulent transport at large Re, where the new timescale is related to the Kolmogorov scale. Although Sawford's model is capable of simulating many features of direct numerical solutions (Yeung and Pope 1989), it remains to be seen whether this model is relevant to oceanic transport induced by mesoscale coherent structures.^{4}

Most geophysical applications of stochastic transport models make simplifying assumptions of homogeneity and stationarity, although these are often not true. An exception is made for vertical inhomogeneity in the atmospheric boundary layer (Rodean 1996), where the constraint of well-mixedness must be satisfied to avoid a variety of unphysical behaviors (Thomson 1987). The well-mixedness constraint—a uniform distribution of particles or tracer concentration remains uniform under evolution—ensures the correctness of the small-time behavior of the velocity distribution of particles from a localized source, the compatibility of the stochastic model with Eulerian equations of motion, and the compatibility of forward and backward in time formulations of the model.

Since Lagrangian observations in the ocean are scarce, dynamic, eddy-resolving simulations of Lagrangian trajectories provide a useful alternative. In particular, simulations may be used for (a) testing the dynamic models; (b) testing transport models; (c) estimating transport parameters, such as eddy diffusivity (Rhines and Schopp 1991); and (d) estimating statistical requirements and optimizing sampling strategies for Lagrangian float measurements. The double-gyre dynamic model is a popular paradigm for the midlatitude ocean circulation, but it is rarely looked at from the transport point of view. Figueroa and Olson (1994) estimate K(*x, **y*) in a double-gyre, primitive-equation model, but the use of an advection-diffusion transport model with this K(*x, **y*) on a coarse grid does not simulate well the actual tracer evolution by the resolved eddies (Figueroa 1994). This suggests that the transport induced by mesoscale oceanic eddies requires a better parameterization than the diffusion model.

BMB analyze the transport in double-gyre, flat-bottom, QG midlatitude circulation at several values of Re. It is shown that tracer transport by mesoscale eddies occurs much differently than in the commonly used model of homogeneous and isotropic eddy diffusion. In most of the basin, and especially in the deep layers, subdiffusive (slow), single-particle dispersion occurs due to long-time trapping of material by coherent structures such as vortices near the strong currents and planetary waves in the eastern part of the gyres. Superdiffusive (fast), single-particle dispersion behavior is found in the western part of the subtropical gyre and in fluctuating jetlike flows near the boundaries. Sub- and superdiffusion are associated with a strong first negative and second positive lobe, respectively, in local *R*(*τ*). Also, persistent transport barriers (e.g., across the eastward jet of the subpolar gyre) are identified and studied, and the meridional material fluxes are measured. These results are used here for testing the skills of the stochastic model hierarchy.

## 2. General view on Markov transport models

In this section we present the general ideas and formalism of the stochastic transport model hierarchy. The specific forms of Markov-1, -2, and -3 models are in sections 4, 5, and 6, respectively. The central idea is the following. Transport of a turbulent flow regime is simulated with a set of stochastic differential equations (SDEs) that govern random motions of individual particles transported by both the time-mean and fluctuating currents. A set of SDEs, together with internal parameters, boundary, and initial conditions, and a time integration rule, constitute a stochastic transport model. No such model can simulate all aspects of the dynamic fluid motion, but the goal is to simulate only certain statistical Lagrangian properties of the flow regime (e.g., single-particle dispersion). The transport model parameters are to be statistically estimated from Eulerian (i.e., at a given location) and/or Lagrangian (i.e., from float trajectories) observations. The idea here is to make the parameter estimate as local, both in space and time, as possible.

We consider stochastic models, which are Markovian (i.e., with a random component that is independent of any previous time and state). The general form of a model from the hierarchy is

where (**f**_{1}, **f**_{2}, … , **f**_{n}, … , **f**_{N}) is the state vector of the system, and each element of it, **f**_{n}, is a 2D physical-space vector (*i* = 1, 2 and *j* = 1, 2 are spatial coordinate indices), and summation is implied over a repeated index. The variables **f**_{n} correspond to **x**, **u**, **u̇**, etc. The **Δ**_{n} are deterministic functions; *d***W**(*t*) is a random increment vector; and *b*_{ij} is the tensor amplitude multiplying the random increment vector. Random forcing enters only the last equation in the sequence (i.e., the one with the highest time derivative of the particle position) and all the lower time derivatives and the position itself are obtained by integration in time. The probability that the system is in a certain state is given by the corresponding conditional (i.e., in Lagrangian phase space) PDF, *P*_{L}(0, **x** | *t, ***x**), and each model has an equivalent Fokker–Planck equation for the phase-space evolution of *P*_{L} (Risken 1989). The tracer concentration is the projection of *P*_{L} obtained by integrating over all the state variables except **x**.

We define *W*_{j}(*t*) as an independent incremental Wiener process in each coordinate direction:

This process is a continuous but nondifferentiable integral of a zero-mean, discontinuous, stationary, Gaussian, white noise process, *ξ*(*t*). Its variance is equal to *dt.* The relationship (4) is interpreted as *dW*(*t*) = *ξ*(*t*)*dt* in (3).

Boundary conditions are required for (3) in a bounded domain, such as an oceanic basin. The most common choices are that either particles do not reach the boundary because the velocity variance vanishes there or that particles reach the boundary and then are either reflected or absorbed (Durbin 1983). For an oceanic application the physically correct choice of the boundary condition is uncertain beyond the integral constraint of the tracer conservation. For simplicity pro tem, we choose perfect reflection: the normal component of the velocity, *u*^{(n)}, and all its time derivatives, *u̇*^{(n)}, *ü*^{(n)}, … , change signs when a particle hits the boundary. A more realistic alternative might be a reflection coefficient that smoothly decays away from the boundary. Initial conditions are required by (3) for the **f**_{n}, and, except for position, they are chosen randomly from the corresponding normal distributions (appendixes B, C, and D).

For integrating (3) in time, we use an Ito (rather than Stratonovich) calculus, both of which yield the same result in homogeneous, stationary situation but otherwise require different formulations of the **Δ**_{n} (Rodean 1996). For simulating transport in the oceanic gyres, we use a 3-h time step, and the results are insensitive to further reductions of *dt.* The time-mean advection term in **Δ**_{1} is integrated with a fourth-order Runge–Kutta scheme. The parameter coefficients in **Δ**_{n} and *b*_{ij} are calculated only once on a uniform 30-km grid,^{5} and at each time step they are interpolated bicubically to the instantaneous particle position. The accuracy is such that in the absence of fluctuations particles closely follow time-mean streamlines for several gyre circuits.

Solutions of (3) are characterized by the single-particle dispersion tensor,

where the overline indicates an ensemble average over many realizations generated by (3) with the same initial position, **x**(0). In the absence of boundaries and in the large-time limit, all the homogeneous and stationary Markov models have linearly growing *D*(*t*) (i.e., they are asymptotically diffusive) but they have distinctive behaviors at finite time. In both oceanic float observations (e.g., Krauss and Boning 1987; Sundermeyer and Price 1998) and numerical solutions (BMB), *D*(*t*) evolves in a complicated, nonlinear way. Complex *D*(*t*) may be described by intermediate-time power laws,

along principal directions of the velocity variance (BMB). This relation implicitly assumes *local homogeneity* approximation, that is, during the time period of interest the spreading length scale, *L*_{Dii}(*t*) = *D*_{ii}(*t*), remains less than the inhomogeneity scale, *L*_{Ii}, over which the Lagrangian statistical properties vary^{6} in the *i*th coordinate direction, including proximity to a boundary (*L*_{Ii} ∼ *x*_{i} as *x*_{i} → 0). This assumption is most doubtful where *L*_{I} is relatively small and velocity fluctuations are large (e.g., *L*_{I} near the western boundary current). Other measures that characterize (3) are the autocorrelation function,

of Lagrangian velocity time series, the corresponding power spectrum, and evolving particle PDF from localized particle releases.

In a homogeneous, stationary situation and in an unbounded domain, each Markov model from the hierarchy corresponds to a linear, stationary autoregressive process (AP) for the velocity fluctuation (i.e., **f**_{2}) in the limit *dt* → 0 (Box et al. 1994). By transforming each Markov model to the corresponding linear AP, some of its properties become clear and parameter estimating procedures become easier. We provide some basic information on AP in appendix A.

## 3. Fluid-dynamic solution

The fluid-dynamic Lagrangian behavior is analyzed extensively in BMB, and here we briefly summarize some of the results needed for evaluating performance of the Markov transport models. The midlatitude ocean is represented by QG dynamics at large Re in a flat-bottom, square basin with size *L* = 3840 km and a no-slip velocity condition on lateral walls. The circulation is driven by asymmetric zonal wind stress. The ocean is discretized vertically in three isopycnal layers with depths *H*_{i}, and the presentation is focused on only the upper and middle layers because the Lagrangian properties in the bottom and middle layers are qualitatively similar. After the initial spinup process, the turbulent-equilibrium solution is computed for 10^{4} days and stored for analysis. As the primary evaluation criteria for the transport models, we use the fluid-dynamic, large-scale PDFs of localized tracer releases and the meridional, Lagrangian, time-mean intergyre fluxes.^{7} The localized release is initialized in the subtropical western boundary current (WBC), but other release sites lead to similar conclusions.

The time-mean, upper-ocean velocity streamfunction has an asymmetric, double-gyre structure with two WBCs and their associated eastward jet extensions and recirculation zones (Figs. 1a,c). The asymmetry of the gyres is due to asymmetry of the wind forcing. The intergyre boundary between the time-mean gyres is located slightly to the south of the subpolar eastward jet. The eastward jets are separated from each other by a moderately active, in terms of both mean and fluctuating currents, zone. The time-mean flow in the deep ocean is weak except near the subtropical WBC and its eastward jet extension, where it is predominantly anticyclonic. The fluctuations are characterized by the variance tensors for the velocity,

acceleration,

and hyperacceleration,

where components of the hyperacceleration vector are defined as

and operator *f* is used as an infinite-time average so that all the stochastic models are evaluated in the stationary regime. In the homogeneous and isotropic situation, each of (8), (9), and (10) corresponds to single value. The physical meaning of *p*_{i} becomes simple in the Markov-3 model: it is the linear combination of the rate of change of the acceleration and the average velocity fluctuation in its direction. At all depths the fluctuations are intense near the main currents with typical velocity variances of 0.5 in the upper and 0.2 m s^{−1} in the deep ocean (Figs. 1b,d). The flow is substantially anisotropic away from the main currents: *σ*_{22}/*σ*_{11} ≈ 2 in the interior of the basin, and near the lateral boundaries (except the western one) the anisotropy is even more pronounced. The *α*_{ii}(*x, **y, **z*) strongly varies over the basin, and we refer to a location as subdiffusive if *α*_{ii}(*x, **y*) < 0.8, superdiffusive if *α*_{ii}(*x, **y*) > 1.2, and approximately diffusive if 0.8 < *α*_{ii}(*x, **y*) < 1.2.

Ratios of the variance tensors (8)–(10) yield two fundamental parameters of the problem: the first kinematic timescale is found from the relationship

and the second kinematic timescale is found from

These parameters enter the Markov-2 and -3 models. The kinematic timescales, *T*^{(1)}(**x**) and *T*^{(2)}(**x**), are straightforwardly estimated from either Eulerian or Lagrangian (float) measurements, hence their estimates are not constrained by the local homogeneity approximation (section 2). For simplicity pro tem, we neglect the nondiagonal elements of all the tensors that appear in the stochastic models.^{8} The first kinematic time is typically larger in zonal direction and in the upper ocean, and it has smaller values near and along the boundaries (Fig. 2). Away from the main currents, typical values of *T*^{(1)} are 7–10 days in the upper, and 11–13 days in the deep ocean; they are 3–7 days in the subtropical WBC and its eastward extension at all depths. The second kinematic time is about 2 days in the subtropical WBC and its eastward extension, and it is 1–2 weeks away from the swift currents; the differences between the upper and deep oceans are small away from the swift currents; otherwise *T*^{(2)} is larger in the deep ocean (Fig. 3).

Distributions of particles released in the subtropical WBC are shown in Figs. 4 and 5. In the upper ocean the particles spread throughout the subtropical gyre over about 1000 days, and the subpolar gyre remains only weakly invaded over that time because the subpolar eastward jet is a strong barrier to the transport (BMB). In the eastern part of the basin, the particles remain concentrated in a propagating PDF front associated with local subdiffusive behavior. This behavior is more evident in the deep ocean where the time-mean flow is weak in the interior of the basin, and the PDF propagation into the eastern basin is very slow.

Across the intergyre boundary, by its definition as the time-mean streamline running from one boundary to the other, there is no time-average Eulerian flux of material, but there is time-average Lagrangian flux. In the *i*th layer, the total, time-average, Lagrangian intergyre flux is

where *L* is size of the basin; *V*_{i} = *L*^{2}*H*_{i}/*N* is the fluid volume corresponding to each of *N* particles; *N*^{(n,s)}_{i} (*t, **x*) is the probability density of first-time, intergyre boundary crossing; and the superscripts indicate whether the crossing is in the northward or southward direction. Because of integral mass conservation: *F*^{(n)}(*t*) = *F*^{(s)}(*t*), but *N*^{(n)}(*t, **x*) and *N*^{(s)}(*t, **x*) are generally not the same (Fig. 13a), indicating that locally nonzero net Lagrangian fluxes occur between the gyres. The fluid-dynamic *F*_{i}(*t*)^{(n)} is listed in Tables 1, 2, and 3, and the *N*^{(n,s)}_{i}(*t, **x*) is plotted in Fig. 13a. Initial and later positions of crossing particles are another measure of the large-scale transport (Fig. 6) used together with the particle PDFs and fluxes for evaluating the stochastic transport models.

## 4. The Markov-1 Model

### a. Formulation and properties

The Markov-1 (i.e., Langevin or random acceleration) model is the most commonly used stochastic transport model after diffusion. It is derived in appendix B for an inhomogeneous and nonstationary situation in 2D. The governing equations are

where the first rhs term in the second equation represents a fading memory for velocity fluctuations; *θ*^{(1)}_{ij}(**x**) is the (nonsingular) Markov-1 fading-memory time tensor; the drift correction term (for well-mixedness constraint) is

and the random forcing amplitude is defined by

The drift correction term is zero when the turbulence is both stationary and homogeneous (i.e., *σ*_{ij} is a constant tensor). These criteria do not determine *ã*_{i} uniquely (appendix B). This is a weakness of the present stochastic transport theories (Rodean 1996). Our choice of *ã*_{i} is the same as that of Thomson (1987), and it is shown to perform better than an alternative one in a uniform shear flow (Sawford and Yeung 2000). In general, however, this remains an important open issue.

In a homogeneous, stationary situation without boundaries, the elements of *θ*^{(1)}_{ij} are found by integrating *R*_{ij}(*τ*) over *τ*:

In this case *θ*^{(1)}_{ij} is called the Lagrangian integral time, *T*_{L}. Yet beyond the simplest situation, this relationship is not valid. In a general situation, given the assumption of local homogeneity (section 2), the elements of *θ*_{ij} are to be found locally along the principal directions of *σ*_{ij} (in this coordinate system both tensors are diagonal). As explained in section 3, we ignore the off-diagonal terms and use the common geographical coordinates (which in our fluid-dynamic solutions are approximately aligned with the boundary currents, eastward jets, and the gradient of the Coriolis frequency). We define the components of diagonalized *θ*_{ij}(*x, **y*) as the time *τ* at which the corresponding components of a locally stationary *R*_{ij}(*τ, **x, **y*) approach the first zero (BMB).

The mathematical formalism of autoregressive processes is used in the following way (N.B., appendix A). For each coordinate component we rewrite the first equation in (15) as

where *δ* ≡ *dt.* In a homogeneous and stationary situation without boundaries, the associated AP-1 in (A1) has a coefficient

where from (A2):

It follows from (20) and *θ*^{(1)} > 0 that (21) is satisfied for small *δ* and *ϕ*_{1} > 0. From (A3) and (20), the AP-1 autocorrelation function satisfies

After expanding in a Taylor series for *δ* → 0, we find

hence the fundamental Markov-1 parameter, *θ*^{(1)}, can be interpreted as the exponential timescale of the exponentially decaying correlation memory. By (A7), the AP-1 frequency power spectrum is

With (20) and in the limit *δ* → 0, we obtain

for the spectral power density normalized by the condition:

Thus, the AP-1 spectrum decays as ∼*f*^{−2} at high frequencies, and it is flat at low frequencies.

Lateral boundaries change the system by the reflection condition (section 2), which introduces random sign changes in *u*′(*t*). An otherwise homogeneous, stationary, linear random process ceases to be autoregressive. Instead of (A1) the governing equation becomes

where *Z*(*t*) = +1 unless the implied *x*(*t*) reaches the boundary, in which case *Z*(*t*) = −1. From the velocity definition:

and therefore

By integrating over time we obtain

Since in a bounded domain *D*(*t*) reaches a finite global maximum at *t* = ∞, (31) yields two kinematic constraints for *R*(*τ*):

From (32) it follows that the Lagrangian integral time, *T*_{L}, is zero in a bounded domain and that *R*(*τ*) must decay faster than ∼*τ*^{−2} at large *τ* for convergence of the integral.

We illustrate the effect of boundaries on Markov-1 *R*(*τ*), *P̃*(*f*), and *D*(*t*) by calculating the trajectories of 500 particles, randomly placed in a square basin of width *L,* for a time much longer than *θ*^{(1)}. Ballistic length scale is defined as

and the relative importance of boundaries is determined by parameter

For *σ* = 0.4 m s^{−1}, *θ*^{(1)} = 10^{6} s ≈ 10 days, and *L* = 4 × 10^{6} m, the value of *α* is 10. The numbers roughly correspond to the upper ocean away from the boundaries. In the deep ocean *σ* is smaller, hence *α* is larger. Near the boundary or in a marginal basin, such as the Mediterranean or Caribbean seas, *L* is smaller, and so *α* is smaller (e.g., for *L* = 0.4 × 10^{6} m: *α* = 1). Figure 7 shows *R*(*τ*) and *P̃*(*f*) for several values of *α* in the relevant range, as well as for an unbounded domain (i.e., with *α* = ∞). Here *R*(*τ*) has negative lobe due to (32), and the amplitude of the lobe increases as *α* decreases. The negative lobe implies that the power spectrum has a maximum at an intermediate frequency and is shifted from low to high frequencies compared to an unbounded domain; these differences increase as *α* decreases. The power maximum shifts toward higher frequencies, and the negative autocorrelation lobe shifts toward shorter time lags as the boundary influence increases. Even for *α* as large as 10, the spectral slope (25) at high frequencies is about −1.7 compared to the asymptotic value, −2. The unbounded Markov-1 model has ballistic and diffusive asymptotic single-particle dispersion regimes (Taylor 1921),

### b. Application to oceanic gyres

The Markov-1 model accounts for the short-time ballistic spreading, and, therefore, it is a better alternative to the diffusion model, although the overall improvement is not so drastic as in the Markov-2 model. In agreement with both the local homogeneity assumption and the implied exponential decay of *R*(*τ*), *θ*^{(1)}_{ii} is estimated as the time at which *R*_{ii}(*τ*) decays to 0.3. It is approximately isotropic and its magnitude is about 10 days away from the swift currents and 3–5 days otherwise (Fig. 9).

From the localized particle releases, we find that the Markov-1 model is biased toward excessive spreading. In the fluid-dynamic solution (Fig. 4), the particle PDF has a persistent propagating front, but in the Markov-1 simulation the front is broader and less pronounced, and it completely disappears by 2000 days (Fig. 10). The discrepancies are even larger in the deep layers (Fig. 11). Another evident weakness of the Markov-1 model is its failure to simulate the intergyre transport barrier: after 1000 days, the Markov-1 simulation yields about 20% of the particles in the subpolar gyre, as opposed to the fluid dynamic of 5%. In the deep ocean the analogous simulation also yields excessive spreading with the corresponding leakages of 9% and 1%, respectively. The Markov-1 intergyre fluxes, *F*_{i}(*t*), are three to four times larger than the fluid-dynamic ones in the upper ocean, and they are five times larger in the deep ocean (Table 1). To assess the consequences of uncertainty in estimating *θ*^{(1)}, we vary it by ±20% and find that at all depths the intergyre flux varies proportionally to *θ* and by approximately the same amount, and the large-scale spreading pattern does not change qualitatively.

Although the Markov-1 model is capable of simulating short-time ballistic spreading, it is not capable of simulating near-circular or oscillatory motion of particles in coherent structures such as eddies, intense vortices, jet meanders, and planetary waves (BMB). Such motions yield decaying and oscillating *R*(*τ*) [unlike in (23)] and associated with them intermediate-time, sub- and superdiffusive single-particle dispersion behaviors. The missing physics of coherent structures is the main reason for the excessive spreading and overestimation of intergyre fluxes in the Markov-1 model. The Markov-1 deficiencies are substantially corrected in the Markov-2 and -3 models (sections 5 and 6), and further improvements are straightforward (section 7).

The Markov-1 model yields *N*^{(n,s)}_{i}(*t, **x*), which is smaller than the fluid-dynamic *N*^{(n,s)}_{i} in the subpolar eastward jet and larger in the eastern basin (Figs. 13a,b). Also, the fluid-dynamic intergyre flux has its maximum closer to the western boundary than any of the Markov models, perhaps due to the local inaccuracy introduced by the local homogeneity assumption and the deficiency of the present lateral boundary condition of perfect particle reflection. In the Markov-1 model (Fig. 12), many of the crossing particles originate and end up farther away from both the intergyre boundary and WBCs than in the fluid-dynamic solution (Fig. 6). Again this indicates excessive spreading process.

## 5. The Markov-2 model

### a. Formulation and properties

The most important property of the Markov-2 model (derived in appendix C) is its capability of simulating intermediate-time, subdiffusive behavior of *D*(*t*). In addition to the continuous velocity, the Markov-2 model yields a continuous acceleration, **u̇**. Variable **g** is called pseudoacceleration, and it is only equal to **u̇** in homogeneous and stationary situations (i.e., when **ã** = 0). The Markov-2 governing equations are

where *θ*^{(2)}_{ij} is the Markov-2 fading-memory time tensor; the first drift-correction term, *ã*_{i}, is defined in (16); and the second drift-correction term is

The random forcing amplitude is defined by

In a homogeneous and stationary situation without boundaries, (36) can be written as

for each spatial direction (dropping primes), where

For a second-order, finite-difference stencil with time step *δ* and centered at *t* − *δ, *(39) is equivalent to an AP-2 with coefficients

The inverse of (41) is

The sufficient conditions for AP-2 to be stationary (appendix A) are

A severer condition,

restricts the AP-2 so that *u*(*t*) does not change sign at each time step. Roots of the characteristic equation (A2) are real when

The parameters of AP-2 can be interpreted as two fundamental timescales of the Markov-2 model,

which are the Markov-2 fading-memory and the first kinematic (12) timescales. The former describes monotonic decay of correlation memory, and the latter describes the average circular motion of stochastic particles that occurs when acceleration and velocity vectors are not aligned with each other. In Figs. 14 and 15 we show how Markov-2 properties vary with the ratio of the timescales,

equal to 1, 10, and 100. [Note that *β*^{(1)} → 0 recovers the Markov-1 behavior, and even *β*^{(1)} = 1 in Fig. 14b shows a similar *D*(*t*) to that in Fig. 8.] From (A7) the AP-2 spectrum is

It has a maximum at an intermediate frequency, *f*_{max}, and it is ∼*f*^{−4} at frequencies higher than 1/*T*^{(1)} (Fig. 14a). From (A5) with *p* = 2, the AP-2 velocity autocorrelation function (A3) has its first two coefficients satisfying

The AP-2 *R*(*τ*) oscillates on one timescale *T*^{(1)}, and the envelope of oscillation decays on the timescale *θ*^{(2)} (Fig. 15). Therefore, for *β*^{(1)} ≥ 1, the technique of measuring *θ*^{(2)} as a timescale over which *R*(*τ*) approaches zero (a common practice and a safe one for Markov-1 model), instead provides an estimate of *T*^{(1)} in Markov-2. The intermediate-time, Markov-2 *D*(*t*) is subdiffusive when *β*^{(1)} is large (Fig. 14b), and that is qualitatively more like the fluid-dynamic *D*(*t*) in most parts of the gyres rather than in the Markov-1 model (Fig. 8). Superdiffusive, single-particle dispersion, common in the gyres, is related to the asymmetry of *R*(*τ*) and is therefore precluded in the Markov-2 model.

For the Markov-2 model in a bounded domain, the kinematic constraints (32) are valid. So the growth in *D*(*t*) slows down and saturates at late time; the spectral power in *P*(*f*) shifts away from low frequencies, and its slope becomes shallower at very high frequencies, with *P*(*f*) ∼ *f*^{−2} as a result of increased velocity randomness—both as in the bounded-domain Markov-1 model—and *P*(*f*) develops additional peaks around 2^{n}*f*_{max}, *n* ≥ 1.

### b. Application to oceanic gyres

The Markov-2 model is the simplest member of the hierarchy that accounts for the intermediate-time, subdiffusive spreading (a common property of the fluid-dynamic solution) due to the presence of coherent structures. Given *A*_{i}(*x, **y*) calculated with (40) from the observed local variances, we use (42) to determine the local relationship between *ϕ*_{1} and *ϕ*_{2} as

Using this we calculate *D*(*t*) for AP-2 and determine its intermediate-time (from 20 to 200 days) slope, *α,* as a function of *ϕ*_{1}. We match these to fluid-dynamic values of *α*_{ii}(*x, **y, **z*) (BMB) to determine *ϕ*_{1}(*x, **y, **z*) and *ϕ*_{2}(*x, **y, **z*) for each coordinate component. We then calculate *θ*^{(2)}_{ii}(*x, **y, **z*) from (42) and (46); hence *β*^{(1)}_{i}(*x, **y*) is found from (47). The *β*^{(1)}_{i}(*x, **y*) (Fig. 16) is larger than unity almost everywhere, except near and along those boundaries where it is very small because *θ*^{(2)} is small. In the upper-ocean eastern basin, *β*^{(1)}_{i} varies from 1 to 30, and it is typically larger than in the deep ocean. The limit *β*^{(1)} ≪ 1 (or *T*^{(1)} ≫ *θ*^{(2)}) corresponds to degeneration of the Markov-2 to -1 model, and

Hence from (20) it follows that

The globally averaged *P*(*f*) and *R*(*τ*) calculated from the Markov-2 simulation of the transport in gyres are qualitatively similar to those from the Markov-3 model (Fig. 26), because of the dominant contribution from the subdiffusive regions where the Markov-2 model performs relatively well. The intergyre fluxes (Table 2) and evolving particle PDFs (Fig. 17) from localized releases suggest that the Markov-2 simulation is much closer to the fluid-dynamic solution than the Markov-1, mainly because it does not strongly overestimate the spreading rates. The deep-ocean intergyre fluxes are three to four times smaller than in the Markov-1 simulation (Tables 1 and 2) and much closer to the fluid-dynamic solution; the upper-ocean fluxes also improve. The large-scale particle distributions are similarly improved, but more so in the deep ocean, where super-diffusive regions (not simulated by the Markov-2 model) are not abundant. The same conclusion is made from comparing the origins and destinations of the particles participating in the intergyre transport.

## 6. The Markov-3 model

### a. Formulation and properties

The Markov-3 model is the most general and powerful transport model considered here. In addition to retaining all capabilities of the Markov-2 model, it also can simulate intermediate-time, superdiffusive spreading process. The derivation of the Markov-3 model for an inhomogeneous and nonstationary situation in 2D is in appendix D. The governing equations are

where *ã*_{i} and *c̃*_{i} are defined in (16) and (37), and the third drift-correction term is

The random forcing amplitude is defined by

and *θ*^{(3)}_{ij} is the Markov-3 fading-memory time tensor.

In a homogeneous and stationary situation without boundaries, and for each coordinate direction, (53) may be written as

where the primes and tensor notations are dropped for convenience, *A* is given by (40a), and the other coefficients are

The expression for *C* from (57) is rearranged so that it involves *ü*^{2} rather than *p*^{2}: since

then

The system (56) can be reduced to the single equation for velocity fluctuation:

We define second-order, finite-difference stencils with time step *δ* and centered at (*u*_{k−1} + *u*_{k−2})/2, for the terms in (60),

and obtain an AP-3 with the coefficients

The inverse of (62) is

For stationarity of AP-3, the roots of (A2) are constrained by the conditions,

The roots may be written as

and it follows that they are located outside the unit circle if

Another condition is that the real parts of the roots are positive,

therefore, *u*_{k} does not change sign at each time step. From (66) we find that

The AP-3 parameter subspace defined by (66)–(69) has a wedgelike shape (Fig. 18), and as *ϕ*_{3} → 0, it approaches the analogous AP-2 parameter subspace (section 5a). The *A* is related to *T*^{(1)} by (12), and the parameters *B* and *C* correspond to the fundamental timescales,

The former is the Markov-3 fading-memory time, and the latter is the second kinematic time (13), which describes the average rate of deviations from the average circular motion of stochastic particles (e.g., when particles circulate inside a drifting eddy). The nondimensional parameters are

According to (68):

and it follows from (63) and (70) that this condition is violated if

This is a stability criterion that limits the time step; it does not arise in the lower-order Markov models.

Given *A* and *C* through *T*^{(1)} and *T*^{(2)} in (46) and (70), the problem of estimating the Markov-3 parameters from the fluid-dynamic solution becomes that of finding *B* to fit the intermediate-time, single-particle dispersion power law (6). The relationship between *ϕ*_{1} and *ϕ*_{3} is obtained from (63):

We introduce a variable transformation by

where *L̃* is a shift of *ϕ*_{1}, and find that

The power law exponent, *α,* is fitted over the time interval from 20 to 200 days on the line (76), which gives *ϕ̃*_{1} and *ϕ*_{3}. Then *ϕ*_{1}, *ϕ*_{2}, and *B* are given by (75) and (63). As an illustration of the parameter estimating algorithm, we show *α*(*ϕ̃*_{1}, *ϕ*_{3}) with constant *A,* such that *T*^{(1)} ≈ 15.5 days (Fig. 19).

The AP-3 power spectrum from (A7) is

The *P̃*(*f*) is ∼*f*^{−6} at very high frequencies, and it is flat at the low-frequency end (Fig. 20). At intermediate frequencies *P̃*(*f*) has one maximum and, unlike AP-2, one minimum. The minimum becomes less pronounced than the maximum and eventually disappears when *β*^{(2)} → 0. The AP-3 power spectrum at low and intermediate frequencies approaches AP-2 (Fig. 14a) when *T*^{(1)} and *T*^{(2)} are large compared to *θ*^{(3)}. The AP-3 *R*(*τ*) is given by (A3) with

obtained by substituting *p* = 3 in (A5). The degree of asymmetry in *R*(*τ*) increases with decreasing *β*^{(21)}, and the period of oscillation increases with *β*^{(2)} (Figs. 20 and 21). If *β*^{(21)} < 1, the single-particle dispersion behavior (Fig. 22) is approximately ballistic over a much longer time interval than *θ*^{(3)}, and variations of *T*^{(1)} have only a weak effect on the value of *α.* With *β*^{(21)} > 1, the single-particle dispersion behavior is subdiffusive over an intermediate time interval, and *α* is sensitive to *T*^{(1)}. The *R*(*τ*) in superdiffusive regions of the gyres (BMB) has pronounced oscillations and a moderate degree of asymmetry between positive and negative lobes, suggesting that *T*^{(1)} ≈ *T*^{(2)} ≤ *θ*^{(3)}. The influences of domain boundaries are similar in the Markov-2 and -3 models (section 5a).

According to (62), the situation when *T*^{(2)} ∼ *T*^{(1)} ≫ *θ*^{(3)} corresponds to *ϕ*_{3} ≪ 1, and AP-3 degenerates to AP-2 (section 5a). If we consider the limit

then (63) implies that

and *B*^{(3)} is proportional to *δ.* If *ϕ*_{1} and *ϕ*_{2} are considered as belonging to the AP-2, then from (41) as *δ* → 0,

### b. Application to oceanic gyres

In addition to all transport properties simulated by the models below in the hierarchy, the Markov-3 model is capable of simulating intermediate-time, superdiffusive spreading (a common property of the fluid-dynamic solution) associated with persistent propagations of coherent structures combined with the circular or oscillatory motion of particles. In most of the basin, *β*^{(2)}_{i}(*x, **y, **z*) is small (Fig. 23), and even more so in the deep layers; thus, the degeneration of the Markov-3 to the Markov-2 model (section 6a) is common. Large values of *θ*^{(3)} (1–2 days, typically) correspond to superdiffusive regions in the central part of the subtropical gyre and near and along the lateral boundaries. At all depths the Markov-3 particle PDFs from the localized releases show overall improvement toward the fluid-dynamic PDFs (cf. Figs. 4 and 24) from the Markov-1 and -2 simulations (sections 4b and 5b): the propagating PDF front in the subtropical gyre retains a sharper gradient for a longer time; the rate of particle penetration into the subpolar gyre decreases; and after crossing into the subpolar gyre, the particle spreads more eastward, due to the time-mean flow, rather than northward due to the fluctuations. In the deep ocean, the Markov-3 and -2 particle PDFs are qualitatively similar (Fig. 17), so the former is not shown here. The less dispersive character of the Markov-3 model is evident in its intergyre fluxes (Table 3), which are much closer to the fluid-dynamic fluxes than in any lower-order model (Tables 1 and 2). The upper-ocean, intergyre boundary is not strongly superdiffusive in meridional direction (BMB), but neither do its *β*^{(2)} values indicate degeneration to a lower-order model (Figs. 23a,b), hence the Markov-3 improvements are not limited to the pronounced superdiffusive regions. The upper-ocean distributions of initial and later positions of particles participating in intergyre transport (Fig. 25) and the corresponding *N*^{(n,s)}(*t, **x*) (Fig. 13b) are also improved (see also sections 3, 4b, and 5b). The *N*^{(n,s)}(*t, **x*) in each Markov model underestimates the northward intergyre flux in the ∼200-km-wide zone near the western boundary. This might be due to a failure of the local homogeneity assumption or a deficiency of the simple boundary condition (section 2). In general, the Markov-3 over -2 improvements are more substantial in the upper ocean where the superdiffusive behavior is more pronounced.

The Markov-3 and fluid-dynamic globally averaged frequency spectra, 〈*P*(*f*)〉, are each calculated from 400 randomly initialized trajectories (Fig. 26). The spectral power at low frequencies is due to circular particle motion around the time-mean gyres. At intermediate frequencies (periods of 30–40 days), there is a distinct maximum of the Markov-3 〈*P*(*f*)〉 that is more pronounced in the meridional rather than zonal component (only their sum is shown in Fig. 26). These features are in qualitative agreement with the fluid-dynamic solution (Fig. 26), but the Markov-3 〈*P*(*f*)〉 maxima are much sharper. (A similar remark can be made about the relatively larger oscillations in the globally averaged, Markov-3 velocity autocorrelation function, 〈*R*(*τ*)〉.) There are additional peaks at 2^{n }*f*_{max} due to perfect reflections from the boundaries (sections 2 and 5a); these are absent in the fluid-dynamic spectra, indicating a more complex near-boundary behavior. At high frequencies (up to a period of five days), the Markov-3 spectra have an exponent close to -4, indicating the dominant contribution of subdiffusive regions that is also found in Markov-2 〈*P*(*f*)〉 (section 5). The fluid-dynamic power spectrum is less steep in the upper ocean, though it steepens near the analysis limit in *f*; in the deep ocean it has an exponent close to -4, thus confirming the Markov-3 result. The Markov-3 model differs from the fluid-dynamic solution by generating fluctuations in a narrower range of intermediate timescales, hence it simulates more oscillatory *R*(*τ*) and *D*(*t*) (the oscillations are weakened by spatial averaging over different regions). This type of bias seems to us inherent in the Markov model hierarchy and eliminating it requires a generalization of the form of the stochasticity (e.g., by introducing realistic probability distributions instead of fixed values of the fundamental parameters).

## 7. Conclusions and discussion

The immediate goal of this paper is to formulate and test a hierarchy of stochastic transport models for simulating material transport in the presence of coherent structures. In the present application the hierarchy simulates eddy-induced, passive-tracer transport in oceanic gyres. As a long-term strategy, the models from the hierarchy can be implemented in coarse-grid GCMs as parameterizations of transport induced by unresolved eddies (i.e., they are potential replacements for the less skillful but commonly used eddy-diffusion model). Each stochastic model has its partial differential equivalent (Fokker–Planck equation), but beyond the zeroth hierarchical level it is, in general, technically easier to simulate the transport with the stochastic formulation. In this case the tracer density is obtained locally by simple coarse-graining procedure applied to the particle population. An alternative, less radical, and less accurate approach, is to use the large-time asymptotic behaviors of the stochastic models with locally estimated (from observations) parameters for calculating the associated diffusivity coefficients. Such diffusivity estimates would account, to a certain degree, for the transport effects introduced by coherent structures.

Performance of the hierarchy is tested against the fluid-dynamic transport in an eddy-resolving, oceanic-gyre solution at large Re (BMB). This implies that parameters of the hierarchy are estimated from the observed statistical properties of the turbulent solution. The fluid-dynamic transport is dominated by mesoscale coherent structures, such as swift meandering currents, intense vortices, eddies, and planetary waves. The transport is characterized by the spreading process that is ballistic on short times and predominantly sub- or superdiffusive on intermediate times (up to few months). On longer times this causes persistent transport barriers between different parts of the gyres and, in particular, a strong barrier to meridional transport between the gyres.

The transport models are stochastic Markov processes governing random motion of individual Lagrangian particles. The simplest member of the hierarchy is the random-walk (Markov-0) model, which is equivalent to the commonly used diffusion model. It is based on the idea of instantaneous velocity decorrelations and therefore is not capable of simulating nondiffusive transport associated with coherent structures. We focus on higher-order models (Markov-1, -2, and -3) with the stochastic forcing acting on successively higher time derivatives of particle positions. These models simulate some nondiffusive transport behaviors on short and intermediate times. For each model we derive correction terms that guarantee important physical constraint of well-mixedness, and that generalizes them for inhomogeneous and nonstationary situations as in the ocean.

The hierarchy is formulated very generally with 2D tensor coefficients, but for initial simplicity it is evaluated assuming tensor diagonality and stationarity but retaining the inhomogeneity. In addition to the time-mean (or coarse-grid) velocity field, assumed to be known, each subsequent level of the hierarchy introduces one additional parameter, which in a homogeneous, stationary, and isotropic situation reduces to a single value. The model parameters are estimated locally from dynamical properties of the turbulence, such as distributions of fluctuation velocity and its time derivatives. In addition to the fluctuation velocity variance, which in the Markov-1 and higher-order models may be roughly considered as the scaling factor, the fundamental parameters are the fading-memory time describing monotonic loss of correlation memory and the kinematic times. So far, unlike the kinematic times estimated directly from the Eulerian observations, the fading-memory time is estimated specifically for each transport model and from the Lagrangian particle (float) data constrained by the local homogeneity assumption. This does not mean, of course, that the fading-memory time can not be estimated from the Eulerian observations (e.g., as an envelope decay of the Eulerian velocity autocorrelation function), rather it means that here we implement a shortcut that is to be reconsidered when the models are further improved. Also, to reduce the information required for estimating the parameters, one has to look for a closure that relates them to the time-mean (or coarse-grid) currents (analogous to a nonlinear eddy diffusivity).

The Markov-1 model is capable of simulating short-time ballistic behavior associated with exponentially decaying *R*(*τ*). With parameters estimated from the fluid-dynamic solution, it is overly dispersive at all depths, and it overestimates the intergyre (meridional) transport by factor of 5. The fundamental reason for this failure is the inability of the model to simulate intermediate-time nondiffusive behaviors associated with coherent structures. The Markov-2 model is capable of simulating intermediate-time, subdiffusive spreading processes associated with oscillatory *R*(*τ*) (i.e., with circular or oscillatory motion of Lagrangian particles in coherent structures); therefore, it shows a strong improvement over the Markov-1 model, especially in the deep layers. The first kinematic time contains the information about subdiffusive behavior; it describes the average circular motion of Lagrangian particles (floats) that occurs when acceleration and velocity vectors are not aligned with each other. In addition to all capabilities of the models below it in the hierarchy, the Markov-3 model simulates the intermediate-time, superdiffusive behavior associated with decaying, oscillatory, and asymmetric *R*(*τ*); that is, it simulates transport with sustained correlations superimposed on near-circular behavior, like in drifting eddies. The second kinematic time contains the information about superdiffusive behavior; it describes the average rate of change from the average circular motion of Lagrangian particles. The Markov-3 shows substantial improvement from the Markov-2 model, which is not limited to the regions with pronounced superdiffusive behavior.

The results presented here clearly demonstrate that higher-order stochastic transport models can represent the tracer distributions in ocean gyres much better than commonly used diffusion model. However, we see many issues that need further investigation before implementing the transport models in coarse-grid GCMs. The physical assumptions (section 1a) of the two-dimensionality and tracer adiabaticity can be straightforwardly relaxed by extending the stochastic formalism to the third dimension and by including sinks and sources of the tracer. The assumption of the dynamical passiveness is the most fundamental and difficult to relax, and substantially more progress is needed before we know whether ideas presented here can be applied to dynamically active tracers. Some other issues that arise from shortcuts we have taken are the following: including nondiagonal terms in parameter tensors; the neglect of large-scale, low-frequency variability (i.e., nonstationarity); the nonuniqueness of the drift corrections; and use of adaptive time intervals for estimating the fading-memory parameter from intermediate-time, single-particle dispersion behavior in the higher-order models. Other potentially important stochastic modeling issues are more formidable to address: the use of Gaussian noise in the stochastic models, which limits their intermittency; the excessive peakedness of velocity spectra; the realistic boundary condition; the realistic probability distributions instead of fixed values of the fundamental parameters; simulation of two-particle dispersion, which describes the mixing rather than spreading process (Batchelor 1952; Sawford 2001), and introduces the associated spatial correlations between neighboring particles; optimal measurement strategy for estimating the parameters; and, finally, computational feasibility. We must leave these issues for the future.

## Acknowledgments

We acknowledge I. Kamenkovich for many interesting discussions. Funding for this research was provided by NSF Grant OCE 96-33681 and by Office of Naval Research Grant N00014-98-1-0165.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

### APPENDIX A

#### Some Properties of Autoregressive Process

The general form of AP-*p* is

where *p* is the order, *ϕ*_{n} are constants, and velocities are found at discrete successive times. The AP-p is stationary when the roots of the characteristic equation for *B,*

are outside the unit circle.

Important diagnostic quantities of an AP are its variance, *σ*_{u′}, autocorrelation function, and power spectrum. In the discrete formulation, the autocorrelation function (7) at time lag *kδ* is

and it is found as a linear combination,

of the inverse roots of (A2). The coefficients *A*_{j} are found with the normalization condition *ρ*_{0} = 1 and the Yule–Walker equation,

where *P*_{ij} = *ρ*_{i−j}, and *ρ *_{−k} = *ρ*_{k}. (The Yule–Walker equation is obtained by successively substituting *k* = 1, 2, … , *p* into (A3).) The AP random forcing variance is

The AP frequency power spectrum is

where the frequency range is 0 ≤ *f* ≤ 1/2.

### APPENDIX B

#### Derivation of the Markov-1 Model

Deriving stochastic equations governing motion of passive particles, we follow Thomson (1987) and Rodean (1996). For 2D, Gaussian fluctuations, the Eulerian, unconditional PDF of the velocity fluctuation, **u**′, is

where the determinant of a matrix is indicated by surrounding vertical lines, and the velocity variance tensor (8) depends on *t* and **x**. The inverse velocity variance tensor is defined away from the lateral boundaries (where *σ*_{ij} = 0 due to the no-slip condition) as

and (B1) is rewritten as

The “well-mixed state” of the tracer is the situation when the conditional PDF of the Lagrangian tracer, *P*_{L}[*t, ***x**, **u**′ | **x**(0), **u**′(0)], is proportional to *P*_{E}(*t, ***x**, **u**′). Physically, this means that once the tracer is uniformly distributed over the domain it stays that way. The “well-mixed condition” is a criterion that constrains the choice of the governing SDE. If the criterion is satisfied, then the well-mixed state is one of the possible solutions of the SDE.

The Fokker–Planck equation corresponding to the first-order Markov model is

and it governs the evolution of the probability density. The boundary condition corresponding to the perfect reflection of particles is ∂*P*(*t, **x*_{n})/∂*x*_{n} = 0, where *x*_{n} is direction transverse to the boundary. In the stationary situation, ∂*P*/∂*t* = 0, and when the tracer is well mixed, *P* ∼ *P*_{E} ∼ *P*_{L}. The goal is to find *a*_{i} and *b*_{ij} such that (B4) is satisfied when *P* = *P*_{E}; that is, the well-mixed state is a solution. Thus,

where

and *ϕ*_{i} → 0 as |**u**′| → ∞. We define the stochastic forcing amplitude as

where *θ*_{ij} is the nonsingular, first-order, fading-memory tensor. We differentiate (B3) and obtain

Given (B5) and (B10), we find that

where the first term is the fading memory of Lagrangian particles and the second term is the drift correction needed to guarantee the well-mixed condition. Using the fact that the flow is incompressible (∂*u*_{i}/∂*x*_{i} = 0), we rewrite (B6) as

and substitute expressions from (B8) and (B9) to obtain

Next, we assume that the drift term is a quadratic function of **u**′,

From (B14) and (B10) it follows that

The terms of the same order of **u**′ in (B13) and (B15) are equated to obtain the following equations for *α*_{i}, *β*_{ij}, and *γ*_{ijk}:

These equations yield multiple solutions for *α*_{i}, *β*_{ij}, and *γ*_{ijk}. Equation (B19) gives two solutions for *γ*_{ijk}:

and the third solution, *γ*^{(3)}_{ijk} = −(*σ*_{im}/2) ∂*λ*_{km}/∂*x*_{j}, is equivalent to *γ*^{(2)}_{ijk}. The single solution for *β*_{ij} that satisfies (B16) and (B18) is

In (B17) we change the summation indices (i.e., in the lhs, *m* → *j* and *k* → *j,* and in the rhs *i* → *j*) to obtain

Thus, with (B20) and (B21) there are two solutions for *α*_{i}:

We use (8), the fact that *σ*_{ij} = *σ*_{ji}, and the relationships

to obtain

In the end, we have two solutions, and any linear combination of them is also acceptable (Reynolds 1998). Other physical and mathematical conditions, in addition to the well-mixed one, are needed to discriminate between the alternatives, but such conditions are not known. With the assumption that the distinctions among the alternative solutions may not be large^{9}, we choose *α*^{(2)}_{i} and *γ*^{(2)}_{ijk}. The corresponding drift correction term becomes

After collecting (B11) and (B30), and changing from *λ*_{ij} to *σ*_{ij}, we find

and, finally, with the additional assumption of stationarity, we obtain the Markov-1 model, (15) and (16).

### APPENDIX C

#### Derivation of the Markov-2 Model

The Eulerian, unconditional PDF of **g** and **u**′ is

where the velocity and pseudoacceleration variance tensors, defined in (8) and (9), depend on *t* and **x**. The inverse pseudoacceleration variance tensor is

and (C1) is rewritten as

The Fokker–Planck equation corresponding to the second-order Markov model is

We define

The constraint that the well-mixed state is a solution of the Markov-2 model is

The stochastic forcing amplitude is defined as

where the fading-memory tensor, *θ*_{ij}, is introduced analogously with appendix B.

We differentiate (C3) and find the following relationships:

Given (C6) and (C12), it follows that

We rewrite (C6) as

substitute (C9)–(C12), and obtain

We assume that the first drift correction, *ã*_{i} = *ϕ*_{i}/*P*_{E}, is given by (B14) and the second drift correction is a linear function of **g**,

Using (C9)–(C12), we obtain

We equate terms of the same order of **p**, **g**, and **u**′ in (C15), (C17), and (C18) in such a way that the results of appendix B are recovered, and there are additional equations:

The only solution of (C19)–(C21) is

The first drift correction term remains as in the Markov-1 model, and the second drift correction term is

With the additional assumption of stationarity, we obtain (36) and (37).

In a homogeneous, stationary, and isotropic situation, Sawford (1991) considers the fading-memory time in the form of

where *τ*_{2} = *σμ**τ*^{−1}_{1}. As a result,

has no oscillations, and the *τ*-scales are related to the Lagrangian integral time by equation

A relationship used for finding *τ*_{1} and *τ*_{2} is the quadratic behavior of *R*(*τ*) at small *τ*:

In the case of a strong scale separation (i.e., *τ*_{2} ≪ *τ*_{1}): *τ*_{1} ≈ *T*_{L}, and *τ*_{2} may be interpreted as the Lagrangian acceleration correlation time related to the Kolmogorov timescale (Sawford 1991; Pope 1994). In oceanic gyres, *R*(*τ*) is rarely nonoscillatory, and where it is, the *τ*_{i} are not very different; therefore, this interpretation is not appropriate in gyres.

### APPENDIX D

#### Derivation of the Markov-3 Model

The Eulerian, unconditional PDF of **p**, **g**, and **u**′ is

where the velocity, pseudoacceleration, and pseudo-hyper-acceleration variance tensors are defined in (8), (9), and (10) and depend on *t* and **x**. The inverse pseudo-hyper-acceleration variance tensor is

and (D1) is rewritten as

The Fokker–Planck equation corresponding to the third-order Markov model is

We define

The well-mixed constraint yields

The stochastic forcing amplitude is defined as

where *θ*_{ij} is the corresponding fading-memory time tensor.

We differentiate (D3) and find

Given (D7) and (D14), it follows that

We rewrite (D8) as

substitute the relationships (D10)–(D14), and obtain

We assume that the third drift correction is a linear function of **p**,

and the other drift correction terms are given by (B14) and (C24). From (D10)–(D14) it follows that

We equate terms of the same order of **p**, **g**, and **u**′ in (D17) and (D19)–(D21) in such a way that the results of appendixes B and C are recovered, and

The only solution of (D22)–(D24) is

The corresponding drift correction term becomes

With the assumption of stationarity, we obtain (53), (54), and (55).

## Footnotes

*Corresponding author address:* Pavel S. Berloff, Institute of Geophysics and Planetary Physics, University of California, Los Angeles, Los Angeles, CA 90095-1567. Email: pavel@atmos.ucla.edu.

^{1}

The QG dynamic model has important advantages, in comparison with more dynamically complete OGCMs, because solutions can be calculated for the long time needed for reliable transport statistics and/or with the fine spatial resolution needed for achieving turbulent regimes with large Re [see Siegel et al. (2001) for even larger Re than analyzed here]. The latter is important for resolving the spectrum of mesoscale fluctuations contributing to transport.

^{2}

Many tracers, such as industrial pollution and phytoplankton, evolve by both passive physical transport and reactive chemical and biological mechanisms. Even dynamically active tracers—heat, salt, and potential vorticity—are sometimes transported approximately passively. In OGCMs the common practice is to represent the subgrid-scale transport of all tracers as if they are passive and to apply active dynamics only for resolved-scale fields. Stochastic transport models are usable in the same conceptual framework.

^{4}

We show that recirculating particle motion, as in mesoscale eddies, can be simulated by Sawford's model, but with different parameter choices.

^{6}

There is analogous statement about the nonstationarity timescale.

^{7}

In general global and single-basin GCMs are intended to simulate the large-scale spreading process and meridional fluxes more than the mesoscale properties of transport; therefore, we emphasize the former, but the goal is to create skillful transport models that operate equally well on all scales.

^{8}

This simplification cannot be justified by the smallness of these elements (BMB). However, when we compare the Markov-1 model solutions with and without these elements, we find that they make a hardly detectable difference in the particle PDFs from localized releases (N.B., Figs. 10–11), and the resulting changes in the intergyre flux are only about 2% in the upper and less than 1% in the deep ocean (N.B., Table 1). Since these are the primary evaluation criteria for the transport models, and the inclusion of nondiagonal elements substantially complicates the parameter estimating procedures for the higher-order models, we make this simplification for the initial study, recognizing that the issue is to be addressed in the future.

^{9}

Experiments with *α*^{(1)}_{i} and *γ*^{(1)}_{ijk} show less than 1% difference in terms of the intergyre transport and no substantial differences in terms of the evolving tracer PDF from localized releases, so we do not explore this alternative further.