Lagrangian stochastic (LS) modeling is a common technique in atmospheric boundary layer modeling but is relatively new in coastal oceanography. This paper presents some fundamental aspects of LS modeling as they pertain to oceanography. The theory behind LS modeling is reviewed and an introduction to the substantial atmospheric literature on the subject is provided.
One of the most important properties of an LS model is that it maintains an initially uniform distribution of particles uniform for all time—the well-mixed condition (WMC). Turbulent data for use in an oceanic LS model (LSM) are typically output at discrete positions by a general circulation model. Tests for the WMC are devised, and it is shown that for inhomogeneous turbulence the data output by an oceanic general circulation model is such that the WMC cannot be demonstrated. It is hypothesized that this is due to data resolution problems. To test this hypothesis analytical turbulence data are constructed and output at various resolutions to show that the WMC can only be demonstrated if the resolution is high enough (the required resolution depending on the inhomogeneity of the turbulence data). The output of an LSM represents one trial of possible ensemble and this paper seeks to learn the ensemble average properties of the dispersion. This relates to the number of particles or trials that are performed. Methods for determining the number of particles required to have statistical certainty in one's results are demonstrated, and two possible errors that can occur when using too few particles are shown.
Lagrangian stochastic modeling of the advection and dispersion of clusters of particles has been a common technique in atmospheric boundary layer modeling for close to 20 years. The situation in oceanography is somewhat different, where the use of stochastic techniques has predominantly been applied to the analysis and modeling of drifter dispersion in turbulence assumed to be homogeneous (see, e.g., Dutkiewicz et al. 1993; Griffa et al. 1995; Buffoni et al. 1997; Falco et al. 2000). However, recently, the use of Lagrangian stochastic models (LSMs) in oceanography has increased primarily due to the use of individual-based models in biophysical modeling of the early life history of fish eggs and larvae (Sclafani et al. 1993; Werner et al. 2001, 1996; Hannah et al. 1998). Due to this difference in operational prerogative, many of the basic principles of LS modeling, especially in inhomogeneous turbulence, are not well known to oceanographers. One of the purposes of this paper is to provide a review of LS techniques as they pertain to oceanography.
In general there are three main reasons why LS modeling in the coastal ocean will differ from modeling in the atmosphere. First of all, the atmosphere has a turbulent boundary layer at its bottom surface, while the ocean has a turbulent boundary layer at its top (due to wind stress) and bottom surfaces. The result is that the ocean's vertical turbulent structure is typically more inhomogeneous than the atmosphere's, a fact that can make modeling more difficult.
Second, LSMs of the lower atmosphere typically use a surface turbulent variable (e.g., the friction velocity u∗) and analytical vertical turbulent structure functions [e.g., velocity variance σ2 = σ2(z; u∗)] to describe turbulence properties. In oceanography, the use of high-resolution circulation models with embedded turbulence closure schemes (e.g., Mellor and Yamada 1974) means that turbulent quantities for use in an LSM are output at discrete levels in the vertical (as well as horizontal position and time). We will see that this seemingly innocent difference can make accurate LS modeling virtually impossible.
Third, LS modeling in the atmosphere tends to be in the category of short-range boundary layer transport studies where high accuracy in time and space is important (considered to be the main virtues of LS modeling), and simplifying assumptions regarding dimension (2D vs 3D) and turbulent correlations are justifiable. Ocean particle tracking tends to fall into the long-range transport category, with (usually) lower demands on accuracy but (possibly) requiring true 3D stochastic modeling. This requires use of a different type of LSM, a random displacement model (RDM), the properties of which will be described in section 2.
A key property of a correct LSM is that it maintains an initially uniform concentration of particles uniform for all time. This is called maintaining the well-mixed condition (WMC). A major focus of this work is demonstrating the WMC for typical turbulent conditions in the coastal ocean, as this indicates the principles and caution that must be taken in such modeling. This is treated in section 3.
One run of an LSM provides one trial of a possible ensemble of trials. However, we need to know the ensemble average dispersive properties of the turbulent flow. This leads to the classic problem of stochastic modeling: how many realizations (or particles) are needed to have statistical confidence in one's results? We will see (section 4) that the answer to the question of sufficient realizations or number of particles is specific to the problem under consideration.
The outline of this paper is as follows. In section 2 we review the theory behind Lagrangian stochastic models and discuss concepts such as the Lagrangian timescale, the well-mixed condition, the diffusion limit of an LSM, and the inhomogeneity index for 1D turbulence. Section 3 deals with demonstration of the WMC and possible problems in doing so. Section 4 looks at the question of sufficient realizations, and shows how this can be determined, how it is specific to the question being asked, and what errors are possible when underseeding the source region. The last section is a summary and discussion.
2. Review of theory
The theory behind Lagrangian stochastic modeling is well represented in the literature, and the aim of this section is not to explain the complicated details of the derivation, but rather to present the key aspects relevant to an operational understanding of the model(s). The reader is referred to the excellent monograph by H. C. Rodean (Rodean 1996) for a clear, expanded version of the derivation of LSMs. The section begins with a review of the derivation, then moves on to topics related to LSMs and their use.
a. Model derivation
The starting point for LS modeling is the 3D Langevin equation [see chapter 1 of Gardiner (1983) for an interesting discussion of the historical setting]
To simplify discussion, consider the 1D version of (1) for the vertical velocity w:
In the above velocity u = (u, υ, w), the velocity increment dui = (du, dυ, dw), x = (x, y, z), t is time, dt is the time step, dx = dxi, and d𝒲 is the incremental “Weiner process” with variance dt [𝒲(t) = ∫t0 ζ(s) ds, d𝒲 = ζdt, where ζ(t) is a “white noise” random forcing].
The first term on the right-hand side of (1) is a deterministic term called the drift term. The second term is a stochastic or diffusion term. The goal is to derive physically meaningful forms for the functions a and b.
Equation (2) is an example of a stochastic differential equation, and associated with it is a probability density function (pdf) for w, P(z, w, t). The evolution of P is described by a Fokker–Planck equation,
which states that the evolution of P(z, w, t) in phase space is due to 1) vertical advection of P, plus 2) the divergence in phase space of aP (the drift of probability), plus 3) the diffusion of P with diffusion coefficient b2, that is, P evolves via an advection–diffusion equation.
The Fokker–Planck equation is a differential form of the Chapman–Kolmogorov equation for a Markov process and required this assumption in its derivation (see Rodean (1996, ch.5) and Gardiner (1983). Simply put, a Markov process is one in which the probability of a future state is independent of the past, depending only on the present plus a transition rule (pdf) that takes us from the present to the future. Strictly speaking, (3) describes the evolution of the transitional (or conditional) pdf P(z, w, t | z0, w0, t0); the probability that a particle initially at z0 with velocity w0 at t0 is observed in the dz, dw, neighborhood of z, w, at time t, which by the Markov assumption is related to the unconditional pdf P(z, w, t).
1) Solving for a and b
The Fokker–Planck equation is the key to finding expressions for a and b because it provides a link between the probability density function of the stochastic differential equation and statistical properties that can be derived from the governing Eulerian equations.
Two methods of solution are discussed in the literature: the moment method [Sawford (1986); van Dop et al. (1985); also Du et al. (1994) for a recent view); and direct solution of the Fokker–Planck equation (Thomson 1987, hereafter Thomson]. They both involve the fundamental (and intuitive) constraint that an initially well-mixed (uniform) distribution of particles remains uniform in unsteady, inhomogeneous turbulence—the WMC. The second requirement is called Eulerian consistency, which means that the statistics of the flow described by the LS model must be equivalent to the statistics derived from the governing Eulerian equations. It was Thomson who showed that these two requirements are, in fact, identical.
Thomson's method requires two important ideas.
- Where b has a universal form. For time increments within the inertial subrange the random part of the LSM should be determined from similarity theory considerations (see Wilson and Sawford 1996). This yields
- The WMC requires, formally, that P = PE; that is, the pdf of the LSM is the same as the pdf for the governing Eulerian field (Thomson 1987). As well, the only consistent pdf is one that has vanishing odd moments (Sawford 1986); that is, PE(z, w, t) is Gaussian, σ2w = σ2w(z, t) is the variance of w; the pdf is everywhere Gaussian, but its variance is a function of z and possibly t.
b. The solution of the 1D equation is unique
Using the above, Thomson found a unique solution to the 1D problem. Writing it in a form relevant to coastal oceanography, that is, for a model where σ2w = σ2w(z, t) and where a mean W exists (e.g., tidal regime), the full solution from Thomson with ŵ = W + w is
where TL is the Lagrangian timescale, and ζdt is a random number from a Gaussian distribution with variance dt. A Lagrangian stochastic model is considered valid for timescales smaller than TL but longer than the Kolmogorov timescale tη = (ν/ε)1/2, where ν is the kinematic viscosity and ε is the turbulent kinetic energy dissipation. A more familiar form for (4) is with ∂σ2w/∂t = 0 (i.e., stationary turbulence) and no mean W:
This is the equation derived by Wilson et al. (1981). Legg and Raupach (1982) derived a similar equation but without the w2/σ2w term (therefore not a well-mixed model). (The difference between the Wilson et al. and Legg and Raupach models is relevant in highly inhomogeneous turbulence.) The interpretation of (5) is that spatial gradients in the magnitude of turbulent kicks (the random term) lead to accumulation of particles in regions of low turbulence, a tendency that is countered by the ∂σ2w/∂z factor in the drift term.
c. The inhomogeneity index in 1D turbulence
If we nondimensionalize (5) using
we get (dropping primes)
is the inhomogeneity index [as mentioned in the Wilson and Sawford (1996) review], and in the sense that these equations are Lagrangian equations, z = z(t) so that IH = IH(z(t)). Note that in the context of integrating the 1D model, IH depends on the input data. Although the equations are typically not integrated in nondimensional form, the inhomogeneity index serves as a useful indicator for possible problems with the input data. (Note: IH as a measure of turbulence inhomogeneity can be misleading because, if ∂σ2w/∂z = 0 and TL = TL(z), then IH = 0, but the turbulence is still considered inhomogeneous. This is often the case in the lower atmosphere.)
d. The 3D solution is not unique
The solution of the 3D Langevin equations is considerably more complicated and, importantly, no unique solution is known to exist. (The same is true in 2D.) Thomson derived his “simplest” solution, in which each velocity component is of the form
containing 74 terms. Note that for the typical coastal circulation model where, for example, u′υ′ ∼ K∂U/∂y, the terms in (7) are derivable. To the authors' knowledge no attempt has been made to use a 3D LSM of this nature. If 3D stochastic modeling is required, an RDM, described below, is preferable. A final problem with the simplest solution is that it is impossible to determine if it is a physically realistic solution. In the atmosphere, the 3D model above is often used in its 2D form (Rodean 1996, p. 44; Flesch et al. 1995) after simplifying assumptions are made regarding the mean flow field and turbulent correlations (u′υ′ etc.).
e. The random displacement model
A random displacement model is the “diffusion limit” of an LSM: the limit TL → 0 or t ≫ TL. These models of particle displacements get around the problems with the 3D (and 2D) LSMs. Various derivations of this limit for the 1D model exist in the literature (see, e.g., Durbin 1984; Boughton et al. 1987). Here we quote the general result from Rodean (1996):
where Kij is the eddy diffusivity. Equation (8) is a unique solution to the turbulent diffusion problem. Applying the Fokker–Planck equation, rearranging and using the fact that the concentration C is related to the pdf P leads to
which shows that the RDM is equivalent to an advection–diffusion equation model. We see that the RDM provides a simple, unique, 3D alternative to an LSM, valid for timescales much longer than TL.
f. TL and the accuracy of an LSM versus an RDM
In general, an LSM is considered to be more accurate for short timescales and space scales and in complex geometries than an advection/diffusion equation model (or RDM; Du et al. 1994). The reason for this can be understood using Taylor's classic theory of turbulent dispersion (Taylor 1921). First, we define the Lagrangian timescale as the integral of the velocity autocorrelation function r:
[Note that in the context of LSMs TL = (2σ2u)/(C0ε) (Wilson and Sawford 1996), which for steady homogeneous turbulence is an integral timescale. For unsteady, inhomogeneous turbulence the definition still applies locally.] Taylor's theory for the evolution of the ensemble average variance X2 is
where we think of the left-hand side as equal to the effective diffusivity K, and the autocorrelation function on the right-hand side to have the classic decaying “exponential correlogram” form. For timescales ≪TL when the initial patch is much smaller than the turbulent eddy scale, we get
while in the diffusion limit, when the patch has grown larger than the eddy scale, we get a K-theory model
Thus K-theory, or eddy diffusion, models are considered to overestimate dispersion for times much shorter than TL and to be inaccurate near source regions. As well, in complicated terrains it is considered more accurate to track particles than to solve an advection–diffusion equation.
The above conclusions must be tempered by two facts. First, the velocity and turbulent quantities driving an LSM are usually output from a discretized circulation model, which introduces separate questions of accuracy. Second, it typically takes 10–100 times more computer time to integrate an LSM versus an advection–diffusion model, and this must be weighed when considering the (possible) convenience and added precision of an LSM.
g. Boundary effects
An LSM must be complemented with a correct boundary reflection scheme in order to preserve particles numbers (no flux condition) and velocity correlations, and to allow the WMC to be satisfied. The details are complicated (Wilson and Flesch 1993) but can be distilled down to the requirement that the turbulence must become homogeneous as a boundary is approached. As there is always an unresolved boundary layer, this condition can be artificially imposed on the data. In practice, however, correct results are often reported in violation of these theoretical requirements (e.g., Legg and Raupach 1982).
h. Time step choice and dt bias
Typically, LSMs use an explicit time-stepping scheme for the stochastic component. In inhomogeneous turbulence where TL changes with position, it is tempting to use an adaptive scheme whereby the time step changes based on the local TL, instead of one in which dt < min(TL). Wilson and Flesch (1993) showed that an adaptive stepping scheme can result in what they call a dt-bias effect. Briefly put, they considered a variable time step approach in inhomogeneous turbulence characterized by σ2w = const and TL = TL(z) and showed that this leads to a net bias velocity due to a particle in a TL gradient having a different dt depending on whether it was going up or down. This bias velocity can lead to the unmixing of initially uniform particle fields. Therefore, it is safer to use a constant dt.
i. Nonneutrally buoyant particles
Up to this point, it has been implicit in our discussion that particles are neutrally buoyant and act like fluid parcels (the WMC required this). In reality, many particles of interest (e.g., fish eggs or larvae) are not neutrally buoyant or may exhibit deliberate behaviors (e.g., swimming). Unfortunately, a satisfactory theory for LSMs of nonpassive particles does not exist.
The essence of the problem is the following. Imagine two time steps of an LSM in which the particle (an egg) has some density-dependent behavior (e.g., Stoke's settling velocity). On the first time step the fluid particle moves from z0 to z1 where it has velocity w1. The egg moves to z1 and then to z′1 due to its settling velocity. The trouble begins on the second time step where we have a fluid particle at z1 with w1 and an LSM to describe its next step (from z1), but the egg is at z′1 with a different w. This is called a trajectory crossing problem. Solutions in the literature (Sawford and Guest 1991; Zhuang et al. 1989) consider the spatial correlation between the fluid particle and the “egg” and adjust TL or reset the eddy that the egg is in, depending on assumptions regarding this correlation. The models work fairly well but indicate that much work remains to be done. For oceanographic purposes, where egg–water density differences are typically small, the trajectory crossing problem may not be that important (provided the run length is short enough for this to be true). But if significant swimming behavior is involved, then the chances of a particle wandering out of its eddy may not be small and significant tracking errors may occur.
3. Demonstrating the WMC
In this section we look at the ability of a 1D LSM and a 2D RDM to demonstrate the well-mixed condition using circulation model turbulence data as input. We will see that the output turbulent data from circulation models can make demonstration of the WMC impossible, regardless of time step or the number of particles used. We speculate that this is due to poor representation of the spatial derivatives of turbulent quantities required by the stochastic model. We show for smoothed model data and discretized analytical turbulence data that by increasing the output resolution the WMC can ultimately be demonstrated.
a. Input data
The turbulence data used in this paper is output from a run of the regional finite element model Quoddy4 (Lynch et al. 1996). The circulation model domain extended from the eastern Nova Scotian Shelf to west of Georges Bank and from about 150 km offshelf to the coast. The model had 21 vertical sigma levels designed with higher resolution in both top and bottom boundary layers, 12 192 nodes, and included density, wind stress and tidal forcing.
Vertical turbulence variables q2, q2l, kq (turbulent kinetic energy, length scale, and vertical diffusivity) are calculated from a Mellor–Yamada 2.5-level scheme (Mellor and Yamada 1974). The derived quantities for use in an LSM (σ2w = 0.3q2/2, TL = kq/σ2w) are output at the sigma coordinate levels. Horizontal turbulence is represented by a horizontally (and temporally) varying diffusivity AH calculated using a Smagorinsky (1963) scheme. (Note that AH does not vary in the vertical.) For use in the 2D RDM tests described below, the data were interpolated onto a 5 × 5 km grid. For all experiments reported in this paper we use tidally averaged (residual) turbulent quantities.
Figure 1 (thick lines) shows a vertical profile of turbulence data from a grid point near Browns Bank. The data, with strong vertical variation in σ2w and TL, are considered to be highly inhomogeneous. An inspection of the dataset showed that approximately 85% of the nodes had profiles of this nature. Note that the turbulence, represented by ∂σ2w/∂z and IH, only weakly approximates a homogeneous structure at the boundaries and thus does not satisfy the Wilson and Flesch (1993) boundary condition requirements. For the 1D tests described below, the data were artificially modified to force homogeneity at the boundaries. In practice this made no difference to the results.
Figure 2a is a contour plot of the AH field over Browns Bank. The diffusivity ranges from 50 to 200 m2 s−1, with generally lower values on-bank. For the 2D test, as in the 1D case, gradients in AH were forced to zero at the boundaries.
b. Methods for demonstrating the WMC
One run of an LSM can be considered to be one trial, or sample, of the possible ensemble of trials, and any method for demonstrating the WMC, or assessing the results from an LSM run, has to account for the statistical nature of the output of an LSM. The demonstration of the WMC requires showing that the initial uniform particle concentration (or distribution) remains uniform, in a statistical sense. Due to the nature of the input turbulent data, we use a different method for the 1D and 2D cases.
1) The 1D case: The variance test
In one dimension we test for the WMC as follows. Multiple trials of an LSM are performed starting with a uniform distribution of P particles. For each (24 h) trial the time mean concentration
[C(z)] at N equally spaced levels is calculated. [Concentration is calculated using a residence time algorithm (Luhar and Rao 1993).] For an ensemble of T trials the ensemble mean concentration 〈C〉 ± std dev is computed. If 〈C〉 is within one std dev of the initial concentration at all levels, then we consider the WMC to be demonstrated. However, this fact alone is not sufficient because we need to know something about the statistical stability of this result, that is, whether or not this is the expected result or just a chance occurrence.
We determine this by repeatedly subsampling the total M-trial ensemble and forming the statistics
where S2 is the mean of the distribution of sample variances, (V)1/2 is the standard deviation of the mean or standard error, N is the number of levels, T the number of trials in the sample, and R the number of independent T-sized samples drawn from the M-trial ensemble (Steel and Torrie 1960, p. 56). We expect that V(T) will be a decaying function of sample size T, and the decay scale in T gives an indication of the number of trials (or particles) required to yield a statistically acceptable result. We call this procedure the variance test.
2) The 2D case: The correlation test
In 2D it is more complicated to devise a WMC test. Our test is based on the fact that if a WMC does not exist, then a significant correlation would exist between the perturbation concentration field C′(x, y, t) = C(x, y, t) − C and the perturbation diffusivity field A′H(x, y) = AH(x, y) − AH, where C is the initial well-mixed concentration and AH is the xy-averaged diffusivity field.
To make this idea clearer, Figs. 2a,b show the Browns Bank AH field and the concentration field at day 16 from a run in which the drift (i.e., ∇AH) term in the 2D RDM [Eq. (8)] is turned off. The negative C′: A′H correlation is obvious with regions of higher concentration corresponding to lower diffusivity and vice versa.
To test for the WMC, we calculate the normalized correlation coefficient versus time,
where C′ · A′H = Σi Σj C′ijA′Hij, and σ2 is the variance. Significance levels are calculated using a version of the technique of Perry and Smith (1994). At a given time, the C′ field was repeatedly randomized and the test statistic Cor recalculated. A histogram of the 2000 Cor values was constructed and ±Cor values corresponding to 90% and 95% of the area was determined. So, for example, the 95% significance values mean that given the set of C′s at a particular time, 95% of the time we would expect a random correlation between C′ and A′H to lie between ±Cor95. If the actual Cor is outside of this range, then it is deemed significant; and if this result pertains for all time steps and numbers of particles, then we consider that the WMC is not attained.
c. The 1D LSM tests
Using the “raw” model data (Fig. 1) the WMC was never found to be met. This is illustrated in Fig. 3, which shows a plot of 〈C〉 ± std dev, averaged over ten 500-particle trials. The ensemble mean concentration has large dips between −10 and −20 m, and −40 and −50 m, indicating that particles are evacuated from that zone. These problem regions correspond to regions of rapidly changing IH (Fig. 1). The plot also shows the behavior of V versus the number of trials, determined by repeatedly subsampling the 20-trial ensemble. Note that at least 10 trials are required before V levels off. The idea that about 5000 particles are required to achieve statistical certainty in an LS model is consistent with the atmospheric literature on this subject. The above result pertained regardless of the time step chosen or the method of calculating vertical derivatives. As well, runs where the boundaries were extended “to infinity” with homogeneous turbulence—to eliminate possible boundary effects—showed no difference.
Two possible reasons for failure to meet the WMC are as follows. 1) The model formulation is somehow deficient (e.g., due to truncation at some lower order) and higher-order derivative terms are required to make the model work in highly inhomogeneous turbulence. Or 2) the input data resolution is such that calculation of derivatives is too imprecise to allow demonstration of the WMC; that is, the data are effectively discontinuous. In the derivation of LS models (Gardiner 1983; Rodean 1996) it is not obvious that important higher-order terms are omitted, and this fact plus the historical success in the use of LSMs makes us doubt that the model formulation is deficient. Given the nature of the input data (Fig. 1) it is more likely that the resolution offered by the raw data combined with the inhomogeneous nature of the turbulence leads to fields that are not resolved smoothly enough to allow the model to satisfy the WMC.
To demonstrate this, the TL and σ2w data were fit to sixth- and fifth-order polynomials, respectively, and output at 0.5-m resolution (Fig. 1, dashed lines). Note that the fit produces smooth and reasonable looking σ2w and TL profiles and significantly reduces the amplitude of the IH profile. Figure 4a shows 〈C〉 ± std dev (10 trials) for the smoothed-data model. The mean curve has lots of small-scale wiggles but is within one std dev from five particles per meter at all levels, thus satisfying the WMC. Figures 4b,c show the behavior of the average variance and the fraction of trials meeting the WMC as a function of number of trials. The latter was computed by repeatedly subsampling the 20-trial ensemble and shows again that at least 10 trials must be done (5000 total particles) to have statistical confidence in the LSM output.
The above runs were done using a constant time step of 0.5 s (almost 20 times smaller than the minimum TL in the dataset). Figure 5 is a representative result for dt = 10.0 s, showing an evacuation of particles near −40 m as in Fig. 3 and problems near the bottom boundary where the time step is O(TL). It was found that if the time step is increased to 10 s, the WMC is never met for the same input data. Therefore, even for smooth data an incorrect choice of time step can lead to incorrect results.
To test the hypothesis that data smoothness–resolution, with its effect on derivative calculation, is the reason that the WMC is not met, we used Gaussian analytical TL and σ2w profiles
output at increasing vertical resolution, and investigated whether the WMC could be satisfied. The reason for the choice of a Gaussian shape is that it roughly represents the vertical profile of turbulence in a two–boundary layer system and has the properties of strong variation in IH plus homogeneous turbulence at the boundaries, thus removing the possibility that problems are due to boundary effects.
Figure 6 shows the result for turbulence data output at 2.5-m resolution, for which the WMC was not met. Figure 6a contains the 20-trial ensemble of curves, exhibiting a peak at zero depth with only one curve having a value on the low side of the eight particles per meter mean concentration near that depth. This is illustrated in Fig. 6b where the 20-trial ensemble mean ±1 std dev and the mean concentration line are plotted. Figure 6c shows the average variance versus number of trials indicating that at about 10 trials a statistical stable result is obtained.
Figure 7 shows the result for turbulence data output at 0.25-m resolution where the WMC is attained (Fig. 7a). Figure 7b shows the variance versus number of trials, again leveling off around 10 trials. Figure 7c shows the fraction of trials satisfying the WMC versus number of trials indicating that 16 trials (8000 particles) are required to demonstrate the WMC with 100% certainty.
We note that the above results are sensitive to the peak-to-peak amplitude of IH(ΔIH), a tunable parameter. As ΔIH decreases it gets progressively easier to demonstrate the WMC; as it increases a point is reached where it is impossible to show a WMC except when analytical functions are used within the LSM to calculate derivative quantities.
The above results support the hypothesis that resolution problems and their effect on turbulent derivative calculation lead to the inability to satisfy the WMC.
d. The 2D RDM tests
In this section we look at the performance of a 2D random displacement model [Eq. (8)] using circulation model output data (AH field) and an analytical AH field as input data. Figure 8 is a plot of the normalized correlation coefficient versus time, and the 95% significance level, for a run using the circulation model AH field. The figure shows that the correlation levels off after about 18 days and is significant; that is, the WMC is not met using the AH field output from the circulation model. The leveling off of the correlation represents the time it takes for a diffusive equilibrium to be established. Changes in the time step and number of particles made no difference to this result.
To test whether the inability to demonstrate a WMC was due to resolution effects we ran the RDM using a 2D symmetric analytical Gaussian AH field, output at 2- and 10-km resolution. The origin of the Gaussian was at the center of the domain with a spread of 20 km. The correlation plots for 10- and 2-km resolutions (Fig. 9) show that, as in the 1D case, a finer resolution allows the WMC to be realized. As in the 1D case, if the “inhomogeneity” of the 2D AH field is increased (by decreasing the spread), then the demonstration of the WMC becomes more difficult unless the analytical function is used to calculate the spatial derivatives.
To summarize, circulation model output turbulence data in both the vertical and horizontal directions may be such that the well-mixed condition cannot be demonstrated. Tests using analytical forms of turbulence data show that this is due to resolution problems with inhomogeneous turbulence data. As output resolution is increased, derivatives of turbulent quantities are better determined and the WMC can be demonstrated.
4. Determining the number of particles
The tests for the WMC illustrate the principle method for determining whether enough particles were used: perform multiple trials and investigate how the answer (or some measure of it) changes. The variance test is an example of this procedure. Without a test of this nature one cannot be certain that the output from the stochastic model is representative of the (desired) ensemble statistics, or an outlier from which it would be dangerous to draw conclusions. In this section we identify two types of errors due to underseeding (or undersampling) the source region and show that the number of particles required, and thus the potential for error, is specific to the problem under consideration.
To illustrate what we will call a U-I underseeding error (U-I error) we ran a 1D LSM on the simple problem of relaxation of an isolated uniform plug of tracer in homogeneous turbulence (the “top-hat” problem). The analytical solution to this problem provides a ground truth to evaluate the LSM performance. Figure 10a shows the concentration profile after 3 h for ten 500-particle trials of an LSM, with initial uniform seeding between ±12.5 m. Figure 10b shows the ensemble mean ± std dev for the LSM and the analytical solution. Note that the ensemble average accurately simulates the analytical solution, and that any of the 10 trials would reasonably represent the analytical concentration profile. That is, 500-particle seeding is dense enough that a single trial is representative of the ensemble and thus approximates the correct answer everywhere.
Figure 10c is the same as Fig. 10a, except that the initial domain is seeded with 50 particles. In this case, the variation in the 10 trials is much larger, although the ensemble mean plus deviation (Fig. 10d) still captures the analytical solution. However, if one were to have done just the one trial highlighted by the wider black line in Fig. 10c, errors as great as 100% (compared to the analytical solution) would have resulted. Therefore, if the source region is underseeded, an individual trial can poorly represent the ensemble and thus lead to erroneous conclusions. We call this a U-I error.
To illustrate a U-II error and show how the required number of particles is specific to the problem under consideration, we refer to more relevant examples using Quoddy flow fields for the southwest Nova Scotia–Bay of Fundy (SWN–BoF) region.
Browns Bank is the principal spawning ground for haddock in the SWN-BoF ecosystem, and there is considerable interest in the partitioning of haddock eggs and larvae between the bank, with its retaining gyre, and downstream in the BoF (Campana et al. 1989; Shackell et al. 1999). We used the Quoddy 3D climatological spring flow field with residual horizontal turbulence (AH field) to track particles spawned on the southeast portion of the bank to look at two questions: 1) what is the fraction of particles retained on the bank as a function of time, and 2) what is the concentration in some downstream grid cell as a function of time? We desire to know the number of particles/trials required to answer these questions with statistical confidence and accuracy.
As an illustration of the problem, Fig. 11 shows particle drift and dispersion at 10-day intervals for a 500-particle trial. The Browns Bank 100-m isobath is outlined, plus a 20 × 20 km box downstream at the mouth of the BoF. In this experiment, particles are tracked at the 20-m level, and the AH field is smoothed in order to avoid problems with the WMC.
Figure 12 shows the fraction of particles retained within the Browns Bank 100-m isobath as a function of time, for various total number of particles (100–2000). All of the curves are remarkably similar, with as few as 100 particles (solid line) yielding an answer accurate to about 10% after 20 days. For this question, 500 particles are all that are required to obtain a correct answer.
To understand why this is so, think of this as a source-target problem, with Browns Bank as the source and, effectively, Browns Bank (or not Browns Bank) as the target. Looked at in this way, it is logical that few particles are required for an accurate answer because the target area is large and is thus “easy to hit.”
Looked at in the same way, we can anticipate that the problem of accurately determining the concentration at a remote downstream location will be more difficult and will require more particles. Figure 13 shows the fraction of particles found within a 20 × 20 km square near the mouth of the BoF at days 45–60 for sixteen 2000-particle trials. Note first that the maximum spread is about a factor of 2 at day 50, and by eyeballing the mean and scatter at each time, one would estimate that a 2000-particle trial is within about 30%–40% of the ensemble mean. If we were concerned with the temperature field that the particles experienced, and this field changed over a scale of hundreds of kilometers, then this level of accuracy could be considered good enough. If, on the other hand, the particles represented an oil spill, and the 20 × 20 km grid cell part of a major feeding ground, then greater accuracy may be required.
To get a more precise measure we use the variance test described in section 3. Figures 14a,b show the mean concentration and std error of the mean at day 45 as a function of number of 2000-particle trials, determined (as above) by repeatedly subsampling the 16-trial ensemble. Figure 14c shows the 95% confidence interval determined from a Student's t-test, expressed as a fraction of the mean. We see that the mean and std error stabilize at around eight trials (or 16 000 particles), at which point the t-test would indicate that the true mean would lie between ±10% of the estimated mean (≈0.0325), at the 95% confidence level. Note that the mean varies only slightly with the number of trials, and the std error is small (∼0.002/0.0325 ∼ 6% or less) even for two trials, or 4000 particles. Therefore, with less confidence, but also considerably less computer time, one 4000-particle run would give a satisfactory answer.
The same technique can be used on 500-particle trials, shown in Fig. 15. The figure shows that again the mean and standard error stabilize around 9–10 trials, but with a much higher std error (≈25%) and a confidence interval of about 60% of the mean. Therefore, 10 × 500 particles does not seem to yield a satisfactory result and calls into question the implied equivalence between N trials of M particles and one trial of N × M particles. Note, however, that the mean concentration is roughly half as large as that seen in 2000-particle runs. This illustrates what we call a U-II error, resulting from underseeding the source area in a way that does not sufficiently sample the subarea that would, on average, have trajectories that pass through the target. This problem, which also occurs for strictly deterministic flow fields, is illustrated in Fig. 16. Consider the situation, for deterministic flow, in which the middle 20% of a source region (S′ = 0.2S) is the source for trajectories that intercept the target region (T) downstream (Fig. 16a). Let us uniformly seed the entire region with an increasing number of particles and predict the fraction of particles that will intercept the target. One particle seeded uniformly in S would lie in S′ and thus we would predict that 100% of particles would hit T. Two particles would miss S′ completely and lead to a 0% prediction, et cetera. Figure 16b shows the prediction as a function of the number of uniformly seeded particles and indicates that there is a minimum number of particles required to obtain an accurate answer. Figure 16c shows the day-45 concentration of particles in the 20- × 20-km grid discussed above for 100-, 500-, 1000-, 2000-, and 4000-particle trials and shows that for this specific problem at least 1000 particles are required to avoid a U-II error.
In this paper we have presented various aspects of Lagrangian stochastic modeling directed toward practical applications in coastal oceanography. Starting from a general review of theory, we looked at one- and two-dimensional tests of the well-mixed condition and the problem of determining the number of particles or trials required to produce reliable results.
Oceanographic stochastic modeling tends to be different from atmospheric modeling in that turbulent data for use in oceanographic stochastic models are output from ocean circulation models while atmospheric models use a surface stress and analytical functions to provide vertical structure. We found (section 3) that the output turbulence data from coastal circulation models can be effectively discontinuous so that first derivatives do not exist, and that this makes demonstration of the WMC impossible. We showed that this was likely the correct interpretation by using analytical turbulence data output at various resolutions and showing that the WMC could only be demonstrated for sufficiently high resolutions. In all cases discussed, the use of the analytical functions themselves removed this problem. This makes the case for the development of canonical vertical structure functions for oceanic turbulence. Although beyond the scope of this work, such solutions may be possible from simplification of the governing turbulence closure equations. Loss of accuracy due to simplification could be balanced against the ability to use vertical turbulent schemes in oceanographic particle tracking.
Despite difficulties in demonstrating the WMC, it is possible that for practical problems acceptable answers could still be obtained; that is, the WMC test could be too rigorous. It is difficult to answer this question because the alternative to an LSM, an advection–diffusion equation model, suffers from inaccuracies as well, principally involving artificial diffusion, and overestimation of dispersion for short release times (see, e.g., Zambianchi and Griffa 1994). Nevertheless, we compared a 1D RDM, a 1D LSM, and a diffusion equation model (DM) based on the turbulence data in Fig. 1 for the problem of dispersion of an initial concentration of passive tracer with a peak at −30 m. [This profile is meant to simulate the observed distribution of haddock larvae in SWN. See Frank et al. (1989, Fig. 4.)] Despite arguments about the degree of diffusion in the diffusion model, the pattern obtained will be essentially correct. Figure 17a shows the DM solution at 24 h, plus the initial concentration and input kυ profiles. The error measures ratio RDM/DM and LSM/DM are plotted in Fig. 17b. We see that the greatest LSM errors occur at exactly the regions where the WMC problems occurred (Fig. 3a) and that the RDM arguably does slightly better than the LSM. This comparison indicates that if there are problems in demonstrating the WMC, then this is a warning flag for the use of a vertical turbulent scheme in a particle tracking routine.
We discussed (section 4) how to determine the number of particles, or number of trials, required to have statistical confidence in one's results. The method involves doing repeated trials using the stochastic model, forming the ensemble average and standard deviation, and determining when these statistics stabilize. This method was used to determine whether the WMC was satisfied (the variance and correlation tests described in section 3).
We identified two types of errors related to underseeding the particle source region (section 4). What we call a U-I error was illustrated by using an LSM to model a problem with an analytical solution. This type of error is due to using so few particles that significant deviation from the desired answer may result. In other words the uncertainty in the output is so large that the result may bear little resemblance to the true answer. We see this as akin to an undersampling problem in which the underlying distribution is so poorly sampled that the statistics of the sample do not properly represent the statistics of the distribution.
A U-II error is also an undersampling error, but it involves an underseeding of the part of the source region whose trajectories intersect the target area and illustrates possible problems associated with indiscriminate use of statistical tests. Essentially, it is possible with sparse seeding to miss source regions whose (turbulent) streamlines intersect the desired target. Repeated trials with the same seeding lead to stable but incorrect statistics. The example we gave (section 4) showed how one could underpredict downstream particle concentration by a factor of 2.
In Lagrangian stochastic modeling there is an apparent equivalence between N trials of M particles, and one trial of N × M particles. In the example used to illustrate a U-I error, we saw that ten 50-particle trials or one 500-particle trial gave acceptable results. However, the U-II error example pointed out a caveat in this equivalence: if a U-II underseeding situation pertains, the ensemble of any number of trials will give the wrong answer. The lesson is that one has to test for a minimum number of particles as well as statistical stability (Fig. 16).
Associated with the above are the ideas of how the question being asked determines the required number of particles and what a satisfactory answer is. The example of the fraction of particles seeded on Browns Bank that were retained on Browns Bank as a function of time required few particles (∼500) to achieve an accurate answer, while the accurate determination of the concentration in a 20 × 20 km region downstream at the mouth of the Bay of Fundy required a minimum of eight 2000-particle trials. In the latter example we showed that the uncertainty, once the minimum number of particles was established, was about 30% so that if accuracy requires are low, one 2000-particle trial would be adequate (Fig. 13).
We discussed the variability in the required number of particles in terms of a source–target problem. If the target is “large” and easy to hit, as in the case of the retention problem, then few particles will be required. If the target is “small,” then more particles will be required and the possibility of a U-II error increases. It is difficult to make the concept of effective target size rigorous, but factors involved would be the distribution of streamlines emanating from the source that intersect the target, the turbulent intensity between the source and target, and the timescale for traveling the source-target distance. Because Lagrangian stochastic modeling is a computer-intensive task with repeated trials of large numbers of particles typically necessary to produce believable results, a recipe based on the source–target concept would be a useful tool to guide modeling efforts.
The increase in the use of Lagrangian stochastic models in coastal oceanography is driven by the use of individual-based models in biophysical modeling of the dispersion, growth and mortality of fish eggs and larvae. The main advantage of individual-based models is the simple representation of growth and mortality functions. Although problems exist in translating these relationships to continuum mechanics form, doing so allows the complete model to be recast in advection–diffusion equation form. This transformation would eliminate problems related to the well-mixed condition and speed up computation time. It is possible that the work required to convert individual-based models to concentration-based equations will be well worth the effort.
Author D. Brickman was supported by U.S. GLOBEC.
Corresponding author address: Dr. David Brickmann, Department of Fisheries and Oceans, Bedford Institute of Oceanography, P.O. Box 1006, Dartmouth, NS B2Y 4A2, Canada. Email: email@example.com
U.S. GLOBEC Contribution Number 186.