## 1. Introduction

Small-scale mixing is an important issue for a variety of atmospheric and oceanic processes. In situ measurements of velocity and/or temperature microstructures in the atmosphere and ocean enable direct estimates of turbulence parameters to be obtained, but such measurements remain difficult, expensive, and sparse (e.g., Alisse and Sidi 2000; St. Laurent and Schmitt 1999).

Thorpe (1977) proposed an elegant method to identify and characterize turbulent patches from in situ measurements by comparing an observed and a sorted vertical profile of potential density (or potential temperature). The stable sorted profile corresponds to a rearranged (reference) profile that can be obtained by adiabatic displacements. Because a vertical profile of potential density (temperature) in a stably stratified fluid is a monotonic function of depth (altitude), overturns display clear signatures in the difference between the measured and sorted profiles. Such overturns trigger convective instabilities, which in turn produce turbulent mixing. The Thorpe method can a priori be applied to standard soundings, either conductivity–temperature–depth (CTD) measurements in the ocean or lakes (e.g., Thorpe 1977; Galbraith and Kelley 1996; Ferron et al. 1998; Alford and Pinkel 2000) or pressure–temperature (PT) measurements in the atmosphere (Luce et al. 2002; Gavrilov et al. 2005; Clayson and Kantha 2008). Recently, it has been proposed to apply the Thorpe analysis to a huge meteorological radiosondes database to infer the space–time variability of atmospheric turbulence (Clayson and Kantha 2008).

The Thorpe signal is defined at each altitude (or depth) as the signal difference (potential density or potential temperature) between the observed and ordered profiles. The difference of altitude (depth) of a data bin in the observed and sorted profiles defines the Thorpe displacement. Thorpe displacements can be viewed as the vertical distances the observed fluid parcels must be adiabatically displaced so that the fluid becomes gravitationally stable everywhere, the available potential energy reaching a minimum. Thorpe displacements are thus related to the available turbulent potential energy (Dillon 1984). The Thorpe length *L _{T}* of an overturn is the root-mean-square (rms) of Thorpe displacements for the considered overturn. The Thorpe length is thought to be proportional to the Ozmidov scale

*L*

_{0}= (

*E*/

_{k}*N*

^{3})

^{1/2}, where

*E*is the kinetic energy dissipation rate, and

_{k}*N*is the buoyancy frequency (Ozmidov 1965; Dillon 1982). Several methods have been proposed to relate either

*E*or

_{k}*L*to an eddy diffusion coefficient

_{T}*K*(Gavrilov et al. 2005).

The Thorpe scale approach has several limitations, however. For example, ship motions or balloon wake can cause severe errors in perturbing the probe movement or the flow itself. Beyond such problems (not discussed here), instrumental noise is a key issue, because noise can produce potential density/temperature inversions in the observed profile, especially in case of weak background stability. In the present context, an inversion is defined as a localized decrease of potential temperature versus height, or of potential density versus depth, whatever its origin. Following the terminology used by Johnson and Garrett (2004), the term *inversion* will refer to both real and artificial structures in a potential density/temperature profile. The term *overturn* will specifically refer to an inversion resulting from atmospheric motions (turbulence or Kelvin–Helmholtz billows). The selection of artificial inversions as overturns may result in a dramatic overestimation of the turbulent fraction. The space–time inhomogeneity of turbulence can give rise to diffusion coefficient estimations diverging by several orders of magnitude. It is therefore crucial to apply quantitative procedures to discriminate overturns from inversions produced by noise.

A variety of approaches have been proposed for this objective. Thorpe (1977) rejected displacements for which the Thorpe signal does not exceed a predetermined noise level. Ferron et al. (1998) and Gargett and Garner (2008) defined an intermediate density profile in which the density of neighboring points only differs if the difference between them is larger than a predetermined noise threshold. Also, these authors required an overturn to be an isolated region, that is, a region for which fluid parcels are exchanged with others belonging to the *same* region. Galbraith and Kelley (1996) proposed a series of tests for identifying overturns within CTD profiles. First, they considered limits on the minimum size for overturns to be detectable based on the instrumental noise and density gradient. Second, the authors suggested that the run length, defined as the number of adjacent bins of the Thorpe signal with the same sign, can be a useful diagnostic. They based the selection of overturns on an ad hoc value of the ratio between the observed run lengths and that expected from noise. Timmermans et al. (2003) suggested comparing their observed run lengths to the rms run length expected from a random sequence of Bernoulli trials with probability of ½. Gavrilov et al. (2005) selected data segments (arbitrarily of 12.8-m length) as not affected by the noise by applying two combined criteria: 1) the horizontal structure function at 1 m should exceed twice the noise variance, and (2) a potential temperature increase defined as *L _{T}dθ*

_{0}/

*dz*(where

*θ*

_{0}is the sorted potential temperature) should exceed twice the noise standard deviation. Piera et al. (2002) proposed another approach. After a wavelet denoising of the data, they estimated the potential displacement errors defined as the standard deviation of displacements induced by random instrumental noise. They then considered a data segment as a part of an overturn from the fraction of displacements exceeding the potential errors within that segment.

Alternatively, Alford and Pinkel (2000) qualified selected inversions as overturns when these inversions were present in both temperature and density profiles. Gargett and Garner (2008) proposed a diagnostic parameter allowing for the elimination of inversions resulting from density spikes. Finally, Galbraith and Kelley (1996) as well as Gargett and Garner (2008) introduced tests for rejecting density inversions caused by time-response mismatches in temperature and conductivity sensors. Such validation methods, mostly relevant in an oceanographic context, were addressed in a recent paper of Gargett and Garner (2008) and will not be further discussed here.

The selection methods based on a threshold depending on the noise level (amplitude of Thorpe signal, run lengths, and noise-induced displacements) are potentially very effective if measurement noise induces random errors in the sorted profile. The key assumption of these methods is that the errors on the Thorpe signal, or on the Thorpe displacements, induced by instrumental noise are independent. This assumption is questionable, however. For instance, Johnson and Garrett (2004) investigated the effects of noise on run lengths. These authors clearly demonstrated that the run-length statistics are related to both the background stratification and the size of the data sample. They concluded that “comparing the distribution of run-lengths or the rms run-length with that expected from a random uncorrelated series is not a reliable way of distinguishing between signal and noise.”

In the present paper, we propose an alternative method to discriminate overturns from inversions induced by random noise. The method is based on the data range statistics within the detected inversions. The range of a sample is defined as the difference between the maximum and the minimum values in that sample. A statistical hypothesis test enabling for the discrimination of turbulent overturns from noise-induced inversions is described.

The paper is structured as follows. In section 2, the effects of noise upon a sorting procedure are investigated. It is shown that the Thorpe signal and displacements due to noise are not independent. A quantitative criterion for the minimum size an overturn must reach to be detected is derived. In section 3, we describe the selection method of overturns and evaluate its performance through Monte Carlo simulations. Section 4 illustrates the use of the method on an atmospheric profile at low and high vertical resolutions. The conclusions are drawn in section 5.

## 2. Instrumental noise effects on the Thorpe analysis

### a. Basic illustration

Measured density (or temperature) profiles are affected by random instrumental noise. Considering for simplicity a monotonic profile of potential density/temperature to which noise is added, artificial inversions can be produced by the noise only. The probability of occurrence for noise-induced permutations depends on the stratification, on the vertical resolution, and on the noise level (see appendix A).

*X*(potential density or temperature) regularly sampled along the vertical, that is,

*X*=

_{i}*X*(

*z*), where

_{i}*z*=

_{i}*z*

_{0}+

*i*×

*δz*is the

*i*th sampled level (altitude or depth);

*z*

_{0}and

*δz*are an initial level and the vertical sample step, respectively. A key parameter for quantifying the impact of measurement noise in a sorting procedure is the average of the signal difference between consecutive data bins

*τ*

*σ*: where

_{N}*τ*

*X*(

*z*

_{i+1}) −

*X*(

*z*)

_{i}*τ*

*ζ*

*n*, the trend estimate reduces to It is shown in appendix A that the probability of a noise-induced permutation between two bins can be expressed as a simple function of tnr

*ζ*only. A bulk tnr

*ζ*

*ζ*

^{−1}corresponding to microstructure measurements for comparable stratification conditions (e.g., Gavrilov et al. 2005).

Artificial inversions can easily be produced in case of weak stability. Figure 1a shows a synthetic signal (dots) made of two sections with distinct linear trends: a region with small tnr [*ζ* = 0.5 × 10^{−2}, from bin 50 to bin 150, is enclosed within a region with larger tnr (*ζ* = 10^{−1})]. A normally distributed noise *B* ∼ *N*(0, 1) is added. There is no overturn in this case. Such staircase profiles are commonly observed in real geophysical flows (e.g., Dalaudier et al. 1994; Woods 1968; Coulman et al. 1995). The dashed curve of Fig. 1a shows the sorted profile. Within the weakly stratified region, the sorted profile is clearly biased because of the sorting of noise. Figures 1b and 1c show the Thorpe signal and the Thorpe displacements, respectively. The Thorpe signal (displacements) is biased toward positive (negative) values in the lower part of the weakly stratified region and toward negative (positive) values in the upper part of this region. Such vertical distributions for the Thorpe signal or Thorpe displacements are very similar to the ones observed in case of a real overturn. The issue is how to discriminate without ambiguity overturns from such noise-induced inversions.

### b. Sorting of a noise sample

To illustrate the impact of the sorting of noise, Fig. 2a shows a sample of 100 independent normally distributed variables *B* ∼ *N*(0, 1) (dots on left panel). The solid line shows the sorted profile. As expected, the lower part of the sorted bins are systematically negative, the upper part positive, and the slope of the sorted profile is positive everywhere. The sorted profile is indeed an empirical estimate of the cumulative distribution function (cdf) of the sample. The dashed curve of Fig. 2a shows the associated theoretical cdf (×100) for a normally distributed population. The Thorpe signal and the Thorpe displacements are shown in Figs. 2b and 2c, respectively. Clearly, the Thorpe signal or Thorpe displacements of neighboring points are correlated.

*D*(

*i*) be the difference between the position

*i*and the rank of the

*i*th bin

*R*(

*i*), that is,

*D*(

*i*) =

*i*−

*R*(

*i*). [The Thorpe displacement

*D*is simply

_{T}*D*(

_{T}*i*) =

*D*(

*i*) ×

*δz*.] For

*n*independent, identically distributed (iid) variables, the probability for the

*i*th bin to be at any rank

*k*is simply 1/

*n*. On the other hand, the possible values for

*D*(

*i*) range between

*i*−

*n*and

*i*− 1, that is, The displacement

*D*(

*i*) is uniformly distributed:

*D*(

*i*) ∼

*U*(−

*n*+

*i*;

*i*− 1), and the mathematical expectation E[

*D*(

*i*)] is For a given bin

*i*, the expectation of displacement for

*n*iid variables depends on position

*i*, but also on the sample size

*n*(dashed curve of Fig. 2c). The sorting of a noisy signal within a neutral (or weakly stratified) region will systematically produce such a positive bias in the reference profile, and thus in the Thorpe signal and Thorpe displacements, making such a region indistinguishable from an overturn (cf. with Fig. 1c).

It is clear from this example that the run-length diagnostic (Galbraith and Kelley 1996) cannot be relevant for such cases. As already noticed by Johnson and Garrett (2004), long run lengths are expected at the top and bottom of weakly stratified regions. The method based on a comparison between Thorpe displacements and noise-induced displacements (Piera et al. 2002) also fails because 1) the displacements due to noise are no longer iid; 2) the noise-induced displacements are not simply proportional to the standard deviation of noise (Fig. 2c); and 3) the mean stratification (from which are estimated the noise-induced displacements), usually evaluated from the sorted profile, is systematically biased (overestimated) due to the sorting of noise (Fig. 2a) if the stability is weak.

### c. Size of detectable overturns

As stated by Galbraith and Kelley (1996), a limit in the detection of overturns results from the need to measure density (temperature) differences within overturns exceeding a threshold imposed by noise and background stratification. However, this threshold not only depends on the noise level but also on the size of the overturn. The difference between the maximum and minimum of the values of a sample defines the sample range *W*. The statistics of the range for an *n* sample of iid variables (i.e., random noise) depends on both the probability distribution and the size of the sample *n* (e.g., David 1981, and appendix B). For instance, Fig. 3 shows the pdf of the range *W _{N}*(

*n*) of

*n*iid variables

*N*(0, 1), for

*n*= 10, 20, 50 (the subscript

*N*stands for a normally distributed parent population). These pdfs are estimated by numerical integrations of Eq. (B5). As the sample size increases, the pdf of the range is shifted toward larger values and its standard deviation decreases. From these pdfs, various moments and percentiles can be estimated. Figure 4 shows the expectation of the range

*E*[

*W*(

_{N}*n*)] as well as the 95 and 99 percentiles,

*w*

_{95}and

*w*

_{99}, respectively, for iid variables,

*X*∼

*N*(0, 1), as a function of the sample size

*n*for 2 ≤

*n*≤ 5000 (see also Table 2).

*n*, that is, thickness

*L*= (

_{I}*n*− 1)

*δz*. To distinguish an overturn from a noise-induced inversion, the range of the data (potential density or temperature) within the inversion

*W*(

*n*) must significantly exceed the range of a noise sample of similar size,

*W*(

_{N}*n*), that is, The order of magnitude of the minimum tnr for an overturn of size

*n*to be detectable is derived from (5): Figure 5 shows

*ζ*

_{min}as a function of the sample size

*n*, that is, the size of a detected inversion. The points below the curves are unlikely to be observed, especially if a trend (negative or positive) exists (if any trend exists, the expectation of range is always larger than the expectation for the the null-trend case). For a given tnr, that is, for a given stratification, measurement resolution, and noise level, the curves indicate the minimum length (in bins number) for an overturn to be detectable. For the sake of comparison, the threshold tnr proposed by Galbraith and Kelley (1996) is, with our notations,

*ζ*

_{GK}= 2/(

*n*− 1) [from their Eq. (5)]. This threshold is shown on Fig. 5 (dotted line). For large inversions, this criterion is too optimistic, as the increase of data range due to the sorting of noise is ignored.

## 3. Overturn identification

### a. The selection procedure

The basic idea of this paper is to infer the existence of an overturn from the range of the signal within the identified inversions (artificial or real). If an inversion is produced by the sorting of noise (*H*_{0} or null hypothesis), the range of the data is expected to be distributed as the range of an *n* sample of noise. Conversely, if an overturn is produced within a stably stratified region (*H*_{1} hypothesis), the signal range is expected to exceed significantly the range of a noise sample. Indeed, as the background trend increases, whatever the sign of the trend is, the pdf of the range is shifted toward larger values. A quantitative criterion allowing for the acceptance or rejection of the null hypothesis (*H*_{0}) can be obtained by comparing the signal range with the pdf of a noise sample of the same size. The inversion is likely to be an overturn if the signal range exceeds a threshold corresponding to a large percentile of the noise sample. Otherwise, one can not draw a conclusion.

To illustrate our purpose, we build a synthetic signal *X _{i}*, 1 ≤

*i*≤ 100. Normally distributed noise

*B*∼

*N*(0, 1) is added to a test model profile. A realization for which

*ζ*

*ζ*

*σ*= 1) is shown in Fig. 6. “Solid-body” and “sine” overturns are simulated from bin 11 to 30 and 71 to 90, respectively (Fig. 6a). A weakly stratified region, from bin 41 to 60, produces an artificial inversion that is clearly visible in the Thorpe displacements profile (Fig. 6d).

_{N}To determine the size of the detectable overturns, we first evaluate a bulk tnr. From the trend estimate *τ**X*_{i+1} − *X _{i}*

*X*

_{100}−

*X*

_{1})/(100 − 1) and standard deviation of noise

*σ*, we deduce

_{N}*ζ*

*τ*

*σ*(dashed line of Fig. 6b). A local value for the tnr is estimated from the sorted profile:

_{N}*ζ*= (

_{i}*X*

_{i+1:100}−

*X*

_{i−1:100})/2

*σ*, where

_{N}*X*

_{i:100}is the bin of rank

*i*within the sorted 100 samples (dashed curve of Fig. 6a). A minimum of tnr is clearly visible in the central part of the profile, where the stratification is weak. Given the bulk tnr, an order of magnitude for the size of a detectable overturn can be estimated from Fig. 5. From our simulated dataset, the bulk tnr is 0.5 and the size for a detectable overturn is on the order ∼10 bins (Fig. 5). Consequently, noise reduction is not required because we consider 20 bin inversions in the present case. The noise reduction of data will be discussed in section 4c.

The next step consists of identifying inversions (real and artificial) from the signal (potential density or potential temperature). According to Dillon (1984), an inversion is defined as an isolated region, that is, a region for which *n* is the size of the inversion. We also required, for any 1 ≤ *k* ≤ *n*, that

*H*

_{1}hypothesis) or not (

*H*

_{0}hypothesis). If not, the inversion is believed to be induced by noise. The signal range within the detected inversions has to be compared with the range of a noise sample of the same size because the pdf of the range depends on the size of the sample (see Fig. 3). Let

*w*(

_{p}*n*) be the

*p*percentiles of the range of

*n*iid normally distributed variables, that is, Pr[

*W*(

_{N}*n*) <

*w*(

_{p}*n*)] =

*p*/100. By retaining a

*p*% confidence level, an inversion of size

*n*is recognized as a noise-induced one (

*H*

_{0}hypothesis) if its data range

*W*(

*n*) is such that and is rejected otherwise (

*H*

_{1}hypothesis). The probability of rejecting

*H*

_{0}if

*H*

_{0}is realized (type I error, corresponding to an erroneous detection) defines the significance level

*α*(

*α*= 1 −

*p*/100 = 0.05 or 0.01). The probability

*β*of accepting the

*H*

_{0}hypothesis when

*H*

_{1}is realized (type II error, corresponding to the nondetection of a real overturn) depends on the characteristics of the overturns and may be estimated through Monte Carlo simulations (see Table 1). The power of a hypothesis test is usually defined as the probability of rejecting

*H*

_{0}when it is false, that is, to detect an existing overturn, which is equal to 1 −

*β*(Freeman 1963). The thresholds of range

*w*(

_{p}*n*) corresponding to significance levels

*α*= 0.05 and

*α*= 0.01, that is,

*w*

_{95}and

*w*

_{99}, are plotted in Fig. 4 and tabulated in Table 2. The expectation, standard deviation, as well as various percentiles of the range of a normally distributed population

*N*(0, 1) can be downloaded as an American Standard Code for Information Interchange (ASCII) file (available online at ftp://ftp.aero.jussieu.fr/pub/os/WN.txt). If the noise is assumed to follow a distribution other than normal, the percentiles of the range have to be numerically calculated from Eq. (B5).

### b. Validation of the hypothesis test

Monte Carlo simulations (1000 runs) were performed from the synthetics presented in the previous section with various bulk tnr values. For the particular realization shown in Fig. 6, for which tnr = 0.5, the detected inversions extend over about 85% of the profile due to the generation of small size noise-induced inversions.

The thresholds of range corresponding to significance levels of 5% and 1% for a 20-bin noise sample are *w*_{95}(20) = 5.01 and *w*_{99}(20) = 5.65, respectively (Table 2). For a tnr *ζ**W* = 11.9 ± 1.8 (here *σ _{N}* = 1). The sine overturn (bins 71–90) for which

*W*= 7.0 ± 1 is identified for 98.1% of the runs (type II error for about 1.9% of the cases). For the artificial inversion (bins 40–60),

*W*= 3.6 ± 0.95. The inversion is selected as an overturn for about 0.7% of the runs (see Table 1). Figure 6 (top panel) shows the position and vertical extent of the overturns as well as the corresponding Thorpe lengths for this particular realization, a case where the two overturns are correctly identified (more than 98% of the cases).

For *ζ**ζ*

### c. Influence of stirring by turbulence

Within the ocean or atmosphere, overturning layers frequently contain random (turbulent) fluctuations and solid-body or sine overturns are only seldom observed. Furthermore, the cumulative sum criterion applied for the detection of inversions may appear questionable in case of a random distribution of fluid particles positions within the inversions. Another simulation was thus performed by assuming particles randomly distributed within the inversions. Such a complete randomization of the bins’ positions simulates an extreme influence of stirring. Compared to the previous case, the data range remains unchanged within the inversions, but the Thorpe displacements are now fully random (and not only randomized by the noise). It is likely that most of the oceanic or atmospheric overturns lie in between these two extreme cases, from “solid” to “fully stirred” overturns.

As mentioned previously, we performed 1000 runs with various bulk tnr but with stirred inversions. The lower panels of Fig. 6 show the results of the selection procedure for the same particular realization (upper panels) but with fully stirred inversions. No significant difference was noticed, neither in the inversions detection nor the fraction of selected overturns (Table 1). The Thorpe length estimate does not seem to be affected much by the assumed complete stirring, although it is slightly smaller in case of a solid-body overturn. Also, the shape and amplitude of the cumulative sums of displacements are now roughly the same for the three main inversions (Fig. 6h). Therefore, one can conclude that the proposed selection criterion for the detection of overturns does not depend on the presence (or importance) of stirring within the inversions.

Results of Monte Carlo simulations for three different bulk tnr *ζ**n* = 20 are summarized in Table 1. Clearly, for *n* = 20 the selection method is not effective if *ζ**ζ*

### d. Power of the hypothesis test

For a fixed inversion size (*n* = 20), the power of the test is observed to increase with tnr *ζ**n*. The power of the test was estimated through Monte Carlo simulations for 2 ≤ *n* ≤ 1000, and 0.001 ≤ *ζ**n* and *ζ**n*, the power is increasing from 0 to 1 starting from a tnr threshold depending on *n*. Identically, the power is increasing with *n* for constant *ζ**n* > 50), the test power is high (>0.9) for the tnr values corresponding to a significance level 0.05 or 0.01 (dashed curves). Conversely, for the small overturns (*n* ≤ 10) the power is only slightly larger than 0.5 for the threshold tnr.

## 4. Example with real atmospheric data

### a. Dataset description

The proposed selection procedure is now applied to vertical profiles of atmospheric temperature and pressure. We used data collected from a RS90G Vaisala radiosonde launched during a radar-balloon observation campaign called the Middle and Upper Atmosphere (MU) Radar, Temperature Sheets, and Interferometry (MUTSI; Luce et al. 2002; Gavrilov et al. 2005). The initial sampling frequency and vertical resolution (after resampling at a constant vertical step) were 0.69 Hz and 6 m, respectively. High-resolution (HR) balloon measurements were also performed with sampling frequency and vertical resolution of 50 Hz, and ∼10 cm, respectively. The reader can find more details about the HR measurements in Gavrilov et al. (2005). By contrast, the Vaisala data will hereafter be labeled as low-resolution (LR) data. The HR and LR sensors were on the same gondola, and the data were collected simultaneously.

*θ*is inferred from the measured temperature and pressure using the relation for ideal diatomic gas: where

*T*is the temperature (K) and

*P*is the pressure (hPa). Several nearly neutral layers can be seen around 4-km altitude and above 10-km altitude.

In the present work, we consider tropospheric data only (from 4.5- to 11.8-km altitude), that is, data acquired in relatively weak stability (low tnr) conditions. The noise is evaluated experimentally from the temperature measurements for both LR and HR data: after the removal of a linear trend on short data sequences (five points), we calculated the mean of the squared data differences. The average from all the sequences is an estimate of twice the noise variance. The standard deviation of noise for LR (HR) temperature measurements is found to be 22 (2.2) mK. The noise level of the observed temperature data is independent of altitude, whereas the noise level of the potential temperature slightly varies with altitude because of the pressure term [Eq. (8)]: it increases from 26 to 35 (2.6–3.5) mK for LR (HR) data for heights ranging from 4.5 to 11.8 km.

The vertical profile of tnr for the LR data is shown in Fig. 8a. The bulk tnr *ζ*_{LR} is ∼0.9 (dashed line), the local values showing minima less than 10^{−1} around 11-km altitude. For such a bulk tnr, one cannot expect to detect overturns smaller than *n* ∼ 6 bins, that is, thinner than 30 m (Fig. 5). Furthermore, the test power reaches 0.95 for *n* ∼ 10 (54 m). Within the low stability region, the threshold size is ∼70 bins (414 m). The bulk tnr of the HR measurements is *ζ*_{HR} ∼ 0.14, so detectable size is *n* ∼ 50 bins (5 m), with test power reaching 0.95 for such a size. As shown later in the study, a denoising procedure can provide better performances.

### b. LR data analysis

Figure 8 shows the Thorpe displacements (Fig. 8c) and the cumulative sum of displacements (Fig. 8b) for the LR data. The Thorpe displacements range from about 10 to 300–400 m. For this particular profile, the (real and artificial) inversions extend over 60% of the upper troposphere.

By applying the selection test with a 1% significance level, about 91% of the inversions are not selected as overturns. The turbulent fraction, that is, the fraction of the profile which is selected as overturns, falls down to 27%. The four selected turbulent patches are shown in Fig. 8d (thick lines); here the Thorpe lengths for the detected inversions with the selected (presumably turbulent) overturns are highlighted. The sorted potential temperature profile is superimposed on the figure. A thick turbulent region is observed from altitudes 10.1–11.2 km, with a Thorpe length of about 140 m. Thinner layers are also observed below 6.5 km, with Thorpe lengths ranging from 25 to 50 m. It is worth noting that most of the inversions are not selected as overturns: for their given tnr and sizes, their ranges cannot be distinguished from the range of random noise samples. Of course, a less restrictive confidence level can be used, at the expense of a larger rate of (accepted) errors. For *α* = 25%, the ratio of selected overturn is 21%, and for *α* = 25%, it reaches 40%.

### c. Denoised HR data analysis

As mentioned previously, *ζ*_{HR} ∼ 0.14, the detectable size for an overturn with 1% significance level being *n* ≈ 50 bins (5 m). Also, for such a low tnr the vertical extent of turbulent overturns is likely poorly resolved. A denoising procedure aimed to increase *ζ**m*, increases *ζ**m* reduces the standard deviation of noise by a factor ∼*m*^{−1/2}, thus increasing *ζ**m*^{1/2}. Consequently, a simple way to increase the tnr by a factor *m*^{3/2}, while keeping the noise uncorrelated, consists of 1) smoothing out the profile with a running filter of size *m* and 2) undersampling the smoothed profile by the same factor *m*.

The size of the running window (undersampling factor) must be chosen to optimize the procedure. It is expected that, above a certain tnr value, the minimum detectable size of overturn reaches 2 bins. A further downsampling is useless because a degradation of the vertical resolution will only lead to an increase of the smallest size of the detectable overturn. Figure 9 shows the size of the detectable overturn (m) versus the tnr *ζ* of the subsampled data for various initial (raw data) tnr values. An optimum is found for 3 ≤ *ζ* ≤ 5.

Based on this empirical result, the HR potential temperature profile has been smoothed out with a 9-point Hamming window and resampled at a vertical step of nine bins (0.9 m). This procedure increased *ζ*_{HR} by a factor ∼9^{3/2} = 27, that is, *ζ*_{HR} ≈ 4 (Fig. 10a).

The selected turbulent overturns from the degraded HR data are shown in Fig. 10d. The contrast with the results obtained from the LR data (Fig. 8) is striking: the turbulent fraction is now 47% (27% from the LR data). A large fraction of small size overturns are selected from the HR data and Thorpe lengths are also much smaller, ranging from ∼5 to 140 m. Although not surprising in regions where inversions are not detected from the LR profile (because the resolution of the degraded HR data is about 6 times better), overturn sizes and Thorpe lengths are also observed to be smaller in the 5–6.5-km altitude domain, where overturns are detected from both datasets. One also observes that the thick turbulent patch selected from LR data (from 10.1 to 11.2 km) is now split into several patches, the thicker one having about 300-m depth. However, the Thorpe lengths for this large overturn are similar from both datasets.

The large differences in the size of the selected overturns are very likely a consequence of the poor precision in the evaluation of their vertical extent for low tnr data. The tnr of LR data is so small within the 10–11-km altitude domain, *ζ*_{LR} ∼ 10^{−1}, that a widening effect due to the large probability of noise-induced permutations is likely to occur (see appendix A and Fig. A2). For instance, with *ζ* = 0.1 the probability for a permutation due to noise between adjacent bins is 0.46 and remains as large as 0.3 for two bins five steps apart (Fig. A2).

## 5. Summary and conclusions

The present work was motivated by the need for a robust procedure applicable to Thorpe analyses aimed to distinguish real overturns from inversions produced by instrumental noise. For this purpose, we build a hypothesis test based on the statistics of the data range. The theory of order statistics was presented by David (1981), and the fundamentals of the range statistics are recalled in appendix B.

The procedure for selecting overturns consists of three steps:

- We first estimate the bulk tnr [Eqs. (1) and (2)], which gives an indication on the minimum size (in bin numbers) for the detectable overturns [Eq. (6) and Fig. 5]. If this size is too large, a preliminary denoising procedure is required (section 4c).
- The Thorpe displacement profile is constructed. The noise-induced and real inversions are detected within the profile from the cumulative sum of the Thorpe displacements.
- An inversion is selected as an overturn if its range exceeds a large prescribed percentile of a noise sample of the same size [(Eq. (7)]. For practical purposes, we tabulated the moments, as well as various percentiles, of the range for normally distributed variables as a function of the sample size [Table 2 and an ASCII file (available online at ftp://ftp.aero.jussieu.fr/pub/os/WN.txt)].

The statistical test was validated through Monte Carlo simulations for various tnr values and inversion sizes and by assuming stirred and nonstirred inversions. The probability for an erroneous selection (type I error) is limited by a prescribed significance level *α*, usually 0.05 or 0.01. The power of the test, that is, the percentage of correctly detected overturns, is found to be an increasing function of both the tnr and the size of the inversions (Fig. 5).

The denoising procedure is based on the degradation of the vertical resolution of the original data. It aims at increasing the tnr although preserving the independence of the noise between data bins. The tnr of the degraded data increases by a factor *m*^{3/2}, where *m* is the undersampling factor (i.e., the number of bins of the running filter). The parameter *m* is constrained by a trade-off between the desired detectable size of overturns and the reliability of the selection test (quantified through its power).

To illustrate our purpose, the selection procedure was applied to two datasets obtained simultaneously during a single balloon flight: 1) a low-resolution (LR) profile with vertical resolution *δz* = 6 m and a bulk tnr *ζ*_{LR} ∼ 0.9, and 2) a high-resolution (HR) profile with *δz* = 0.1 m and *ζ*_{HR} ∼ 0.14. From the LR dataset, the size of detectable overturns with a significance level *α* = 0.01 is *n* ∼ 6 bins, that is, 30 m (Fig. 5). For such a size, the probability of overturn detection, that is, the power of the test, is about 0.5. A power of 0.95 can be obtained by increasing the detectable size up to 10 bins (54 m). For the HR profile, with the same significant level and 0.95 power, the detectable overturns size is *n* ∼ 50 (5 m).

Clearly, overturns of a smaller size can be detected from the HR profile if a denoising procedure is applied. By degrading the vertical resolution to 0.9 m (*m* = 9), the bulk tnr is increased to *ζ*

Major differences in the distribution of sizes of the selected overturns are observed from these two datasets (Figs. 8 and 10). Not surprisingly, smaller overturns are selected from the HR profile, the selected turbulent fraction are almost twice as large (45% versus 27%), with most of the HR selected overturns being thinner than 100 m. The results demonstrate the importance of HR measurements with high tnr for determining the turbulent fraction of the atmosphere. The difference in the size of the overturns selected within the same altitude domain from both datasets is likely a consequence of the poor determination of the boundaries of the overturns (defined as isolated layers) for low tnr conditions (*ζ*_{LR} ≪ 1 where overturns are detected). This effect is also visible from the Monte Carlo simulations; see, for instance, the relative widening of the detected overturns for low tnr conditions (Table 1). Consequently, the size distribution of the selected overturns appears to be very sensitive to the tnr of the observations.

The previously described selection procedure provides, at least theoretically, a way to analyze large datasets to describe the turbulence climatology in the atmosphere or in the ocean, as suggested by Clayson and Kantha (2008). However, the data resolution, as well as the noise level, both combined in the tnr, are key issues when applying a Thorpe analysis. A major conclusion of this work is that the data tnr determines the minimum sizes of the detectable overturns by a Thorpe analysis. Clearly, the detected turbulent fraction, as well as the size distribution of the selected overturns, depend heavily on this parameter.

## Acknowledgments

The authors wish to acknowledge Centre National d’Études Spatiales (CNES) for operating the balloons in Japan and the Research Institute for Sustainable Humanosphere (RISH) for their contribution in the MUTSI campaign. The authors are also grateful to two anonymous reviewers for helpful comments on the manuscript.

## REFERENCES

Alford, M. H., , and Pinkel R. , 2000: Observation of overturning in the thermocline: The context of ocean mixing.

,*J. Phys. Oceanogr.***30****,**805–832.Alisse, J-R., , and Sidi C. , 2000: Experimental probability density functions of small-scale fluctuations in the stably stratified atmosphere.

,*J. Fluid Mech.***402****,**137–162.Clayson, C. A., , and Kantha L. , 2008: On turbulence and mixing in the free atmosphere inferred from high-resolution soundings.

,*J. Atmos. Oceanic Technol.***25****,**833–852.Coulman, C., , Vernin J. , , and Fuchs A. , 1995: Optical seeing—Mechanism of formation of thin turbulent laminae in the atmosphere.

,*Appl. Opt.***34****,**5461–5474.Dalaudier, F., , Sidi C. , , Crochet M. , , and Vernin J. , 1994: Direct evidence of sheets in the atmospheric and temperature field.

,*J. Atmos. Sci.***51****,**237–248.David, H. A., 1981:

*Order Statistics*. John Wiley & Sons, 360 pp.Dillon, T. M., 1982: Vertical overturns: A comparison of Thorpe and Ozmidov length scales.

,*J. Geophys. Res.***87****,**9601–9613.Dillon, T. M., 1984: The energetics of overturning structures: Implications for the theory of fossil turbulence.

,*J. Phys. Oceanogr.***14****,**541–549.Ferron, B., , Mercier H. , , Speer K. , , Gargett A. , , and Polzin K. , 1998: Mixing in the Romanche fracture zone.

,*J. Phys. Oceanogr.***28****,**1929–1945.Freeman, H., 1963: Testing statistical hypotheses.

*Introduction to Statistical Inference,*Addison-Wesley Publishing Company, 283–305.Galbraith, P. S., , and Kelley D. E. , 1996: Identifying overturns in CDT profiles.

,*J. Atmos. Oceanic Technol.***13****,**688–702.Gargett, A. E., , and Garner T. , 2008: Determining Thorpe scales from ship-lowered CTD density profiles.

,*J. Atmos. Oceanic Technol.***25****,**1657–1670.Gavrilov, N. M., , Luce H. , , Crochet M. , , Dalaudier F. , , and Fukao S. , 2005: Turbulence parameter estimation from high-resolution balloon temperature measurements of the MUTSI-2000 campaign.

,*Ann. Geophys.***23****,**2401–2413.Johnson, H. L., , and Garrett C. , 2004: Effect of noise on Thorpe scales and run lengths.

,*J. Phys. Oceanogr.***34****,**2359–2372.Luce, H., , Fukao S. , , Dalaudier F. , , and Crochet M. , 2002: Strong mixing events observed near the tropopause with the MU radar and high-resolution balloon techniques.

,*J. Atmos. Sci.***59****,**2885–2896.Ozmidov, R. V., 1965: On the turbulent exchange in a stably stratified ocean.

,*Atmos. Ocean Phys.***1****,**861–871.Piera, J., , Roget E. , , and Catalan J. , 2002: Turbulent patch identification in microstructure profiles: A method based on wavelet denoising and Thorpe displacement analysis.

,*J. Atmos. Oceanic Technol.***19****,**1390–1402.Robinson-Cox, J., 1992: Tables of order statistics of normal random variables under linear trend.

,*Commun. Stat. Theory Methods***21****,**3497–3520.St. Laurent, L., , and Schmitt R. W. , 1999: The contribution of salt fingers to vertical mixing in the North Atlantic trace release experiment.

,*J. Phys. Oceanogr.***29****,**1404–1424.Thorpe, S. A., 1977: Turbulence and mixing in a Scottish loch.

,*Philos. Trans. Roy. Soc. London***286A****,**125–181.Timmermans, M-L., , Garrett C. , , and Carmack E. , 2003: The thermohaline structure and evolution of the deep waters in the Canada basin, Arctic Ocean.

,*Deep-Sea Res. I***50****,**1305–1321.Woods, J. D., 1968: Wave-induced shear instability in the summer thermocline.

,*J. Fluid Mech.***32****,**791–800.

## APPENDIX A

### Probability for a Noise-Induced Permutation

*X*

_{1},

*X*

_{2}, … ,

*X*} consisting of a random noise superimposed to a linear trend. The

_{n}*i*th bin can be expressed as where

*τ*is the trend (

*τ*=

*s*

_{i+1}−

*s*),

_{i}*s*is the deterministic signal, and

_{i}*B*is the noise for the

_{i}*i*th bin. We assume a normally distributed noise (

*B*∼

_{i}*N*(0,

*σ*)) and

_{N}*τ*≥ 0. The random variables

*X*are independent

_{i}*X*∼

_{i}*N*(

*iτ*,

*σ*), with cumulative distribution function (cdf)

_{N}*F*.

_{i}*X*

_{i+1}<

*X*. Let

_{i}*p*

_{1}be the probability of such an inversion, that is,

*p*

_{1}= Pr[

*X*

_{i+1}<

*X*]. Starting from the conditional probability Pr[

_{i}*X*

_{i+1}<

*x*|

*X*=

_{i}*x*],

*p*

_{1}reads As

*X*=

_{i}*iτ*+

*B*,

_{i}*d*Pr[

*X*=

_{i}*x*] =

*d*Pr[

*B*=

_{i}*b*] =

*f*(

*b*)

*db*,

*f*is the probability density function (pdf) of noise (Fig. A1). It then follows that The

*B*are independent. Consequently, where

_{i}*F*is the cdf of the noise. Finally, we get Here,

*F*(

*b*−

*τ*) is the conditional probability for an inversion of bins

*i*and

*i*+ 1, given

*B*=

_{i}*b*.

*F*reads

*ζ*=

*τ*/

*σ*is the trend-to-noise ratio (tnr) (Fig. A2). The probability

_{N}*p*

_{1}thus depends on the tnr

*ζ*only. The probability for a permutation between two bins, with distant

*k*steps apart reads

For *ζ* = 0, *p _{k}* = ½, and ∀

*k*, the probability for a permutation with any bin, is ½. If

*ζ*≠ 0, the probability is a simple function of tnr

*ζ*and of the distance between bins expressed in the steps number

*k*. As the probability for two data bins to permute is a decreasing function of tnr, the run lengths due to noise must also decrease with tnr.

## APPENDIX B

### Order Statistics

In this appendix, we recall the basic results of the statistics of range, as described by David (1981). Sorting the sample {*X*_{1}, *X*_{2}, … , *X _{n}*} gives {

*X*

_{1:n},

*X*

_{2:n}, … ,

*X*

_{n:n}},

*X*

_{i:n}defined as the

*i*th order statistics (

*i*= 1, 2, … ,

*n*).

#### Independent identically distributed (iid) variables

*n*variables are iid (i.e.,

*τ*= 0). Let

*f*and

*F*be the pdf and the cdf of the sample, respectively. The cdf of the

*r*th sorted variable

*F*

_{r:n}is obtained by noting that

*F*

_{r:n}(

*x*) is the probability that at least

*r*of the

*X*is less than or equal to

_{i}*x*, that is The pdf is deduced by differentiation: The functions

*f*

_{r:n}and

*F*

_{r:n}can easily be numerically calculated if one specifies the probability distribution of the sample.

*n*range

*W*(

*n*) is defined as the difference between the maximum and minimum of the values in the

*n*sample, that is,

*W*(

*n*) =

*X*

_{n:n}−

*X*

_{1:n}. The cdf of the range of

*n*iid variables is derived by noting that (David 1981) is the probability given

*x*that one of the

*X*falls into the interval (

_{i}*x*,

*x*+

*dx*) and all of the

*n*− 1 remaining

*X*fall into (

_{i}*x*,

*x*+

*w*). The resulting cdf

*F*(

_{W}*w*) is obtained by integrating over

*x*: The following pdf

*f*is deduced: Figure 3 shows the pdf of the

_{W}*n*range for a normally distributed parent population for

*n*= 10, 20, and 50. We used this last expression to numerically compute the various percentiles of the range for a normally distributed parent population (Table 2).

#### Non-iid variables

*i*th variable has cdf

*F*, the cdf of the

_{i}*r*th sorted element reads (e.g., David 1981) where the summation

*S*extends over all permutations

_{i}*j*

_{1},

*j*

_{2}, … ,

*j*of (1, 2, … ,

_{n}*n*), for which

*j*

_{1}<

*j*

_{2}< · · · <

*j*and

_{i}*j*

_{i+1}< · · · <

*j*. The ensuing complications are considerable. However, this cdf, and the moments, can be numerically evaluated for a limited number of variables (say,

_{n}*n*≤ 20) without great difficulties if one assumes a linear trend (e.g., Robinson-Cox 1992). The moments of the

*n*range are even functions of the trend

*τ*. For any trend (positive or negative), the

*n*-range expectation is larger than in case of iid variables (i.e., null-trend case).

Results of the Monte Carlo simulations. Here, *L _{T}* and inversion widths are expressed in bin units. L1, L2, L3 designate the solid-body, noise-induced, and sine inversions, respectively. The percentage of selected overturns (PSO) shows the power of the test for L1 and L3 overturns.

Moments and percentiles of the range for *n* iid with normal *N*(0, 1) pdf.