## Abstract

The statistical properties of turbulence at upper levels in the atmosphere [upper troposphere and lower stratosphere (UTLS)] are still not well known, partly because of the lack of adequate routine observations. This is despite the obvious benefit that such observations would have for alerting aircraft of potentially hazardous conditions, either in real time or for route planning. To address this deficiency, a research project sponsored by the Federal Aviation Administration has developed a software package that automatically estimates and reports atmospheric turbulence intensity levels (as EDR ≡ *ε*^{1/3}, where *ε* is the energy or eddy dissipation rate). The package has been tested and evaluated on commercial aircraft. The amount of turbulence data gathered from these in situ reports is unprecedented. As of January 2014, there are ~200 aircraft outfitted with this system, contributing to over 137 million archived records of EDR values through 2013, most of which were taken at cruise levels of commercial aircraft, that is, in the UTLS. In this paper, techniques used for estimating EDR are outlined and comparisons with pilot reports from the same or nearby aircraft are presented. These reports allow calibration of EDR in terms of traditionally reported intensity categories (“light,” “moderate,” or “severe”). The results of some statistical analyses of EDR values are also presented. These analyses are restricted to the United States for now, but, as this program is expanded to international carriers, such data will begin to become available over other areas of the globe.

## 1. Introduction

Commercial and general-aviation aircraft continue to encounter unexpected turbulence that requires immediate changes in flight paths or is hazardous to the aircraft and occupants. For commercial air carriers, turbulence is by far the leading cause of occupant injuries (e.g., Bass 2002; Tvaryanas 2003). As a consequence, the costs to airlines that result from turbulence encounters for injuries (medical attention and liability suits), cabin and aircraft damage, time lost to inspection and maintenance, and impacts of delays are substantial. In addition, turbulence encounters continue to lead to the public perception that air travel can be unpleasant and even unsafe.

More numerous quantitative turbulence observations could allow construction of reliable turbulence maps within the national airspace, which in turn would provide better tactical avoidance options and data that can be used to produce better turbulence forecasts for strategic avoidance (e.g., Bass et al. 2001; Bass 2002). Traditionally, the only routine observations of turbulence have been those provided verbally by pilots [known as pilot reports (PIREP) in the United States and as air reports (AIREP) internationally], but these can have substantial errors in reported intensity, position, and time (Schwartz 1996; Sharman et al. 2006; Bass 2002). In the United States, PIREPs categorize turbulence on perceived intensity scales of “smooth” (also referred to as “null,” “nil,” or “neg”), “light,” “moderate,” “severe,” or “extreme” (Federal Aviation Administration 2012, their Table 7-1-9]. Although formal definitions of these severity categories are provided in terms of normal accelerations or airspeed fluctuations, in practice they are both subjective (based on aircrew interpretation) and aircraft dependent, making them ill suited for providing reliable and consistent maps of atmospheric turbulence levels.

To address these deficiencies, an in situ turbulence-reporting algorithm (Cornman et al. 1995, 2004) has been implemented on some U.S. commercial air carriers (currently about 200 aircraft). The algorithm estimates atmospheric turbulence intensity as inferred from the cube root of the energy or eddy dissipation rate (here termed “EDR” ≡ ; m^{2/3} s^{−1}) as based on estimated vertical wind velocity *w* or aircraft vertical acceleration. The mean and peak EDR encountered over each 1 min of flight are automatically downlinked at user-specified intervals and are accurate to within at least 10 km and 1 min; these accuracies are much better than those of typical PIREPs, making them ideal for operational purposes (Emanuel et al. 2013), turbulence statistical analyses [especially in the upper troposphere and lower stratosphere (UTLS)], and turbulence case studies (e.g., Lane et al. 2012; Sharman et al. 2012b; Williams 2014). From an aviation turbulence forecasting/nowcasting perspective, an aircraft-independent metric such as EDR is preferred, since the forecasts should be an atmospheric quantity and independent of aircraft type.

The use of EDR as a metric of turbulence intensity was originally suggested by MacCready (1964) and is particularly useful operationally, since is proportional to the root-mean-square (RMS) vertical acceleration experienced by an aircraft for specific flight conditions (MacCready 1964; Cornman et al. 1995). An alternative reporting method for atmospheric turbulence intensity is the maximum derived equivalent vertical gust velocity (DEVG) or *U*_{de} = 10 × DEVG (e.g., Sherman 1985; Gill 2014), currently implemented on some Qantas and British Airways Boeing Company (B) 747-400s. Although DEVG is claimed to be an aircraft-independent metric and can be roughly correlated with EDR (Stickland 1998a), it is not a true atmospheric turbulence intensity metric and is therefore less useful than EDR for atmospheric turbulence forecasting applications (Stickland 1998a,b). Further, MacCready (1964) recommends EDR as the turbulence reporting “standard” in part because of its relative simplicity. For these reasons EDR has become the atmospheric turbulence metric required by the International Civil Aviation Organization (ICAO) (ICAO 2001, available from the authors in electronic form; ICAO 2010) for routine turbulence reporting and will likely become the Aircraft Meteorological Data Relay (AMDAR) standard for turbulence reporting as well.

The EDR estimation algorithms to be presented here have been implemented on the entire fleets of United Airlines (UAL) B737-300 and B757-200 aircraft (denoted here as UAL737s and UAL757s, respectively), and Delta Air Lines (DAL) B737-700/-800 aircraft (denoted as DAL737s). Both estimation algorithms report EDR, but the algorithm deployed on the UAL aircraft utilizes aircraft accelerometer data along with an aircraft response model, whereas the newer algorithm deployed on DAL aircraft utilizes aircraft-calculated vertical winds in place of accelerometer data, obviating the need for an aircraft response function model, and includes more sophisticated onboard quality-control algorithms. Figure 1 shows the number of in situ EDR reports received in 2012 for flight levels (FLs) above 20 000 ft (denoted as FL200 and ≈6.1 km) from the UAL and DAL deployments. As can be seen, the UAL757 data are primarily from routes between the northeastern and southwestern United States, whereas the DAL737 routes also include the Southeast and thus provide coverage that is somewhat more comprehensive. The UAL757 reports are provided routinely once per minute, whereas the DAL737 reports consist of 15-min routine reports plus “triggered” reports when the estimated EDR value exceeds some predetermined threshold. Thus, the number of DAL737 reports from above FL200 received in that year was lower (1 979 438) than the number of UAL reports received (13 183 573) from above that level. Still, the number of combined reports is unprecedented. For comparison, the total number of turbulence PIREPs recorded through the National Oceanic and Atmospheric Administration (NOAA) Family of Services (FOS) at all flight levels of all intensities for all aircraft (including general aviation) for 2012 was 1 049 660.

The in situ EDR algorithm was originally deployed on the entire UAL B737-300 and B757-200 fleets in 2001. The B737-300 fleet was gradually retired; the last B737-300 was retired in October of 2009. The DAL B737-700/-800 aircraft were implemented with the vertical winds–based algorithm beginning in August of 2008. The algorithm is also currently being implemented on DAL 767-300/-300ER/-400ERs, but fleetwide implementation at this writing is not complete (but is expected to be about 90 aircraft). In the analyses presented here, an archive of in situ EDR data is used that includes 128.3 million UAL757 reports from 2004 to 2013 above FL200 and 9.6 million DAL737 reports from 1 October 2008 to 2013. UAL757 in situ EDR reports prior to 2004 and those from several UAL737 aircraft had quality-control issues and so were not used in these analyses. Because the DAL767 data are relatively recent and fleetwide implementation had still not been completed by the end of 2013, those data were not used either.

In this paper a brief description of the in situ EDR estimation algorithms is provided in section 2. Section 3 provides some results of comparisons of PIREPs with EDR that were performed to provide calibration of EDR for specific aircraft types to turbulence intensities traditionally reported by pilots. Section 4 provides statistical analyses of the EDR data that allow some assessments of turbulence occurrence and distributions, especially at cruise altitudes (i.e., in the UTLS). Section 5 provides a summary and conclusions.

## 2. Description of the EDR estimation algorithms

Aircraft response to atmospheric turbulence depends on aircraft size, weight, cruise speed, altitude, attitude, and aerodynamics, and only a certain range of wavelengths of turbulent eddies is felt by aircraft as “bumpiness.” For most commercial aircraft this size range is from approximately 10 m to 1 km (MacCready 1964; Vinnichenko et al. 1980; Hoblit 1988). The response is far more sensitive to vertical gusts than to longitudinal or lateral gusts (Hoblit 1988), however, and therefore vertical-gust estimates are most appropriate for aircraft applications. Since most of the energy responsible for aircraft bumpiness is in the so-called inertial range (e.g., MacCready 1964), it is important that any turbulence estimation algorithm be based on an underlying turbulence model that provides a reasonable representation of turbulence on these scales. Faster-moving aircraft will also be responsive to larger scales that are typically outside the inertial subrange (MacCready 1964). One model that includes representations of both the inertial subrange and the larger scales beyond it and that has been used extensively by both the aerodynamics and meteorological communities is the von Kármán spectral model (e.g., Hinze 1959; Hoblit 1988; Murrow 1987; Murrow et al. 1982; Founda et al. 1997; Kristensen and Lenschow 1987).

The EDR estimation algorithms used in this study incorporate the von Kármán spectral model and either 1) the measured vertical accelerations together with a model of the aircraft response as implemented on UAL757 aircraft, or 2) a vertical winds–based algorithm currently implemented on DAL737s. Only a brief summary of the accelerometer-based method is provided below because it is described in detail in Cornman et al. (1995). A description of the vertical winds–based method is not currently available, however, and therefore a more detailed presentation is provided here. In either case, the algorithm is distributed as a software package that can be loaded on the Aircraft Condition and Monitoring System or other suitable onboard computer. No modifications to aircraft hardware are required.

### a. The von Kármán wind model

For vertical gusts, which are mainly normal to the direction of flight, the transverse velocity spectrum is used. In its most general form this is given by the Fourier transform of the two-point transverse correlation function *B*_{NN} (e.g., Monin and Yaglom 1975, p. 43):

where *k* is the horizontal wavenumber. For an incompressible isotropic turbulent velocity field, the von Kármán spectral model can be written (von Kármán 1948; Hinze 1959; Frehlich et al. 2001) as

where *r* is the separation distance, is the variance of the vertical velocity component, *L* is a length scale, Γ is the gamma function, and *K*_{ν} is the modified Bessel function of the second kind of order *ν*. Then, from Eq. (1),

where the longitudinal integral length scale *L*_{i} is

The vertical velocity variance can be written in terms of *ε* as (e.g., Mann 1994; Cornman et al. 1995)

where *α* is an empirical constant estimated to be 1.5–1.7 (Kristensen and Lenschow 1987; Mann 1994; Cornman et al. 1995)—the current algorithms use *α* = 1.6. Then, in terms of *ε*^{2/3}, Eq. (3) becomes (Mann 1994, 1998; Kristensen and Lenschow 1987)

For large wavenumbers (inertial range),

which is similar to the formulation used by MacCready (1962, 1964).

An example plot of the spectrum from Eq. (6) is shown in Fig. 2 for two values of *ε*^{1/3} [0.1 and 0.3 m^{2/3} s^{−1} corresponding to the ICAO (2001) recommended thresholds of light and moderate, respectively], and three values of *L* (1500, 1000, and 500 m). The plots demonstrate the higher spectral levels for all wavenumbers for larger values of *ε*^{1/3}. For low wavenumbers the transverse spectrum rolls off at the knee, and in that region the spectral levels depend on *L*. For higher wavenumbers the spectra exhibit the *k*^{−5/3} behavior described by Eq. (7) and are independent of *L*. The exact value of *L* cannot be specified in a universal way; it is situation dependent, and so some representative value must be used. Murrow (1987) finds that for high-altitude encounters *L* varies from about 300 to 2000 m depending on the turbulence source. For the EDR algorithms described here a value of *L* = 669 m (corresponding to *L*_{i} = 500 m) is used throughout.

For airborne applications, a time series is obtained and temporal spectra are computed. Assuming Taylor’s frozen hypothesis is valid (this assumption is reasonable given the high airspeeds of the aircraft relative to the fluctuation velocities; e.g., Wyngaard and Clifford 1977; Bacmeister et al. 1996; Gardner and Gardner 1993), the temporal vertical wind spectrum *S*_{w}(*f*) is related to the vertical wind spatial spectrum *F*_{N}(*k*) through

where *f* is frequency and *V*_{T} is the aircraft true airspeed, assumed to be constant for the spectral computations.

### b. Accelerometer-based EDR estimation method

This EDR estimation technique is described in detail in Cornman et al. (1995). The variance in the aircraft’s vertical acceleration, contained between the frequencies *f*_{l} and *f*_{h}, is given by

where *H*(*f*) is the product of the aircraft vertical acceleration response function and a bandpass filter. An approach to calculating the response function is presented in Cornman et al. (1995). [Note that there is a typographical error in Eq. (26) of that reference. The variable *θ* should be .] If it is assumed that the temporal velocity spectrum can be written as the product of the estimated eddy dissipation rate and a prescribed function *ϕ*_{w}: , then the estimated EDR is derived from

where is the bandpass-filtered vertical acceleration standard deviation and *F* is given by

In the current implementation *f*_{l} and *f*_{h} are set to 0.1 and 0.8 Hz, respectively. The algorithm calculates the running RMS of the filtered signal over 10-s windows. At an 8-Hz sampling rate, this provides 480 EDR estimates, and the median and 90th percentile (referred to as the “peak”) of these estimates are used for communicating the turbulence experience.

### c. Vertical wind–based EDR estimation method

In the vertical wind–based EDR estimation method, onboard estimates of the vertical wind are used and then a frequency-domain, single-parameter, maximum-likelihood calculation is used to estimate EDR. The technique was only briefly outlined in Cornman et al. (2004), and so a more detailed description is given here. A similar technique has been used by Chan (2010), but on flight-recorder data.

The vertical wind estimate *w* (positive upward) is computed as the difference between the velocity with respect to the air and the vehicle inertial vertical velocity (IVV) through (e.g., Lenschow 1972)

where *V*_{T} is the true airspeed, *φ* is the roll angle, *θ* is the pitch angle, *β* is the sideslip angle, is the pitch rate, *α*_{b} is the body-axis angle of attack, and *M* is the longitudinal distance from the angle-of-attack measurement locations to the aircraft center of gravity. For most cases the sideslip angle is small and can be ignored (Haverdings and Chan 2010). In a similar way, in tests using high-rate data, the pitch-rate term does not seem to contribute significantly to the EDR estimates either (even for pitch rates of 4°–5° s^{−1} the maximum contribution of the pitch-rate term to EDR is ~0.03 m^{2/3} s^{−1}, well within operationally useful limits), and therefore Eq. (12) to a good approximation is (Parks et al. 1985; Wingrove and Bach 1994; Haverdings and Chan 2010)

The body-axis angle of attack *α*_{b} is obtained from the left and right vane-measured angles of attack (*α*_{L} and *α*_{R}) through

The calibration constants *a*_{0} and *a*_{1} are estimated in a preprocessing step using a least squares linear fit of to several hours of straight and level flight data under nominally smooth conditions, for which *α*_{b} should be approximately equal to *θ*. A similar approach was suggested by Drüe and Heinemann (2013). A major advantage of the winds-based algorithm is that it requires fewer aircraft parameters than accelerometer-based methods for computing EDR—with Eqs. (13) and (14) only six real-time aircraft flight parameters are required, as shown in Table 1. Note the minimum required sampling frequency for most of these is 4 Hz, but for the implementations to date all flight parameters have been available at 8 Hz.

To compute the empirical vertical wind power spectrum, the time series of *w* from Eq. (13) is first linearly detrended (using a least squares fit) and (Tukey) windowed and then is Fourier transformed:

where *w*^{dT} is the detrended and windowed vertical wind time series, *f*_{s} is the sampling frequency, *m* = 10*f*_{s} (corresponding to 10 s of data), and *k* = 0, …, 5*f*_{s}. Using the spectral-domain-based maximum-likelihood estimation method (Smalikho 1997), the EDR estimate is

where *k*_{l} and *k*_{h} are the lower and upper index bounds corresponding to the frequencies *f*_{l} and *f*_{h} over which the average is taken (set to 0.5 and 3.5 Hz for the 8-Hz implementations), *γ* is a bias-correction term, and is a “model” spectrum. Several possible model spectra could be used. The von Kármán spectrum [Eqs. (6) and (8)] would be a logical choice except that it does not take into account filter effects, finite length, discrete time sampling, or the window function. To generate unbiased estimates of , the model spectrum needs to take into account the procedure for generating the empirical spectrum . In the current algorithm implementation, is the periodogram of the computed windowed von Kármán autocorrelation function with unit *ε*^{1/3} (e.g., Frehlich et al. 2001):

with *B*_{NN} being given by Eqs. (2) and (6) with *L*=669 m (*L*_{i} = 500 m) and where *T* is the autocorrelation function for the (power normalized) Tukey window:

where the taper width *M* = floor(0.1*m* − 0.2) is about 10%.

In practice the aircraft-measured fields (Table 1) used to compute *w* are typically filtered in an analog-to-digital conversion step. In the case of commercial aircraft the details of the filtering are often not known, and the empirical parameter *γ* in Eq. (16) is used to account for these effects. The parameter *γ* is computed in a preprocessing step by comparing over a limited frequency range (0.5–1.5 Hz) over which filter effects should be small and the spectral levels should be unbiased with computed over the full range, setting *γ* to 1 in both runs. The two resulting estimates are plotted against each other using many hours of actual aircraft data, and the errors-in-variables linear regression technique (Press et al. 1992) is applied, excluding small (<0.05 m^{2/3} s^{−1}) values. The current algorithm implementation uses *γ* = 1.3, but this choice depends on the frequency ranges used and the aircraft-specific filtering.

As with the acceleration-based method, a nominal 1-min computing interval is used. Within this 1-min interval, shorter time windows are used to derive individual EDR estimates. The choice of window length is a compromise between capturing important short discrete events and having enough samples to provide stable computational statistics. The analysis of large-amplitude, discrete turbulence encounters indicates that the temporal duration at typical commercial transport cruise speeds is on the order of 5–15 s (see, e.g., Fig. 4b of Sharman et al. 2012a). Therefore, a 10-s window with ½ overlap seems a reasonable compromise. This is consistent with Haverdings and Chan (2010), who use a moving 10–20-s window with 4-Hz sampling. In the implementations to date, vertical winds are available at 8 Hz; thus the 10-s window provides 80 data points per spectrum and 12 samples per minute. Examples of the output of the algorithm are shown in Fig. 3. Note that the *f*^{−5/3} line fit all spectra fairly well. Even for large-amplitude discrete events, the *f*^{−5/3} approximation is well satisfied at the higher frequencies (e.g., Sharman et al. 2012a; Mark and Fischer 1976). Differences in the 12 realizations sampled during the 1-min interval are to be expected, however, given the random nature of atmospheric turbulence; therefore, only the ensemble mean and peak of the 12 EDR samples are computed for possible downlink.

Figure 4 provides a sample verification of the ability of the vertical winds–based algorithm to reproduce the EDR from an input von Kármán turbulent wind field of known spatial statistics and assumed aircraft filtering effects. For the filter, a two-pole Butterworth filter with a 3-Hz stop-band cutoff (e.g., Porat 1997, chapter 10) is used. Figure 4 shows 10 000 realizations of a filtered and windowed von Kármán wind field produced using the method of Frehlich et al. (2001) with true *ε*^{1/3} specified uniformly between 0 and 0.5 m^{2/3} s^{−1} and with *L*_{i} = 500 m. From the 1D simulated *w* field, 10-s-long intervals sampled at 8 Hz can be constructed to imitate the sampling by a hypothetical aircraft flying along the simulated turbulence line. A comparison of Fig. 4a, which uses the von Kármán spectrum without sampling effects [Eq. (16) with taken as Eqs. (6) and (8) for the algorithm model and with *γ* = 1], with Fig. 4b, which includes sampling effects [Eq. (16) with taken as Eq. (17) with *γ* = 1.3], gives some idea of the effects of the bias-correction factor *γ* and choice of spectral model. Note that in Fig. 4b the input *ε*^{1/3} agrees statistically very well with the output and the bias is small. Other verifications of the in situ EDR method have been performed by Chan (2010) and Chan and Lee (2012) on the basis of comparisons with ground-based lidar data and by Williams et al. (2006) using comparisons with ground-based radar spectral-width data.

### d. EDR reporting

In the ideal situation, all 1-min mean and peak EDR estimates would be downlinked, but, if standard air-to-ground communications were used, this protocol would result in substantial costs to downlink mostly smooth turbulence experiences. To minimize transmission costs, two strategies are implemented. The first strategy is to bin the resultant 1-min mean and peak EDRs to a fairly coarse resolution to minimize the number of characters in the downlinked message. In the current UAL implementation, a bin width of 0.1 m^{2/3} s^{−1} is used; the DAL implementations use a discretization of 0.02 m^{2/3} s^{−1}. For most aviation applications, these resolutions are sufficient for operational needs (ICAO 2010).

The second strategy used is to downlink only the median and 90th percentile (UAL) or mean and peak (DAL) of the individual EDR estimates over the 1-min period. From these values, a qualitative assessment can be made of whether the turbulence is relatively continuous or discrete. For UAL, these are downlinked for all 1-min estimates. Although this sampling strategy is clearly the desired one, it does result in a very large number of reports of smooth (cf. Fig. 10, described below). Another option used with DAL aircraft provides “event based” reporting in combination with routine reporting over a longer time interval. In this method, mean and peak turbulence values are added to the routine wind and temperature Aircraft Communications Addressing and Reporting System (ACARS)/AMDAR reports (Moninger et al. 2003) at nominally 15–30-min intervals, but turbulence reports would also be immediately generated (or triggered) when the peak EDR level exceeded a predetermined threshold. In this case, a window of previous EDR values (e.g., the current 1-min sample plus or minus the past five or so 1-min samples) is used to better isolate discrete events. Another reporting trigger could be that *x* of the past *n* (e.g., 3 out of 5) EDR (peak or mean) estimates exceeded a lower threshold.

For the DAL implementations, three triggers are used. A “type 1” trigger occurs when the peak EDR exceeds a predetermined threshold set by the airline (currently 0.18 m^{2/3} s^{−1}). This trigger causes an immediate downlink. “Type 2” detects a fairly consistent medium intensity (currently based on three of the last six 1-min peak EDRs exceeding 0.12 m^{2/3} s^{−1}). “Type 3” detects consistent lower intensity (based on four of the last six 1-min *mean* EDRs exceeding 0.06 m^{2/3} s^{−1}). A 6-min-long follow-up report occurs 6 min after a type-1 or type-2 report. An example of downlinks using these reporting strategies is shown in Fig. 5.

## 3. Calibration of EDR estimates to traditional PIREP intensities

EDR is a measure of atmospheric turbulence intensity; what a pilot experiences and reports is probably best correlated to the peak acceleration over some time interval (Bass 1999), however. Although EDR is related to the RMS vertical acceleration (*σ*_{g} = in gravitational acceleration *g* units) through Eq. (10), Bowles and Buck (2009) showed that *σ*_{g} was also correlated well with peak loads and therefore with what a pilot would tend to report. They found that the peak *g* is ≈ 2.6 *σ*_{g} (where *σ*_{g} was computed using a running 5-s window), and they determined *σ*_{g} thresholds of 0.1, 0.3, and 0.6 *g* units corresponding to the moderate, severe, and extreme thresholds, respectively. This result is similar to the numbers suggested by Zbrozek (1961; see also MacCready 1964), who provides *σ*_{g} thresholds of 0.05, 0.10, 0.20, and 0.30 *g* units for slight, moderate, moderately heavy, and very heavy, respectively.

Still other studies have related to PIREP turbulence intensity categories for typical commercial aircraft. MacCready (1964) estimates EDR thresholds of 0.03, 0.07, 0.16, and 0.38 m^{2/3} s^{−1} for light, moderate, heavy, and extreme, respectively, and Lee et al. (1988) and Bohne (1985) estimate 0.07, 0.16, and 0.38 m^{2/3} s^{−1} for moderate, severe, and extreme, respectively. The ICAO Annex 3 (“Meteorological Service for International Air Navigation”; ICAO 2001) established threshold standards of 0.10, 0.30, and 0.50 m^{2/3} s^{−1} for maximum EDR values during a 1-min sampling interval for light, moderate, and severe turbulence intensities for medium-sized aircraft. These thresholds were later increased in ICAO (2007, 2010) to 0.10, 0.40, and 0.70 m^{2/3} s^{−1}.

Other *ε*^{1/3} threshold values that are based on the longitudinal wind component *u* have been suggested: Vinnichenko and Dutton (1969) and Dutton (1971) estimated threshold values of = 0.14, 0.23, and 0.46 m^{2/3} s^{−1} for light, moderate, and severe, respectively. Trout and Panofsky (1969) estimated geometric means of threshold values from different sources as 0.14, 0.20, and 0.41 m^{2/3} s^{−1} for light, moderate, and severe, respectively. Note, however, that *ε*^{1/3} estimates that are based on the longitudinal or horizontal wind component will not necessarily be the same as those derived from the vertical wind component , especially for isolated higher-turbulence events in the stratified-shear-flow environments that are characteristic of the UTLS (e.g., Sharman and Frehlich 2003; Schumann et al. 1995; Kennedy and Shapiro 1975; Lilly et al. 1974).

### a. Comparisons of PIREPs with in situ EDR B737 and B757 data

The previous estimates mentioned above suffer from a lack of sufficient numbers of samples to derive reliable PIREP-to-EDR mappings, and the means by which the mappings were produced were not well documented. Therefore, it is difficult to assess the uncertainty bounds. Here we statistically assess PIREP errors and obtain EDR values corresponding to traditional PIREP intensity categories by comparing a wide range of PIREPs with EDR data from the same or nearby aircraft. The PIREPs used derive from NOAA’s FOS and from proprietary DAL verbal reports; they are compared with EDR data from UAL757s and DAL737s. To maximize the number of comparisons, all available data through the end of 2013 were used (viz., UAL757 data from 2004 to 2013 and DAL737 data from 1 October 2008 to 31 December 2013).

To match a PIREP to an EDR report from the same aircraft, some criteria must be developed to define a match. By subjective review of flight tracks with moderate–severe and severe PIREP cases, the following criteria were found to best define a match: the maximum peak EDR report is within 15 min, 150 km, and 1200 ft (366 m) vertically of the PIREP location. Use of these criteria results in a very large number of matches: 69 867 for UAL757 data and 224 625 for DAL737 data. The number of DAL737 matches exceeds the number of UAL757 matches even though the UAL757 EDR dataset is larger than the DAL737 dataset because every PIREP in the set of proprietary DAL verbal reports contains aircraft identifying information and for the FOS PIREPs this is not true. The large number of matches makes it possible to develop robust statistics of the position and timing errors associated with PIREPs. Earlier estimates of these errors were provided by Schwartz (1996) and Sharman et al. (2006), but these estimates were based on a limited number of sample comparisons. In the following, the verbal PIREP intensity categories of smooth to extreme are converted to a 0–8 scale, where 0 corresponds to smooth, 1 is smooth-light, and 2, 4, 6, and 8 correspond to light, moderate, severe, and extreme, respectively. Table 2 summarizes the average error statistics for intensities 1–6 for UAL757 and 1–4 for DAL737 PIREPs. Intensity category 0, or smooth, is excluded from the averaging because, with the majority of EDR reports being null turbulence, it is unlikely that the closest EDR report precisely corresponds to the smooth PIREP and therefore including these matches in the averaging would make the error lower than it otherwise would be. Intensity categories 5 and 6, or moderate–severe and severe, are excluded from DAL737 averaging because of the low sample size (1 and 2) in those categories. Table 2 indicates that the errors in reporting are for the most part remarkably consistent between DAL737 and UAL757, and the median and average distance errors are very similar at ~35 and 46 km, respectively. In general, the time differences are negative, indicating that the PIREP occurred after the event. Because of issues with the precision of the timing, however, if a PIREP is shortly after an event then the timing difference can appear as positive.

PIREP–EDR matches reported by the exact same aircraft can only be accomplished if some identifier information is included in the PIREP. This is often not the case, but additional matches of PIREPs with corresponding EDR reports can be derived by matching PIREPs that indicate their aircraft type to be the same as the EDR-equipped aircraft but do not indicate specific airline, flight, or tail-number information. To maximize the likelihood that the two matched reports derive from the same aircraft, the time and position radius about the PIREP location are reduced to the absolute maximum of the 25th and 75th percentile of the distance and timing errors from Table 2. Thus for this secondary comparison, “nearby” is defined as the maximum in situ peak EDR report within 3.9 min, 75 km, and 1200 ft (366 m) vertically of the PIREP location. These aircraft-type matches provide an additional 3525 UAL757 and 1738 DAL737 cases, including 88 moderate–severe and severe matches.

Figure 6 shows the results as box plots of these comparisons. The large spread in intensity categories 5 and 6 is due to the lower sample size and the possibility that some reports were matched that did not originate from the same aircraft. Note that some inconsistencies in the box plots are present because the bin width of UAL757 data (0.1 m^{2/3} s^{−1}) is much larger than the bin width of the DAL737 data. To compensate for this effect somewhat, the UAL757 data for each 0.1 m^{2/3} s^{−1} bin were placed at the lower bin value + 0.02 m^{2/3} s^{−1}. With this adjustment, for both aircraft, the medians are fit very well with a simple quadratic as shown. Of interest is that the DAL B737-800 [maximum takeoff weight (MTOW) of ~174 000 lb (79 000 kg)] and the UAL B757-200 (MTOW of ~220 000 lb (99 000 kg) aircraft, which are both in the ICAO “medium” weight category [15 500–300 000 lb (7000–136 000 kg); e.g., http://www.faa.gov/air_traffic/publications/atpubs/CNT/5-1-1.htm] have very similar statistics, and the best-fit quadratic curves

to the medians, where *P* is the PIREP intensity on the 0–8 scale, are very similar, with *C* = 0.0138 (UAL757) and C = 0.0125 (DAL737). The 25%–75% range using both UAL757 and DAL737 data of light values (*P* = 1 and 2) is 0.01–0.12, for moderate (*P* = 3 and 4) it is 0.01–0.26, and for severe (*P* = 5 and 6) it is 0.08–0.72 m^{2/3} s^{−1}. Taking the EDR “threshold” for a particular severity category as the midrange between standard intensity categories (i.e., *P* = 1, 3, or 5) and using the average of UAL and DAL C values, Eq. (19) gives EDR threshold values of 0.013, 0.118, and 0.33 m^{2/3} s^{−1}, respectively. If one alternatively takes the threshold to be the average of the median of the standard intensity categories (*P* = 2, 4, or 6) gives EDR values of 0.01, 0.21, and 0.47 m^{2/3} s^{−1} corresponding to light, moderate, and severe thresholds, respectively. In either case, these values are considerably lower than the current ICAO (2007, 2010) standards but are not too different from the values listed by the other investigators referenced above. This result implies the current ICAO recommendations (0.10, 0.40, and 0.70 m^{2/3} s^{−1}) might need to be revised downward.

### b. Transformations to other aircraft

EDR is a state-of-the-atmosphere turbulence intensity metric; that is, it is an aircraft-independent measure. Although this quality is ideal for turbulence nowcasting/forecasting purposes, there may be operational users who desire an aircraft-dependent turbulence value. From Eq. (10), they are related; that is, the RMS vertical acceleration (RMS *g*, or *σ*_{g}) is proportional to EDR through the aircraft response function. In some cases, the aircraft response function may be available from the manufacturer or through aircraft simulators. If these data are not available, a mathematical modeling approach can be used to estimate the response function (e.g., Cornman et al. 1995; Buck and Newman 2006).

In any case, if one is given the integral of the aircraft response function *F*, then from Eq. (10)

where *F* is in general a function of aircraft type ac, *V*_{T}, current altitude *z*, and current aircraft weight *W*. From this formula, *σ*_{g} can be readily converted from, say, a reporting aircraft “1” to another aircraft “2,” both experiencing the same atmospheric turbulence level *ε*^{1/3}, by use of

In a practical scenario with EDR reporting, if an air traffic controller or dispatcher desires a display of *σ*_{g} for a given aircraft—generated from the EDR reports sent from other aircraft—Eq. (21) would be used. The conversion factors *F* would have to be known for the given aircraft and flight condition, however. Lookup tables for all (or many) aircraft types could be programmed into the display system, but the aircraft-specific, real-time factors would have to be made available to the display system. For pilots that have access to EDR information from another aircraft through downlinked or crosslinked data and desire *σ*_{g} (or peak *g*) for their aircraft, things are easier since that aircraft should have available on board the required parameters as well as its own lookup table for the conversion factor.

Equation (21) can also be used to generalize the results of section 3a to apply to other aircraft types. As indicated above, if one assumes that a PIREP ∝ peak *g* ∝ *σ*_{g}, then from Eqs. (20) and (21) for a given EDR

To estimate the EDR for a particular aircraft type given a PIREP from another aircraft type, the expression in Eq. (20) can be used in Eq. (22) to give

Using the UAL757 quadratic fits above—that is, *C* = 0.0138—and *F* values computed from aircraft response functions by the techniques outlined in Cornman et al. (1995) gives PIREP–EDR maps for other aircraft weight classes as shown in Fig. 7. As expected, heavier aircraft require a higher EDR for a given PIREP intensity, with the difference between aircraft types increasing with higher severities. The difference is substantial for a PIREP in the severe (*P* = 6) category, giving corresponding EDRs of 0.41, 0.49, and 0.61 m^{2/3} s^{−1} for small business jet (SBJ), B737, and B747 aircraft, that is, an ~0.2 m^{2/3} s^{−1} EDR difference between the light and heavy aircraft. The difference is less significant for the moderate (*P* = 4) category PIREPs, with 0.18, 0.22, and 0.27 m^{2/3} s^{−1} or an EDR difference within 0.1 m^{2/3} s^{−1}.

In summary, if aircraft type is taken into account, EDR reports are consistent with PIREPS, although developing precise thresholds for the light, moderate, and severe intensity categories is difficult given the spread in the comparison data. The exact number is probably not so important operationally, however, and operationally useful margins are currently being assessed (Emanuel et al. 2013). A simple model can be used to convert a PIREP or EDR value from one aircraft type to a PIREP or EDR value for another aircraft type, allowing EDR information to be used to derive aircraft-specific response.

## 4. Derived EDR climatologies

The tremendous number of recorded EDR estimates available from the UAL757 and DAL737 fleets provides unprecedented sampling of the turbulent state of the atmosphere, especially at the cruise (UTLS) altitudes at which commercial aircraft spend most of their time. For the 10-yr period 2004–13, 128.3 million UAL757 1-min peak and median EDR reports were recorded for flight altitudes ≥ 20 000 ft; for the 5-yr period 2009–13, 6.1 million (flight altitudes ≥ 20 000 ft) and 9.6 million (all altitudes) DAL737 EDR reports were available, with a significant percentage of these reports being taken during climbs and descents. The UAL757 reporting algorithm in climbs and descents sometimes gives suspect results, and therefore reports taken below FL200 are not used. Note that both the UAL757 and DAL737 datasets are probably biased toward lower values because commercial aircraft will attempt to avoid turbulence if possible; these effects are difficult to estimate. Nevertheless, some useful turbulence statistics to describe the climatological behavior (“climatologies”) can still be developed.

One such measure is the seasonal variability of turbulence intensity as shown in Fig. 8. Shown in this figure is the percent of peak EDR data that is greater than 0.2 m^{2/3} s^{−1}, averaged by month over all records for both UAL757 data and DAL737 data. At lower altitudes (upper curve; DAL737 data) the overall percentage is higher and the seasonal swings are larger, with maxima occurring in the spring, presumably because of the increased frequency of convection. At upper altitudes, the overall incidence is much smaller (lower curves, multiplied by a factor of 5 relative to the scaling of the upper curve), with the DAL737 and UAL757 data agreeing fairly well in magnitude, and also exhibits maxima in the spring but with a secondary wintertime maximum in some years. This pattern is consistent with PIREPs climatologies derived by Wolff and Sharman (2008) and climate model–derived seasonal variabilities reported in Jaeger and Sprenger (2007).

By examining sequential EDR data records it is also possible to derive estimates of the statistics of the length and depth of turbulence patches. Here a turbulence patch is defined as a series of peak EDR reports beginning and ending with peak EDR values of ≥0.1 m^{2/3} s^{−1}, containing at least one peak EDR of ≥0.2 m^{2/3} s^{−1}, and separated by a minimum of 5 min during which peak EDRs are all below 0.1 m^{2/3} s^{−1}. Figure 9a shows the cumulative distribution function (CDF) of the length of turbulence patches for aircraft in straight-and-level cruise flight above FL200 (56 017 UAL757 and 13 633 DAL737 data points), and Fig. 9b shows the CDF of the turbulence patch depth for DAL737 aircraft in climbs and descents (32 861 DAL737 data points). Here again, to maximize the number of samples obtained that satisfy the above criteria, all available data through the end of 2013 were used: UAL757 data from 2004 to 2013 and DAL 737 data from 1 October 2008 to 2013. In Fig. 9a, the UAL and DAL statistics are slightly different, presumably because of differing sampling sizes and differing route structures of the airlines. The median patch length is about 57 (UAL) and 66 (DAL) km, corresponding to flight times of 4.3 and 5.6 min, respectively, assuming a cruise speed of 223 m s^{−1} (500 mi h^{−1}). From Fig. 9b, the median depth is slightly greater at lower levels (~1.4 km) than at higher levels (~0.9 km). This result may be due to penetration through deep cloud layers. These numbers are consistent with previous studies; for example, Vinnichenko et al. (1980, their Fig. 9.6) reported a median patch length of ~20–60 km and a median patch depth of ~200–700 m depending on location, whereas Steiner (1966) estimated the median patch lengths for clear-air turbulence to be less than 10 mi (~16 km) with a thickness of 2000–3000 ft (~600–900 m).

Another quantity of interest that can be estimated from the EDR data is the probability density function (PDF) of *ε*^{1/3}. This function is plotted in Fig. 10 for the UAL757 (routine only) and DAL737 (routine + triggered) peak EDR reports in the cruise ranges (≥FL200) for all UAL757 reports from 2004 to 2013 and DAL737 reports from 2009 to 2013. These results are slightly different, as would be expected for the different sampling strategies and EDR output resolutions used, as well as the different route structures flown. Both datasets fit very well to a lognormal PDF, consistent with previous studies (e.g., Nastrom and Gage 1985; Frehlich 1992; Frehlich and Sharman 2004):

where

with *x* being peak, median, or mean EDR and the angle brackets indicating the ensemble average. Here the fits are performed using Powell’s method (Press et al. 1992) to minimize the relative mean square error of the function fit Eq. (24) to the binned (of bin widths 0.1 and 0.02 m^{2/3} s^{−1} for UAL757 and DAL737 data, respectively) EDR data. For the DAL737 data, some of the lowest and highest bins were not used because of expected noise (relative to the small EDR values in the lowest bins) and too few samples available, respectively. These fits give for the UAL757 and DAL737 peak data, respectively, 〈ln(*ε*^{1/3})〉 = −3.83 and −2.85 and = 0.787 and 0.571, or 〈*ε*^{1/3}〉 = 0.030 and 0.068 m^{2/3} s^{−1} and 〈*ε*〉 = 1.7 × 10^{−4} and 8.3 × 10^{−4} m^{2} s^{−3}. The higher values obtained from the DAL data are consistent with the biased sampling of higher turbulence intensity events because of the triggered reporting strategy relative to the UAL routine reporting. The UAL757 median and DAL737 mean (routine reports only) EDR data fits give lower values of 〈*ε*〉 = 6.2 × 10^{−5} and 2.9 × 10^{−4} m^{2} s^{−3}, respectively (not shown). For comparison, on the basis of an analysis of the Global Atmospheric Sampling Program (GASP; Nastrom and Gage 1985) data, Frehlich and Sharman (2004) estimated 〈*ε*〉 ≈ 7.64 × 10^{−5} m^{2} s^{−3}, whereas Lindborg (1999) estimates 〈*ε*〉 ≈ 7.6 × 10^{−5} m^{2} s^{−3} from cruising-level GASP and Measurements of Ozone and Water Vapor by In-Service Airbus Aircraft (MOZAIC) data, and Frehlich and Sharman (2010) estimate 〈*ε*〉 ≈ 5.2 × 10^{−5} m^{2} s^{−3} at 10-km altitude on the basis of second-order structure-function analyses of ACARS/AMDAR data. Thus the in situ EDR-derived mean values are consistent with other estimates, but all of these estimates are probably somewhat low since, as already mentioned, they are based on commercial aircraft flights that to some extent will try to avoid known areas of turbulence.

This avoidance bias is difficult to estimate, but some inference of its magnitude can be made by comparing current PIREP moderate-or-greater (MOG) encounter frequencies with frequencies tabulated in earlier years when presumably poorer reporting and forecasting techniques led to more frequent turbulence encounters. For instance, Colson (1963) estimated that the maximum wintertime encounter frequency of pilot-reported MOG events in 1960 was about 15% after removing PIREPs associated with convection, whereas in the more recent survey by Wolff and Sharman (2008) the upper-level MOG encounter frequency was less than 2%, which implies that the avoidance bias could be substantial. With this in mind, using DAL triggered reports (which contain one point above a threshold plus 11 surrounding 1-min points) is probably more representative of the true distributions of turbulence. Binning these reports into 3000-ft (~0.91 km) vertical intervals and performing the lognormal PDF fit in each interval provides 〈ln(*ε*^{1/3})〉 or 〈*ε*^{1/3}〉 as a function of height. The result is shown in Fig. 11 as a relative distribution of 〈*ε*^{1/3}〉, where now the brackets denote an average over all triggered reports in each altitude interval for reports received during 2009–13. This distribution is consistent with other studies (e.g., Frehlich and Sharman 2010, their Fig. 15; Wolff and Sharman 2008, their Fig. 6; Steiner 1966, his Fig. 1) that show maxima near the surface and aloft, with minimum intensities at midtropospheric levels and a falloff at the highest altitudes sampled by commercial aircraft. Relative to previous studies, however, much more data went into this figure, and the data are more quantitative. Therefore, it should be a more robust result, although it is only representative of the regions covered by DAL737 flights (cf. Fig. 1).

## 5. Summary and conclusions

Automated in situ EDR reports currently available from some airlines (and probably from more in the future) provide unprecedented sampling of aircraft-scale atmospheric turbulence levels, especially in the UTLS. The large volume of data allows construction of more robust statistics of upper-level turbulence relative to previous, more limited, studies. There are eight specific findings:

Careful comparisons of thousands of PIREPs with EDR reports from the same and nearby aircraft showed the 25%–75% range (median) EDR values for different PIREP categories to be 0.01–0.12 (0.01) m

^{2/3}s^{−1}for “light,” 0.01–0.26 (0.22) m^{2/3}s^{−1}for “moderate,” and 0.08–0.72 (0.47) m^{2/3}s^{−1}for “severe,”; thus there is substantial overlap in the categories, as expected. Of course this is valid only for medium-sized aircraft; in general, the comparison statistics were remarkably consistent between the UAL B757-200s using the accelerometer-based estimation method and the DAL B737-800s using the vertical winds–based estimation method, however. In any event, these values are considerably lower than the current ICAO (2007, 2010) standards of 0.1, 0.4, and 0.7 m^{2/3}s^{−1}, but are not too different from the values listed by other investigators cited in this paper. This result implies that the ICAO values should probably be revised downward, at least for medium-sized aircraft.The medians of the EDR-to-PIREP comparison data fit a simple quadratic [Eq. (19)] very well, and the fits are very close, even though the two aircraft have different flight performance and weights.

The quadratic fit and information about the aircraft response function can be used to map EDRs from one aircraft type to another type.

An obvious seasonal dependence is exhibited, with maxima in the spring at all flight levels and in winter at upper levels. This behavior is consistent with previous studies that used different data sources.

The median turbulence patch length at cruise altitudes (≥FL200) is ~60–70 km, and median depth at all flight altitudes is ~1 km.

The EDR data fit a lognormal distribution very well at all altitudes.

Higher values of EDR are very rare indeed, although some of this result is due to an avoidance bias that is difficult to quantify. Only 10% of both UAL and DAL data have EDR > 0.1 m

^{2/3}s^{−1}; and the frequency of EDR values > 0.5 m^{2/3}s^{−1}is only ~2 × 10^{−5}.The average altitude distribution shows maxima at midlevels at ~3 km and at upper levels at ~11 km.

Operationally, the availability of in situ EDR data in near–real time provides better situational awareness of turbulence within the national airspace that can be used for tactical avoidance and, in turn, allows enhancements in safety, capacity, and operational efficiencies. Further, given the higher reporting frequency, aircraft independence, and the higher time and position accuracy relative to PIREPs, the in situ EDR data provide a valuable verification and tuning source for turbulence nowcasting/forecasting algorithms that have traditionally been limited to PIREPs (e.g., Sharman et al. 2006; Kim et al. 2011; McCann et al. 2012) or to accelerometer metrics (e.g., Gill 2014), both of which introduce uncertainties in the results because they do not directly provide an atmospheric turbulence intensity metric. Given that the preferred verification source for turbulence forecast and nowcast algorithms is EDR, it makes sense that future versions of turbulence forecasting algorithms output EDR as the intensity metric, and this approach has already been done for the updated Graphical Turbulence Guidance (Sharman et al. 2006; output available online at http://www.aviationweather.gov/adds/turbulence) product.

The EDR data also provide a rich data source for turbulence case studies, especially those related to near-cloud turbulence for which high position and timing accuracies are required (e.g., Lane et al. 2012; Sharman et al. 2012b; Williams 2014). Other applications of EDR data include their use as a verification source for remote sensing techniques (e.g., Williams et al. 2006; Chan 2010) and to provide input to real-time wake-vortex models for computing wake decay (e.g., Holzäpfel 2003). Other possible applications are discussed in Emanuel et al. (2013). If, as expected, other airlines implement these automated in situ EDR reports, wider and more complete coverage of the airspace will allow better evaluations of UTLS turbulence statistics globally as well as lead to enhanced efficiency and safety of flight; these EDR reports may ultimately replace PIREPS as the turbulence reporting standard.

## Acknowledgments

This research is in response to requirements and funding by the U.S. Federal Aviation Administration (FAA). The views expressed are those of the authors and do not necessarily represent the official policy or position of the FAA. Thanks are given to Rod Frehlich for developing the lognormal-fit algorithm used in section 4. The authors are grateful to three anonymous reviewers for their constructive comments that led to clarifications in the manuscript.

## REFERENCES

*Proc. 18th Digital Avionics Systems Conf.,*St. Louis, MO, IEEE, 6.D.3-1–6.D.3-8.

*11th Conf. on Aviation, Range, and Aerospace Meteorology,*Hyannis, MA, Amer. Meteor. Soc., 4.3. [Available online at https://ams.confex.com/ams/pdfpapers/81622.pdf.]

*In situ*performance standard for eddy dissipation rate. Preprint and Recording,

*16th Conf. of Aviation, Range, and Aerospace Meteorology,*Austin, TX, Amer. Meteor. Soc., 11.3. [Available online at https://ams.confex.com/ams/93Annual/webprogram/Paper219007.html.]

*Turbulence: An Introduction to Its Mechanism and Theory*. McGraw-Hill, 586 pp.

*Gust Loads on Aircraft: Concepts and Applications.*AIAA Education Series, Institute of Aeronautics and Astronautics, 306 pp.

*Statistical Fluid Mechanics: Mechanics of Turbulence.*Vol. 2, MIT Press, 874 pp.

*Atmospheric Turbulence Relative to Aviation, Missile, and Space Programs,*NASA Conf. Publ. 2468, 137–154.

*A Course in Digital Signal Processing.*John Wiley and Sons, 602 pp.

*Numerical Recipes: The Art of Scientific Computing.*2nd ed. Cambridge University Press, 963 pp.

*Proc. 41st Aerospace Sciences Meeting and Exhibit,*Reno, NV, AIAA, AIAA-2003-194.

*Turbulence in the Free Atmosphere.*Plenum, 310 pp.

*Proc. 44th AIAA Aerospace Sciences Meeting and Exhibit,*Reno, NV, AIAA, AIAA 2006-0076. [Available online at ftp://ftp.rap.ucar.edu/pub/jkwillia/NTDA/DAR_1/AIAA-2006-76-806.pdf.]

## Footnotes

The National Center for Atmospheric Research is sponsored by the National Science Foundation.