## Abstract

A novel approach is constructed for determining both the occurrence and the duration of stationary periods within time series. After reviewing four statistical techniques that provide stationarity measures invariant to applying a constant offset to the variable of interest, the reverse arrangement test is selected as the basic statistical operation to construct the approach. The probability distributions of the starting and ending points of stationary intervals are used to determine the nonstationary location at which a period should be split into two subperiods and nonstationary samples that should be discarded from analysis of time-averaged statistics. The approach provides efficient analysis of long-term datasets and is capable of relating data sampled at multiple locations. Applying the approach to data obtained within a walnut orchard canopy during the defoliated phase of the Canopy Horizontal Array Turbulence Study (CHATS) yields periods with stationary within-canopy velocities required by analysis of the bulk drag–wind relationship. The uncertainties in empirical estimates of the bulk drag–wind relationship associated with nonstationarity and finite duration of time series are evaluated. An integral view of various stationarity measures is presented, with a highlight of their links to physical processes.

## 1. Introduction

Stationary time series are required to precisely estimate time-averaged statistics. Without prior knowledge of a variable’s characteristics, a stationarity measure should desirably remain invariant to applying a constant offset to the variable. Statistical techniques satisfying this requirement include the integral time scale (Lumley and Panofsky 1964), the reverse arrangement test (Kendall et al. 1979), the run test (Brownlee 1965), and the nonstationarity ratio (Mahrt 1998). In previous studies using these techniques, the time series were either prescribed with a fixed duration (e.g., Večenaj and de Wekker 2015; Mahrt 1998; Andreas et al. 2003) or predetermined using visual inspection (e.g., Dias et al. 2004). Arbitrary criteria were chosen to separate between stationary and nonstationary time series, such as an upper bound of 2 min for the integral time scale (Dias et al. 2004), a level of significance of 0.05 for the reverse arrangement test and the run test (Dias et al. 2004; Večenaj and de Wekker 2015), and an upper bound of 2 for the nonstationarity ratio (Mahrt 1998; Večenaj and de Wekker 2015). The degrees of nonstationarity characterized by these arbitrary criteria are not equivalent, and their relationships may vary from one time series to another (Dias et al. 2004; Večenaj and de Wekker 2015). No standard approaches have been proposed for (i) linking stationarity measures to physical mechanisms, (ii) determining the occurrence and duration of stationary periods without visual inspection, and (iii) investigating the dependence of time-averaged statistics on the criteria that separate stationary and nonstationary periods.

The goal of this work is to construct an approach to determine stationary periods that accommodates the abovementioned three aspects. To facilitate reading equations in this manuscript, the appendix summarizes the definition of mathematical terms. The approach is applicable to any time series, and is applied here to within-canopy mean velocities involved in empirical estimates of the bulk drag–wind relationship. Canopies of roughness elements (e.g., crops, forests, and cities) cover a considerable fraction of the land surface and provide essential forcing for the evolution of weather, climate, and biogeochemistry. Therefore, accurate representation of the bulk drag–wind relationship is critical for numerical simulations in which canopies are not resolved.

Chatfield’s (1996, 28–29) definition of strictly stationary suggests that mean trends and periodic variations are two major causes of nonstationarity. Section 2 reviews the four available statistical techniques, showing that interpreting the integral time scale and nonstationarity ratio can be misleading without prior knowledge of mean trends and periodic variations. Section 3a reviews the assumptions used in empirical estimates of the bulk drag–wind relationship, and it shows that determining periods with stationary within-canopy mean velocities reduces uncertainties in analysis of the relationship. The reverse arrangement test is selected as the basic statistical operation to construct an approach that determines both the occurrence and the duration of stationary periods. Section 3b describes the procedures of analyzing data from the Canopy Horizontal Array Turbulence Study (CHATS; NCAR Earth Observing Laboratory 2011; Patton et al. 2011), an ideal dataset providing high-frequency time series of all three velocity components measured at seven levels within a walnut orchard canopy. Section 4 presents results of stationary periods during the defoliated phase of CHATS, highlighting the sensitivity of empirical estimates of the bulk drag–wind relationship to the criterion specified for the reverse arrangement test. Section 5a estimates the integral time scales for all three velocity components and the vertical turbulent momentum flux, and then investigates uncertainties in time-averaged statistics associated with the finite duration of time series. Section 5b provides an integral view of the four stationarity measures reviewed in section 2, highlighting their links to physical processes. Section 6 outlines the major findings about determining stationary periods in time series and evaluating the uncertainties in empirical estimates of the bulk drag–wind relationship.

## 2. Statistical techniques measuring the stationarity of time series

### a. The integral time scale

The integral time scale measures the duration that a variable, *χ*, is correlated with itself (Lumley and Panofsky 1964, 14–15),

where

is the autocovariance function. Here an overbar indicates averaging over an infinite duration (equivalent to ensemble averaging), is the variance of *χ*, and is the autocorrelation function. The integral in (1) is finite only in the absence of a mean trend and therefore the value of is linked to the stationarity of a time series (Lumley and Panofsky 1964, 15–16). However, time series in reality are always of finite duration, compromising the representativeness of the estimated autocorrelation function (Yaglom 1987, 237–241). Without prior knowledge of any mean trends, interpreting an integral time scale estimated using raw field-observed time series can be misleading.

However, knowing the order of magnitude of is a prerequisite to using three of the available statistical techniques, that is, the reverse arrangement test (Kendall et al. 1979), the run test (Brownlee 1965), and the nonstationarity ratio (Mahrt 1998). All three techniques require a minimum number of mutually independent samples, an assumption never perfectly satisfied in field experiments. Practically, averaging *χ* over subrecords of duration that do not overlap provides approximately independent samples. Before knowing the stationarity of *χ*, one can use previously published data to estimate the order of . Section 3b shows an example of estimating the integral time scales of within-canopy velocity components using a combination of wind tunnel measurements, large-eddy simulation (LES) results, and field-observed velocity spectra.

After determining stationary time periods, the integral time scales are estimated using two approaches: 1) the value of *η* at (Kaimal and Finnigan 1994, p. 35; where *e* is the base for natural logarithms) and 2) the integral of up to its first zero crossing (Lenschow and Stankov 1986). Approach 1 is consistent with the strategy used by an LES study (Patton et al. 2016) involved in estimating integral time scales in section 3b, and is conceptually similar to a spectrum-based approach used by Lenschow et al. (1994). Approach 2 is useful for justifying the choice of because it usually provides larger estimates of integral time scales than alternative approaches [e.g., the integral of the autocorrelation function modified by lag-window functions used by Dias et al. (2004)]. The estimated integral time scales are then used to quantify the uncertainties in time-averaged statistics with respect to true ensemble averages (Lumley and Panofsky 1964, 35–37),

where the hat indicates time-averaging over duration .

### b. The reverse arrangement test

Given a sequence of samples , where , 2, …, *N*, a reverse arrangement is defined as for (Kendall et al. 1979, p. 505). Here a tilde indicates averaging over a subrecord of duration , which provides approximately independent samples. For the simplest case, , the possible number of reverse arrangements are 0 and 1, and the corresponding numbers of possible permutations, and , are both 1. For , the number of possible permutations for a given number of reverse arrangements *A* can be calculated as (Kendall et al. 1979, p. 506)

where and . The probability density function for a sequence of *N* samples to have *A* reverse arrangements is

Figure 1a provides theoretical values of for and 30. The cumulative probability, , measures the significance of mean trends. Specifically, indicates a positive mean trend, while indicates a negative mean trend. The significance of mean trends increases as approaches 0 or 1.

Given a level of significance *α* we can find two values of reverse arrangements, and , such that (Kendall et al. 1979, p. 513)

Example values of and for are indicated by vertical dotted lines in Fig. 1a. A time series is declared stationary for , and the reliability is guaranteed for a sample size (Bendat and Piersol 2011, 97–99).

### c. The run test

Given , where , 2, …, *N*, we define a run as the instance that a time series crosses the median of samples (Brownlee 1965, 231–232). The probability for a time series of *N* samples to have *Q* runs is (Brownlee 1965, 224–226)

Here *r* is the number of samples below the median, and is the number of samples above or at median. For even values of *N*, , and for odd values of *N*, . The possible number of runs *Q* ranges from 1 to . The time series crosses its median *s* times upward and times downward. For even values of *Q*, , and for odd values of *Q*, . Figure 1b provides theoretical values of for and 30. The cumulative probability, , measures the biases toward large- or small-scale variations. Specifically, indicates a bias toward variations on scales greater than , including but not limited to mean trends, whereas indicates a bias toward variations on scales smaller than . The significance of biases increases as approaches 0 or 1.

Given a level of *α*, we can find two numbers of runs, and , such that (Brownlee 1965, p. 226)

Example values of and for are indicated by vertical dotted lines in Fig. 1b. A time series is declared stationary for , and the reliability is guaranteed for a sample size (Bendat and Piersol 2011, 95–97).

### d. The nonstationarity ratio

The nonstationarity ratio proposed by Mahrt (1998) requires division of a time series of duration into *I* records, and each record consists of *J* approximately independent samples, . Each record has a mean value,

where a double tilde indicates averaging over a record of duration . The standard deviation of the record mean values is calculated as

where is the mean value of the entire time series. Meanwhile, the standard deviation of samples within each record is

and the random variability of records is

The nonstationarity ratio, defined as

measures the contribution from variations on scales larger than 10% of the duration of a time series, including but not limited to mean trends (Andreas et al. 2003). If samples are random, then NR is approximately unity for and (Bendat and Piersol 2011, 82–83). Mahrt (1998) proposed to declare a time series nonstationary. For a time series of duration , the averaging time () for a sample increases with decreasing sample size (), that is, . Because samples become more independent from each other as the sample averaging time increases, we adopt and from Večenaj and de Wekker (2015), which yield a smaller sample size compared to values in other studies (e.g., Mahrt 1998; Andreas et al. 2003).

To examine the sensitivity of stationarity measures to periodic variations within a time series, we construct a virtual time series of 40 samples (Fig. 2), where

Because the characteristics of this time series remain unchanging, the stationarity measures should be similar for two shorter time series of the same number of samples starting from different time locations. For example, all time series of 30 samples starting from are highly nonstationary according to the reverse arrangement test and the run test, because values of and are always less than 0.01 (not shown). However, the values of NR for a time series of 30 samples can vary from 0.45 to infinity (Fig. 2b). Using NR to measure stationarity suggests that a time series of 30 samples within the virtual time series of 40 samples would be deemed both stationary () and highly nonstationary (). Therefore, without prior knowledge of periodic variations within a time series, interpreting the results of nonstationarity ratio can be misleading. This finding is consistent with Andreas et al. (2003), who reported a sensitivity of NR to periodic variations generated by organized coherent structures.

## 3. Analyzing the stationarity of within-canopy mean velocities

### a. Assumptions used in empirical estimates of the bulk drag–wind relationship

Atmospheric boundary layer (ABL) flows can be described using the Boussinesq approximation of the Navier–Stokes equation in a rotating frame of reference:

where is the instantaneous velocity vector, ν is the kinematic viscosity, is the effective gravitational acceleration, and is the angular velocity of the rotating frame of reference. Variables *ρ* and *p* are the perturbations of pressure and density with respect to a hydrostatic base state, and , respectively. In the presence of canopy of roughness elements (e.g., plants, wind turbines, and buildings), averaging (17) in horizontal directions (represented by angle brackets) yields

The noncommutativity of horizontal averaging and spatial differentiation induces a force term that consists of form and viscous drag exerted by canopy elements on the flow. Because of the complexity of evolving boundary layers on surfaces of individual canopy elements, deriving a model for canopy drag () from first principles is nontrivial. Thus, empirical estimates of canopy drag from experimental data are mandatory, which require simplification of (18) as much as possible. With the assumptions of large Rossby and Reynolds numbers and homogeneous flow on the horizontal averaging scale, applying ensemble averaging to the streamwise component of (18) yields

The vertical momentum flux consists of three terms,

where and are instantaneous velocity fluctuations with respect to their ensemble averages, and and are the deviations of velocity ensemble averages with respect to their horizontally averaged values. The second term, , becomes zero when vanishes in the presence of flat topography. If the canopy is sufficiently dense, then the mean vertical turbulent momentum flux does not penetrate through the canopy to the ground (i.e., ), and the dispersive flux is typically negligible compared with the canopy-top mean vertical turbulent momentum flux . Here the vertical bar represents evaluating turbulence statistics at the location indicated by the subscript. Thus, for a dense canopy on top of flat topography, integrating (19) over the entire canopy depth (, where *h* is the canopy height) yields

The term is known as the bulk mean canopy drag. In the streamwise direction, the vertically integrated mean pressure gradient is typically negligible compared with . In the absence of mean acceleration—that is, —the bulk mean canopy drag should remain unchanging and therefore should remain constant. In summary, if all the assumptions used to obtain (21) are satisfied, then removing timeperiods with nonstationary within-canopy mean velocities should reduce the variability of the bulk drag–wind relationship.

Within-canopy velocity components are frequently influenced by nonstationarity sources, such as weather systems and diurnal cycles (Van der Hoven 1957; Fiedler and Panofsky 1970). Turbulent flows within canopies are also dominated by organized coherent structures (Gao et al. 1989; Finnigan et al. 2009). According to the discussion in section 2, raw time series of within-canopy velocity components cannot be analyzed directly using the integral time scale and nonstationarity ratio techniques. For the data analysis in this work, the reverse arrangement test is more desirable than the run test for three major reasons. First, the reverse arrangement test measures mean trends only (section 2b), whereas the run test measures a combination of mean trends and periodic variations that cannot be distinguished from each other (section 2c). The enhancement of canopy drag during acceleration and the reduction of canopy drag during deceleration are qualitatively understood, whereas the impact of periodic velocity variations on canopy drag remains unclear. Second, the probabilities for a time series of velocity components to show positive and negative mean trends are approximately equal, whereas a balance between large- and small-scale variations is not guaranteed. Third, the range of reverse arrangements, , is times greater than the range of runs, , and therefore the reverse arrangement test provides finer stationarity measures as compared to the run test (see Fig. 1).

### b. The procedures of analyzing the CHATS data

#### 1) Overview of CHATS data

During CHATS, 13 levels of sonic anemometers were deployed facing west on a vertical tower near the north edge of a walnut orchard canopy (NCAR Earth Observing Laboratory 2011; Patton et al. 2011). The trees were uniformly planted on a flat topography with roughly 7-m spacing, and the canopy height is 10 m. The lowest seven levels of sonic anemometers were deployed within the canopy (, 3, 4.5, 6, 7.5, 9, and 10 m). The Campbell Scientific CSAT3 sonic anemometers measured all three velocity components and virtual temperature at a frequency of 60 Hz. The flow distortion as a result of transducer shadowing is corrected following the approaches suggested by Horst et al. (2015). The sonic tilt angles were calculated using the planar fit technique (Wilczak et al. 2001), and the coordinate system was rotated so that the average of vertical velocity over the entire duration of the experiment became zero. The resulting velocity components are along the axes of a local north–west–up (NWU) coordinate system. The horizontally homogeneous flow assumption used to obtain (18) requires sufficiently long fetch of canopy upwind from the instruments. Therefore, data analysis in this work is limited to periods during which canopy-top mean wind directions were within 30° from south. An initial interrogation of 5-min averaged data collected during the defoliated phase of CHATS (from 0800 UTC 15 March 2007 to 0800 UTC 14 April 2007) suggests that the longest period satisfying the mean wind direction requirement is 245 min.

As explained in section 2a, obtaining approximately independent samples for the reverse arrangement test requires estimating the order of integral time scales before knowing the stationarity of time series. We use two approaches to estimate an upper bound for the integral time scales of within-canopy velocities. The first approach uses wind tunnel measurements (Shaw et al. 1995) and LES results (Patton et al. 2016). The LES study coupled a typical forest canopy with ABL turbulent motions for stability conditions ranging from near neutral to free convection. The canopy-top integral length scales of *u* are smaller than eight canopy heights (Fig. 12a in Patton et al. 2016). The conversion from integral length scales to integral time scales uses information from wind tunnel measurements. At canopy top, the integral length scale of *u* is 3 times greater than the product of and (Fig. 15 in Shaw et al. 1995). Thus, an upper bound for the canopy top is estimated as

where is the duration of an individual time series. Plugging m and m s^{−1} across all case studies investigated in section 4 into (22) yields s. The second approach uses LES results (Patton et al. 2016) and velocity spectra measured during CHATS. The *integral length scales* and the peak wavelengths in the wavenumber-weighted spectra are theoretically related (Kristensen et al. 1989). LES results suggest that the canopy-top peak wavelengths are 1.9–3.3 times the integral length scales of *u* for stability conditions ranging from near neutral to free convection [combining Figs. 8 and 12a in Patton et al. (2016)]. For case studies investigated in section 4, the frequency-weighted spectra of the south–north velocity component (within 30° from the streamwise direction) show peak periods ranging from 18 to 60 s (not shown). If the ratio between a peak period and an integral time scale equals the ratio between a peak wavelength and an integral length scale, then we estimate s s. Both approaches yield consistent estimates of the upper bound for . Within the canopy, values of are within of its canopy-top magnitude (Fig. 17 in Shaw et al. 1995). Assuming that is about an order of magnitude smaller than , and and are of similar magnitude, we estimate an upper bound of 50 s for the integral time scales of within-canopy velocities during CHATS.

A duration greater than 30 min is typically required to obtain adequate sampling of ABL turbulent motions and to provide time-averaged statistics representative of ensemble averages (Kaimal and Finnigan 1994, 254–256). A minimum duration of 30 min and the requirement of at least 10 samples for the reverse arrangement test suggest an averaging time of min, which satisfies s. Hereafter, stationarity of within-canopy mean velocity is evaluated by applying the reverse arrangement test to time series of 3-min averaged samples. Because a time period may contain several stationary subperiods separated by nonstationary subperiods, the analysis consists of two steps: (i) determining the nonstationary location at which a period should be split into two subperiods and (ii) removing nonstationary samples from further analysis.

#### 2) Splitting a period at nonstationary locations

Given a period of duration , applying the reverse arrangement test with a level of α to velocity time series at a specified height determines a group of intervals with stationary mean velocity components in all three directions. The durations of stationary intervals () range from 30 min to , and the probability distributions of the starting and ending points of these stationary intervals are used to determine nonstationary locations, as shown in Fig. 3. The probability distributions of the starting and ending points of a group of stationary intervals typically show a number of “modes” separated by time periods associated with zero probability distributions. Specifically, if the probability distribution of shows

then and are the starting and ending points of the *i*th mode. The importance of an individual mode is measured using its cumulative probability distribution, . The mode yielding the largest cumulative probability distribution is defined as the primary mode, and the starting point of the primary mode of is represented as . At least one interval starting at is stationary, whereas all intervals starting at are nonstationary. Therefore, the sample of averaged velocities between and contributes to the nonstationarity. In a statistical sense across all levels within the canopy, at least a portion of the period before the latest starting point of the primary modes of , , is nonstationary. The nonstationarity of a sample of averaged velocities between and is measured by the number of levels showing a zero probability distribution of , denoted as . Similarly, a primary mode can be identified for the probability distribution of , and the ending point of the primary mode is represented as . At least one interval ending at is stationary, whereas all intervals ending at are nonstationary. Therefore, the sample of averaged velocities between and contributes to the nonstationarity. In a statistical sense across all levels within the canopy, at least a portion of the period after the earliest ending point of the primary modes of , , is nonstationary. The nonstationarity of a sample of averaged velocities between and is measured by the number of levels showing a zero probability distribution of , denoted as . The strategy to split a period into multiple subperiods depends on the combination of and .

If min and min, then the period needs no further splitting.

If min and min, then a maximum is found for 30 min , and the period is split into two subperiods at the corresponding location.

If min and min, then a maximum is found for min, and the period is split into two subperiods at the corresponding location.

If min and min, then a maximum is found for 30 min and a maximum is found for min. The period is split into two subperiods at the location corresponding to the largest maximum value.

The abovementioned procedures are repeated until min and min are satisfied for all subperiods. Then the resulting subperiods are passed to analysis in the next subsection.

#### 3) Removing nonstationary samples from further analysis

For a subperiod of duration obtained after completing the procedures in Fig. 3, applying the reverse arrangement test with a level of significance (α) to velocity time series at all heights within the canopy determines a group of stationary intervals. Recall that this subperiod satisfies min and min (i.e., needing no further splitting), and samples before and after contribute to the nonstationarity. Figure 4 shows the strategy to remove nonstationary samples, which depends on the combination of and .

If and , then all intervals are used for further analysis of statistics, that is, .

If and , then removing members containing samples before yields a remaining group of intervals. These remaining intervals are used for further analysis of statistics, that is, .

If and , then removing members containing samples after yields a remaining group of intervals. These remaining intervals are used for further analysis of statistics, that is, .

If and , then removing members containing samples before yields a remaining group of intervals, while removing members containing samples after yields a remaining group of intervals. If , then samples before contribute less to the nonstationarity than samples after , and the subperiod is modified by removing samples after . Otherwise, the subperiod is modified by removing samples before . The procedures in Fig. 4 are repeated for the modified subperiod.

The procedures in Fig. 4 finally yield a remaining group of intervals with stationary mean velocities at all levels within the canopy. The variability of time-averaged statistics for a group of intervals can be quantified as

where for each interval is weighted by its duration to provide a statistical mean,

and the variance of across intervals is estimated as

So far all procedures have been conducted in the local NWU coordinate system (i.e., south–north, west–east, and vertical directions). The canopy-top mean horizontal velocity components calculated using (25) provide a mean wind direction. Data are then rotated into a coordinate system aligned with the canopy-top mean flow (i.e., streamwise, spanwise, and vertical directions). Empirical estimates of the bulk drag–wind relationship use mean velocity and mean vertical turbulent momentum flux at canopy top from all available subperiods. On a logarithmic scale, can be considered as negligible if

where is an arbitrarily chosen small number. Thus, empirical estimates of the bulk drag–wind relationship become more reliable with increasing ranges of and across available subperiods and decreasing and for individual subperiods.

## 4. Results

### a. An overview of 37 case studies during the defoliated phase of CHATS

Figure 5 shows the distributions of a total of 37 periods during the defoliated phase of CHATS satisfying the requirements of (i) canopy-top mean wind directions within 30° from south, determined using the initial interrogation of 5-min averaged data; and (ii) at least two stationary intervals longer than 30 min for calculating the variance of time-averaged statistics in (26), determined by applying the reverse arrangement test with to 3-min averaged within-canopy velocities. The durations of these periods range from 35 to 245 min, and the majority of these periods are shorter than 120 min (Fig. 5a).

Figure 5b shows that most of the 37 periods occur from late afternoon to sunrise. Figure 5c provides rough estimates of the canopy-top friction velocity,

and the stability parameter , where the Obukhov length is calculated as

Here the subscripts “SN” and “EW” indicate south–north and east–west directions, respectively; is the von Kármán constant; m s^{−2} is the gravitational acceleration; and is the virtual potential temperature. The canopy-top friction velocity ranges from 0.14 to 0.81 m s^{−1}, while ranges from −0.86 to 0.55. Positive values of indicate stable conditions, whereas negative values of indicate unstable conditions. According to the classification of stability conditions defined by Dupont and Patton (2012), the majority of the time periods are classified as forced convection, near-neutral, and transition to stable conditions (). Note that the values of and shown in Fig. 5c are used only to suggest the overall characteristics of the turbulence and stability conditions, because the stationarity of time series has not been evaluated yet.

### b. Stationary periods determined for a case study from 0255 to 0455 UTC 4 April 2007

A case study from 0255 to 0455 UTC 4 April 2007 is chosen from those shown in Fig. 5 to illustrate the data-processing procedures and strategies described in section 3b. The turbulence and stability conditions are characterized by m s^{−1} and , classified as transition to stable (Dupont and Patton 2012). Figure 6 shows that the variations of 3-min averaged vertical velocity component are much smaller than the variations of 3-min averaged horizontal velocity components. This visual inspection is consistent with statistical results suggesting that the vertical velocity component typically does not contribute to nonstationarity (not shown). The magnitude of the 3-min averaged south–north velocity component always increases with height in the upper half of the canopy (red, orange, green, and cyan solid lines in Fig. 6a). The magnitude of the 3-min averaged east–west velocity component either increases or remains approximately constant with height (Fig. 6b). Visually determining the stationarity of all three velocity components at all seven levels within the canopy is not obvious.

For , the reverse arrangement test yields groups of , 144, 115, 120, 127, 117, and 113 intervals longer than 30 min with stationary mean velocities at , 0.3, 0.45, 0.6, 0.75, 0.9, and 1, respectively. Figure 7a,b shows the probability distributions of the starting and ending points of these stationary intervals. The latest starting point of the primary modes of and the earliest ending point of the primary modes of across all levels within the canopy are min and min (the gray vertical dotted lines in Figs. 7a,b, respectively). Because min and min, we need to evaluate both the number of levels showing zero probability distribution of for 30 min and the number of levels showing zero probability distribution of for min. A maximum of is found at min and a maximum of is found at min (the black vertical dotted lines in Figs. 7a,b, respectively). Because the maximum is greater than the maximum , min (i.e., 0413 UTC) is the most nonstationary location during the period of duration , at which the period is split into two subperiods (the thick black vertical solid lines in Figs. 6, 7c,d). From 0413 to 0425 UTC, 3-min averaged velocities at all levels within the canopy are characterized by approximately zero east–west components (Fig. 6b) and increasing south–north components (Fig. 6a), supporting the seven-level nonstationarity suggested by the probability distribution of . Interestingly, the magnitude of the east–west velocity component increases with height during the subperiod from 0255 to 0413 UTC, but it remains approximately constant with height during the subperiod from 0413 to 0455 UTC (Fig. 6b).

We then evaluate the stationarity of the two subperiods split at 0413 UTC. For the subperiod from 0255 to 0413 UTC ( min, to the left of the thick black vertical solid lines in Figs. 7c,d), the probability distributions of the starting and ending points of intervals with stationary mean velocities at individual levels within the canopy yield and min. This subperiod needs neither splitting nor modification. The reverse arrangement test with determines a group of intervals with stationary mean velocities at all levels within the canopy, none of which contains samples after min (i.e., 0407 UTC). Similarly, for the subperiod from 0413 to 0455 UTC ( min, to the right of the thick black vertical solid lines in Figs. 7c,d), the probability distributions of the starting and ending points of intervals with stationary mean velocities at individual levels within the canopy yield and min. This subperiod needs neither further splitting nor further modification. The reverse arrangement test with determines a group of intervals with stationary mean velocities at all levels within the canopy.

Table 1 compares statistics calculated for stationary intervals during the subperiod from 0255 to 0413 UTC and for stationary intervals during the subperiods from 0413 to 0455 UTC. The canopy-top weighted mean wind speed, , remained the same, while the canopy-top weighted mean wind direction, , changed by 18°. These results suggest that the seven-level nonstationarity detected in Fig. 7b is a sudden change in the mean wind direction. After rotating the coordinate system to align with the canopy-top mean wind during individual subperiods, the 3-min averaged steamwise velocity component always increases with height in the upper canopy (Fig. 8a), whereas the 3-min averaged spanwise velocity component always remains approximately constant with height (Fig. 8b).

The ratio between and in Table 1 measures the alignment between canopy-top mean shear stress and mean wind, with small values of this ratio implying better alignment than large values. During the subperiod from 0255 to 0413 UTC, a ratio of 0.022 indicates good alignment between canopy-top mean shear stress and mean wind, whereas during the subperiod from 0413 to 0455 UTC, a ratio of 0.158 indicates misalignment between canopy-top mean shear stress and mean wind. The misalignment is caused by a lag in the change of mean shear stress in response to the change in mean wind direction, during which the mean shear stress is actively adjusting toward the direction of mean wind. This hysteresis is a type of nonstationarity, even if the uncertainties associated with nonstationarity, and , are negligible.

The stationarity of mean velocities during the time period from 0255 to 0455 UTC 4 April 2007 is further investigated by increasing the level of significance used for the reverse arrangement test. The requirement of at least two intervals longer than 30 min with stationary mean velocities at all levels within the canopy suggests a maximum level of significance, . For , the probability distributions of the starting and ending points of stationary intervals suggest splitting the period of duration min into two subperiods at a location ranging from 0404 to 0413 UTC. During the subperiod prior to the split, variances of time-averaged statistics in (26) are available for , whereas during the subperiod after the split, variances of time-averaged statistics in (26) are available only for . These results also reveal the connection between nonstationarity and the misalignment between mean shear stress and mean wind. Thus, empirical estimates of the bulk drag–wind relationship should only consider the subperiods that additionally show good alignment between canopy-top mean shear stress and mean wind.

### c. Stationary periods determined for all case studies

The analysis procedures proposed in section 3b are conducted for the 37 case studies during the defoliated phase of CHATS shown in Fig. 5. Similar to the case study shown in section 4b, after rotating the coordinate system to align with the canopy-top mean wind during individual subperiods, the 3-min averaged spanwise velocity component always remains approximately constant with height. Note that the sample averaging time, min, is sufficient for a within-canopy velocity component to become uncorrelated with itself (assumed in section 3b and verified in section 5a). The collapse of the 3-min averaged spanwise velocity component at various levels implies no rotation of mean wind direction with height within the canopy during stationary periods determined for case studies shown in Fig. 5. A reliable approach for determining stationary periods is critical for evaluating time-averaged statistics and consequently for examining the rotation of within-canopy mean wind direction.

As discussed in section 4b, empirical estimates of the bulk drag–wind relationship should consider only stationary subperiods that also show good alignment between canopy-top mean shear stress and mean wind. Here an angle between mean shear stress and mean wind smaller than 5° is arbitrarily chosen as a criterion for good alignment, corresponding to the magnitude of smaller than of the magnitude of . Figure 9 shows results of against for on a log–log scale. The error bars indicate and for individual subperiods. Red symbols representing subperiods showing good alignment between canopy-top mean shear stress and mean wind and negligible and suggest an approximate power-law relationship between the weighted mean vertical turbulent momentum flux and the weighted mean wind at canopy top, insensitive to the level of significance *α* between 0.05 and 0.35. Adding black symbols representing subperiods showing good alignment between canopy-top mean shear stress and mean wind but nonnegligible and does not change the power-law behavior of the bulk drag–wind relationship. The black symbols are few, implying that the majority of subperiods showing good alignment between canopy-top mean shear stress and mean wind also show negligible and . Gray symbols representing subperiods showing misalignment between canopy-top mean shear stress and mean wind can deviate considerably from the power-law bulk drag–wind relationship suggested by red and black symbols. The subperiod yielding the greatest deviation is characterized by m s^{−1} and m^{2} s^{−2}, corresponding to the only case study classified as free convection (, the circle below the red dashed line in Fig. 5c). This subperiod also yields the greatest misalignment between mean shear stress and mean wind at canopy top, .

For further analysis in section 5, we specify , which yields 23 subperiods showing good alignment between canopy-top mean shear stress and mean wind and negligible and . These subperiods yield the greatest range of and for empirical estimates of the bulk drag–wind relationship (comparing the span of red symbols in Fig. 9c with those in other panels of Fig. 9). All subperiods are classified as forced convection, near-neutral, and transition to stable conditions, during which mean shear contributes considerably to the production of turbulent kinetic energy (TKE).

## 5. Discussion

### a. Estimating the integral time scales and the uncertainties associated with finite duration

Figure 10 shows estimates of the integral time scales of the three velocity components and the vertical turbulent momentum flux for the 23 subperiods shown as red symbols in Fig. 9c as the value of *η* at . The integral time-scale estimates are always finite, a necessary but insufficient requirement for stationary time series of finite duration. In general, the integral time scales estimated from field data are shorter than 50 s (to the left of the vertical dotted lines in Figs. 10a,b), consistent with the assumption obtained using wind tunnel measurements, LES results, and velocity spectra (see section 3b). The two exceptional subperiods are (i) from 0650 to 0740 UTC 1 April 2007 (the olive line, characterized by = 0.41) and (ii) from 0820 to 0953 UTC 1 April 2007 (the green line, characterized by = 0.55). These two most stable subperiods (the top two circles in Fig. 5c) yield an order of magnitude larger than the other subperiods (see Fig. 10b). Visual inspection of the Raman-Shifted Eye-Safe Aerosol Lidar movies (Mayor 2010) identifies wavelike propagation of aerosols during these two subperiods, similar to the microscale internal gravity waves in very stable flows above the walnut orchard canopy reported by Mayor (2017). Because of the occurrence of microscale internal gravity waves and the large integral time scales of the spanwise velocity component, these two stable subperiods characterized by should require longer durations of time series than the other subperiods (including unstable subperiods). These results suggest that the conclusions obtained by Dupont and Patton (2012) for stable conditions may have been biased by their choice of a duration of 5 min for individual time series.

To justify the choice of , Fig. 11 compares the integral time scales estimated by integrating to its first zero crossing to those estimated as *η* at (i.e., results shown in Fig. 10). Except for the two most stable subperiods represented by olive and green circles, respectively, the integral time-scale estimates obtained from both approaches are mostly shorter than 1.5 min, so the choice to use min is appropriate. For and shorter than 1.5 min (circles in the bottom-left quadrant in Figs. 11a,b), integrating to its first zero crossing usually provides larger estimates of and than evaluating *η* at (most circles are above the 1:1 lines). For and , the two approaches provide estimates in good agreement with each other (most circles are around the 1:1 lines in Figs. 11c,d).

Because the two most stable subperiods behave considerably differently from the other subperiods, they are hereafter excluded from the analysis. Figure 12 shows the uncertainties of time-averaged turbulence statistics associated with the finite duration of time series estimated using (3) for the remaining 21 subperiods. Similar to (27), can be considered negligible if

According to (3), tends to decrease with increasing . For min, the estimates of are almost always negligible (i.e., below the horizontal dotted line in Fig. 12b). However, min is required for the estimates of to become always negligible (i.e., below the horizontal dotted line in Fig. 12a). These results suggest that the uncertainties in canopy-top mean velocity associated with finite duration dominate the uncertainties in empirical estimates of the bulk drag–wind relationship. Practically, two approaches can be used to improve the reliability of empirical estimates of the bulk drag–wind relationship: (i) increasing the lower limit of from 30 to 48 min and (ii) eliminating stationary intervals showing nonnegligible uncertainties associated with the finite duration of time series (i.e., circles above the horizontal dotted lines in Fig. 12).

### b. An integrated view of stationarity measures

By definition, for stationary time series determined using the reverse arrangement test with , (see details in section 2b) is always confined within the range from 0.075 to 0.925. For the 21 subperiods investigated in Fig. 12, the probability distributions of are evaluated for time series of 3-min averaged samples of the three velocity components in the mean-flow coordinate system. The values of obtained for and are almost always confined within the range from 0.075 to 0.925 (Figs. 13a,g). The effect of rotating the coordinate system is most evident in the spanwise direction, as the values of obtained for are sometimes beyond the range from 0.075 to 0.925 (Fig. 13d). These results are consistent with the visual inspection of the differences between horizontal velocity components in the local NWU and mean-flow coordinate systems during the period from 0255 to 0455 UTC 4 April 2007 (comparing Figs. 6, 8). Specifically, the time series of and are similar, whereas the time series and are noticeably different from each other.

For subperiods with stationary mean velocities determined using the reverse arrangement test, the mean trends in velocity components are negligible. Then (see details in section 2c) measures the biases toward large- or small-scale periodic patterns. For the 21 subperiods investigated in Fig. 12, the probability distributions of are evaluated for 3-min averaged time series of the three velocity components in the mean-flow coordinate system. The values of obtained for are slightly biased toward the range smaller than 0.5 (Fig. 13b), while the values of obtained for are strongly biased toward the range smaller than 0.5 (Fig. 13e). These results suggest that large-scale turbulent motions contribute more to the spanwise velocity component than to the streamwise velocity component, consistent with visual inspection of Fig. 8. The values of obtained for are slightly biased toward the range larger than 0.5 (Fig. 13h), implying that small-scale turbulent motions contribute more to the vertical velocity component than to the horizontal velocity components. These results are consistent with the anisotropic nature of turbulent motions near Earth’s surface.

The contribution from large-scale motions also influences the estimates of the nonstationarity ratio. Using and , an individual time series of duration 30 min 78 min is divided into samples of time average over 1 min min (see details in section 2d). For the 21 subperiods investigated in Fig. 12, the probability distributions of NR are evaluated for time series of the three velocity components in the mean-flow coordinate system. The values of NR obtained for peak at about 1 and are always confined within 2 (Fig. 13i), consistent with the expectation of stationarity proposed by Mahrt (1998). However, the values of NR obtained for peak at about 1.8 (Fig. 13c), while the values of NR obtained for show a wide peak from 1.4 to 2.4 (Fig. 13f). A remarkable portion of time series of and yield values of NR greater than 2 [i.e., the stationarity criterion proposed by Mahrt (1998)], implying a limitation of using the nonstationarity ratio to evaluate the stationarity of horizontal velocity components in the ABL.

Now we evaluate the stationarity of canopy-top mean vertical turbulent momentum flux. The integral time scales of *w* and at canopy top are on the same order of magnitude (see Figs. 10c,d, 11c,d), implying that turbulent motions on various scales contribute similarly to the vertical velocity component and vertical turbulent momentum flux. The values of evaluated for show greater probability of encountering a mean trend than those evaluated for velocity components (comparing Fig. 13j with Figs. 13a,d,g), partially because of the mean shear stress lagging behind the change in mean wind. The greater probability of encountering a mean trend leads to values of obtained for biased toward the range smaller than 0.5 (Fig. 13k). Consistently, the values of NR evaluated for are greater than those evaluated for (comparing Fig. 13l with Fig. 13i). These results reveal the necessity to link stationary measures back to physical processes, or otherwise the various stationarity measures may seem too complicated to be understood, as suggested by previously published studies (e.g., Dias et al. 2004; Večenaj and de Wekker 2015). Taking the vertical turbulent momentum flux as an example, we see that the integral time scales are always much smaller than 2 min [i.e., the criterion used by Dias et al. (2004); see Figs. 10d, 11d], and the values of NR are mostly confined within 2 [i.e., the criterion proposed by Mahrt (1998); see Fig. 13l]. However, results from the reverse arrangement test sometimes show nonnegligible mean trends (Fig. 13j), and the effects of nonnegligible mean trends are evident in results from the run test (Fig. 13k).

## 6. Conclusions

A novel approach has been proposed to determine both the occurrence and duration of stationary periods in time series. This approach provides efficient analysis of long-term datasets and is capable of relating data sampled at multiple locations. The basic idea is that the probability distributions of the starting and ending points of stationary intervals provide information about (i) the nonstationary location at which a period should be split into two subperiods (Fig. 3) and (ii) the nonstationary samples within a subperiod that should be removed from analysis of time-averaged statistics (Fig. 4). Section 3a uses the empirical estimates of the bulk drag–wind relationship as an example, showing how to identify the statistics of interest (i.e., within-canopy mean velocities) and how to choose the most desirable statistical technique (i.e., the reverse arrangement test). Section 3b uses CHATS data as an example, showing how to choose the duration of individual time series (i.e., from 30 min to a few hours) and how to obtain approximately independent samples (i.e., 3-min averaged velocity components). Key findings arising from analysis of field data from the defoliated phase of CHATS include the following:

Rotating data into a coordinate system aligned with the canopy-top mean wind leads to a collapse of time series of the 3-min averaged spanwise velocity component at various heights within the canopy (Fig. 8).

The mean shear stress lagging behind the change in mean wind direction is shown to be a cause of misalignment between mean shear stress and mean wind at canopy top. This nonstationary situation is also demonstrated by the greater probability of encountering a mean trend for mean vertical turbulent momentum flux than velocity components, and should be removed from an analysis of the bulk drag–wind relationship.

During the defoliated phase of CHATS, stationary subperiods showing good alignment between mean shear stress and mean wind at canopy top usually show negligible variability associated with nonstationarity as well. These subperiods suggest an approximate power-law relationship between the mean vertical turbulent momentum flux and the mean wind at canopy top, with flexibility to choose the level of significance

*α*between 0.05 and 0.35 (Fig. 9). Choosing a level of significance yields the greatest span of mean velocity and mean vertical turbulent momentum flux at canopy top for reliable empirical estimates of the bulk drag–wind relationship.Integral time-scale estimates satisfy the assumed upper limit of 50 s obtained from wind tunnel measurements, LES results, and velocity spectra, except for the two most stable subperiods (Fig. 10). These two subperiods are removed from further analysis in section 5 because the occurrence of microscale internal gravity waves leads to values of an order of magnitude larger than the other subperiods.

The integral time-scale estimates are used to evaluate the representativeness of time-averaged statistics with respect to ensemble averages (Fig. 12). The uncertainties in canopy-top mean velocity associated with the finite durations of time series dominate the uncertainties in empirical estimates of the bulk drag–wind relationship.

An integrated view of stationarity measures provided by the reverse arrangement test, the run test, and the nonstationarity ratio is presented (Fig. 13). In the absence of mean trends, the run test reveals the contribution from large-scale turbulent motions. Anisotropic contributions of large-scale motions to velocity components are responsible for the sensitivity of the reverse arrangement test to the rotation of coordinate system. Evaluating the stationarity using the nonstationarity ratio is applicable to the vertical velocity component but inapplicable to horizontal velocity components because of considerable contribution from large-scale motions.

The procedures and strategies described in section 3b are applicable to any time series when the time scale required to obtain time-average statistics representative of ensemble averages is larger than the product between the required minimum number of samples and the time scale for samples to become approximately independent. Nevertheless, interpreting stationarity measures should always refer back to physical processes. For example, evaluating the stationarity of mean velocities and investigating the alignment between mean shear stress and mean wind in the atmospheric surface layer are meaningful only when mean shear contributes considerably to the production of TKE. Consequently, the majority of stationary periods detected in this manuscript are classified as forced convection, near-neutral, and transition to stable conditions (). For free convection () and stable () conditions, mean shear likely becomes unimportant, calling for alternative strategies to evaluate the stationarity of turbulence. If turbulence statistics of interest satisfy Galilean invariance or are positively definite, then one can use additional statistical techniques to measure stationarity (e.g., Foken and Wichura 1996; Vickers and Mahrt 1997; Affre et al. 2000; Nappo et al. 2010; Hiscox et al. 2015).

## Acknowledgments

Funding for this research was provided by the National Center for Atmospheric Research (NCAR) through support from the National Science Foundation under Cooperative Support Agreement AGS-0856145. Ying Pan was supported by NCAR's Advanced Study Program. We are grateful to Don Lenschow, Sean Burns, and two anonymous reviewers for their comments on earlier drafts of the manuscript. We would like to acknowledge the operational, technical, and scientific support provided by NCAR’s Earth Observing Laboratory (particularly, Steven Oncley, Gordon Maclean, and Tom Horst), the high-performance computing support from Yellowstone (ark:/85065/d7wd3xhc) provided by NCAR’s Computational and Information Systems Laboratory, and the technical and partial page charge support provided by NCAR's Mesoscale and Microscale Meteorology Laboratory.

### APPENDIX

## REFERENCES

*Random Data: Analysis and Measurement Procedures*. John Wiley & Sons, 566 pp.

*Statistical Theory and Methodology in Science and Engineering*. John Wiley & Sons, 590 pp.

*The Analysis of Time Series: An Introduction*. Chapman & Hall, 283 pp.

*Atmospheric Boundary Layer Flows: Their Structure and Measurement*. Oxford University Press, 289 pp.

*The Advanced Theory of Statistics*. Vol. 2. Macmillan Publishing Co. Inc., 748 pp.

*The Structure of Atmospheric Turbulence*. John Wiley & Sons, Inc., 231 pp.

*Correlation Theory of Stationary and Related Random Functions: Supplementary Notes and References.*Springer-Verlag, 526 pp.

## Footnotes

^{a}

Current affiliation: The Pennsylvania State University, University Park, Pennsylvania.

^{b}

The National Center for Atmospheric Research is sponsored by the National Science Foundation.

© 2017 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).