## 1. Introduction

Zonal-mean flow variability has received considerable attention during the past decade because of its implications for both medium-range weather forecasting and climate prediction. Zonal-mean flow anomalies are known to exhibit two major types of behavior: (i) *zonal index variability* (e.g., Robinson 1991; Yu and Hartmann 1993; Feldstein and Lee 1998; Lorenz and Hartmann 2001, 2003), which can be thought of in terms of irregular north–south shifts of the midlatitude jet combined with jet intensification or weakening (Monahan and Fyfe 2006); and (ii) *quasi-periodic poleward, zonal-mean anomaly propagation* (e.g., Riehl et al. 1950; James and Dodd 1996; Feldstein 1998; Kravtsov et al. 2006; Lee et al. 2007). The time scale for the zonal index is approximately 10 days (Feldstein 2000) and that for the poleward, zonal-mean anomaly propagation about 60 days (Feldstein 1998). The atmospheric flow anomalies associated with the first type of behavior are related to the so-called annular modes (Thompson and Wallace 1998, 2000). The annular mode of the Northern Hemisphere is often referred to as the Arctic Oscillation (AO) and may be related to the more geographically confined North Atlantic Oscillation (NAO; Deser 2000; Hurrell et al. 2003; Feldstein and Franzke 2006). An analogous feature in the Southern Hemisphere is called the Southern Annular Mode (SAM; Thompson and Wallace 2000). The poleward propagation (PP) of the zonal-mean zonal flow anomalies has also been observed in both hemispheres (Feldstein 1998; Lorenz and Hartmann 2001, 2003; Kravtsov et al. 2006).

One of the most interesting questions regarding the dynamics of the zonal index involves the role of transient eddy momentum fluxes. In particular, many authors have argued that the persistence of zonal index anomalies is due to the so-called synoptic eddy feedback (Robinson 1991, 2000; Feldstein and Lee 1998; Lorenz and Hartmann 2001, 2003), that is, the synoptic-scale, transient eddy momentum flux convergence responding to the presence of the zonal index anomalies so as to reinforce the zonal index anomalies.

*statistically linear*process if the eddy momentum flux convergence can be parameterized as a linear function of the zonal-mean flow anomalies. To quantify such effects, one can, for example, project various terms in the vertically and zonally averaged zonal-mean zonal momentum equation onto the spatial pattern associated with the leading mode of zonally averaged variability (Robinson 1991; Lorenz and Hartmann 2001, 2003). In a flat-bottom geometry, with

*z*representing the projection of the anomalous vertically integrated zonal-mean zonal wind onto this spatial pattern, and for linear surface drag characterized by time scale

*T*

_{d}, the equation governing the time series

*z*(

*t*),

*in the absence*of eddy feedbacks, would bewhere

*η*is a stochastic representation of the effect of transient eddy momentum flux convergence that is not influenced by the zonal flow anomaly. If the eddy feedback is active but works in a linear fashion, (1) will be modified as follows:where

*T*

_{L}and

*T*

_{H}are the time scales associated with

*z*projections of low- and high-pass filtered eddy momentum flux convergence anomalies, respectively. The proximity sign in (2) indicates that we have neglected interactions between high-frequency and low-frequency eddies in computing

*T*

_{d,1}. The negative sign in front of

*T*

_{H}reflects the fact that synoptic eddies tend to reinforce the zonal-mean anomaly. In the case of

*T*

_{d,1}>

*T*

_{d}, the eddy processes result in an increased persistence of the zonal-flow anomalies.

The above linearity implies that *z* anomalies of all magnitudes and signs are being damped at the same rate. On the other hand, if there is a range of *z* values for which *T*_{d,1} decreases in a range of *z* values, then we refer to the feedback as being *statistically nonlinear* over this range. In this study, we will use the term *regime* to refer to flow patterns associated with these values of *z* (for a more general context, see Reinhold and Pierrehumbert 1982, 1985; Plaut and Vautard 1994; Koo et al. 2003). One of the main objectives of the present study is to develop and apply statistical methodology for identification of regimes.

It is important to keep in mind that this study identifies regimes as being statistically nonlinear, in the sense that the variability cannot be described by a linear model with constant coefficients and additive Gaussian noise (this noise is independent of the state of the system). From this perspective, the occurrence of regimes alludes to the occurrence of deterministic nonlinearities. However, there is also the possibility that the regimes can arise from multiplicative noise. In this case, the variability can be represented by a model that is linear in the resolved variables, but with noise that depends on the state of the system (Sura et al. 2005). As shown in Sura et al. (2005), deterministic nonlinear models and multiplicative noise models can yield very similar statistics. An examination of regimes in the context of multiplicative noise is beyond the scope of the present study. Such an analysis, which requires much longer simulations than those to be presented in this manuscript, is planned for in future research.

The dynamics of the poleward zonal-mean flow anomaly propagation was also shown to involve transient eddies (James and Dodd 1996; Feldstein 1998). Lee et al. (2007) demonstrated that the poleward propagation is driven by Rossby wave breaking at a critical latitude: the latitude at which the phase speed of the wave is equal to the local zonal mean zonal wind speed. In a manner analogous to the wave–mean flow interaction associated with the stratospheric quasi-biennial oscillation, the wave breaking modifies the zonal-mean flow so as to shift the critical latitude in a poleward direction. As a result, subsequent wave breaking occurs at a higher latitude. In this manner, the wave-breaking and critical latitudes are systematically shifted poleward in a leap-frog fashion, resulting in the quasiperiodic character of the poleward propagation. As such, nonlinear wave dynamics is a crucial part of the process, but the quasiperiodic behavior suggests that most of the poleward zonal-mean flow anomaly propagation can be described with a linear stochastic model.

Son and Lee (2006) analyzed a series of GCM simulations. They found that zonal index behavior dominates when the climatological, zonal-mean, potential vorticity (PV) meridional gradient is sharp and strong, whereas poleward zonal-mean flow anomaly propagation is most apparent when the climatological flow is characterized by a zonal-mean, meridional potential vorticity gradient that is weak and broad. The present paper aims at establishing, via statistical analyses of the series of idealized climate model simulations of Son and Lee (2006), the degree to which nonlinear feedbacks account for the impact of transient eddies on both the zonal-index and poleward zonal-mean flow anomaly propagation. The statistical diagnoses performed here provide a framework for future dynamical analyses of the phenomena captured in these simulations.

## 2. Paper outline and methodology

### a. GCM simulations

We analyze the seven model simulations performed by Son et al. (2008). These simulations correspond to longer time integrations of selected model runs from Son and Lee (2006). The model consists of the dynamical core of the Geophysical Fluid Dynamics Laboratory (GFDL) GCM with an R30 horizontal resolution and 10 equally spaced vertical sigma levels. The model’s forcing was represented by a 30-day relaxation to various equilibrium temperature profiles. In this study, we analyze seven control flat-bottom simulations, CS-1–CS-7, which differ in the strength of the two model parameters that represent tropical heating and the width of the midlatitude baroclinic zone. As discussed in Son and Lee (2005, 2006), the CS-1 and CS-2 control runs are characterized by a climatological, zonal-mean, meridional potential vorticity gradient that is strong and sharp, and therefore zonal index variability dominates; the CS-6 and CS-7 runs exhibit broad and weak PV gradients, yielding poleward zonal-mean anomaly propagation; and the CS-3 through CS-5 runs show intermediate characteristics. The basic state’s PV gradients are modified in the runs with mountains (Son et al. 2008); we will analyze some of these simulations as well, but our focus will be on the results with the flat-bottom model configuration.

Each model simulation is 142 000 days long and is archived daily. Because the forcing in each flat-bottom case is symmetric with respect to the equator, the output from the Northern Hemisphere (NH) and Southern Hemisphere (SH) of each model run correspond to different realizations associated with the same dynamical processes. We analyzed the NH and SH datasets separately and confirmed that the results reported below are robust (i.e., are reproduced in both the NH and SH analyses). In the following presentation, we will only show the NH results.

### b. Linear stochastic model

In section 3, the output from the climate model runs is examined in the phase space of the 10 leading empirical orthogonal functions (EOFs) of the vertically integrated zonal-mean zonal wind. Linear stochastic models are constructed from the 10-dimensional (10-D) phase space using the methodology of Kravtsov et al. (2005, 2006). After constructing the stochastic model, we employ this model to produce 100 10-D surrogate linear realizations for each of the seven model simulations. We repeat all of the statistical analyses of the idealized model runs [probability density function (PDF) estimation, spectral analysis, etc.] with the surrogate data from the stochastic model runs. For each quantity *f* that was obtained from the analysis of the idealized model simulations (e.g., the value of the probability density in a certain bin of the phase space), there are 100 corresponding surrogate quantities *f _{k}* (

*k*= 1100). We will then sort the values of

*f*in ascending order and specify the 95% confidence interval as corresponding to the range between the 3rd and 97th of the surrogate values. If

_{k}*f*lies outside of this confidence interval, we conclude that this aspect of the model behavior is different from that of its prototype linear stochastic model.

### c. PDF estimation and identification of persistent regimes

In sections 3 and 4, the PDF estimation is performed in the subspace of EOF 1 and EOFs 1 and 2, respectively. These EOFs typically correspond to a mixture of north/south shifts and intensification/weakening of the zonal-mean zonal jet (Monahan and Fyfe 2006). We employ a kernel-density estimator with the Epanechnikov kernel and a smoothing parameter of 0.7 to compute the PDFs.

We will use the PDF estimation to identify *regimes* within the EOF-1–EOF-2 phase space as the regions of increased probability of persistence [see, e.g., Vautard et al. (1988); also, note that this definition of regimes is entirely consistent with that in section 2]. To do so, we will first compute PDFs of both unfiltered and low-pass-filtered principal component time series associated with EOFs 1 and 2. We will then compute, for all points within the EOF-1–EOF-2 phase space, the ratio *R* of the low-pass-filtered PDF values to the unfiltered PDF values. Those regions in the phase space where *R* exhibits local maxima exceeding the 95% confidence level, based on *R* values computed from the corresponding linear model surrogates, will be marked as belonging to one of the few atmospheric *regimes*.

The presence of a local *R* maximum is a necessary but not sufficient condition for the identification of a regime characterized by persistent anomalies. Because the PDF of “persistent” anomalies is based on low-pass-filtered data, its standard deviation is necessarily reduced compared to that of the raw data, so the region near the phase-space origin will be characterized by *R* > 1; in the same way, the outskirts of the distributions will exhibit *R* < 1. The local *R* maximum near the origin is purely due to geometric considerations and has nothing to do with increased persistence of anomalies at that location. Although ranking *R* values relative to the predictions of linear stochastic surrogates eliminates the geometric bias in the regime identification, we perform a number of additional tests to ensure that the regimes identified are both (i) climatologically significant (i.e., the system spends a substantial fraction of time in a regime) and (ii) characterized by population density and *persistence* that exceeds the 95th percentile suggested by linear stochastic models. One of the persistence characteristics that we will use is based on the average regime duration, that is, the average amount of time that a flow trajectory spends in the regime during a single regime episode. We will also use a somewhat more robust persistence measure based on the *e*-folding time scale of the regime frequency–duration diagram (see section 4c). *The regimes are thus defined as the regions of the GCM’s phase space and their corresponding spatiotemporal anomalies that are more densely populated and are characterized by enhanced persistence relative to predictions of the linear stochastic model.*

### d. Identification of oscillatory variability

In section 5, we use multichannel singular spectrum analysis [M-SSA; see Ghil et al. (2002) and references therein] to describe low-frequency oscillatory variability in the full and reduced (linear stochastic) models. We will apply M-SSA to the 5-day nonoverlapping means of the leading two daily principal component (PC) time series obtained by EOF analysis of 142 000-day long raw data, which will result in a 2800 × 2 input data matrix. We will refer to each of the two “spatial” dimensions of the input data matrix as “channels.”

M-SSA performs eigenanalysis of the covariance matrix of an extended data matrix that is based on the input data matrix and its *M* time-lagged copies (we used a window size of *M* = 200, which along with the 5-day sampling allowed us to resolve time scales of less than 1000 days). The leading “oscillatory” mode is defined as the pair of dominant eigenmodes; however, it will only correspond to a genuine oscillation if the corresponding eigenvectors are in quadrature and have a similar period. Upon identification of the dominant eigenmode pair, one can obtain the time series of the associated reconstructed components (RCs). RCs can be viewed as narrowband-filtered versions of PCs, in which the filters are derived objectively from M-SSA information to best isolate the time evolution of the oscillatory signal. We also obtain daily versions of the RCs by applying a cubic spline to their original 5-day-sampled versions.

We next define eight phase categories of oscillatory variability of the leading mode by constructing a phase time series *φ*(*t*) = −arctan(ΔRC_{1}/RC_{1}), − *π* < *φ* < *π* (see, e.g., Plaut and Vautard 1994). Here, RC_{1} and ΔRC_{1} are the normalized channel-1 reconstructed component (the time series of the leading mode) and its normalized tendency, respectively. The days belonging to certain phase ranges of the oscillatory variability can be defined in two ways. For the first method, the 14 000 values of *φ*(*t*) can be sorted in ascending order. The days corresponding to the smallest 1/8 × 14 000 = 1750 values of sorted *φ*(*t*) are ascribed to phase category 1, the days belonging to the next 1750 values to phase category 2, and so on up to phase category 8 (note that the 1750 days belonging to any given composite phase do not correspond to consecutive days in the original unsorted time series). In this classification, the population of each phase category is the same. We will use this classification for constructing composite patterns of oscillatory variability in physical space (section 5) and for investigating the relationship between oscillations and regimes (sections 2e and 6). Because the composite “time” (the eight-valued vector obtained by compositing the original 14 000-valued linearly increasing time vector) in this method is linearly increasing, the propagation speed is not distorted.

A second definition of the phase category is to designate those days on which −*π* < *φ* < −3*π*/4 as phase 1, those days on which −3*π*/4 < *φ* < −*π*/2 as phase 2, and so on. For a purely linear harmonic oscillation, the duration of each phase category will be the same (equal to one eighth of the oscillation’s period). However, for an arbitrary time series generated by nonlinear processes, the number of days in each phase category may be unequal. We will thus use this definition to identify the deviations from linearity in the models’ low-frequency variability (LFV) in section 5.

### e. Regimes and oscillations

In section 6, we will explore connections between oscillatory variability and regime behavior by examining the simultaneous occurrences of regimes and certain phase categories of the dominant oscillatory signal. Once again, any possible association between the regimes and oscillations will be ranked against the results from the linear stochastic models (section 3). Note that the latter models have no regime behavior by definition, but one can still use the same phase space regions as in the idealized model simulations to mark the surrogate “regimes.”

We summarize and interpret our results in section 7.

## 3. Performance of linear stochastic models

In this section, we compare the one-dimensional PDFs and autocorrelation functions of the idealized climate model and corresponding linear stochastic model runs (section 2b). To demonstrate the ability of the linear stochastic models to capture the essential features of the climate model, we show the results for two cases, CS-2 and CS-6, which, as we will see below, reflect two very different dynamical regimes. These two cases were also analyzed in detail by Son et al. (2008). The one-dimensional PDFs and autocorrelation functions of the nine leading zonal-mean zonal flow PCs of the CS-2 run are shown in Figs. 1 and 2; the corresponding quantities for the CS-6 run are illustrated in Figs. 3 and 4 . In these figures, the solid lines indicate the idealized model results, while the dashed lines mark the 95% spread associated with the corresponding linear stochastic models. As can be seen, the linear stochastic models capture very well the variability of the PCs in both the CS-2 and CS-6 runs. (For display purposes, we do not show the PDFs for the tenth PC. As with the other PCs, for PC-10 the PDFs and autocorrelation functions of the linear stochastic model closely resemble those of the idealized model.) Furthermore, because the linear models are driven by Gaussian noise, the PDFs must also be Gaussian. The largest deviations from Gaussianity occur in the PDF of PC-2; somewhat smaller deviations occur in the PDFs of PC-1, PC-3, PC-4, and PC-5.

The 95% spread of the linear model simulation tightly bounds the autocorrelation functions of the leading PCs from the CS-2 simulation (Fig. 2); the only feature of the climate model simulation not captured by the linear model is the weakly oscillatory character of PC-3. All other PCs are decaying monotonically in both the climate and the linear stochastic models. In contrast, the variability of CS-6 is dominated by a very pronounced low-frequency oscillation, which is also remarkably well captured by the linear model, although it is slightly more energetic in the latter (Fig. 4).

In summary, linear stochastic models provide an excellent statistical description of the variability in the CS-2 and CS-6 runs. A similar degree of resemblance is found between the results from the other five climate model runs and those of the corresponding linear model simulations (not shown). Note that the success of the linear stochastic models in reproducing the climate model results does not necessarily mean that the underlying dynamics is linear (see introductory comments of section 1). Rather, it shows that nonlinear processes instrumental in shaping and maintaining certain modes of LFV in the climate model (namely, in our case, the modes of zonal-mean zonal flow variability) may be parameterized linearly or represented, on a long time scale, as a fast random forcing, which energizes the slower dynamical modes under consideration. In the following presentation, we will exploit the ability of linear stochastic models to closely approximate the climate model’s zonal-flow variability to help identify nonlinear features of the climate model simulations by tracking the differences between the full model statistics and those of the linear stochastic models. It is important to note that in contrast to those nonlinear features that can be linearly parameterized, as discussed above, this comparison between the statistics of the climate model and the linear stochastic model will be able to identify nonlinear features that cannot be linearly parameterized.

## 4. Persistent regimes

### a. Persistence probability prescreening

To identify regions in the climate model’s phase space in which the probability of persistence is highest, we use both raw and low-pass-filtered anomalies (see section 2c). The conceptual justification for our method comes from the following thought experiment. Let us decompose each of the model trajectories into its “low-frequency” main path, presumably reflecting the system’s slow dynamics, and the fast irregular deviations from this path. Next, let us examine all regions in the EOF-1–EOF-2 phase space and consider all trajectories passing through each of these regions. The unfiltered version of the line representing a path will in fact have a finite “width” due to fast random deviations from its dominant evolution. When this width exceeds the size of the phase-space region under consideration, the unfiltered anomalies will be detected outside of this region, whereas the actual low-frequency path is, of course, still inside. Similarly, some of the unfiltered anomalies that are detected within the control region will have the filtered main paths elsewhere.

As described in section 2c, we identify the potential regime regions as those in which the ratio *R* of PDF values based on low-pass- and unfiltered data exhibits local maxima that exceed the 95th percentile predicted by the corresponding linear stochastic models (see subsection 2c; see also Vautard et al. 1988). The filter size should be sufficiently large to remove fast transients but small enough not to distort the main paths. We determine the optimal filter size by trial and error and end up with a 10-day boxcar running average to define the low-pass-filtered anomalies.

For each of the CS-1–CS-7 control runs, we show in Fig. 5 the ratio *R* for the two-dimensional (in the EOF-1–EOF-2 plane) PDFs. We identify a few (up to four) separate regions characterized by local *R* maxima that exceed the 95th percentile associated with linear surrogates. We approximate the boundaries of the regimes with elliptical curves and will subsequently identify the time intervals spent in each of the regimes. The regime regions in Fig. 5 also correspond to statistically significant (relative, once again, to the null hypothesis based on our linear stochastic model) ridges or shoulders of both raw and low-pass-filtered PDFs (not shown, but see relevant analyses below). Similar results are obtained using 5-day and 20-day low-pass filtering, albeit with lower PDF ratios for the former case and indications of oversmoothing for the latter.

We labeled the potential regimes as follows: Regime 1 (regime 4) is characterized by anomalies having a moderate (large) positive EOF-1 projection; regimes 2 and 3 both have a negative EOF-1 projection, but the former is located closer to the origin than the latter. It turns out that regimes 3 and 4 are visited very rarely (consistent with their location away from the phase-space origin) and are thus not climatologically important. We will therefore only consider potential regimes 1 and 2 in the subsequent analyses. *These analyses are aimed at establishing if the regime behavior is indeed associated with dynamical processes that cannot be parameterized linearly* [see the examples in section 2, near the discussion of Eqs. (1) and (2)].

The composite vertically averaged, zonal-mean, zonal-wind patterns associated with potential regimes 1 and 2 are shown in Fig. 6. We multiplied the regime anomalies by a factor of 3, and the climatological winds are shown for reference. In CS-1–CS-3, where the eddy-driven jet dominates the climatology, regime 1 (regime 2) corresponds to a poleward (equatorward) shift of this jet. These anomalies thus correspond to either phase of the zonal index. In simulations CS-4 and CS-5, on the other hand, whereas regime 1 still describes the poleward shift of the jet, regime 2 indicates a strengthening of the eddy-driven jet. Finally, regime 1 in CS-6 corresponds to a splitting of the subtropical and eddy-driven jets and regime 2 to the slight merging of the two jets, whereas in CS-7, regime 1 (regime 2) amount to poleward (equatorward) movement of the two jets.

### b. Are the regimes more densely populated than predicted by linear stochastic models?

We next count the number of days (out of a total of 14 000 days for each simulation) spent in each of the two main regimes and rank them relative to that in the corresponding region in the EOF-1–EOF-2 plane of the linear stochastic model. Solid lines and crosses in Fig. 7 show the regime occupation estimates based on the 10-day boxcar-filtered PCs 1 and 2 of the climate and linear models for each of the CS-1–CS-7 runs (the simulation number is shown on the abscissa of each panel), and dashed lines indicate the spread of this quantity in the corresponding linear stochastic model surrogates. We see that both potential regimes 1 and 2 exhibit a significantly higher-than-linear frequency of occurrence (equal to or exceeding the 97th percentile) in all simulations, except for regime 2 in CS-7. Similar (although not quantitatively identical) results were obtained with the regime days identified using raw PCs (not shown): the main and only difference is that for each regime, and for each simulation, the occupation by the raw anomalies is somewhat less dense.

We have thus established that the two dominant potential regimes found in section 4a exhibit an abnormal population relative to linear-model predictions in all simulations but CS-7, in which regime 2 cannot be identified as special. Note also that the following potential regimes have relatively poor populations of under 1000 days: regime 1 in CS-1 and CS-4 and regime 2 in CS-4 and CS-6 (we have already excluded potential regime 2 in CS-7 from consideration).

### c. Are the regimes more persistent than predicted by linear stochastic models?

We next perform the tests to isolate those potential regimes that are indeed abnormally persistent relative to linear stochastic model predictions. The simplest measure of regime persistence is the average duration of the regime episodes. Another, more general method to quantify persistence is to construct frequency–duration diagrams by counting the number of regime episodes of a given duration. This dependence is typically exponential, so that the *e*-folding time scale computed as the slope of the frequency–duration diagram plotted in log-linear coordinates provides an estimate of the regime’s persistence, which can be more robust than the average regime duration measure (see below). This approach is particularly useful for those regimes with relative poor populations, as described in the previous subsection.

The frequency–duration diagrams (see Fig. 8) are computed by counting the number of regime episodes *exceeding* a given duration. We show the results for potential regime 1 in simulations CS-1 and CS-4 (Figs. 8a,b) and for potential regime 2 in simulations CS-2 and CS-4 (Figs. 8c,d); the dashed lines represent the spread in the frequency–duration diagrams based on linear stochastic model simulations (the four panels in Fig. 8 represent examples of potential regimes with relatively small populations). If the GCM-based frequency–duration diagrams are outside of the spread predicted by the linear models, the log-linear slopes (or, equivalently, the *e*-folding time scales) generated by the linear models must all be uniformly lower or higher than the GCM-generated slopes; otherwise, the GCM-generated slopes are statistically indistinguishable from the “linear” slopes.

Note first that the GCM-based frequency–duration dependencies are well within the spread predicted by the linear models for CS-1, potential regime 1 (Fig. 8a) and CS-2, potential regime 2 (Fig. 8c). These potential regimes are therefore *not* abnormally persistent relative to linear stochastic models, and we will therefore exclude these regimes from further consideration. In contrast, the slopes of the frequency–duration plots for the two potential regimes in CS-4 are significantly lower than the corresponding linear slopes (see Figs. 8b,d) in the 1–10-day portion of the frequency–duration diagram. Because both CS-4 regimes occupy a relatively small area in the EOF-1–EOF-2 plane (Fig. 5), and thus the corresponding number of regime days is also small (Fig. 7), the statistics of long-lived anomalies (with durations exceeding 15 days) for these regimes is unreliable. For this reason, the slope of the log-linear frequency–duration diagram for relatively short-duration anomalies represents a more robust measure of persistence than any measures that consider the anomalies of all durations (including the average regime duration measure).

Figure 9 summarizes the persistence characteristics of the potential regimes in all GCM runs based on both (top) average duration and (bottom) *e*-folding time scale; the latter time scale was computed as the least squares slope of the frequency–duration diagram (in log-linear coordinates, as in Fig. 8) for durations from 1 to 15 days.^{1} The runs in which the average duration of the potential regimes exceeds the 95th percentile of the linear surrogates are CS-2, -3, -5, -6, and -7 for potential regime 1, and CS-1, -3, -4, and -5 for potential regime 2. These results are generally consistent with those based on *e*-folding scales, except that the latter measure additionally identifies regime 1 in CS-4 as significantly persistent (consistent with Fig. 8c and our above discussion), whereas the *e*-folding scale for regime 2 in CS-4 falls slightly short of the 95% value. This happens because the 15-day duration used in computing the slope includes a nonlinear drop of frequency–duration diagram between 12 and 15 days in Fig. 8d, thus resulting in the higher least squares slope than that based on shorter-duration events. The latter slope is significantly lower than the one predicted by linear stochastic surrogates. We thus conclude that both potential regimes in CS-4 are characterized by significantly more persistent anomalies than the analogous regions in the linear stochastic models. Finally, regime 1 of CS-6 is dominated by anomalies with very long duration, which can be deduced from its significantly high average duration, but the linear slope in the frequency duration diagram is not lower than that of the linear surrogate model at the 97% significance level (Fig. 9, top left and bottom left). We classify it as belonging to the persistent group based on the fact that its average duration exceeds the 99% significance level (not shown).

The above regime-duration diagnostics were computed for the regime episodes based on the 10-day boxcar-filtered data. The same diagnostics applied to the raw data fail to indicate any changes in regime persistence between the climate model data and the linear model surrogates: the regime duration is always within the bounds predicted by the linear models (not shown). This argues that nonlinear feedbacks responsible for the origin of the flow regimes work on a longer time scale, suggesting that these features are associated with the main-path flow anomalies (see section 4a).

### d. How important are the regimes?

Based on the foregoing analyses, we have identified two main flow regimes, which are characterized by a substantially more significant population and persistence relative to the bounds predicted by linear stochastic models. These persistent regimes, which cannot be parameterized linearly, rely on nonlinear dynamics of the atmospheric flow simulated by the GCMs. Regime 1 exists in all simulations but CS-1, whereas regime 2 is not identified in CS-2, -6, and -7.

To address the question of how significant the detected nonlinearities are, we examine the performance of linear stochastic models as a forecasting tool for the full models and compare their overall forecast skill with that in the “regime” and “no regime” situations. The results of such a comparison are shown in Fig. 10, which displays the skill computed as the linear correlation between the actual and linear-model forecasted time series of PCs 1–3 for the GCM simulations CS-1, CS-3, and CS-6. The linear models here were trained using the first 7000 days of the corresponding GCM simulation, whereas the skills were computed based on the latter 7000 days; the forecast skill is plotted as a function of the forecast lead time. For this calculation, for each of the latter 7000 days, an *n*-day forecast is made for each PC. Together with the observed PC values for the *n*-day forecast, this will generate a pair of time series. The pairs of *n*-day forecast/observed values are then separated into nonregime and regime days. After this division, the above linear correlations are made between the forecasted and observed PC values. The solid line in each panel shows the forecast skill computed for all of the 7000 verification days, the dashed line represents the correlations between actual and forecasted time series computed using the no-regime target days, and the dashed–dotted line shows correlations computed using only the regime days (the days in which the model trajectory is in one of the two main regimes: 1 or 2).

For each case except for PC-3 in CS-6, the linear model’s skill during regime days drops significantly, and the skill during no-regime days only is always slightly higher than the skill computed over the whole time series. Based on the overall and nonregime skill for PC-1, the linear model performs worst (best) for the CS-1 (CS-6) runs, whereas the opposite is true for the skill computed using regime days only. The former effect indicates that the importance of statistically nonlinear feedbacks decreases from CS-1 to CS-6, which may be associated with the gradual shift of dominant variability from stationary, zonal-index type behavior to poleward propagation (see section 5), whose somewhat shorter time scales are better captured by the linear model forecast. The reasons for the decrease of regime-day skill from CS-1 to CS-6 are presently unclear and require further study.

From the above analyses, we can conclude that (i) during regime episodes the linear model is inadequate for prediction, and one can potentially improve the quality of the forecast relative to that of the linear model; (ii) this effect occurs because regimes as defined in our study involve linearly nonparameterizable or statistically nonlinear behavior; and (iii) the lack of the linear model’s skill during regime episodes negatively impacts the overall skill as well: more so in the strongly nonlinear case CS-1, to a smaller degree in the intermediate CS-3 case, and almost unnoticeably in CS-6,^{2} in which the regimes are relatively rare (see Fig. 7).

## 5. Low-frequency oscillations

Son and Lee (2006) reported the occurrence of quasiperiodic poleward propagation in CS-5, -6, and -7; this propagating signal was less pronounced in CS-3 and -4, whereas runs CS-1 and -2 were dominated by irregular and stationary zonal-index-type anomalies. The purpose of this section is to diagnose whether the poleward-propagating feature can be described as statistically linear or nonlinear. This identification could provide additional clues to the dynamical mechanisms governing this variability.

We have reconstructed the dominant mode of oscillatory variability in the GCM simulations and their linear counterparts using M-SSA analysis of the PC-1–PC-2 combined time series, as described in section 2d. All 14 000 points in each of the GCM time series were then classified to belonging to one of the eight phase categories of the model’s dominant oscillation (see section 2d). To address the character of this variability, we first computed the time intervals between two consecutive minima of the reconstructed component (RC-1) time series. The ensemble average of the length of these intervals provides an estimate of the dominant time scale of the model’s LFV; the standard deviation of these lengths, relative to their ensemble-mean value, gives the characteristic bandwidth of the LFV.

These two quantities are displayed in Fig. 11 in the form of an error-bar plot showing the central time scale of the variability along with error bars representing the interval of ± one standard deviation of this time scale. The time scale of variability in runs CS-1 through CS-7 becomes progressively shorter and more oscillatory in character because the bandwidth becomes smaller. The LFV in CS-1 is ultralow-frequency and irregular; the CS-2–CS-5 runs form a group of runs with a moderately short time scale and intermediate bandwidth; and the LFV in CS-6 and CS-7 possesses a quasi-oscillatory character.

Another way of examining the differences in the oscillatory LFV in the CS-1–CS-7 runs is to compute the composite oscillation by averaging the zonal-mean zonal wind anomalies during each of the eight oscillation phase categories. This was computed with the same population for each phase category (see section 2d). With this definition of phase category, flow characteristics such as the anomaly propagation speed are accurately represented in the consecutive composite maps. The examination of these composite oscillations (Fig. 12) reveals that the simulations CS-1 and CS-2 are dominated by stationary anomalies, whereas cases CS-6 and CS-7 exhibit poleward propagation. The intermediate runs, CS-3–CS-5, possess characteristics of both of these extreme cases, so that during some phases the anomalies appear to be stationary or nearly so, whereas other phases show a faster poleward propagation. In CS-6, the phases that were nearly stationary in CS-5 start to exhibit faster propagation, but the fast-propagating phases slow down, resulting in a nearly uniform propagation speed throughout the course of the oscillation; the latter is also true for CS-7. These oscillatory characteristics are consistent with the findings of Son and Lee (2006), who based their analysis on one-point correlation maps of the zonal-mean flow anomalies. Finally, we note that our linear stochastic model yields results that are analogous to those of Figs. 11 and 12 (not shown; however, the autocorrelation plots in Figs. 2 and 4 are an example of a fairly impressive linear match).

To address the question of whether the variability described in this section is entirely statistically linear, we will use our second definition of phase category (see section 2d), in which we choose the eight phase ranges to have the same angular extent (in our case, Δ*φ* = 2*π*/8).^{3} For a purely linear, harmonic oscillation, described by ordinary differential equations with constant coefficients of the kind of our linear stochastic model, the total number of days the system spends in each composite phase would be the same; deviations from the constant population would indicate nonlinearity (an example of such a nonlinear oscillation would be sawtooth-type oscillations, in which some phases are faster than others). In a noisy, finite-length time series the phase populations may, of course, deviate from a constant value just because of sampling.

Figure 13 shows the eight phases of the LFV in the CS-3–CS-7 runs. Here, the GCM results are shown using the solid lines with crosses; the 95% confidence intervals are shown with dashed lines. The results argue that the LFV in CS-3–CS-5 exhibits certain phases with duration significantly outside that of the spread predicted by the linear stochastic null hypothesis. In particular, composite phase 5 in CS-3 and CS-4 and phases 2, 3, 6, and 7 in CS-5 are anomalously fast (less populated than predicted by linear models), whereas the composite phases 2 and 4 of CS-4, as well as phases 1 and 5 of CS-5, are anomalously slow. In contrast, the runs CS-6 and -7 are dominated by narrowband statistically linear oscillations because their phase population is within the linear-fit bounds.

## 6. Are regimes and oscillations related?

The analyses of sections 3–5 have indicated that the GCM simulations CS-3–CS-7 exhibit both statistically nonlinear regime behavior (existence of abnormally persistent flow anomalies; see section 4) and quasiperiodic, ultralow-frequency variability (ULFV; section 5). Although these oscillations are fairly well reproduced by our linear stochastic models (section 3) for CS-3–CS-5, the occupation frequency of some of the oscillation phase categories exhibits statistically significant deviations from linear prediction, while the oscillations in CS-6 and CS-7 are found to be statistically linear. These features raise the question of whether the anomalously persistent regimes in CS-3–CS-5 correspond to the abnormally slow phases of the oscillatory LFV in these runs.

We address this issue by computing conditional probabilities of the regime occurrences given the particular phase category of the oscillatory signal for each of the CS-3–CS-5 model simulations. For that purpose, we count the number of days for which regimes co-occur with phase values within a given phase category (recall that there are a total of eight phase categories; note also that in the present analysis we define the categories to have equal population; see section 2d), and normalize by the total number of the phase-category days (142 000/8 = 1750 days). These probabilities clearly have to be nonuniform because the regimes and oscillations are sampled in certain regions of the EOF-1–EOF-2 phase space: regime 1 and oscillation phases 7, 8, 1, and 2 are typically in the right half-plane of this plane, whereas regime 2 and oscillation phases 3, 4, 5, and 6 are in the left half-plane. As an example of this analysis, we plot the conditional probabilities of regimes 1 and 2 for the CS-5 simulation in Fig. 14. Although the regime probabilities based on the GCM data follow the general shape predicted by the linear stochastic models, the quantitative differences are substantial. In particular, the probability of regime 1 is the highest during phase 1 and that of regime 2 during phase 3.

We repeat this analysis for all other runs and regimes, and thus define the phase categories of the dominant ULFV associated with regimes 1 and 2. The phase categories for runs CS-3–CS-7 are identified by arrows in Fig. 13 and show conflicting results in response to the question posed in this section’s opening paragraph; that is, whereas in simulations CS-4 and CS-5, regime 1 is indeed associated with the slow phases of the oscillation (phases 2 and 1, respectively),^{4} in runs CS-3, -6 and -7 the regimes occur during the phase ranges whose duration is within that predicted by the linear stochastic fit. On the other hand, regime 2 in CS-3–CS-5 occurs during the fast rather than the slow phase of the oscillation (phase categories 5, 5, and 3, respectively).

The above results—especially those associating *persistent* regime 2 with *fast* phase categories—are somewhat paradoxical. Their statistical explanation may lie in the fact that the conditional probabilities of regime occurrences do not exceed a value of 1/3 (see Fig. 14), which means that at least 2/3 of the nonregime points are used to define the phase ranges. Therefore, there are enough degrees of freedom in the system to maintain both types of behavior; in particular, the slowing down or speeding up of the dominant oscillation’s trajectories in certain regions of the EOF-1–EOF-2 phase space does not necessarily mean that these regions will have persistence characteristics exceeding the predictions of the linear stochastic models (because other flow anomalies *not* associated with the dominant oscillations participate in populating these phase-space regions). Another way to state the above is to say that there appears to be little or no definitive association between the regimes (defined as regions of abnormal persistence in the EOF-1–EOF-2 phase space) and the slow (or fast) phase categories of the dominant ULFV in our model.

Further support for this conclusion comes from analyses of simulations with nontrivial topography (Son et al. 2008). These analyses (not shown) identify much more linear oscillatory variability (with phase-category average duration being well within bounds provided by linear stochastic fit) than in the control, flat-bottom runs across all of the CS-1–CS-7 simulations. This linear oscillatory variability does, nevertheless, coexist with regime behavior, much like our finding that the nonlinearly maintained regime 1 coexists with the apparently linear oscillations of CS-6 and CS-7.

## 7. Summary and discussion

We have analyzed aspects of zonal-mean flow variability in a suite of seven GCM simulations (CS-1–CS-7) with a variety of statistical methods (section 2). These simulations were characterized by differing thermal boundary forcings (Son et al. 2008) and therefore by different climatological zonal-mean potential vorticity (PV) gradients, which translated into changes in low-frequency variability across the simulations. In particular, simulations CS-1 and CS-2 had sharp and strong climatological PV gradients and were dominated by quasistationary, irregular, zonal-index behavior, whereas simulations CS-6 and CS-7 had weaker PV gradients and exhibited quasiperiodic poleward propagation (PP; Son and Lee 2005, 2006). Both zonal-index and PP behavior involve nonlinear processes associated with synoptic eddy fluxes of heat and momentum. Some of these processes can, however, be parameterized statistically as a linear function of the zonal-mean flow state. In this case, we have referred to the associated variability as *statistically linear*; otherwise, the term *statistically nonlinear* has been used. Identifying statistically nonlinear aspects of variability is important because they result in selection of persistent flow anomalies, which are potentially more predictable.

We have distinguished between the statistically linear and nonlinear components of the variability by fitting each simulation to a linear stochastic model; this approach is conceptually analogous to the method of Berner and Branstator (2007). Our stochastic model provides an excellent zero-order description of the zonal-mean zonal flow variability in the GCM (section 3; Figs. 1 –4). However, we have shown in section 4 that these linear stochastic models underestimate (by a factor of 1.25–2) the population and persistence characteristics of certain flow patterns observed in the full GCM simulations (Figs. 6 –9). We refer to such persistent anomalies as flow *regimes*. The forecast skill of the linear stochastic models deteriorates substantially for the regime days, which negatively impacts the overall forecast skill (Fig. 10).

In all runs, both flow regimes and ultralow-frequency variability were found. Gradual changes were observed from the irregular, quasi-stationary variability of the CS-1 and CS-2 runs to that with a substantial degree of quasi-periodic poleward propagation in runs CS-6 and CS-7 (see Figs. 11 and 12). The regimes do tend to occur preferentially during certain phases of this oscillatory variability, but no definitive association between regime occurrences and anomalous slowing down of dominant oscillation trajectories was found (see Figs. 13 and 14).

Our earlier statements about weak association between regimes and oscillations are at apparent odds with those of Koo et al. (2003) and Kravtsov et al. (2006). These authors studied the variability of the observed zonal-mean flow in the Southern and Northern Hemispheres, respectively. Koo et al. (2003) did not attempt to distinguish between statistically linear and nonlinear processes. Kravtsov et al. (2006), on the other hand, did detect an anomalous “nonlinear” slowdown of oscillation trajectories passing through the regime regions, thus arguing for a stronger association between the regimes and oscillations than implied by our present work. We plan to further study the causes for this discrepancy in the future.

It thus appears that the regimes and oscillations in the present model are governed by different dynamical mechanisms. Statistically, this means that the regimes may not be associated with the nonlinear slowdown of the phase-space trajectories corresponding to the dominant oscillatory variability (Ghil and Robertson 2002). It appears that the preferential occurrence of the regimes is at most only weakly related to the dominant oscillation (i.e., the anomalies associated with nonleading M-SSA modes of the GCM time series).

Son and Lee (2006) have shown that zonal-flow variability is dominated either by the zonal-index behavior or by poleward propagation. By showing that the regimes and oscillations are at best only weakly related, we have demonstrated that this statement does not imply the mutual exclusiveness of regimes and oscillations. One consequence of this is that both phases of the zonal index are not necessarily associated with regimes. Indeed, despite the observation that potential regimes 1 and 2 in runs CS-1 and CS-2 (Fig. 5) are very similar to the two phases of the zonal index (Fig. 6), there is only one “true” persistent regime in each case, namely regime 2 in CS-1 and regime 1 in CS-2. Similarly, the presence of a propagating quasi-oscillatory signal does not mean that multiple regimes are inhibited because we can detect regimes in runs CS-4 and -5 that simultaneously exhibit substantial poleward propagation.

Our results regarding weak statistical association between these two types of zonal-flow behavior are consistent with there being different dynamical mechanisms that maintain the anomalies (Lee et al. 2007; Son et al. 2008). In a future study, we plan to diagnose eddy momentum fluxes conditioned on the occurrences of persistent regimes or phases of propagating oscillations to study the roles of synoptic eddy feedback (Robinson 1991, 2000; Feldstein and Lee 1998; Lorenz and Hartmann 2001; 2003) for both the zonal-index-type persistent states and synoptic-wave breaking in controlling ultralow-frequency poleward propagation (James and Dodd 1996; Feldstein 1998; Robinson 2000, 2006).

## Acknowledgments

We are grateful to two anonymous reviewers for their comments, which helped us strengthen the manuscript presentation. This research was supported by the following grants: U.S. Department of Energy Grant DE-FG02-02ER63413, National Aeronautics and Space Administration Grant NNG-06-AG66G-1 (SK), and National Science Foundation Grants ATM-0324908, ATM-0351044, and ATM-0649512 (JTH, SBF, SL, SWS).

## REFERENCES

Berner, J., , and G. Branstator, 2007: Linear and nonlinear signatures in the planetary wave dynamics of an AGCM: Probability density functions.

,*J. Atmos. Sci.***64****,**117–136.Deser, C., 2000: On the teleconnectivity of the “Arctic Oscillation.”.

,*Geophys. Res. Lett.***27****,**779–782.Feldstein, S. B., 1998: An observational study of the intraseasonal poleward propagation of zonal mean flow anomalies.

,*J. Atmos. Sci.***55****,**2516–2529.Feldstein, S. B., 2000: Is interannual zonal mean flow variability simply climate noise?

,*J. Climate***13****,**2356–2362.Feldstein, S. B., , and S. Lee, 1998: Is the atmospheric zonal index driven by an eddy feedback?

,*J. Atmos. Sci.***55****,**3077–3086.Feldstein, S. B., , and C. Franzke, 2006: Are the North Atlantic Oscillation and the Northern Annular Mode distinguishable?

,*J. Atmos. Sci.***63****,**2915–2930.Ghil, M., , and A. W. Robertson, 2002: “Waves” vs. “particles” in the atmosphere’s phase space: A pathway to long-range forecasting?

,*Proc. Natl. Acad. Sci. USA***99****,**(Suppl. 1). 2493–2500.Ghil, M., 2002: Advanced spectral methods for climatic time series.

,*Rev. Geophys.***40****,**1003. doi:10.1029/2000RG000092.Hurrell, J. W., , Y. Kushnir, , M. Visbeck, , and G. Ottersen, 2003: An overview of the North Atlantic Oscillation.

*The North Atlantic Oscillation: Climate Significance and Environmental Impact, Geophys. Monogr*., Vol. 134, Amer. Geophys. Union, 1–35.James, I. N., , and J. P. Dodd, 1996: A mechanism for the low-frequency variability of the midlatitude troposphere.

,*Quart. J. Roy. Meteor. Soc.***122****,**1197–1210.Koo, S., , A. W. Robertson, , and M. Ghil, 2003: Multiple regimes and low-frequency oscillations in the Southern Hemisphere’s zonal-mean flow.

,*J. Geophys. Res.***107****,**4596. doi:10.1029/2001JD001353.Kravtsov, S., , D. Kondrashov, , and M. Ghil, 2005: Multi-level regression modeling of nonlinear processes: Derivation and applications to climatic variability.

,*J. Climate***18****,**4404–4424.Kravtsov, S., , A. W. Robertson, , and M. Ghil, 2006: Multiple regimes and low-frequency oscillations in the Northern Hemisphere’s zonal-mean flow.

,*J. Atmos. Sci.***63****,**840–860.Lee, S., , S-W. Son, , K. Grise, , and S. B. Feldstein, 2007: A mechanism for the poleward propagation of zonal mean flow anomalies.

,*J. Atmos. Sci.***64****,**849–868.Lorenz, D. J., , and D. L. Hartmann, 2001: Eddy–zonal flow feedback in the Southern Hemisphere.

,*J. Atmos. Sci.***58****,**3312–3327.Lorenz, D. J., , and D. L. Hartmann, 2003: Eddy–zonal flow feedback in the Northern Hemisphere winter.

,*J. Climate***16****,**1212–1227.Monahan, A. H., , and J. C. Fyfe, 2006: On the nature of zonal jet EOFs.

,*J. Climate***19****,**6409–6424.Plaut, G., , and R. Vautard, 1994: Spells of low-frequency variability and weather regimes in the Northern Hemisphere.

,*J. Atmos. Sci.***51****,**210–236.Reinhold, B. B., , and R. T. Pierrehumbert, 1982: Dynamics of weather regimes: Quasi-stationary waves and blocking.

,*Mon. Wea. Rev.***110****,**1105–1145.Reinhold, B. B., , and R. T. Pierrehumbert, 1985: Corrections to “Dynamics of weather regimes: Quasi-stationary waves and blocking.”.

,*Mon. Wea. Rev.***113****,**2055–2056.Riehl, T., , C. Yeh, , and N. E. La Seur, 1950: A study of variations of the general circulation.

,*J. Meteor.***7****,**181–194.Robinson, W. A., 1991: The dynamics of the zonal index in a simple model of the atmosphere.

,*Tellus***43A****,**295–305.Robinson, W. A., 2000: A baroclinic mechanism for the eddy feedback on the zonal index.

,*J. Atmos. Sci.***57****,**415–422.Robinson, W. A., 2006: On the self-maintenance of midlatitude jets.

,*J. Atmos. Sci.***63****,**2109–2122.Son, S-W., , and S. Lee, 2005: The response of westerly jets to thermal driving in a primitive equation model.

,*J. Atmos. Sci.***62****,**3741–3757.Son, S-W., , and S. Lee, 2006: Preferred modes of variability and their relationship with climate change.

,*J. Climate***19****,**2063–2075.Son, S-W., , S. Lee, , S. Feldstein, , and J. Ten Hoeve, 2008: Time scale and feedback of zonal mean flow variability.

,*J. Atmos. Sci.***65****,**935–952.Sura, P., , M. Newman, , C. Penland, , and P. Sardeshmukh, 2005: Multiplicative noise and non-Gaussianity: A paradigm for atmospheric regimes?

,*J. Atmos. Sci.***62****,**1391–1409.Thompson, D. W. J., , and J. M. Wallace, 1998: The Arctic Oscillation signature in the wintertime geopotential height and temperature fields.

,*Geophys. Res. Lett.***25****,**1297–1300.Thompson, D. W. J., , and J. M. Wallace, 2000: Annular modes in the extratropical circulation. Part I: Month-to-month variability.

,*J. Climate***13****,**1000–1016.Vautard, R., , B. Legras, , and M. Déqué, 1988: On the source of midlatitude low-frequency variability. Part I: A statistical approach to persistence.

,*J. Atmos. Sci.***45****,**2811–2843.Yu, J-Y., , and D. L. Hartmann, 1993: Zonal flow vacillation and eddy forcing in a simple GCM of the atmosphere.

,*J. Atmos. Sci.***50****,**3244–3259.

^{1}

Note that the low values for the *e*-folding time scales, on the order of a few days, do not mean that the events are relatively short; they simply refer to an alternative quantitative measure of persistence. As can be seen in Fig. 9, an average duration of about 8 days corresponds to an *e*-folding time scale of about 3 days.

^{2}

We note here that in the simulation CS-7 the overall effect of non linearity becomes more pronounced compared to CS-6 as far as the difference in the regime/non-regime skills is concerned (not shown).

^{3}

Recall that the oscillation phase categories in each case are determined by the phase angle in the phase space of RC-1 and its derivative, not in the RC-1–RC-2 space.

^{4}

We note here that although the dominant variability in simulations CS-1 and CS-2 is irregular, we can still define the phases of this variability in the same way as for other runs and identify both the phase-category average population and regime associations. In particular, the occurrence of regime 2 in CS-1 and regime 1 in CS-2 is found to be most probable during the abnormally slow phases of ULFV in these runs (not shown).