## Abstract

This paper constructs and analyzes a reduced nonlinear stochastic model of extratropical low-frequency variability. To do so, it applies multilevel quadratic regression to the output of a long simulation of a global baroclinic, quasigeostrophic, three-level (QG3) model with topography; the model's phase space has a dimension of *O*(10^{4}).

The reduced model has 45 variables and captures well the non-Gaussian features of the QG3 model's probability density function (PDF). In particular, the reduced model's PDF shares with the QG3 model its four anomalously persistent flow patterns, which correspond to opposite phases of the Arctic Oscillation and the North Atlantic Oscillation, as well as the Markov chain of transitions between these regimes. In addition, multichannel singular spectrum analysis identifies intraseasonal oscillations with a period of 35–37 days and of 20 days in the data generated by both the QG3 model and its low-dimensional analog.

An analytical and numerical study of the reduced model starts with the fixed points and oscillatory eigenmodes of the model's deterministic part and uses systematically an increasing noise parameter to connect these with the behavior of the full, stochastically forced model version. The results of this study point to the origin of the QG3 model's multiple regimes and intraseasonal oscillations and identify the connections between the two types of behavior.

## 1. Introduction

We analyze the output of a global, three-level, quasigeostrophic (QG3) atmospheric model with topography (Marshall and Molteni 1993; D'Andrea and Vautard 2001; Kondrashov et al. 2004). As shown by these authors, the QG3 model has a fairly realistic climatology and rich variability, which also compares favorably with atmospheric behavior observed in Northern Hemisphere (NH) midlatitude flows. In addition to synoptic variability associated with baroclinic eddies, the model is characterized on longer time scales by the existence of a few persistent and recurrent flow patterns, or weather regimes (Reinhold and Pierrehumbert 1982; Legras and Ghil 1985; Molteni 2002), as well as by intraseasonal oscillations (Ghil and Robertson 2000, 2002; Kondrashov et al. 2004).

These coarse-grained features of the model's low-frequency variability (LFV) may be better understood by using reduced models, which have considerably fewer degrees of freedom. Such models can accurately represent both linear and nonlinear aspects of the full model's LFV, while parameterizing the effect of higher-frequency, synoptic transients on LFV (Robinson 1996, 2000; Lorenz and Hartmann 2001, 2003; Kravtsov et al. 2003, 2005a). In this paper, we construct and analyze such a reduced nonlinear model, which is based solely on the output of a long QG3 simulation and involves a stochastic parameterization of the synoptic eddies' effect on the full model's LFV (Kravtsov et al. 2005b).

To construct reduced dynamical models, one often rewrites the full dynamical model equations in terms of empirical orthogonal functions (EOFs; Preisendorfer 1988) derived from a long simulation of the latter. The equations in this basis are then truncated by retaining only a few leading EOFs that represent large-scale, low-frequency flow (Rinne and Karhila 1975; Schubert 1985; Sirovich and Rodriguez 1987; Mundt and Hart 1994; Selten 1995, 1997), while the residual variance associated with the remaining EOFs is usually treated as random forcing. Alternatively, one can develop a deterministic, flow-dependent parameterization of these fast unresolved processes based on the library of differences between the tendency of the full and truncated models (D'Andrea and Vautard 2001; D'Andrea 2002). Yet another approach to this closure problem, which is mathematically rigorous in the limit of significant scale separation, has been developed by Majda et al. (1999, 2001, 2002, 2003). Franzke et al. (2005) have recently applied this approach to a barotropic model on the sphere, with a T21 resolution, while C. Franzke and A. Majda (2005, personal communication) have applied it to the QG3 model.

The closure problem above can be effectively addressed in a data-driven, rather than model-driven approach, by using inverse stochastic models; these models rely almost entirely on the dataset's information content, while making only minimal assumptions about the underlying dynamics. The simplest type of inverse stochastic model is the so-called linear inverse model (LIM; Penland 1989, 1996; Penland and Ghil 1993), in which the dependence of the state vector's time derivatives on the state vector itself is assumed to be linear, while the time-dependent flow is forced by spatially correlated white noise. LIMs have shown some success in predicting El Niño–Southern Oscillation (ENSO; Penland and Sardeshmukh 1995; Johnson et al. 2000), tropical Atlantic sea surface temperature variability (Penland and Matrosova 1998), as well as extratropical atmospheric variability (Winkler et al. 2001).

Kravtsov et al. (2005b) have recently developed generalizations of LIMs that relax the assumptions of model linearity and of the noise being white in time. Colored noise, in particular red noise, can be accommodated by allowing additional model levels, which permit one to achieve truly white noise at the last level; model nonlinearities are accommodated in the deterministic part of the first level. Such nonlinear, multiple-level inverse models have proven useful in modeling LFV of NH geopotential height anomalies (Kravtsov et al. 2005b), as well as tropical sea surface temperature variability (Kondrashov et al. 2005). In the present paper we apply methodology of Kravtsov et al. (2005b) and of Kondrashov et al. (2005) to obtain reduced analogs of the QG3 model. The best reduced model so obtained is then used to shed light on the full model's LFV.

The paper is organized as follows. In section 2, we describe the experimental setup of the QG3 model and the statistical methods used to analyze its behavior; the multilevel regression modeling technique used to construct the reduced model is reviewed in the appendix. An excellent match between the statistical properties of the full and reduced models is documented in section 3, in terms of both intraseasonal oscillations and multiple flow regimes. In section 4 we interpret the results of section 3 by examining quasi-stationary states and linear eigenmodes of the deterministic part of the reduced model, and study the behavior of this model as a function of the stochastic forcing amplitude. Concluding remarks follow in section 5.

## 2. Data and methodology

We analyze the daily output from a perpetual-winter simulation that is 54 000 days long and was described in detail by Kondrashov et al. (2004). Each of the three levels of the QG3 model has 1024 grid points in the NH. Since the model's LFV is equivalent barotropic, we apply principal component (PC) analysis to its NH 500-hPa streamfunction anomalies to reduce the dataset's dimensionality; in computing EOFs, the anomalies are weighted by the cosine of the latitude (Branstator 1987).

We study the QG3 model's behavior in the phase space spanned by its leading EOFs by applying two distinct types of statistical data analysis: probability density function (PDF) estimation via Gaussian mixtures (Smyth et al. 1999; Hannachi and O'Neil 2001; Kondrashov et al. 2004) and the multichannel version of singular spectrum analysis (M-SSA; see Ghil et al. 2002 and references therein). These two types of analysis correspond to two complementary descriptions of LFV: (i) the episodic one in terms of anomalously persistent multiple flow regimes, and the Markov chain of transitions between them; and (ii) the one that emphasizes low-frequency oscillations (Ghil and Robertson 2002).

The weather regimes correspond to the *k* clusters obtained by the Gaussian mixture analysis applied in the subspace of *d* leading EOFs. The optimal number *k** of clusters is determined using a built-in criterion, based on cross-validated log-likelihood estimates. Each data point has a degree of membership in several clusters, depending on its position with respect to each cluster centroid and the weight of that cluster. Based on a combination of these two criteria, one can associate each point with a unique cluster and obtain the composite pattern of anomalies associated with this cluster. Furthermore, we define regime events as the number of consecutive points (days) along the model trajectory that fall within a given cluster, and compute various quantities related to conditional probabilities of regime occurrence, which characterize the statistics of the transitions between regimes (Mo and Ghil 1988; Kimoto and Ghil 1993b). The preferred transition paths so identified may, in certain cases, be associated with low-frequency oscillations present in the dataset (Ghil et al. 1991; Kondrashov et al. 2004).

These oscillations will be identified here by M-SSA analysis (Keppenne and Ghil 1993), which is especially useful in the study of amplitude- and phase-modulated signals. M-SSA finds eigenvalues and eigenvectors of the grand covariance matrix built from lagged copies of a vector time series. An oscillatory mode is characterized by a pair of nearly equal eigenvalues and periodic eigenvectors that correspond to the same frequency. Following Ghil et al. (2002), we apply a Monte Carlo test to ascertain statistical significance of the oscillations detected by M-SSA. In addition, we subject suspected oscillatory pairs to the lag-correlation test of Plaut and Vautard (1994). All oscillatory signals identified in section 3 pass both of these tests.

The data-adaptively bandpass-filtered time series associated with the oscillatory modes are called reconstructed components (RCs), which we use to define the oscillation's phase categories (Keppenne and Ghil 1993; Plaut and Vautard 1994). Similarly to the regime compositing described above, we also performed a composite analysis keyed to a given phase of a low-frequency oscillation to describe the anomalies in physical space associated with that phase (Ghil and Mo 1991; Plaut and Vautard 1994).

The statistical significance of any composite quantity can be estimated using a nonparametric Monte Carlo method (Dole and Gordon 1983; Vautard et al. 1990; Plaut and Vautard 1994). To do so, one gathers into time segments the consecutive days belonging to a given oscillation phase or a given regime, including a “null regime,” defined as all data points that do not belong to any of the regimes identified. These segments are randomly shuffled 100 times, thus providing 100 independent realizations with the same length as the original time series. Each quantity estimated for a given composite, whether keyed to a given regime or to an oscillation's given phase category, can also be computed using the 100 shuffled sets of category numbers. The 95% confidence interval, for example, is then bounded by the 2.5th and 97.5th percentiles of the random values so computed, sorted in ascending order.

The procedure for constructing the reduced model has been described in detail by Kravtsov et al. (2005b) and is reviewed here briefly in the appendix. The model has, in general, *I* variables at the first level and *N* levels; see Eq. (A1). We chose the number *I* of state vector components (EOFs) to achieve maximal correspondence between the behavior of the reduced model and full QG3 model in terms of their PDF structure and the periods, as well as spatial patterns, of low-frequency oscillations in each model. The reduced model with *I* = 15 variables and quadratic nonlinearities at the first level produced optimal results, while the number of levels is *N* = 3, and it is given by the requirement that the additive noise be truly white at the last level; see Kravtsov et al. (2005b). The total number of variables in the inverse model is thus *I* × *N* = 15 × 3 = 45, two orders of magnitude less than that in the full QG3 model.

## 3. Weather regimes and intraseasonal oscillations

We performed long simulations (≥54 000 days) of the reduced model (A1) with *I* = 15 and *N* = 3, using a time step of *δt* = 1 day. In this section, we compare the statistical properties of the datasets produced by the reduced model and the full QG3 model.

### a. Weather regimes

The PDF of the datasets produced by the QG3 and the reduced model are shown Fig. 1 in the subspace of the QG3 model's three leading EOFs. The clusters were found using mixtures of *k* = 4 Gaussian components in a phase subspace of three leading EOFs, which capture 25% of the total variance. The optimal number of clusters is *k* = 4 for both the QG3 simulation and for the reduced-model dataset, as determined by the cross-validation procedure of Smyth et al. (1999); see also Kondrashov et al. (2004). The locations, shapes and sizes of clusters, and hence the general shape of the QG3 model's PDF, are well reproduced by our reduced model's PDF in Fig. 1. On the other hand, linear reduced models driven by Gaussian noise cannot capture the clearly non-Gaussian shape of the QG3 model's PDF and the associated weather regimes; see further discussion of this point in Kravtsov et al. (2005b) and Kondrashov et al. (2005).

The composites over the data points that belong to each of the ellipses in Fig. 1 represent, in physical space, the patterns of four planetary flow regimes (Legras and Ghil 1985; Ghil and Childress 1987, chapter 6; Mo and Ghil 1988; Cheng and Wallace 1993; Kimoto and Ghil 1993a, b; Hannachi 1997; Smyth et al. 1999; Hannachi and O'Neill 2001; Molteni 2002). These regimes are associated with the opposite phases of the NH annular mode, the so-called Arctic Oscillation (AO; Deser 2000; Thompson and Wallace 2000; Thompson et al. 2000; Wallace 2000), and the sectorial North Atlantic Oscillation (NAO; Hurrel 1995) pattern.

In Fig. 1a, cluster AO^{−} occupies a distinctive region on the PDF ridge that stretches along EOF-1. It corresponds to the low-index phase of the AO (Deser 2000; Wallace 2000). The clusters AO^{+}, NAO^{−}, and NAO^{+} are located around the global PDF maximum, with the centroid of AO^{+} to the left and below, NAO^{+} above, and NAO^{−} slightly to the right of this maximum, respectively. These four regimes are not identical to but in fairly good agreement with the observational results of Cheng and Wallace (1993) and Smyth et al. (1999); see also Ghil and Robertson (2002) and Kondrashov et al. (2004).

Since we are ultimately interested in explaining LFV in NH observations, one might question the independence of the two NAO-related and the two AO-related regimes in the QG3 model. The relations between hemispheric and sectorial regimes are a topic of continuing investigation and debate. Watanabe (2004), for instance, showed, using both NH observations and a linear barotropic model, how the sectorial NAO could have, at certain times of year and in combination with other sectorial phenomena, a downstream effect extending all the way to East Asia. Kravtsov et al. (2006), on the other hand, distinguish in NH observations between a more truly hemispheric character of AO^{−} versus more complex and sectorial manifestations of AO^{+}. As we shall see in section 4, these observational results are reflected in the reduced model's AO^{−} regime being associated with a stable fixed point of our model's deterministic part.

The streamfunction anomalies associated with each regime centroid of the QG3 model are plotted in Fig. 2. The spatial correlations between these anomaly patterns and those obtained from the reduced model (not shown) all exceed 0.9. They are thus much higher than the correlations obtained by D'Andrea and Vautard (2001) and D'Andrea (2002) in their 10-variable reduced model.

### b. Intraseasonal oscillations

We compare the results of M-SSA analysis for the QG3-model and the reduced-model simulations in Fig. 3. Both datasets have the same length of 54 000 days. M-SSA was applied to the time series of the three leading PCs of the QG3 data. The QG3-model spectrum (dashed line) and the reduced-model spectrum (solid line) show excellent agreement in terms of overall shape, as well as in the location of the leading significant oscillatory pair, marked by an arrow. This oscillatory mode has a period of about 37 days in both models and will be analyzed in detail in the remainder of this paper. The second, less energetic, oscillatory mode has a period of about 20 days in the QG3 model (see Kondrashov et al. 2004), and lies outside of the frequency range of Fig. 3. This secondary mode is also present in the reduced model, but is less robust (see section 4b below). The other spectral peaks in Fig. 3 correspond to single PCs and not to oscillatory pairs in M-SSA.

The composite maps keyed to the phases of this oscillation (see section 2) are shown in Fig. 4 for the QG3-model simulation. The reduced model produces results that are virtually identical to those in Fig. 4 (not shown). Figures 4a–d illustrate four successive, equally populated composites that, taken together, cover one-half cycle of the oscillation.

The composites of Figs. 4a,d,b have a strong resemblance to the NAO^{+}, NAO^{−}, and AO^{+} flow-regime centroids of Figs. 2a–c, respectively. After passing through the NAO^{+} (Fig. 4a) and AO^{+} (Fig. 4b) phases, the next phase of our reduced model's oscillation resembles the Pacific–North American (PNA) pattern (Fig. 4c), before reaching the NAO^{−} phase (Fig. 4d); the PNA is a well-known pattern that, along with the AO and NAO, characterizes dominant modes of the NH LFV (Wallace and Gutzler 1981; Ghil and Robertson 2002). Overall, this oscillation shares several features of the oscillatory topographic instability described by Ghil and associates in a hierarchy of models and in NH observations (Legras and Ghil 1985; Jin and Ghil 1990; Ghil and Mo 1991; Ghil and Robertson 2000; Lott et al. 2001, 2004a, b); see Kondrashov et al. (2004) for a more detailed discussion.

### c. Connection between regimes and oscillations

The similarity between the regime composites and certain phases of the low-frequency oscillations described in the previous subsection points to possible relationships between the episodic and oscillatory descriptions of LFV (Ghil et al. 1991; Kimoto and Ghil 1993b; Plaut and Vautard 1994; Ghil and Robertson 2002; Koo et al. 2002; Kravtsov et al. 2006). Kondrashov et al. (2004) computed conditional probabilities of regime transitions for the QG3 model and identified the preferred cycle NAO^{+} → AO^{+} → NAO^{−}, which is consistent with the trajectory of the intraseasonal oscillation. We have computed the transition probability matrix for our reduced model (not shown) and have found it to be almost identical to that of the QG3 model (see Kondrashov et al. 2004, their Table 6).

The relationship between the regimes and oscillations can be quantified by computing the conditional probability of a given regime occurrence, assuming the knowledge of the intraseasonal oscillation's phase category (see section 2), as shown in Fig. 5a for the QG3 model (left panel) and the reduced model (right panel). In both models, the NAO^{+}, NAO^{−}, and AO^{+} regimes are associated with the same distinct phases of the intraseasonal oscillation, while the occurrence probability of the AO^{−} regime does not strongly depend on the oscillation's phase category. This is consistent with the oscillation's spatial patterns in Fig. 4, of which only three resemble one of the four flow regimes. Small differences in phase categorization between Fig. 4 and Fig. 5 are due to the fact that phases of the 37-day oscillation strongly resemble, but are not identical to the weather regimes; compare Figs. 2 and 4.

To examine further the relationship between regimes and oscillations, we computed the composite phase velocity in the plane spanned by the pair of RCs that captures most of the variance associated with the intraseasonal oscillation, and the corresponding tendency. This pair is given, in both models, by RCs 8–9. Since the RCs in M-SSA are themselves vector-valued time series, we used the second channel, denoted by RC*-2, of this pair, along with its tendency. Both RC*-2 and the tendency time series were normalized by their respective standard deviations; with this normalization, a purely sinusoidal oscillation has a constant phase velocity (see Kravtsov et al. 2006).

The results are shown in Fig. 5b for the QG3-model (left panel) and the reduced model (right panel): for both models, the trajectory slows down considerably in the AO^{+} and NAO^{−} regimes, while it accelerates in the NAO^{+} regime. These results are consistent with a larger number of days and regime events for the high-index AO^{+} and blocked NAO^{−} regimes, when compared to these quantities for the zonal NAO^{+} regime (Kondrashov et al. 2004).

## 4. Analysis of the reduced model

To identify the roots of multiple flow regimes and intraseasonal oscillations, we study in sections 4a and 4b the properties of the full dynamical operator of Eq. (A1) obtained by dropping the stochastic forcing term *d***r**^{(2)} at the third and last level of the reduced model. In this study, we also consider the even simpler, quadratic operator obtained by retaining only the deterministic component of the model's first level. Section 4c then traces the onset of the complete behavior described in section 3, as the stochastic forcing amplitude changes from zero to the value obtained by the inverse modeling procedure described in section 2 and the appendix.

### a. Steady and quasi-stationary states

We identify quasi-stationary states of the reduced model's full dynamical operator by computing multiple local minima of the quadratic functional defined as the sum, over all 45 state variables, of their squared tendencies (Legras and Ghil 1985; Mukougawa 1988; Vautard and Legras 1988). To do so, we apply a “subspace trust region” technique from Matlab's Optimization Toolbox. This method is based on the interior-reflective Newton procedure (Coleman and Li 1994, 1996), which involves approximate solutions of the minimization problem for this functional; these solutions, in turn, rely on using a preconditioned conjugate-gradient method. The degree of quasi-stationarity of the minimal-tendency states so obtained is controlled by a preset tolerance *β:* for a given *β* and a given initial guess of the solution, the procedure iteratively corrects the solution until the value of the functional drops below *β*. The values of *β* we use below should be compared with the typical variance of the reduced model's state vector, which is approximately equal to 10^{−2}. Once *β* is chosen, we repeat the minimization procedure many times for randomly chosen initial states.

For *β* ≤ 10^{−8}, the procedure converges to the single solution located in the vicinity of the AO^{−} regime (see Fig. 1), irrespective of the initial guess used. This is a true steady state, which is linearly stable (see section 4b below) and can be obtained by direct integration of the reduced-model equations with no stochastic forcing. The AO^{−} regime in the reduced-model and in the full QG3 model simulations is thus associated with the presence of this stable fixed point in the reduced model's dynamical operator.

As *β* increases, we find multiple local minima that correspond to quasi-stationary states of the reduced model. The locations of these minima in the phase space of the latter's three leading EOFs are shown in Fig. 6, for *β* = 10^{−6} in Fig. 6a and *β* = 10^{−5} in Fig. 6b. If *β* is sufficiently small, *β* < 10^{−7}, the quasi-stationary states are still located mostly in the vicinity of the AO^{−} regime (not shown). With increasing *β*, these states start to progressively spread out toward the climatological mean of the full QG3 model and form a distinct “tongue” (Figs. 6a,b) aligned along the EOF-1 axis. This small-tendency region in the reduced model's phase space is thus responsible for the PDF ridge in Figs. 1a,b.

### b. Linear stability analysis

We perform here a linear stability analysis of the reduced model's fixed points. The eigenvectors associated with pairs of complex eigenvalues represent damped oscillations, whose decay time and period are related to the inverse of the corresponding eigenvalue's real and imaginary parts, respectively.

The dynamical operator we considered so far arises by dropping the white-noise forcing *d***r**^{(2)} at the third level; as a result, *d***r**^{(1)} and *d***r**^{(0)} also become deterministic. It is quite useful to investigate the fixed points of the reduced model's deterministic operator at the first level as well. Instead of a single stable steady state for the full dynamical operator with all 3 levels discussed in section 4a, two steady states are found: one of them is located near the AO^{−} regime and it is linearly unstable, with a growth rate of ≈25 days (see below), while the other is located near the climatology and it is stable. The locations of all fixed points are shown in Fig. 7a, while the eigenspectrum for each of them is shown in Fig. 7b.

For the unique steady state of the full, three-level operator (diamonds in Figs. 7a,b), all of the eigenmodes are decaying. The two least-damped modes are oscillatory; they are characterized by a decay time of approximately 6 days, and have periods of 20 and 35 days, respectively. These two periods are close to the ones obtained by M-SSA analysis of the full QG3 and reduced-model solutions in section 3b (see also Kondrashov et al. 2004). It is tempting, therefore, to interpret the oscillations identified in the two models' simulations as these damped modes. In the reduced model they are excited by the stochastic forcing, while in the full QG3 model their excitation is due to the smaller-scale modes that have been filtered out of the deterministic dynamics by our mode-reduction approach.

The trajectory of the 37-day intraseasonal oscillation, which is shown in Figs. 6a,b, evolves around the time-mean state of the QG3 model, rather than around the model's steady state. This result is consistent with the fact that the AO^{−} regime does not appear to be associated with our 37-day oscillation (see section 3c here and section 5 in Kondrashov et al. 2004).

We computed therefore the linear eigenmodes of the reduced model linearized about the climatological mean state and found that the least-damped eigenmode in this case has a decay time of 6 days and a period of 37 days (filled circles in Figs. 7a,b), similar to what we obtained when linearizing about the unique steady state of the reduced model's full dynamical operator. The 20-day mode, however, is strongly damped when linearizing about the reduced model's climatology, with a decay time of 3 days or less (not shown), which might explain the dominance of the 37-day cycle in the QG3 model and in NH observations.

A similar separation in periods is observed in the oscillatory modes associated with the fixed points of the first-level dynamical operator. The least-damped mode for the unstable steady state near AO^{−} (downward-pointing triangles in Figs. 7a,b) has a decay time of ∼15 days and a period of 25 days, while the corresponding mode for the stable steady state near the climatology (filled squares in Figs. 7a,b) has a period of 35 days and a decay time of 16 days; note that the instability of the former steady state is of exponential, rather than oscillatory type.

The shorter decay times of the oscillatory eigenmodes and the transformation of the steady state near AO^{−} from unstable to stable, when going from the first-level to the three-level deterministic dynamics, indicate that the two additional, linear levels involved in modeling the red-noise forcing levels play a damping role. This role makes both mathematical and physical sense. Mathematically, red noise is associated with linear damping of white noise, which is thus correctly captured by the two additional levels. Physically, the effect of the smaller scales on the large ones can involve pumping energy into the latter, but also damping instabilities associated with their self interaction.

Figure 8 shows the real (Fig. 8a) and imaginary (Fig. 8b) parts of the 37-day mode obtained by linearizing the full dynamical operator of the reduced-model equations about climatology, while Figs. 8c,d show the patterns for the 35-day mode obtained by linearization of the same operator about its unique steady state. The spatial patterns of these two eigenmodes are remarkably similar.

The similarity between the two patterns can be interpreted as follows. Both the AO^{−} regime and the reduced model's steady state are primarily associated with the weakening and southward shift of the jet stream position, with respect to climatology. The changes to the jet stream that are related to the AO^{−} anomaly have, therefore, a large degree of zonal symmetry (see Fig. 2d) so that their self-interaction tends to vanish. This feature is consistent with the presence of a tongue of quasi-stationary states along the EOF-1 axis in Fig. 6. The approximate invariance of the least-damped eigenmodes of the reduced-model equations, when linearized about either the climatological jet or its uniformly shifted counterpart means that the least-damped eigenmodes are not sensitive to AO-type changes of the basic state. In other words, the eigenmodes turn out to be patterns whose interaction with AO-type changes of the midlatitude jet vanishes.

We visualize in Figs. 9a–d the linear mode's evolution during one-half of its 37-day oscillation cycle. This figure shows one-to-one correspondence between the linear eigenmode of the reduced model and the simulated intraseasonal pattern of the full QG3 model in Fig. 4. We conclude, therefore, that the QG3 model's 37-day intraseasonal oscillation is associated with the damped linear eigenmode of the reduced model, excited by the modes of the full model that are captured by the stochastic forcing in the reduced one.

The power spectrum of the PCs is getting progressively whiter as the rank increases, while the spatial scale of the EOFs is getting smaller. In both the reduced and QG3 model simulations, the intraseasonal oscillation has a large amplitude in the four leading PCs, as indicated by the size of the streamfunction anomalies in Fig. 4; the variance associated with this oscillation is very small, though, for the higher-ranked EOFs 5–15 (not shown). The cluster-centroid patterns in both the QG3 model and the reduced model (section 3) also have essentially null projections onto EOFs 5–15 (not shown here, but see Kondrashov et al. 2004). The reduced model with only a few state vector variables at the first level, though—say *I* = 4 variables instead of the 15 used in the reduced model of sections 3 and 4—reproduces neither the period and spatial pattern of the QG3 model's intraseasonal signal nor its PDF structure (not shown).

Based on this evidence, we conclude that interactions between the leading modes 1–4 of the system's LFV, on the one hand, and the intermediate modes represented by EOFs 5–15, on the other, are an essential contributor to the system's low-dimensional, leading-order dynamics. Such interactions between low-frequency flow and intermediate modes may also be instrumental in determining the region of phase space in which the dominant intraseasonal oscillation actually takes place (see Fig. 6), thus removing the degeneracy of the leading eigenmodes of the reduced model's linearization with respect to the shifts of the basic state along the EOF-1 axis (see Fig. 8 and its discussion above). These interactions could also play a role in slowing down the intraseasonal oscillation's trajectory in the vicinity of the anomalously persistent flow regimes detected in section 3c. Besides the intermediate modes 5–15, the “noise” associated with EOFs 16 and higher in the QG3 model can also play a role in the system's leading-order dynamics, as we shall see forthwith.

### c. Emergence of weather regimes

In this section, we track the emergence of the QG3 model's weather regimes. To do so, we introduce a factor *ɛ* (0 ≤ *ɛ* ≤ 1) that multiplies the stochastic forcing term *d***r**^{(2)} at the third level of the reduced model (see the appendix), and perform simulations of the reduced model so modified using different values of *ɛ*. For *ɛ* = 0, the solution converges to the steady state located within the AO^{−} regime (see diamond in Fig. 7a and section 4a). For small values of *ɛ*, the stochastic forcing merely causes the model trajectories to wander in the vicinity of this steady state. As the stochastic forcing amplitude *ɛ* increases, the region occupied by the model's trajectory grows in size. As progressively larger portions of the phase space are filled up, the associated PDF becomes more inhomogeneous.

In Figs. 10a–d, we show a mixture-model approximation of this PDF, along with the clusters supported by the simulated data for *ɛ* = 0.2, 0.4, 0.6, and 1, respectively. The optimum number of clusters *k** for each value of *ɛ* is obtained using the cross-validated log-likelihood criterion (Smyth et al. 1999; see also section 2). For *ɛ* < 0.15, *k** = 1 (not shown), while *k** increases with *ɛ*: *k** = 2 for *ɛ* = 0.2 (Fig. 10a), *k** = 3 for *ɛ* = 0.4 (Fig. 10b), and *k** = 4 for *ɛ* ≥ 0.6 (Figs. 10c,d). This increased non-Gaussianity of the model's PDF is indicative of the weather regimes' nonlinear origin, since a linear system forced by Gaussian stochastic forcing will always support only one mixture-model cluster, centered around the mean.

The AO^{−} regime, characterized by a large positive projection onto EOF-1, is present in all of the reduced-model PDFs of Fig. 10 and is clearly related to the deterministic model version's steady state (see section 4a). As *ɛ* increases, the model PDF becomes elongated in the negative EOF-1 direction (Figs. 10a,b), thereby spending most of the time in the vicinity of the quasi-stationary tongue of Fig. 6. As *ɛ* increases even further (Figs. 10c,d), additional spreading of the PDF along the EOF-1 axis is accompanied by the PDF's increased projection onto the (EOF-2–EOF-3) plane (see Fig. 1c for *ɛ* = 1; not shown for smaller *ɛ*).

This increased three-dimensionality of the PDF is associated with the excitation of the oscillatory modes of sections 3b and 4b. The slow phases of the 37-day oscillation, due to interactions between the model's lowest-frequency EOFs (1–4) and faster (EOFs 5–15) modes (see sections 3c and 4b) become associated with the emergence of additional anomalously persistent regimes. The latter occupy the vicinity of the model's climatological state, farther away from the three-level reduced model's unique steady state and AO^{−} regime.

It is clear, therewith, that our reduced model does not support the oscillatory topographic instability of Ghil and associates (Legras and Ghil 1985; Jin and Ghil 1990; Strong et al. 1993, 1995; Marcus et al. 1994, 1996) as a limit cycle (i.e., as a self-sustained, stable periodic solution) of its empirically inferred, deterministic dynamics. Instead, the 37-day oscillation arises here by the reduced model's stochastic forcing, which represents the QG3 model's eddy variability, pumping of a least-damped eigenmode with the appropriate periodicity. This type of relation between episodic and oscillatory variability in NH LFV, where synoptic variability plays a significant role, has to be added, therefore, to the catalog of such relations already established by Ghil et al. (1991) and by Ghil and Robertson (2002).

## 5. Concluding remarks

### a. Summary

We have constructed a reduced, nonlinear, stochastically forced model for the behavior of the three-level, quasigeostrophic (QG3) model with topography of Marshall and Molteni (1993), using the empirical methodology of Kravtsov et al. (2005b). The QG3 model has O(10^{4}) degrees of freedom. The reduced model, described in the appendix, operates in the phase space of the QG3 model's leading 15 empirical orthogonal functions (EOFs); it involves quadratic nonlinearity at the first model level, while the two additional levels, which describe the effect of unresolved variables on the reduced model's evolution, are both linear (see section 2).

Our reduced model thus has 45 variables and is driven at the third level by a spatially coherent noise that is white in time. Once the size of the state vector (number of EOFs) is chosen, the number of levels in the reduced model is determined automatically by ensuring that the last-level residual forcing's lag-1 covariance matrix vanishes, while its lag-0 covariance matrix converges to a constant matrix. The covariance matrices of the noise, as well as the model's dynamical operator were determined from a very long simulation of the QG3 model, by least squares regression (Wetherill 1986; Press et al. 1994; Kravtsov et al. 2005b). Our model generalizes linear inverse models (LIMs), which have only one level and have been extensively applied to study climate variability (Penland 1989, 1996; Penland and Ghil 1993; Penland and Sardeshmukh 1995; Penland and Matrosova 1998; Winkler et al. 2001).

The dimension *I* = 15 of EOF subspace for constructing the reduced model was chosen to maximize correspondence between its behavior and that of the full QG3 model. We documented remarkable similarity between the full and reduced models, in terms of their anomalously persistent flow regimes, as described by the Gaussian-mixture PDF (section 3a), and in terms of their low-frequency oscillatory modes, as captured by multichannel singular-spectrum analysis (M-SSA; see section 3b). The QG3 model's low-frequency variability (LFV) in turn, captures several key features of observed NH LFV.

In particular, both models are characterized by four anomalously persistent flow regimes, NAO^{+}, NAO^{−}, AO^{+}, and AO^{−} (see Figs. 1, 2 and section 3a), and a dominant oscillatory mode with a period of about 37 days (see Figs. 3, 4 and section 3b). The AO^{+} and NAO^{−} regimes are related, in both models, to anomalous slow-down of the intraseasonal oscillation trajectory, while AO^{−} is a stand-alone regime (see Fig. 5 and section 3c), also associated with the presence of the PDF ridge in Figs. 1a,b.

These features of both models' LFV were interpreted via the analysis of the reduced-model dynamics in section 4: the AO^{−} regime arises from the unique steady state of the reduced model's 3-level deterministic operator, while the PDF ridge of Figs. 1a,b coincides with the location of a plateau of quasi-stationary states of the reduced model (see Fig. 6 and section 4a). As shown in section 4b, the dominant intraseasonal oscillation in both the QG3 and reduced model has a period of about 35–37 days and is associated with the least-damped eigenmode of the latter (Fig. 7), when linearized about its climatological state (cf. Figs. 9 and 4).

This eigenmode is also very similar to the least-damped eigenmode obtained when linearizing the reduced model's deterministic operator about the AO^{−} steady state (cf. Figs. 8a,b and Figs. 8c,d). The similarity between the two eigenmodes reflects their approximate invariance with respect to shifts of the basic state along the EOF-1 axis; the large degree of zonal symmetry of EOF-1 helps explain such an invariance, which is also consistent with the presence of the quasi-stationary tongue along this axis (Fig. 6).

The degeneracy associated with these shifts is removed due to interactions between the largest-scale modes of the system, whose variance is concentrated in the subspace of the four leading EOFs, and the intermediate scales, captured by EOFs 5–15; the reduced model constructed in the phase subspace of the four leading EOFs alone reproduces neither the non-Gaussian features of the QG3 model's PDF, nor its low-frequency oscillatory modes. We also suspect, therefore, that such interactions are responsible for the slow-down of the intraseasonal oscillation's trajectory in the full QG3 model, as well as in our optimal reduced model; this slow-down is associated with the emergence of the AO^{+} and NAO^{−} regimes.

Finally, in section 4c, we studied the emergence of multiple flow regimes and intraseasonal oscillations as the stochastic forcing associated with unresolved small-scale processes increases from zero to its observed value (Fig. 10): the model trajectory is initially confined to a small region near the three-level reduced model's unique steady state and gradually fills up the quasi-stationary ridge along the EOF-1 axis of the reduced model. When the amplitude of the stochastic forcing becomes large enough, the intraseasonal oscillatory mode is excited, resulting in the model's PDF expanding in the EOF-2 and EOF-3 directions, while additional quasi-stationary clusters appear.

### b. Discussion

The methodology of mode reduction used in this paper is based solely on the statistical information contained in the model-generated dataset. D'Andrea and Vautard (2001) and D'Andrea (2002) took a different approach and constructed a low-order representation of the QG3 model by projecting its equations onto a few leading EOFs of the model, and introducing an empirical deterministic closure to account for the effect of unresolved small-scale processes on the reduced model's low-frequency evolution. They compared the performance of the resulting deterministic low-order model with that of the full QG3 model, in terms of approximate correspondence between the full model's weather regimes and the quasi-stationary states of the low-order model. Their flow-dependent parameterization of the closure term turned out to be essential for a good performance of the reduced model. In particular, their Arctic High regime, analogous to our AO^{−} regime, had a high spatial correlation with the corresponding unstable steady state of their deterministic reduced model.

Our model accounts for small-scale processes by using additional model levels that describe their effect on the large-scale variables; the energy for the system's variability is supplied by the stochastic forcing, which represents the joint effect of small-scale instabilities that occur in the full model. The AO^{−} regime of the QG3 and reduced models is associated, in our approach, with the unique steady state of the reduced model's three-level deterministic operator. One can think of this zonally symmetric regime as the zonal flow of classical dynamic meteorology (Gill 1982; Holton 2004).

The correlation between the multiple-flow regimes' anomaly patterns in our reduced model and those of the full QG3 model all exceed 0.9 and are thus higher than the correlations obtained by D'Andrea and Vautard (2001) and D'Andrea (2002). It is interesting that most of the LFV in the QG3 model and in NH observations, including the AO^{+}, NAO^{−} and NAO^{+} regimes, as well as the intraseasonal oscillations, occur fairly far away in phase space from the classical zonal flow of the AO^{−} regime.

Majda et al. (1999, 2001, 2002, 2003) have developed a strategy for systematic mode reduction in deterministic models that govern geophysical flows. This strategy has been recently applied to the analysis of a barotropic atmospheric model (Franzke et al. 2005) and to the QG3 model (C. Franzke and A. Majda 2005, personal communication). Our method, like that of Majda and colleagues, results in a nonlinear deterministic model, driven by stochastic forcing. Unlike their method, ours is completely empirical and has also been applied directly to NH geopotential height anomalies (Kravtsov et al. 2005b), as well as to sea surface temperature anomalies (Kondrashov et al. 2005). The similarities and differences between these two methods, when applying both to a fairly realistic atmospheric model, like QG3, are a matter of considerable interest and further investigation.

The number of variables in our reduced model is much less than the number of degrees of freedom in the full QG3 model, but it is still larger than the number of modes that contain most of the latter models low-frequency variance (EOFs 1–4). The higher-ranked EOFs 5–15 can thus be associated with intermediate-scale modes, while the additive noise in our reduced model's highest level represents the smallest scales, captured by EOFs 16 and higher in the full QG3 model. Note that our higher-order PCs have spectra that become increasingly whiter, rather than being increasingly shifted to higher frequencies or faster decay times.

Direct comparison with the mode-reduction strategy of Majda and coauthors is thus difficult at the present stage: our strategy emphasizes separation in spatial scales—large, intermediate, and small—while theirs emphasizes separation in temporal scales (somewhat like in F. D'Andrea and R. Vautard's work), with slow and fast scales only. This being said, there might be some analogy between the effects on the large-scale LFV of the quadratic terms that describe the interactions between EOFs 1–4 and EOFs 5–15 in our reduced model, on the one hand, and the effects of multiplicative noise in the Franzke et al. (2005) model, on the other.

Empirically based reduced models, like the one constructed in this paper, provide a useful statistical tool for interpretation of complex behavior detected in highly resolved climate models, as well as in observations. Such models not only help compact the dataset's information content but can also provide insights into the dynamics of large-scale, low-frequency climate variability, via the analysis of the reduced model's mathematical structure.

In this way, we have identified the leading modes of the QG3 model's LFV as the least-damped eigenmodes of this system linearized about its climatological state (see also Branstator 1992, 1995; Farrell and Ioannou 1993, 1995; Metz 1994; Da Costa and Vautard 1997; Itoh and Kimoto 1999; Kravtsov et al. 2003, 2005a). We have also pointed out the possible effect of the intermediate and smallest scales of motion on the QG3 model's LFV and on connecting the oscillatory description of LFV with its “episodic” description via multiple flow regimes (Ghil et al. 1991; Ghil and Robertson 2002). The effect of the smaller scales on the largest ones may be related to a more or less active synoptic eddy feedback (Robinson 1996, 2000; Lorenz and Hartmann 2001, 2003; Koo et al. 2002; Kravtsov et al. 2005a), although a detailed look at how the additional levels of the reduced model affect the deterministic dynamics indicates a damping role of the red-noise-like small scales in the QG3 model.

We have provided additional evidence for the AO^{−} regime, detected in the QG3-model simulation and associated with the steady state of our reduced model, representing a dynamical entity that is distinct from the regional NAO phenomenon (see also Deser 2000; Kravtsov et al. 2006). Our reduced-model analysis also suggests that extended-range predictability of the NAO may be possible, due to its connection with intraseasonal oscillations (Lott et al. 2001, 2004a,b) and multiple flow regimes (Ghil and Robertson 2002).

Finally, an important difference between linear and nonlinear reduced models concerns their ability to simulate non-Gaussian PDFs and help predict transitions between different weather regimes. Yang and Reinhold (1991) have reviewed earlier studies of observed NH LFV and shown that large-amplitude transitions between quasi-stationary, persistent states play a key role in it. On the other hand, D'Andrea et al. (1998) have reviewed the performance of 15 general circulation models and found them to generally underestimate the number and duration of blocking events, while Pelly and Hoskins (2003) found that forecasts of blocking inception in advanced numerical weather prediction systems still have no skill starting from a lead time of six days. Thus, if a reduced model can simulate a non-Gaussian PDF whose distinct Gaussians capture multiple weather regimes present in NH LFV, then it might also have some helpful skill at predicting the breaks and onsets of these regimes.

## Acknowledgments

We are grateful to F. D'Andrea for providing a numerical code for the QG3 model and information on its performance. A. W. Robertson and P. J. Smyth kindly provided the codes for the *k*-means method and the Gaussian mixture model and advice on their implementation. C. Franzke and A. J. Majda generously shared with us their preprint “Low-order stochastic mode reduction for a prototype atmospheric GCM” and related correspondence. Three referees provided interesting and constructive comments. This study was supported by NSF Grant ATM00-82131 (DK and MG), as well as by DOE Grant DE-FG02-04ER63881 (SK). Preliminary results of this investigation were presented at the First Annual Assembly of the European Geosciences Union held in Nice, France, in April 2004, and at the Fifth AIMS Dynamical Systems Conference in Pomona, California, in June 2004.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**(

**,**(

**.**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**.**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

### APPENDIX

#### Constructing the Reduced Model

The inverse stochastic counterpart of the QG3 model is constructed in the phase space of *I* leading EOFs of the 500-hPa streamfunction anomaly field. If **x** = {*x _{i}*} is such a state vector of dimension

*I*, then the multilevel quadratic inverse stochastic model has the general form (Kravtsov et al. 2005b)

the matrices 𝗔* _{i}*, the rows

**b**

^{(n)}

_{i}of the matrices 𝗕

^{(n)},

*n*= 0,

*N*and the components

*c*

^{(0)}

_{i}of the vector

**c**

^{(0)}, as well as the components

*r*

^{(n)}

_{i}of each level's residual forcing

**r**

^{(n)},

*n*= 0,

*N*+ 1, are determined by general least squares (Wetherill 1986; Press et al. 1994).

Linear inverse models (LIMs; see section 2) are a particular case of Eq. (A1) with only one level, *N* = 1, and with zero 𝗔* _{i}* and

**c**

^{(0)}. Including quadratic nonlinearity at the first level of the multilevel model allows one to account for processes characterized by non-Gaussian statistics. The stochastic forcing

**r**

^{(0)}at this level, however, typically involves serial correlations and might also depend on the modeled process

**x**. We include, therefore, an additional model level to express the time increments

*d*

**r**

^{(0)}as a linear function of an extended state vector (

**x**,

**r**

^{(0)})

^{T}; in numerical practice, these increments are equivalent to the divided differences of the residual forcing

**r**

^{(0)}.

Linear dependence is used at the second and higher levels ones since the non-Gaussian statistics of the data has already been captured by the first, nonlinear level. More levels are added, until the estimate of the *N*th level's residual **r**^{(N+1)} becomes white in time, and its lag-0 correlation matrix converges to a constant matrix. With the addition of each level, we are accounting for additional time-lag information, thereby squeezing out any time correlations from the residual forcing. Equation (A1) thus describes a wide class of processes in a fashion that explicitly accounts for the modeled process **x** feeding back on the noise statistics (see Kravtsov et al. 2005b).

## Footnotes

* Additional affiliation: Département Terre-Atmosphère-Océan, and Laboratoire de Météorologie Dynamique du CNRS/IPSL, Ecole Normale Supérieure, Paris, France

*Corresponding author address:* Dr. Dmitri Kondrashov, Dept. of Atmospheric and Oceanic Sciences, and Institute of Geophysics and Planetary Physics, University of California, Los Angeles, 405 Hilgard Ave., Los Angeles, CA 90095-1565. Email: dkondras@atmos.ucla.edu