## Abstract

Global sea surface temperature (SST) evolution is analyzed by constructing predictive models that best describe the dataset’s statistics. These inverse models assume that the system’s variability is driven by spatially coherent, additive noise that is white in time and are constructed in the phase space of the dataset’s leading empirical orthogonal functions. Multiple linear regression has been widely used to obtain inverse stochastic models; it is generalized here in two ways. First, the dynamics is allowed to be nonlinear by using polynomial regression. Second, a multilevel extension of classic regression allows the additive noise to be correlated in time; to do so, the residual stochastic forcing at a given level is modeled as a function of variables at this level and the preceding ones. The number of variables, as well as the order of nonlinearity, is determined by optimizing model performance.

The two-level linear and quadratic models have a better El Niño–Southern Oscillation (ENSO) hindcast skill than their one-level counterparts. Estimates of skewness and kurtosis of the models’ simulated Niño-3 index reveal that the quadratic model reproduces better the observed asymmetry between the positive El Niño and negative La Niña events. The benefits of the quadratic model are less clear in terms of its overall, cross-validated hindcast skill; this model outperforms, however, the linear one in predicting the magnitude of extreme SST anomalies.

Seasonal ENSO dependence is captured by incorporating additive, as well as multiplicative forcing with a 12-month period into the first level of each model. The quasi-quadrennial ENSO oscillatory mode is robustly simulated by all models. The “spring barrier” of ENSO forecast skill is explained by Floquet and singular vector analysis, which show that the leading ENSO mode becomes strongly damped in summer, while nonnormal optimum growth has a strong peak in December.

## 1. Introduction

### a. Motivation

The El Niño–Southern Oscillation (ENSO) phenomenon dominates interannual climate signals and has great economic and societal impacts. It originates in the coupled ocean–atmosphere dynamics of the tropical Pacific (Philander 1990), but has a large influence on atmospheric circulation and air–sea interaction outside the tropical belt through associated teleconnections (Alexander et al. 2002; Lau and Nath 2001).

An important aspect of ENSO is that its positive-phase El Niño events are generally characterized by a larger magnitude than its negative La Niña counterparts (Burgers and Stephenson 1999; Hoerling et al. 1997; Sardeshmukh et al. 2000). This statistical skewness is but one of the indicators suggesting that, at least to some extent, the dynamics of ENSO involves nonlinear processes (Ghil and Robertson 2000; Neelin et al. 1994, 1998). At the same time, most detailed numerical models used for operational ENSO prediction significantly underestimate this nonlinearity (Hannachi et al. 2003), and the quality of their forecasts is still far from satisfactory (Barnston et al. 1994, 1999; Landsea and Knaff 2000).

Ghil and Jiang (1998) and Mason and Mimmack (2001) reviewed a variety of statistical models that possess useful ENSO forecast skill; most of these models are still linear. Nonlinear models are getting more attention lately: Timmermann et al. (2001) have applied multiple nonparametric regression analysis to derive a set of low-order ENSO empirical dynamical models, while neural-network models (Grieger and Latif 1994; Tangang et al. 1998) have been used to reconstruct the attractor of ENSO dynamics and make extended ENSO forecasts. The complex structure of multilayer neural-network models, however, makes their results difficult to interpret (Hsieh and Tang 1998), and they did not provide significant improvements in skill over linear methods (Tang et al. 2000).

In this paper, we construct a hierarchy of statistical models for ENSO forecasts. This hierarchy includes a nonlinear model that is easy to interpret, accurately represents both linear and nonlinear aspects of the associated dynamics, and is validated out to 12 months.

### b. Linear inverse models

As a starting point in developing our ENSO model hierarchy, we use the concept of inverse stochastic models. If **X** is the climate-state vector, **X** its time mean, and **x** = **X** − **X** the vector of anomalies, then the evolution of **x** can be expressed as

Here the dot denotes a time derivative, 𝗟 is a linear operator, and **N** represents nonlinear terms; both 𝗟 and **N** may be function of **X**, but this dependence is not taken into consideration here.

The simplest type of inverse stochastic model is the so-called linear inverse model (LIM; Penland 1989, 1996). LIMs are obtained by assuming, in Eq. (1), that **N(x) ***dt* ≈ 𝗧**x ***dt* + *d***r**^{(0)}, where 𝗧 is the matrix describing linear feedbacks of unresolved processes on **x**, and *d***r**^{(0)} is a white-noise process that can be spatially correlated. With this assumption Eq. (1) becomes

The matrix 𝗕^{(0)} and the covariance matrix of the noise 𝗤 ≡ 〈**r**^{(0)}**r**^{(0)T}〉 can be directly estimated from the observed statistics of **x** by multiple linear regression (MLR; Wetherill 1986). LIMs have shown some success in predicting ENSO (Penland and Sardeshmukh 1995; Johnson et al. 2000a, b), tropical Atlantic variability (Penland and Matrosova 1998), as well as extratropical atmospheric variability (Winkler et al. 2001). These models are typically constructed in the phase space of the system’s leading empirical orthogonal functions (EOFs; Preisendorfer 1998). The state vector **x**, or predictor-variable vector, consists of amplitudes of the corresponding principal components (PCs), while the vector of response variables contains their tendencies **ẋ**.

### c. This paper

In the study of most geophysical phenomena, including ENSO, the assumptions of linear, stable dynamics, and of additive white noise used to construct LIMs are only valid to a certain degree of approximation. In particular, the stochastic forcing **r**^{(0)} in Eq. (2) typically involves serial correlations. In addition, when the nonlinearity is strong, the matrices 𝗕^{(0)} and 𝗤 obtained from the data can exhibit substantial dependence on the lag used to fit them (Penland and Ghil 1993).

In the present paper, we consider generalizations of LIMs that use additional statistical information to account for both nonlinearity and serial correlations in the additive noise. Kravtsov et al. (2005, hereafter KKG) first demonstrated the performance of the proposed approach by using three simple, but geophysically relevant examples, namely the deterministic Lorenz (1963) model and two stochastically forced potential-well systems. Next, the methodology was applied to a very long simulation of the Marshall and Molteni (1993) three-layer quasigeostrophic atmospheric model, which has been performed and analyzed by Kondrashov et al. (2004), and finally to a 44-yr set of Northern Hemisphere geopotential heights. In the present paper, the methodology of KKG is applied to a global SST dataset, and the resulting inverse models are used for the analysis and prediction of seasonal-to-interannual climate variability.

One major modification of LIMs is obtained by assuming a polynomial, rather than linear form of **N**(**x**) in Eq. (1), in particular, a quadratic dependence. The *i*th component *N _{i}*(

**x**) of

**N**can then be written as

The matrices 𝗔* _{i}* represent the blocks of a third-order tensor, while the vectors

**b**

^{(0)}

_{i}=

**l**

_{i}+

**t**

_{i}are the rows of the matrix 𝗕

^{(0)}= 𝗟 + 𝗧 [compare with Eq. (2)]. These objects, as well as the components of the vector

**c**

^{(0)}, are estimated here by multiple polynomial regression (MPR; McCullagh and Nelder 1989).

In section 2, we briefly describe a multilevel, quadratic inverse model that deals with the problem of serial correlations in **r**^{(0)}. The global sea-surface temperature (SST) dataset, as well as our analysis procedure, are described in section 3. In section 4, the linear and nonlinear models are compared in terms of their statistical and spectral properties relative to those observed, and we evaluate their forecast potential, with an emphasis on central Pacific SST anomalies. We apply in section 5 month-by-month eigenspectrum analysis, as well as Floquet and singular-vector analysis to explain several features of the modeled and observed data presented in section 4; these features include the quasi-biennial and low-frequency peaks in the power spectra, changes in seasonal variance, and the “spring barrier” in prediction skill. The results are summarized in section 6.

## 2. Nonlinear multilevel model

We construct inverse stochastic models in the phase space of the leading EOFs of the SST field. The quadratic model that we will use below has the general form

where **x** = {*x _{i}*} is the state vector of dimension

*I*. The matrices 𝗔

*, the vectors*

_{i}**b**

^{(0)}

_{i}, and the components

*c*

^{(0)}

_{i}of the vector

**c**

^{(0)}, as well as the components

*r*

^{(0)}

_{i}of the residual forcing

**r**

^{(0)}, are determined by least squares. If the inverse model contains a large number of variables, the statistical distribution of

**r**

^{(0)}at a given instant is nearly Gaussian, according to the central limit theorem (Von Mises 1964).

However, the stochastic forcing **r**^{(0)} in Eq. (4) typically involves serial correlations and might also depend on the modeled process **x**. We include, therefore, in Eq. (5) below, an additional model level to express the known time increments *d***r**^{(0)} as a linear function of an extended state vector [**x**, **r**^{(0)}] ≡ (**x**^{T}, **r**^{(0)T})^{T}. We estimate this level’s residual forcing in turn by least squares. More levels are added in the same way, until the *L*th level’s residual **r**^{(L+1)} becomes white in time, and its lag-0 correlation matrix converges to a constant matrix:

Equation (5) describes a wide class of processes in a fashion that explicitly accounts for the modeled process **x** feeding back on the noise statistics: the vectors **b**^{(l)}_{i} are the rows of matrices 𝗕^{(l)} that represent this “eddy feedback.” The multilevel linear inverse model is obtained by assuming, in Eq. (5), 𝗔* _{i}* ≡

**0**and

**c**

^{(0)}≡

**0**. The classical LIM is thus the one-level version of our multilevel linear model. The details of the methodology and further discussion appear in KKG.

It is well known that extreme ENSO events, both positive (El Niño) and negative (La Niño), tend to occur in boreal winter. Several ways to simulate ENSO phase locking to an annual cycle have been proposed in empirical models. Penland and associates (Penland and Sardeshmukh 1995; Penland 1996; Penland and Matrosova 2001) argued that the dynamical operator in an LIM model for ENSO should be independent of season, and that the phase locking is caused by the seasonal dependence of the stochastic forcing. Johnson et al. (2000a, b), however, suggest the opposite, namely that the seasonality in the dynamical propagator of a Markov model for ENSO cannot be neglected.

Thompson and Battisti (2000, 2001 demonstrated that seasonal variations in the mean state of the linear, dynamical, stochastically forced model help reproduce qualitatively the observed seasonal patterns of variance and lagged autocovariance in tropical Pacific SSTs. Both Johnson et al. (2000b) and Xue et al. (2000) obtained seasonally varying Markov transition matrices for each calendar month. Seasonal-type models, however, have to be trained on much shorter records (i.e., 12 times shorter than the original time series for annually averaged models); this shortcoming leads to larger errors in estimating the regression parameters and, consequently, to a lower prediction skill.

We suggest here an alternative approach to include seasonal dependence in the dynamical part of the first level of our linear and nonlinear models, namely we assume the matrix 𝗕^{(0)} and vector **c**^{(0)} to be periodic with period *T* = 12 months:

The complete time record is thus used to simultaneously estimate the four seasonal-dependence coefficients 𝗕_{s}, 𝗕_{c}, **c**_{c}, and **c**_{s}, which provides greater stability in our regression parameter estimates. Specifying such seasonal dependence only on the first model level gave the best results in cross-validation forecasts, when compared to alternative models (not shown), where the periodic form of Eq. (6) was used on both levels, or on the second level only.

The optimal number of state-vector components, as well as the degree of nonlinearity, has to be assessed, in general, using cross validation. This test is carried out by Monte Carlo simulations in which the inverse model, trained on one part of the available data, is used to estimate the properties of the system evolution during the validation interval that was not included in the training interval. The measure chosen to assess the model depends on the purpose at hand: if the model is to be used for prediction, the cross-validated hindcast skill, quantified by the correlation between the observed and simulated fields, or the root-mean-square (rms) distance between the two, at future times, is an appropriate measure of model performance; in more theoretical applications, certain statistical characteristics of the observed and modeled evolution, such as probability density functions (PDFs) of model variables and their power spectra, have to be compared. We will use both hindcast skill measures and overall statistical behavior to compare the performance of several inverse stochastic models.

## 3. Data and analysis procedure

We construct our inverse stochastic models using a 645-month dataset (January 1950–September 2003) of global, monthly SST anomalies, given on a 5° × 5° grid over the 30°S–60°N latitude belt (Kaplan et al. 1998), with the seasonal cycle removed. The spatial extent of the dataset was chosen to maximize prediction skill of the so-called Niño-3 SST index. This index is defined as the area-averaged SST anomaly over the rectangular box (5°S–5°N, 150°–90°W), outlined by the dashed line in Figs. 2a and 2b below.

Trenberth and Hurrel (1994) have argued that the Niño-3 time series experiences an abrupt positive shift after about 1976, when considering a mean of the record before and after this date. Chao et al. (2000), however, showed that this shift is not unique, but represents one of several phase shifts associated with a 15–20-yr interdecadal oscillation. We find that our model’s performance slightly improves at lead times of more than 8 months, as described in section 4 below, if the 1976/77 shift is removed pointwise from the data. The time interval is divided into two subintervals, 1950–76 and 1977–2003, and the two distinct means of the time series are subtracted from the respective subintervals; the resulting corrected Niño-3 SST index is shown as a solid line in Fig. 1.

The resulting dataset is used in two different ways. When considering long-term statistical properties or computing in-sample hindcasts, we construct our inverse models in the subspace spanned by the leading EOFs computed on the whole time series. On the other hand, when we consider cross-validated hindcasts, we use EOFs of the reduced dataset that leaves out segments of SST evolution, which are several years long and which we subsequently predict (Penland and Sardeshmukh 1995; Johnson et al. 2000a, b). We choose these segments to span complete warm, as well as cold ENSO phases, and thus divide the time series into 3–5-yr-long intervals that start and end in January of the following years: 1950, 1955, 1960, 1964, 1968, 1972, 1975, 1979, 1982, 1986, 1990, 1994, 1997, and 2002. If 5-yr-long segments of equal length are used, some of the training periods include an incomplete large-amplitude ENSO cycle; this inclusion generally results in a reduction of the cross-validated hindcast skill of up to 0.05 in anomaly correlation at a 6-month lead time. This reduction is due to the small number of large ENSO events in the relatively short observed record, which hampers reliable estimates of the model coefficients.

Both in-sample hindcasts and cross-validated hindcasts are evaluated by performing a set of integrations of a given inverse model, initialized at each month in the record, for a given number of months. The prediction characteristics are computed by transforming the model’s solution from EOF space back into physical space, and comparing it against the total observed SST anomaly.

The number of variables (i.e., of EOFs retained) and the order of the regression polynomial used in the inverse models is chosen to maximize the models’ cross-validated hindcast skill: 20 variables is the best choice for all linear and quadratic models; the skill deteriorates if a smaller or larger number is used. A cubic model was also tested, but did not perform as well as the quadratic model in cross-validated hindcasts, and will not be considered here any further. Although the residual forcing at the first level of both linear and quadratic models does involve serial correlations, we construct one-level (1-L) and two-level (2-L) versions of these models for explicit comparison of model performance. The residual forcing at the second level is fairly white, and adding further levels does not improve the hindcast skill. The stochastic forcing in the models is spatially correlated according to the residual forcing’s lag-0 covariance matrix.

The model coefficients, as well as the residual forcing, are estimated for the linear models by ordinary least squares (Press et al. 1994). When higher-order polynomials are used for the inverse models, we have many more parameters to estimate, and the regression problem (see appendix A of KKG) can become ill-conditioned for a short data record. This ill-conditioning is related to the collinearity phenomenon, in which the vectors of predictor variables are close to linear dependence (Wetherill 1986). To deal with this problem for our quadratic model, we use the partial least squares (PLS) procedure (Wold et al. 1984; Höskuldsson 1996). PLS finds the so-called factors, or latent variables, that capture the maximum variance in the predictor variables, as well as achieving high correlation with the response variables. The number of these variables to retain is found by using cross validation.

The resulting models (5) are integrated with Δ*t* = 1 month, being forced at each time step by random realizations of the spatially correlated stochastic forcing at the model’s last level. To produce cross-validated or in-sample hindcasts we use “ensemble forecasting.” Instead of using just one model run, many runs with different random realizations of the stochastic forcing are made. The ensemble mean of the different forecasts (Kalnay 2003) provides a smoother and more reliable seasonal-to-interannual forecast.

## 4. Model comparison and validation

In this section we show first that 2-L models are superior to 1-L ones in terms of cross-validated hindcast skill at long lead time (section 4a). In section 4b our best 2-L linear and quadratic models are then compared in terms of their statistical properties, such as the PDF and seasonal variation [sections 4b(1) and 4b(2)] and power spectra [section 4b(3)] of the observed and modeled SSTs. Finally, the models are compared in section 4c(1) in terms of their predictive skill, and the “spring barrier” in this skill is discussed in section 4c(2).

### a. Comparison between one-level and two-level models

We compare the 1-L and 2-L quadratic models in terms of their respective cross-validated hindcast skills in Fig. 2 (see also Fig. 9 in KKG). Figures 2a and 2b display the spatial distribution of the SST anomaly correlations in the cross-validated hindcasts at 9-month lead, for the 1-L and 2-L model versions, respectively. The hindcast time series represent each the mean of a 100-member ensemble, and they have been cross-validated as described in section 3. Similar results have been obtained when comparing 1-L and 2-L linear models (not shown).

Both quadratic model versions have similar skill patterns, with maximum values over the equatorial Pacific and Indian Ocean; a stronger maximum in the Niño-3 region (outlined by a solid black line in Figs. 2a,b) appears for shorter lead times (not shown). The 2-L model’s skill is higher practically everywhere; this superior skill of the 2-L model is most marked south of the Niño-3 region, as well as in the Indian Ocean.

The correlation between the predicted and observed area-averaged Niño-3 SST anomalies is plotted in Fig. 2c: the 2-L model (solid line) is significantly more skillful than the 1-L model (dotted line) at lead times longer than 4 months; both 1-L and 2-L models beat the damped persistence hindcast (dashed line). The latter has been generated by fitting an AR(1) model to the observed Niño-3 SST time series, and then using this model to damp persistence exponentially in lead time. The 2-L model outperforms the 1-L model in rms error as well, with improvement becoming significant at lead times longer than 6 months (Fig. 2d).

### b. Statistical properties

Given the better performance of the 2-L models, we only consider henceforth their linear and quadratic versions; in this subsection, we examine the long-term statistical properties of these two model versions. Both models were trained on the EOFs of the entire observed SST time series, after which we performed 100 realizations for either model, each being 645 months long, like the observed dataset.

#### 1) Histograms and seasonal dependence

The histograms of the observed and simulated Niño-3 SST index are plotted in the left column of Fig. 3; they have been normalized to have unit area. In each panel, the corresponding normal distribution is plotted as a heavy solid line. In the right column of Fig. 3 we plot the monthly distribution of variance for the corresponding SST indices.

Results for the observed data (1950–2003) are shown in row (a), and results for the 100-member ensemble of realizations of the quadratic and linear model in rows (b) and (c), respectively. The histograms and variances for the simulated data represent averaging over the 100-member ensemble.

The variance of the observed Niño-3 index tends to peak during the winter and drop in spring, which is related to the so-called spring barrier in ENSO prediction (e.g., Balmaseda et al. 1995; Weiss and Weiss 1999). This seasonal feature of ENSO is simulated reasonably well by both the linear and the quadratic 2-L model (right column of Figs. 3b,c); being averaged over many integrations, the seasonal variation is not as large in the models as in the observed data, and the timing of the maximum seasonal variance is slightly off. The seasonal evolution in variance is analyzed in detail in the next subsection.

The observed dataset is characterized by strongly non-Gaussian features, with El Niño events generally having larger magnitude than La Niñas. The corresponding histogram in Fig. 3a exhibits therefore positive skewness. When the histogram is computed by combining all individual realizations of the quadratic model (Fig. 3b), the distribution of the Niño-3 index is also positively skewed, albeit to a lesser extent than for the observed data. In contrast, since the stochastic forcing is normally distributed, the histograms produced by the linear model are bound to approximate Gaussian PDF, as shown in Fig. 3c (see also Figs. 10a–c in KKG). The histograms in Figs. 3b and 3c do not change substantially when using a larger ensemble size.

To quantify the deviations from Gaussianity, we examine the skewness *b*_{1} and kurtosis *b*_{2} − 3 (Mardia 1980), where *b*_{1} = *m*_{3}/*m*^{3/2}_{2}, *b*_{2} − 3 = *m*_{4}/*m*^{2}_{2} − 3, and *m _{k}* is the

*k*th sample moment about the mean. The skewness describes asymmetry of the probability distribution about the mean, while the kurtosis measures its flatness. For a Gaussian distribution both

*b*

_{1}and

*b*

_{2}− 3 equal zero.

We show the distribution of skewness and kurtosis for the single observed and the 100 model simulations, with a linear trend subtracted, in Figs. 3d and 3e. The skewness and kurtosis values for the observed Niño-3 index (vertical solid lines) are equal to 0.79 and 1.26; they fall within the 95% percentile of the quadratic model distribution (Fig. 3d). In contrast, the observed values for both skewness and kurtosis are beyond the 95% confidence limit of the linear model distribution (Fig. 3e). The skewness mean value of 0.01 for the linear model is very close to zero, in agreement with Fig. 3c.

The mean values for skewness and kurtosis in Fig. 3d, 0.39 and 0.51, respectively, are about one-half the observed ones. When using a smaller number of variables (as few as five), the quadratic model matches better the observed moment values, but it has a lower cross-validated hindcast skill.

#### 2) Box-plot statistics

To examine in further detail the seasonal PDF evolution of the observed and simulated data, we use so-called box plots (e.g., Hannachi et al. 2003). These plots show the spread around the mean and the skewness of a given distribution, as well as displaying the outliers; they are displayed in Fig. 4 for the observed data (Fig. 4a), as well as long data (same as used in Figs. 3b,c) from integrations of the nonlinear and linear models (Figs. 4b and 4c, respectively). The symbols for a given calendar month are explained in the caption of the figure.

The observed dataset exhibits significant seasonal dependence of its spread and skewness, with larger spread and strongly positive skewness during boreal winter. The linear model appears to do slightly better in capturing the seasonal dependence of the spread (Figs. 4b,c), while the quadratic model captures better the skewness of the data, primarily in the outliers. The seasonal variation in prediction skill of our models is presented in section 4c.

#### 3) Power spectra

ENSO climate signals involve two oscillatory modes: the lower-frequency or quasi-quadriennial (QQ) mode (Jiang et al. 1995) has a period of 3–7 yr, while the other mode is quasi-biennial (QB; Rasmusson et al. 1990; Ghil et al. 2002). Any model aiming at extended-range ENSO prediction should reproduce these periodicities (Ghil and Jiang 1998; Ghil and Robertson 2000). State-of-the-art climate models, however, still have difficulties in simulating the observed Niño-3 spectra. For example, the latest version of the Community Climate System Model, CCSM2, simulates an ENSO signal with a peak frequency that corresponds to a period of 2–3 yr (Kiehl et al. 2003).

We investigate the existence of such oscillatory modes in our inverse model simulations and compare them with the observed behavior using advanced spectral methods—singular-spectrum analysis (SSA; Vautard and Ghil 1989; Keppenne and Ghil 1992; Dettinger et al. 1995; Ghil et al. 2002) and wavelet analysis (Strang 1989; Meyer 1992). SSA and wavelet methods are complementary: SSA is especially useful for the analysis of amplitude- and phase-modulated signals, while wavelet analysis allows one to follow changes in the amplitude and frequency of an oscillatory signal over time.

SSA computes eigenvalues and eigenvectors of a given time series’ lag-covariance matrix. The eigenvalues are the squares of the singular values after which the method is called, while the eigenvectors are often called temporal EOFs, by analogy with the spatial EOFs that arise in the PC analysis of meteorological fields (Fraedrich 1986; Vautard and Ghil 1989). The projection of the original time series onto the temporal EOFs yields temporal PCs. The SSA window width determines the range of periodicities to be detected. An oscillatory component is represented in SSA by a pair of approximately equal singular values, with the respective temporal EOFs and PCs being in phase quadrature (Vautard and Ghil 1989). To determine the statistical significance of oscillatory pairs, we apply a *χ*^{2} test against a null hypothesis of red noise (Allen and Smith 1996), as well as a lag-correlation test to verify that a given pair of PCs is indeed in quadrature (Ghil and Mo 1991; Vautard et al. 1992).

The continuous wavelet transform (CWT) is defined as the convolution of the signal with scaled, shifted versions of the mother wavelet function. For our analysis, we will use a complex Gaussian wavelet of the eighth order as the basis function. The CWT yields a number of time-dependent wavelet coefficients, which are related to amplitude and frequency. The absolute value of these coefficients yields the wavelet spectrum.

We show the results of the spectral analysis in Fig. 5. In Figs. 5a and 5b, SSA spectra of the observed Niño-3 index (1950–2003), and of a single, 53-yr-long realization of our quadratic, 2-L model are plotted. The corresponding results of the CWT analysis are shown in panels Figs. 5c and 5d. We have used an SSA window width of 60 month, which allows one to capture periodicities as long as 5 yr. Similar results are obtained using windows of 50 and 70 months (not shown). Error bars correspond to the 2.5th and 97.5th percentiles of a red-noise test. Singular values that lie outside this interval are unlikely, at the 95% level, to be due to a red-noise process. For each EOF, a characteristic frequency has been estimated by maximizing its correlation with a sinusoid.

For the observed data, SSA detects one significant oscillatory pair with a period of 47 months (EOFs 1 and 2), and a possible second pair with a period of 27 months (EOFs 3 and 4). The lag-correlation test for the corresponding pair of temporal PCs confirms that both of these pairs are indeed oscillatory. The two pairs thus capture the QB (Rasmusson et al. 1990), and QQ (Jiang et al. 1995) mode. For the 2-L quadratic model’s realization, we find that its four leading EOFs also form two possible pairs, with periods of 56 and 28 months, respectively; both are confirmed to be oscillatory by a lag-correlation test. Two ENSO oscillatory modes are also prominent in much longer model simulations (not shown), though the QB mode is less robust than the QQ one.

The main oscillatory mode in the simulated data has a slightly longer period than observed. The frequency of this oscillation changes with time, however, as it does in the observations (Moron et al. 1998; Ghil et al. 2002; Wang and Wang 1996). The wavelet spectrum of the observed data in Fig. 5c demonstrates changes in the period of ENSO’s low-frequency mode. The period is close to 70 months in the 1950s, undergoes an abrupt drop to about 40 months in the 1960s, and then slowly increases to 50–60 months by the end of the record. Qualitatively similar behavior for the dominant mode is obtained by calculating instantaneous frequencies of the leading pair of EOFs in multiscale SSA (Yiou et al. 2000; not shown). We plan to investigate the interdecadal variability of our models in a follow-up paper in greater detail.

The wavelet spectrum of the simulated data in Fig. 5d is qualitatively similar to the observed one, in as much as the period of the main oscillation changes from 50 to 30 and then back to 50 months during the course of time. The particular realization we display is arbitrary, so we are not attempting to capture the timing of the frequency transitions shown in Fig. 5c, only their range and overall character. We examine further the causes and properties of ENSO oscillatory modes in section 5.

### c. Hindcasts

In this subsection we discuss our models’ cross-validated ENSO hindcasts. The procedure for generating these hindcasts was described in section 3.

#### 1) Ensemble-mean and extreme-event hindcasts

It turns out that the linear and quadratic, 2-L inverse models have comparable cross-validated hindcast skill overall; this skill was only shown in Fig. 2 for the latter. We will see, though, that the quadratic model outperforms its linear counterpart in predicting the magnitude of extreme SST anomalies.

Figures 6a and 6b show the prediction skill *f*_{mean}(*t*) for the mean of a 100-member ensemble hindcast, along with the damped-persistence hindcast. Both models are significantly more skillful than the persistence hindcast, and have almost identical rms errors and anomaly correlations for lead times up to 6 months; at this range, the correlation coefficient for both still exceeds 0.6, which represents a common criterion of useful prediction. At lead times beyond 6 months, the linear model does a slightly better job than the quadratic one.

It is difficult to compare our results with those from other statistical and dynamical models in detail, since different researchers use different measures of skill and different forecast-validation intervals. At the present time, dynamical models do not seem to outperform the best statistical models in forecasting major ENSO indices (Ghil and Jiang 1998; Barnston et al. 1999; Goddard and DeWitt 2005) and hence we limit ourselves here to the latter. In doing so, it is well known that the Niño-3.4 index tends to be slightly easier to forecast than Niño-3 and that smoothing of the forecast over several months is a common practice that also improves anomaly-correlation skills. The 0.62 value of our unsmoothed, monthly anomaly correlation for the Niño-3 index at a 6-month lead in Fig. 6a here is thus quite similar to the cross-validated skill of other statistical models. For instance, both Johnson et al. (2000b) and Xue et al. (2000) used a 3-month smoothing of the Niño-3.4 index in their cross validations and obtained, respectively, 0.60 for the 1956–95 interval (Johnson et al. 2000b, their Fig. 4) and 0.65 for the 1980–95 interval (Xue et al. 2000, their Fig. 12). The competitive skill of our results has led to our best 2-L quadratic model being included in a multimodel prediction scheme at the International Research Institute for Climate Prediction (see online at http://iri.columbia.edu/climate/ENSO/currentinfo/SST-table.html for details).

Based on the same two 100-member ensembles, we plot in Figs. 6c and 6d the forecast skill of an extreme-event hindcast *f*_{extr}(*t*) ≡ *ω*_{+}*f*_{+} + *ω*_{−}*f*_{−}, where *f*_{+} and *f*_{−} are ensemble averages over the top and bottom 20% of the hindcasts, ordered by the magnitude of the predicted SST anomaly, and *ω*_{±} = 0.5[1 ± tanh( *f*_{mean})]. With this weighting, *f*_{extr}(*t*) ≈ *f*_{+} during strong warm events, and *f*_{extr}(*t*) ≈ *f*_{−} during strong cold events. The rms error is expected to be higher in Fig. 6d than in Fig. 6b, since we average now only over forecasts in the tails of the ensemble distribution. The extreme-event hindcast produced by the quadratic model has a somewhat smaller rms error than its linear counterpart (Fig. 6d), while the anomaly correlation with the observed data for both models is essentially the same as for the ensemble mean hindcast (cf. Figs. 6c and 6a). The extreme-event nonlinear hindcasts beat persistence in rms error after 2 months (Fig. 6d) and continues to do so for the following 10 months. The modest improvement of the quadratic versus linear hindcasts comes from the nonlinear model’s ability to capture the non-Gaussian features of ENSO, with positive SST anomalies having larger amplitude than the negative anomalies, while the PDF of the linear inverse model’s time series is normal (see left panels in Fig. 3). The linear model tends, therefore, to underestimate the magnitude of El Niño events and to overestimate that of La Niña events.

To illustrate this point we compare the time series *f*_{extr}(*t*) for both cross-validated hindcasts in Fig. 7a, while the in-sample hindcasts are shown in Fig. 7b. In the latter, both models were trained and verified on the entire available record. We only show the 1970–2003 segment of the record, which contains the major La Niña and El Niño events. Given a short historical record, the in-sample hindcast skill helps us to analyze the potential for future model performance, and represents the best-possible prediction.

Close examination of Fig. 7b shows that the in-sample hindcast curve *f*_{extr}(*t*) for the quadratic model matches the observed time series value of roughly 3.7°C at the peak of the 1997/98 El Niño, while the linear model underestimates it by about 1°C. This comparison suggests that our quadratic inverse model has the potential to predict strong ENSO events, like the 1997/98 El Niño, 6 months in advance, with a probability of 20%. In contrast, the probability of predicting the correct magnitude of this event at such a lead time is vanishingly small for the linear inverse model.

The in-sample hindcasts of both linear and quadratic models, however, are similar for other large El Niños during the 1970–2003 interval. The same is true for the cross-validated hindcast time series in Fig. 7a, which shows no advantage of the quadratic over the linear model for any El Niño during this interval, including the 1997/98 event. The magnitude of La Niñas, on the other hand, is better described by the quadratic model.

This behavior can be understood as follows. In-sample hindcasts tell us essentially how well the model simulates the record; they overestimate, therefore, the model’s predictive skill in a real-time situation. On the other hand, cross validation usually underestimates model performance, especially if the underlying time series contains only a very small number of extreme events. When such events are excluded from the model’s training period, they become statistical outliers, and therefore cannot be well predicted in cross validation. This seems to be the case for some of the extreme ENSO events.

The comparison of Figs. 7a and 7b indicates, for example, that the 1997/98 El Niño does not have any counterpart in the historical record. This uniqueness is further highlighted in the SSA reconstruction of the record. In Fig. 7c, we plot, following Jiang et al. (1995), the reconstruction of the two oscillatory ENSO modes, QQ and QB. The strongest warm events during the 1970–2003 time interval have the QQ and QB modes almost in phase, with the 1997/98 event being almost perfectly phase aligned, even more so than the other large-amplitude events. Still, the amplitude of the actual event exceeds the sum of the QQ and QB in-phase amplitudes, possibly pointing to an additional, highly nonlinear process that “kicks in” at this level of warming.

Including additional information in the model, for instance subsurface temperatures or wind stress, may help capture such an additional process better, and thus obtain better forecast skill for extreme ENSO events. The studies of Xue et al. (2000) and McPhaden (2004) do suggest that the heat content anomalies along the equatorial belt are, in fact, an important ENSO predictor, while the work of Jin (1997a, b,c) shows that these anomalies are out of phase with the SST anomalies. We plan to include additional physical variables in our future work, and to consider carefully their phase relations to the SST.

#### 2) Seasonal dependence of the forecast skill: “Spring barrier”

In Fig. 8, we plot the month-to-month distribution of the skill for the ensemble mean cross-validated hindcasts of our best 2-L linear and quadratic model; results are shown at both 6-month and 3-month lead, with the target month indicated on the abscissa. Figure 8 confirms that our 2-L models, too, exhibits the so-called spring barrier for prediction. Similar results are obtained for the extreme-event hindcasts (not shown). The spring barrier is a common problem for other dynamical and statistical models as well (Balmaseda et al. 1995). It involves a drop of skill in El Niño-3 forecasts with a 3–6-month lead that start in February through April, with the forecast skill rising again in late summer and fall. The related spring time drop of lagged autocorrelation in observed SST anomalies (Flugel and Chang 1998) is reproduced by both our linear and quadratic models (see Fig. 9), with the linear model matching the observations slightly better.

The seasonal prediction barrier occurs in our models in spring and early summer and it is consistent with the seasonal dependence of ENSO variance apparent in the right panels of Fig. 3 and in the Fig. 4. SST anomalies are smaller in late winter through summer, and their evolution in our model at that time is governed primarily by the stochastic forcing; this fact leads to a low signal-to-noise ratio and, hence, to lower predictability for the season in question. The opposite is true for the boreal winter months, when both El Niño and La Niña are at or near peak amplitude and the associated SST signal is large relative to the noise. The existence of the spring barrier is thus related to the question, Why are SST anomalies largest during boreal winter? This question is answered by the eigenspectrum analysis of section 5.

## 5. Dynamical analysis

In this section, we analyze our inverse models’ dynamics in order to explain certain features of the modeled and observed data presented in section 4. In particular, the existence of two distinct oscillatory modes, seasonal variations, as well as the spring barrier in the prediction skill are addressed. We utilize month-by-month eigenspectrum analysis (section 5a) and Floquet analysis (section 5b) to interpret the seasonal modulation of the oscillatory modes. Singular-vector analysis (section 5c) is used to compute the optimum initial patterns that grow into the most energetic Floquet mode. The nonnormal growth reaches its maximum always in winter, thus providing the observed seasonal dependence of prediction skill.

### a. Month-by-month linear analysis

To better understand the seasonal dependence of our models’ forecast skill and oscillatory behavior, we analyze the eigenmodes of the linear operator 𝗕^{(0)} in Eq. (6) for a linear 20-component 2-L model. Since 𝗕^{(0)} is periodic with a period of 12 months, we compute a set of eigenvalues and eigenvectors for each of the 12 calendar months. The complex eigenvalues represent damped oscillations with a decay time and period given by the real and imaginary part of the corresponding eigenvalue.

We have found two robust eigenmodes, whose damping times change from month to month, while their periods and spatial patterns stay fairly similar. These two modes are always ranked in the top three least-damped eigenmodes for any calendar month in which they have been identified; they are associated with the QQ and QB modes of ENSO and are shown in Fig. 10. Virtually identical leading modes are obtained when analyzing the eigenspectrum of the linear part of the 2-L quadratic model (not shown). The other eigenmodes are much more sensitive to the season and change significantly from one model to the other.

Our QQ and QB modes are identified based on high month-to-month pattern correlation. Figure 10a shows how the decay time and period of each mode change through the year. The QQ mode has a period ranging from 47 to 58 months and is least damped in December, with a decay time of 14.3 months; it is not identifiable in summer. In contrast, a more strongly damped QB mode is present in all months, and has a period of about 25 months, except in June when its period is of 40 months. It is least damped is spring and early summer, when its decay time is about 5 months. Penland and Sardeshmukh (1995) obtained a 48-month ENSO mode, similar to the QQ mode here, using an LIM for the tropical Pacific only. The latter analysis, however, included neither the extratropics nor seasonal dependence.

The real and imaginary parts of each oscillatory mode’s complex conjugate eigenvector pair form its spatial patterns and are in phase quadrature. They are shown in Figs. 10b and 10c and Figs. 10d and 10e for the QQ and QB modes, in January and May, respectively, when they are least damped.

The QQ component is very similar to the 4-yr oscillatory ENSO mode identified by multichannel SSA analysis (M-SSA) of an SST global dataset by Ghil et al. (2002). This mode consists of a “horseshoe” pattern over the extratropical Pacific, together with a classic El Niño pattern over the tropical Pacific. Amplitudes elsewhere are weak, although there is some simultaneous warming in the northwestern Indian Ocean and over the tropical Atlantic during an El Niño. The spatial patterns of the QQ and QB modes in our regression models are fairly similar to each other, in agreement with the findings of Moron et al. (1998) and Zhang et al. (1998), who identified robust 2- and 4-yr periodicities using M-SSA on global SSTs.

### b. Floquet analysis

Since we include the march of seasons in our models, the linear operator in each model is periodic in time. This 12-month period is comparable with the growth or decay times of the QQ and QB modes obtained in the previous subsection, as well as with their periods. The month-by-month stability analysis of section 5a is thus not really self-consistent and we apply therefore Floquet analysis to obtain the leading oscillatory modes of the linear system

where 𝗕^{(0)} (*t*) is periodic with period *T* = 12 months, as defined in Eq. (6).

The Floquet modes correspond to the eigenvectors and eigen values of a so-called monodromy matrix (Iooss and Joseph 1980; Strong et al. 1995; Jin et al. 1996). The monodromy matrix 𝗠 is defined as the value taken by the fundamental (also known as the resolvent or propagator) matrix **Φ**(*t*) (Hartman 1982) after integrating for one year the matrix form of the ordinary differential equation (ODE):

where 𝗜 is the *n* × *n* identity matrix; here *n* = 40 for our 2-L 20-component model and 𝗠 = **Φ**(*T*).

The eigenvalues of the matrix 𝗠 are called Floquet multipliers, *α _{j}* =

*e*,

^{σjT}*j*= 1, 2, . . . ,

*n*. The Floquet multipliers are uniquely determined and do not depend on the month in which you start to integrate the ODE (8a), while the Floquet exponents

*σ*= (1/

_{j}*T*)

*log*(

*α*) determine the frequency and growth rate of the Floquet modes

_{j}**X**

_{j}(

*t*) =

*e*

^{σjt}**Y**

_{j}(

*t*) but are defined only up to an additive multiple of

*i*2

*π*/

*T*. The periodic Floquet eigenvectors,

**Y**

*(*

_{j}*t*) =

**Y**

*(*

_{j}*t*+

*T*), describe each mode’s seasonal evolution and are obtained from the eigenvectors

*ψ*of the monodromy matrix:

_{j}**Y**

_{j}(

*t*) =

*e*

^{−σjT}

**Φ**(

*t*)

*ψ*, 𝗠

_{j}*ψ*=

_{j}*α*.

_{j}ψ_{j}Equation (8a) was integrated using an Euler scheme and time step of one month; the convergence of the solutions was checked by using a time step of 1 day, with almost identical results. For our 2-L 20-component linear model, the least-damped, oscillatory Floquet mode corresponds to a classic ENSO cycle, with a period of 52 months and a decay time of 11 months; we refer to it as the ENSO mode. The closest counterpart of the QB mode in this Floquet analysis has a frequency of 41 months and it is strongly damped, with a decay time of 3.4 months. The Floquet analysis of 15-component models, whether linear or linearized quadratic (i.e., *n* = 30), yields a leading ENSO mode with similar spectral and spatial features (not shown).

The spatial–temporal structure of the ENSO mode is represented by the real and imaginary parts of the least-damped Floquet vector; these two parts are in phase quadrature and are shown in Figs. 11a and 11b for January and in Figs. 11c and 11d for June, respectively. These plots reveal significant seasonal variation of the ENSO mode, which is consistent with the eigenspectrum analysis of **B**^{(0)} (*t*) for the individual months above: the mode reaches its maximum amplitude in boreal winter, and its minimum in summer.

The simulated Niño-3 index of the ENSO mode, with exponential decay suppressed, is shown in Fig. 11e. The mode’s main cycle of 52 months is amplitude-modulated by the seasonal cycle of the Floquet eigenvectors. The monthly distribution of variance for the mode’s Niño-3 index (Fig. 11f) is similar to the one observed (see Fig. 3a), and simulated in the full model (Figs. 3b,c).

These analytical results allow us to summarize the causes behind the spring barrier in ENSO predictability. The dominant 4.2-yr oscillatory ENSO signal, represented either by the QQ mode from section 5a or by the ENSO mode in the present subsection, is most energetic in winter and strongly damped in summer; this results in a drop of predictability during summer. The QB mode is present throughout the year, but it is both smaller in amplitude and very strongly damped, so the predictability associated with this mode is limited. The seasonal modulation of ENSO mode is explained via nonnormal growth in the next section.

### c. Singular vectors

Blumenthal (1991) and Penland and Sardeshmukh (1995) have related the development of ENSO events to the nonnormal growth of small perturbations. To obtain information about optimal nonnormal growth in the linear part of our models, we applied singular value decomposition (SVD) to the fundamental matrices **Φ**(*τ*), following Thompson and Battisti (2000):

The columns of 𝗩 contain the so-called optimal initial vectors that achieve a growth equal in magnitude, after a prescribed time *τ*, to the corresponding singular values, which are the elements of the diagonal matrix 𝗦, while the final structure of the evolution over the time *τ* is given by the columns of the 𝗨 matrix.

Figure 12a shows the contour map of the largest singular value of **Φ**(*t*, *τ*) for each initial calendar month *t* and up to a lead time of *τ* = 20 month. For a given starting month, maximum growth always occurs in the following boreal winter. The absolute maximum in growth, by a factor of 3, corresponds to starting conditions in February and a lead time of 10 months. The corresponding optimal initial and final patterns are shown in Figs. 12b and 12c, respectively. The pattern at the maximum-growth point strongly resembles a Floquet ENSO mode (Figs. 11a,b), or a fully developed ENSO event. Having maximum growth in winter leads naturally to the higher variance of the Floquet ENSO mode at the same time, explaining Figs. 11a, 11c and 11f.

Both our 2-L model’s Floquet ENSO mode and singular-vector behavior resemble to a certain extent those computed by Thompson and Battisti (2001) for their stochastically forced, linear model, but are far from identical to them. In the latter, the growth, though high for a February start, reaches its maximum for a May start, while in our case the month of maximum growth is February. Our results provide, therewith, a better explanation of the observed seasonal variance distribution, as well as of the spring barrier in prediction skill.

Our optimum final spatial pattern agrees quite well with that of Penland and Sardeshmukh (1995), and represents a fully developed ENSO event. Our optimum initial spatial pattern has one positive regions that contains overall maximum and it is southwest–northeast oriented; this region starts around (20°S, 150°W) and ends near equatorial Africa. Another region is more or less parallel to the previous one, but it starts around the date line, on the equator and then ends near Baja California, Mexico. These features are very similar to the results of Penland and Sardeshmukh (1995), while Thompson and Battisti (2001) obtained only one strong maximum in the southeastern Equatorial Pacific.

The influence of ENSO events on SST variability in other ocean basins, including the North Pacific, has been studied extensively by assuming an “atmospheric bridge” mechanism (Alexander et al. 2002; Lau and Nath 2001). In our results, the North Pacific exhibits a strong center of action in both optimum initial pattern (Fig. 12b), and ENSO Floquet mode (Figs. 11a–d). Unfortunately, earlier ENSO predictability studies (Penland and Sardeshmukh 1995; Thompson and Battisti 2001) included neither the North Pacific nor the North Atlantic and therefore it is not possible to compare our results over these ocean basins with theirs. It thus remains unclear whether SSTs in the northeastern North Pacific play a truly significant role in the nonnormal growth on the ENSO mode, or whether they are mere statistical artifacts of our analysis, being only passively influenced by SST variability in the tropical Pacific via the “atmospheric bridge.”

## 6. Summary and discussion

We have applied the methodology of KKG to construct empirical, linear and nonlinear, inverse stochastic models for the analysis of global SST evolution with an emphasis on ENSO variability. Overall, our models reproduce well the oscillatory, seasonally modulated behavior of observed ENSO phenomena (section 1). Our method (section 2) generalizes in two ways linear inverse modeling (LIM; Penland 1989, 1996; Penland and Ghil 1993; Penland and Sardeshmukh 1995; Penland and Matrosova 1998; Johnson et al. 2000b; Winkler et al. 2001) approach.

The LIM approach considers the dynamics to be linear, stable, and stochastically forced; it estimates the linear deterministic operator

as well as the structure of the stochastic forcing, directly from observations by multiple linear regression (MLR), while assuming the latter forcing to be white in time. The modifications we introduce to this methodology relax the assumptions of linearity and white noise and lead to the construction of multilevel, polynomial inverse models. Another novel aspect of our models is the inclusion of multiplicative and deterministic seasonal forcing, instead of the seasonal variability of the additive forcing in LIM (Penland 1996).

Linear and quadratic, one-level (1-L) and two-level (2-L) models were obtained in the phase space of the leading EOFs of the 1950–2003 global SST record (Kaplan et al. 1998; see section 3 here). The number of variables to include in each model, as well as the order of the model’s nonlinearity, was determined by cross validation to maximize the model’s hindcast skill. The model was also optimized in terms of other statistical properties, such as the probability distribution function (PDF) and power spectra of the observed and modeled SSTs.

The linear and nonlinear 1-L models’ residual forcing involves serial correlations due, among other things, to the dependence of the stochastic forcing on the modeled flow. The residual forcing of our 2-L, quadratic model is indeed a spatially correlated noise process that is white in time. In section 4a, we have shown that the 2-L model has a significantly better cross-validated hindcast skill than its 1-L counterpart at lead times longer than 4 months (Fig. 2), in particular over the equatorial Pacific, as well as over the Indian Ocean. The 1-L linear model used here is analogous to the LIM used for ENSO prediction by C. Penland and associates (Penland and Sardeshmukh 1995; Penland 1996). Our 2-L model, on the other hand, explicitly represents the low-frequency flow’s feedback on the noise statistics (KKG).

The statistical properties of the linear and quadratic 2-L models were considered in section 4b. In our models, the linear part of the deterministic dynamics is seasonally dependent, while in the LIMs of C. Penland and collaborators the seasonal dependence enters into the covariance of the additive noise. Both our linear and quadratic models correctly simulate the seasonal dependence of ENSO. We have tried several other methodologies, by adding a purely deterministic seasonal forcing to, or including the seasonal forcing into both levels of, our inverse model. Any of these modifications degrade the performance of the corresponding models.

We have shown that our 2-L quadratic model is doing a better job than the linear one in capturing the main non-Gaussian features of the distribution of observed SST anomalies (Fig. 3). The observed values of skewness and kurtosis for the Niño-3 PDF lie within 95% of the possible range for an ensemble of 100 realizations of the quadratic model (Fig. 3d). In contrast, for a similar ensemble of linear models, the observed values lie outside the 95% confidence limits of the models’ distribution for these PDF parameters (Fig. 3e). The quadratic model captures better the seasonal dependence of observed skewness, which is strongly biased toward larger warm events during boreal winter (Fig. 4). On the other hand, the 2-L linear model appears to do slightly better in capturing the seasonal dependence of the spread of the distribution.

The simulations of our linear, as well as nonlinear models are characterized by the presence of two oscillatory modes, a dominant quasi-quadriennial (QQ) and a less energetic quasi-biennial (QB) one; their respective periods are close to the observed ones (Figs. 5a,b). The periods of these modes, in observed data as well as model simulations, exhibit an interdecadal variation, as revealed by the wavelet spectra of the corresponding time series (Figs. 5c,d).

The predictive potential of our linear and quadratic inverse models was examined in section 4c using both in-sample hindcasts and cross-validated hindcasts. The skill for 6-month lead in Fig. 6a is comparable to the cross-validated skill of other statistical models (Xue et al. 2000; Johnson et al. 2000b). The quadratic model outperforms its linear counterpart in predicting the magnitude of extreme events (Fig. 6). In particular, the correct magnitude of the 1997/98 El Niño event is predicted 6 months in advance, with a probability of about 20% for an in-sample hindcast mode. In contrast, the probability of such a prediction using the linear model is negligible (Fig. 7b). The ability of the quadratic model to forecast large positive SST anomalies is consistent with the non-Gaussian, skewed aspect of its simulated-data histograms. This advantage of the quadratic-model forecast over the linear one is less pronounced for other extreme events.

The SSA reconstruction of the two major oscillatory ENSO modes suggests that the 1997/98 event is characterized by a particularly exact phase alignment of the QQ and QB modes (Fig. 7c). The dynamics of this event might therefore be slightly different from that of its previous counterparts, for which the rising phases of the two modes did not coincide as precisely. In contrast to extreme SST anomalies, small-amplitude anomalies are predicted equally well by the linear and quadratic 2-L models, as expected. Therefore, when averaged over the whole time series, the predictive skills of the linear and quadratic models are very close.

In summary, our 2-L quadratic model is clearly a better fit to the data in terms of non-Gaussian features (Figs. 3, 4), but its advantages are less clear in terms of cross-validated hindcast skill (Fig. 6): the anomaly correlations are not as good, while the rms errors in extreme-event prediction show a slight improvement.

The seasonal distribution of the cross-validated forecast skill for the 2-L quadratic and linear models (Fig. 8) exhibit a distinct “spring barrier.” This spring barrier in prediction skill is a feature of all statistical and dynamical models used for ENSO forecasts (Latif et al. 1998). Balmaseda et al. (1995) connected the interdecadal modulations in the sharpness of the spring barrier to the degree of phase locking between ENSO and the annual cycle. Jin et al. (1994, 1996) found that in an unstable, nonlinear model based on that of Zebiak and Cane (1987), phase locking between the ENSO mode and the annual cycle does occur and produces several ENSO features not reproduced by linear models (see also Ghil and Robertson 2000; Tziperman et al. 1994). Blumenthal (1991) has related the existence of the spring barrier to a seasonal dependence of nonnormal error growth in a coupled ocean-atmosphere model. Thompson and Battisti (2001) concluded that the presence of an annual cycle in linear models is also important for the seasonal cycle in ENSO variance and the presence of a spring barrier in such models.

The spring barrier of cross-validated hindcast skill (Fig. 8) and lagged autocovariance (Fig. 9) in our models is consistent with the seasonal cycle of observed and simulated ENSO variance: for target months from the late spring to midsummer, during which the observed SST anomalies are weak, the signal-to-noise ratio is small, and the models’ hindcast skill deteriorates.

To better understand the seasonal dependence of both simulation and prediction results in our models, stability analyses of our 2-L linear model, as well as of a linearized quadratic-model version, were carried out for each calendar month. These analyses show that the two oscillatory modes found in the simulations and observations are associated with the models’ most robust linear eigenmodes. The QQ mode is the least-damped one in boreal winter, but cannot be identified in summer, while the QB mode is present throughout the whole year, but is strongly damped (Fig. 10).

Floquet analysis provides more complete and self-consistent information about the seasonal evolution of the oscillatory modes. This analysis shows that the least-damped oscillatory mode, which is robust in both linear and nonlinear models, has a spatial pattern and frequency similar to the simulated QQ mode (Fig. 11). This ENSO mode of the Floquet analysis has its maximum amplitude in boreal winter, while being strongly damped in summer. Using SVD decomposition of the propagator matrices (Fig. 12), we found that non-normal error growth helps explain the seasonal modulation of this ENSO mode. Out SVD results complement those of Thompson and Battisti (2000), who developed a much more detailed, but purely linear stochastic-dynamical model by linearizing the Zebiak and Cane (1987) intermediate coupled model and driving it with additive white-noise forcing. Non-normal growth in our model peaks in February, thus leading to a higher variance of the ENSO mode in winter time, while Thompson and Battisti (2001) obtained maximum growth for a May start. It is left for future research to explain how the dynamic nonlinearities of our quadratic model cause the preferred growth of warm events.

Based on combined statistical and hindcast measures, we conclude that our quadratic 2-L model with multiplicative seasonal forcing provides a good dynamical representation of the ENSO phenomenon and some improvement in the forecasting of extreme events. The physical and dynamical reasons for this competitive performance of such a simple, purely data-based model require further study.

## Acknowledgments

It is a pleasure to thank Lisa Goddard of the International Research Institute for Climate Prediction for providing results on the cross-validated hindcast skill of the Constructed Analog Model, which allowed us to conclude that the hindcast skill of our 2-L quadratic model is quite competitive over the long run. Two anonymous reviewers gave us excellent and constructive advice. This research was supported by NSF Grant ATM-0081321 (MG and DK), as well as NSF Grant OCE-02-221066 and DOE Grant 98ER6215 (SK and AWR).

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**.**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

* Current affiliation: International Research Institute for Climate Prediction, Palisades, New York

+ Permanent affiliation: Départment Terre–Atmosphère–Océan and Laboratoire de Météorologie Dynamique du CNRS/IPSL, École Normale Supérieure, Paris, France

*Corresponding author address:* Dr. Dmitri Kondrashov, Department of Atmospheric and Oceanic Sciences, University of California, Los Angeles, 405 Hilgard Ave., Los Angeles, CA 90095-1565. Email: dkondras@atmos.ucla.edu