A multivariate empirical model is used to show that predictability of the dominant patterns of tropical and North Pacific oceanic variability, El Niño–Southern Oscillation (ENSO), and the Pacific decadal oscillation (PDO), is mostly limited to little more than a year, despite the presence of spectral peaks on decadal time scales. The model used is a linear inverse model (LIM) derived from the observed simultaneous and 1-yr lag correlation statistics of July–June-averaged SST from the Hadley Centre Global Sea Ice and Sea Surface Temperature (HadISST) dataset for the years 1900–2002. The model accurately reproduces the power spectra of the data, including interannual and interdecadal spectral peaks that are significant relative to univariate red noise. Eigenanalysis of the linear dynamical operator yields propagating eigenmodes that correspond to these peaks but have very short decay times and, thus, limited predictability.
Longer-term predictability does exist, however, due to two stationary eigenmodes that are more weakly damped. These eigenmodes do not strongly correspond to the canonical ENSO and PDO patterns. Instead, one is similar to the 1900–2002 trend and might represent anthropogenic effects, while the second represents multidecadal fluctuations of a pattern that potentially represents natural decadal variability; however, neither attribution can be made unambiguously with the analysis presented in this paper. Predictability of these two stationary eigenmodes is significantly enhanced by tropical–North Pacific coupling. Neither stationary eigenmode is well captured in the control run of any coupled GCM in the CMIP-3 project of the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4), perhaps because in all of the GCMs tropical SST decadal variability is too weak and North Pacific SSTs are too independent of the Tropics.
A key implication of this analysis is that the PDO may represent not a single physical mode but rather the sum of several phenomena, each of which represents a different red noise with its own autocorrelation time scale and spatial pattern. The sum of these red noises can give rise to apparent PDO “regime shifts” and seeming characteristics of a long memory process. Such shifts are not predictable beyond the time scale of the most rapidly decorrelating noise, less than two years, although the expected duration of regimes may be determined from the relative amplitudes of different eigenmodes.
Predictability of the atmosphere on annual to decadal time scales comes primarily from the ocean. Observations of spectral peaks on these time scales in both the North Pacific and tropical Indo–Pacific Oceans, particularly from long-term proxy data (Evans et al. 2001; Biondi et al. 2001; D’Arrigo et al. 2001; Gedalof et al. 2002; D’Arrigo et al. 2005), as well as modeling studies of oceanic dynamics acting on multiyear time scales (Venzke et al. 2000; Schneider and Miller 2001; Schneider et al. 2002; Scott and Qiu 2003), suggest that some decadal (or at least greater than interannual) oceanic predictability may exist. In addition, there has been much interest in so-called regime shifts in the climate system, conspicuously rapid transitions between relatively stable atmospheric and oceanic states that appear to occur roughly every few decades (e.g., Trenberth and Hurrell 1994; Hare and Mantua 2000) and perhaps drive similar changes in marine ecosystems (e.g., Miller et al. 1994; Mantua et al. 1997; Benson and Trites 2002; Miller et al. 2004).
A key unresolved question has been how much North Pacific sea surface temperature (SST) predictability is affected by mechanisms of decadal variability “internal” to the extratropics versus mechanisms driven from and/or coupled to the Tropics. Many modeling studies consider the North Pacific in isolation, where extratropical atmospheric variability that is temporally incoherent but spatially coherent drives both a local SST response due to anomalous surface fluxes and Ekman advection and a remote SST response in the Kuroshio–Oyashio Extension (KOE) region1 due to oceanic Rossby waves. Multidecadal spectral peaks in this view are then due to stochastically forced spatial resonance and/or to coupled air–sea feedback (e.g., Frankignoul et al. 1997; Jin 1997; Saravanan and McWilliams 1998; Weng and Neelin 1999; Seager et al. 2001; Schneider et al. 2002). Observational analyses have identified patterns that statistically explain much SST variability (e.g., Deser and Blackmon 1995; Zhang et al. 1997; Mantua et al. 1997; Livezey and Smith 1999; Tourre et al. 1999; Deser et al. 2004; An and Wang 2005), finding some “modes” of North Pacific SST variability that appear related to the Tropics and some that do not. In general, however, limitations of the statistical analyses make dynamical interpretation unclear, especially concerning disentangling tropical influences from purely internal North Pacific dynamics.
That there is at least some tropical influence upon North Pacific decadal variability seems clear (Trenberth and Hurrell 1994; Zhang et al. 1997; Evans et al. 2001; Newman et al. 2003b; Seager et al. 2004). For leads of a few months, the dominant mode of tropical SST variability, El Niño–Southern Oscillation (ENSO), is highly correlated year-round with the dominant mode of North Pacific variability, the Pacific decadal oscillation (PDO), as a result of the atmospheric bridge (Alexander et al. 2002). Because of wintertime reemergence (Alexander et al. 1999), North Pacific SSTs also have year-to-year memory. Thus, a simple model of the PDO–ENSO relationship can be constructed (Newman et al. 2003b, hereafter NCA) showing that, just as the ocean acts to redden atmospheric noise (Frankignoul and Hasselmann 1977), it also acts to redden the ENSO signal. That is, the observed spatial pattern of Pacific SST decadal variability, with relatively higher amplitude in the extratropics than in the Tropics, should be at least partly a consequence of a reddened ENSO response (NCA).
Although useful conceptually, the NCA model is limited by its simplicity. For example, the univariate approach restricts representation of tropical and North Pacific SST anomalies to a single pattern each. However, North Pacific variability is not equally sensitive to all parts of the Tropics (e.g., Alexander et al. 2002; Barsugli and Sardeshmukh 2002), and decadal variability within the Tropics does not coincide precisely with interannual variability (Zhang et al. 1997; Deser et al. 2004). Decadal variability in the North Pacific is not represented by the PDO alone (Deser and Blackmon 1995; Nakamura et al. 1997; Bond et al. 2003) and interactions between different parts of the basin are important on decadal time scales (Schneider and Cornuelle 2005). Thus, a multivariate analysis is clearly preferable.
Another key limitation of the NCA model is that it assumes a perfect forecast of ENSO, a restriction that would be preferable to relax, especially since it is possible that the North Pacific is not merely forced by the Tropics but may force the Tropics and/or be coupled to it (Gu and Philander 1997; Barnett et al. 1999; Schneider et al. 1999; Vimont et al. 2001, 2003; Solomon et al. 2003). Thus, to investigate variability and predictability of the North Pacific, the Tropics must be included as part of the system, not external to it.
To extend the NCA analysis, we will apply linear inverse modeling (LIM) to a state vector constructed from tropical and North Pacific SSTs. LIM attempts to empirically extract the linear dynamical system from simultaneous and time-lag covariance statistics of the system variables, and has been successfully used in geophysical contexts ranging from subseasonal atmospheric variability (Winkler et al. 2001; Newman et al. 2003a) to seasonal ENSO dynamics and prediction (Penland and Matrosova 1994; Penland and Sardeshmukh 1995; Penland 1996). LIM may be considered as a multivariate analog to the univariate red noise null hypothesis for SST variability first proposed by Frankignoul and Hasselmann (1977). Unlike its univariate counterpart, however, multivariate red noise can have transient anomaly growth and both stationary and propagating anomalies, and thus spectral peaks (e.g., Penland and Sardeshmukh 1995). Also, predictability estimates from LIM are straightforward and can be understood from fundamental aspects of the linear system (Newman et al. 2003a; Chang et al. 2004).
The outline of the paper is as follows. Multivariate red noise and linear inverse modeling are discussed in section 2, and section 3 describes details of the data and model. In section 4, the LIM constructed from 102 years of SST data is shown to reproduce the power spectra of the leading principal components (PCs), and it is also found that for forecast leads greater than 1 yr most LIM forecast skill results from two stationary eigenmodes of the linear operator. Effects of tropical–extratropical coupling are investigated in section 5, and our analysis is applied to the output of 18 coupled GCMs in section 6, where it is suggested that the GCMs significantly underestimate tropical forcing of the North Pacific on both interannual and interdecadal time scales. Concluding remarks are in section 7.
2. Multivariate red noise and linear inverse modeling
a. A multivariate null hypothesis
A standard approach to analyzing long-term datasets is to look for spectral “peaks” above a univariate red noise background. That is, one assumes a null hypothesis for a time series x:
where l < 0, −1/l is the decorrelation time scale and ξ represents white noise. Predictability of x is limited by l. Such a simple null hypothesis is often hard to beat; for example, monthly time series of the leading North Pacific pattern of both SST (Pierce 2001; Rudnick and Davis 2003) and sea level pressure (Percival et al. 2001) cannot be distinguished from univariate red noise at the 95% confidence level. In addition, from an SST prediction standpoint, atmospheric variability might as well be random: since the atmosphere varies so rapidly, it cannot be predicted on the much longer dynamical time scales of the ocean. This natural time-scale separation justifies the simple approximation (1) in which the ocean integrates forcing by weather approximated as white noise (Hasselmann 1976; Frankignoul and Hasselmann 1977), a picture perhaps only slightly modified by air–sea coupling (Barsugli and Battisti 1998).
The concept underlying this approximation can be usefully extended to the more general case of multivariate nonlinear dynamics. That is, a time-scale separation between “slow” dynamics and “fast,” or rapidly varying, nonlinearities can be shown to allow the approximation of a multivariate nonlinear tendency equation for a state vector x as
the sum of stable and predictable linear dynamics plus unpredictable white noise ξ (e.g., Papanicolaou and Kohler 1974; Hasselmann 1976; see also Penland 1996). [Note that for the problem considered in this paper, noise represents not only atmospheric variability but more generally any other rapidly decorrelating nonlinearity.] Put simply, variability is modeled as multivariate red noise. However, since the linear operator 𝗟 is a matrix its dynamics are represented by the eigenanalysis 𝗟uj = uj λj, where uj are the eigenmodes and λj are the corresponding eigenvalues that may be complex. This results in two critical differences between univariate and multivariate red noise. First, complex eigenvalues correspond to propagating eigenmodes with defined periods. Second, although all the eigenvalues are damped (i.e., they have negative real parts), in most geophysical systems of interest 𝗟 is nonnormal, so the eigenvectors are not orthogonal and thus transient anomaly growth is possible (e.g., Farrell 1988). Since 𝗟 is stable, an energy balance relationship called a fluctuation–dissipation relationship can be derived from (2):
where 𝗖0 = 〈x(0)x(0)T〉 is the zero-lag covariance matrix and 𝗤 is the noise covariance matrix, which can have nonzero off-diagonal elements (Penland and Matrosova 1994).
b. Linear inverse modeling
Obviously, not all nonlinear dynamical systems can be usefully represented by (2) (e.g., Penland 1989). Two approaches to determine if (2) is a good approximation are the “forward” and “inverse” methods. For the forward method, one might try to rewrite the physical equations in the form of (2). Such a derivation is likely to be extremely difficult for the full climate system. Alternatively, from (2),
where 𝗖τ = 〈x(τ)x(0)T〉 is a time-lag covariance matrix at some lag τ. This suggests an inverse method, in which (given enough good data) 𝗟, as well as 𝗤, can be estimated from observed 𝗖τ and 𝗖0 for some specified lag τ = τo, as described for example in Penland and Sardeshmukh (1995). An effectively linear, stochastically forced model of a system thus constructed is called a linear inverse model (Penland 1989; Penland and Ghil 1993; Penland 1996; DelSole and Hou 1999; Johnson et al. 2000; Winkler et al. 2001; Newman et al. 2003a). In principle, 𝗟 obtained through LIM should be identical to 𝗟 obtained from the forward method.
Whether the LIM is a good representation of the data is tested a posteriori, as described in the above-cited references. Here we stress one important point. If the dynamics of x are effectively linear, it should not matter what lag τo is used to determine 𝗟. That is, 𝗖τ1 = exp(𝗟τ1)𝗖0 and 𝗖τ2 = exp(𝗟τ2)𝗖0. The simplest test of LIM is to determine 𝗟 from many different lags and then compare each 𝗟. Unfortunately, if the “true” 𝗟 has some relatively high-frequency eigenmodes, then lags τ that are about an eigenmode half-period cannot be used to determine 𝗟, even though 𝗚(τ) can be determined. This “Nyquist problem” (Penland and Sardeshmukh 1995) occurs for all linear dynamical systems, so it does not necessarily invalidate the assumption of linearity. An alternative test is to determine if 𝗟, determined at one shorter lag τo, reproduces the observed autocorrelation of x at all longer lags from (4) (Winkler et al. 2001). In this paper, we will employ the related calculation wherein observed power spectra of x are compared to those predicted by LIM.
c. Estimating predictability from LIM
LIM forecast errors e(t + τ) = x(t + τ) − x̂(t + τ) are random and have covariance 𝗘(τ) = 𝗖0 − 𝗚(τ)𝗖0𝗚T(τ). Forecast skill measured by the average anomaly correlation ρ(τ) between forecast and verification anomalies is a function of S, the forecast signal-to-noise ratio, as
(Sardeshmukh et al. 2000). The LIM assumes noise is independent of the state; so on average, higher forecast skill is directly related to stronger predictable signal (Newman et al. 2003a). Therefore, long-range forecasts have the highest skill for relatively large initial amplitude in the least-damped eigenmodes of 𝗟.
3. Model details and data
a. SST data and EOF analysis
The dataset used was essentially the same as used by NCA. SSTs were from the Hadley Sea Ice and Sea Surface Temperature analysis (HadISST: Rayner et al. 2003) for the years 1900–2002. Similar calculations with Kaplan SST data (Kaplan et al. 1998; not shown) gave qualitatively similar results. The data were averaged into 5° × 5° grid boxes. Data were averaged into annual means extending from July through June of the following year. Anomalies were determined by removing the climatological annual mean. Empirical orthogonal functions (EOFs) were determined for the 1950/51–2001/02 period, and then anomalies were projected onto these patterns to produce 102-yr time series for each EOF. NCA determined EOFs from monthly data and then took a July–June mean of the resulting time series, but this made very little difference in our analysis. Prior to 1950, the general paucity of data may make it difficult to determine the dynamics on seasonal time scales, but HadISST annual means for that period appear dynamically relevant (Rayner et al. 2003). NCA showed that using a July–June mean might avoid some seasonality issues for tropical variability, North Pacific variability, and the tropical–North Pacific relationship. However, seasonality is likely still relevant to decadal variability (Vimont 2005).
EOFs were determined separately for the Tropics (20°S–20°N, 60°E–60°W) and the North Pacific (21°–60°N). The three leading EOFs for each field are shown in Fig. 1. Annual averages reduce the number of degrees of freedom, so few EOFs are needed to explain a significant fraction of variance of either field. The leading EOFs dominate the variability of their respective fields and are typically named “ENSO” and the “PDO.”
b. Construction of LIM
The SST state vector is
where xN is anomalous North Pacific SST and xT is anomalous tropical SST, so that (2) becomes
Here ξN and ξT are white noise forcings of xN and xT, respectively. Note that separating tropical and North Pacific SST explicitly in x allows diagnosis of tropical–North Pacific interactions through 𝗟NT and 𝗟TN (Newman et al. 2000; Winkler et al. 2001). Also, since x represents annual means, the noise can include variability that, while not predictable over the course of an entire year, could be predictable for a few months.
Defining the state vector solely in terms of SST is perhaps necessary given data limitations for other atmospheric and oceanic circulation variables. Certainly, other variables related to important physical processes such as surface fluxes and oceanic advection are important to SST evolution. The inverse model does, however, implicitly include the effects of all other variables linearly related to SST. This is an important distinction from a forward dynamical model in which the evolution of the state vector is governed only by the explicitly represented interactions among its components.
The leading seven EOFs of anomalous tropical SST and the leading three EOFs of anomalous North Pacific SST were retained for the model. The time-varying coefficients of these EOFs (i.e., the PCs) define a 10-component state vector x. The retained tropical SST EOFs account for 91% of tropical variance, while the North Pacific SST EOFs capture about 80% of the variance in the central and western North Pacific, though only about 64% of the domain-integrated variance. A lag of τo = 1 yr was used to determine 𝗟. Constructing a LIM from longer lags and/or more EOFs produces qualitatively similar results for 𝗚, but results in the Nyquist problem for 𝗟 discussed in section 2.
Finally, the LIM must be tested on data independent of that used to determine 𝗟. Estimates of 𝗟 and of forecast skill were cross-validated as follows. We subsampled the data record by removing one decade, computed 𝗟 via (3) for the remaining years, and then generated forecasts for the independent years. This procedure was repeated for each year. All measures of forecast skill in this study are based upon these “jackknifed” forecasts.
a. Forecast skill
Figure 2 shows forecast skill ρ(τ) for forecast leads of 1 and 2 yr. Here, forecasts are compared with the complete (i.e., untruncated in EOF space) gridded observations. Regions of little skill (ρ < 0.2) in the North Pacific coincide with regions where relatively less variance is explained by the leading three EOFs. Additional EOFs in xN increase skill to be no worse than about ρ = 0.3 in the northeast Pacific for a 1-yr forecast, with higher-skill regions essentially unchanged.
For 2-yr forecasts, highest LIM skill lies primarily outside the ENSO region. However, persistence skill is also high, especially in the far west Pacific and Indian Ocean, so differences between LIM and persistence skill (bottom two panels in Fig. 2) are often small. One notable exception is the tropical Pacific, where persistence skill is negative due to the pronounced interannual cycle of ENSO. LIM skill in the western subtropical Pacific also appears better than persistence. Additional EOFs greatly reduce, but do not quite eliminate, the large negative difference in the Northeast Pacific in the bottom panels.
Examining skill by EOF (Fig. 3) shows good PDO and ENSO 1-yr forecast skill. In particular, inspection of the forecasts (not shown) finds that all notable transitions into positive and negative ENSO states are captured, as is possible for multivariate, but not for univariate, red noise. PDO 1-yr skill here (ρ = 0.64) may be closer to a realistic estimate of skill than PDO skill in NCA (ρ = 0.74) since ENSO skill is no longer perfect. For both the PDO and ENSO, however, skill drops significantly for 2-yr forecasts.
b. Power spectra
As discussed in section 2b, if (2) is a good model of x, the LIM should reproduce the entire power spectra of x. As a test, (2) is integrated forward for 100 000 years, using the method described in Penland and Matrosova (1994). The noise is generated by ξ = Σjqjηjrj(t), where qj and (ηj)2 are the eigenvectors and eigenvalues, respectively, of 𝗤, and rj(t) are independent Gaussian white noises with unit variance. One eigenvalue of 𝗤 is negative with small magnitude, representing insufficient sampling of the most strongly damped eigenmode of 𝗟 (below). Reducing the projection of the noise upon this eigenmode by one-third results in a positive definite 𝗤. Note that any truncation of 𝗤 means that the LIM will not exactly replicate 𝗖0.
The resulting “data” are separated into one thousand 100-yr time series. The observed spectra and the ensemble mean of the model spectra, determined using the Thomson multitaper method (Thomson 1982; Mann and Lees 1996), for the three leading PCs of both tropical and North Pacific SST are shown in Figs. 4 and 5. Also, with 1000 realizations, the 95% confidence interval, shown with gray shading, can be determined in a Monte Carlo sense.
The LIM reproduces the main features of the observed power spectra for all the PCs (including the remaining tropical SST PCs not shown) on all time scales. Obviously, the mean LIM spectra are much smoother than observed, due to the relatively few degrees of freedom in the truncated EOF space. On the other hand, the irregularity of the observed spectra may simply be due to sampling since no statistically significant peaks exist in the observed spectra relative to multivariate red noise. The confidence intervals reflect substantial variation in the spectra of each 100-yr segment from the model run. Each segment can have pronounced peaks not always coincident with the observed peaks, as exhibited by the spectra of two different 100-yr segments also in Figs. 4 and 5. That is, this multivariate linear system implies that, if good SST data for the last few hundred years were available, we would find interannual and interdecadal spectral peaks for past centuries to be centered at somewhat different frequencies and have different amplitudes and/or widths compared to the twentieth century, without any change in the underlying dynamics on interannual and interdecadal time scales. This point has been emphasized for ENSO variability by Penland and Matrosova (1994), who constructed a LIM of 3-month running mean tropical IndoPacific SST that reproduced not only the spectra but also the case-to-case evolution of both warm and cold ENSO events.
Additionally, while for the entire model run the correlation of PDO with ENSO is 0.63 (observed r = 0.69), the correlation varies between 0.37 and 0.82 (95% confidence interval: 0.48–0.75) when computed from individual 100-yr segments. For 5-yr running means of the model output (i.e., filtering to remove interannual variability) the range is even greater: the correlation for the full run is 0.64 (observations is 0.73) but for 100-yr segments the 95% confidence interval is 0.33–0.85. Similarly, while the observed PDO–ENSO correlation varies from 0.77 to 0.67 for the two halves of the data record, correlation within individual 50-yr segments of the model runs varies between 0.26 and 0.87 (95% confidence interval: 0.42–0.79). Again, since 𝗟 is constant, this variation is entirely due to noise in this multivariate system, suggesting that large changes in correlations between climate indices also do not necessarily reflect any change in the underlying dynamics, even on decadal time scales. Obviously, changes in dynamics on millennial time scales (e.g., Tudhope et al. 2001) are not ruled out here.
c. Eigenanalysis of the linear dynamical operator
To understand departures of both observed and LIM spectra from univariate red noise, we next perform an eigenanalysis of 𝗟. First, a caveat. It is always tempting to separate variability into a set of modes. However, variability results from the sum of contributions of the complete set of eigenmodes of 𝗟; that is,
where αj(t) is the projection time series for the jth eigenmode. Unlike techniques that order modes by decreasing amount of variance (or covariance or correlation) explained, the least damped eigenmode is not required to explain the most variance. In fact, eigenmodes are generally not orthogonal (since 𝗟 is generally not self-adjoint) and thus variance cannot be partitioned into individual modes. Eigenmodes are useful to the extent they provide insight into the dynamics represented by 𝗟 and hence into the predictability.
The eigenvalues and e-folding times (EFTs) of 𝗟 are listed in Table 1, and leading eigenmodes [also sometimes called principal oscillation patterns (POPs): Hasselmann 1988; Penland 1989] are shown in Fig. 6. Eigenmode time series in Fig. 7 are determined by projecting the data onto the corresponding (biorthogonal) adjoint vectors vj; that is, vj are the eigenmodes of the adjoint operator 𝗟† defined under the L2 norm, so that ui · vj = δij and thus αj(t) = xv*j (e.g., Strang 1988; Farrell 1988). The two leading propagating eigenmodes (σi ≠ 0) are reminiscent of a “decadal ENSO” pattern (top panels of Fig. 6, u3) and an “interannual ENSO” pattern (middle panels of Fig. 6, u4). Both of these eigenmodes are needed to reproduce ENSO: α4 (α3) is correlated with the leading tropical PC at 0.69 (0.43) (see also Penland and Sardeshmukh 1995). Phase propagation is such that the tropical SST anomaly leads the North Pacific anomaly, by about 2 yr in the decadal ENSO (not shown) and by about 1 yr in the interannual ENSO. As is typical of studies filtering Pacific SSTs into interannual and interdecadal bands (e.g., Zhang et al. 1997), the tropical SST portion of u3 has broader latitudinal extent than for u4. The North Pacific anomaly is also relatively stronger, consistent with the “reddened ENSO” null hypothesis (NCA). In addition, u3 has maximum tropical amplitude nearer to the date line than u4, which could reflect not only greater decadal variability (e.g., Hazeleger et al. 2005) but also greater extratropical sensitivity to SST anomalies in this location (Alexander et al. 2002; Barsugli and Sardeshmukh 2002). One might be tempted to relate the cosine and sine phases of u3. However, the EFT is so short that in the presence of noise not even a quarter cycle can likely exist; consequently, neither propagating eigenmode oscillates with a well-defined period (Fig. 7).
The two leading eigenmodes (bottom panels of Fig. 6) are stationary and have considerably longer EFTs than any other eigenmode. Unsurprisingly, these eigenmodes also display the greatest persistence. (In contrast, the amplitude of the most damped eigenmode, also in Fig. 7, changes sign almost yearly.) The leading eigenmode, u1, looks very similar to the 1900–2002 trend pattern (not shown), which also shows pronounced warming along the western and northern boundaries of the Pacific Ocean and in the Indian Ocean, with little amplitude in the ENSO and PDO regions. In fact, α1(t) is highly correlated (r = 0.86) with July–June global mean temperature determined from the monthly NASA Goddard Institute for Space Studies (GISS) Surface Temperature Analysis (GISTEMP) dataset (available online at http://data.giss.nasa.gov/gistemp/; Hansen et al. 1996, 1999, 2001; Reynolds et al. 2002; green line in Fig. 7); note that amplitude of both time series decreases from the mid-1940s to the mid-1960s. The tropical portion of u1 is also very similar to the least damped eigenmode from a LIM constructed of 3-month running mean tropical SSTs during 1950–2002. Penland and Matrosova (2006) suggest this eigenmode may represent tropical variability separate from canonical ENSO evolution that nevertheless impacts ENSO predictability, and found that its associated time series peaked in the mid-1980s. However, including the North Pacific results in a time series that continues to increase beyond this time (Fig. 7). The observed u1 trend is greater than almost 99% of the u1 trends in all 102-yr segments of the LIM model run.
The second eigenmode, u2, has strong amplitude in the North Pacific, with the maximum located west of the PDO maximum. The tropical amplitude is relatively weaker, although there is an equatorial maximum at the date line, off-equatorial maxima in the eastern Pacific, and again a relatively large Indian Ocean anomaly. Similar features have been found by previous decadal variability studies. For example, u2 resembles the “global signal” found by Livezey and Smith (1999) in a rotated canonical correlation analysis of SST and U.S. surface temperatures for the 1950–92 period. Livezey and Smith suggested that this pattern reflected a trend possibly related to global warming, and as it is quite similar to the post-1950 trend pattern, it might be expected to dominate their analysis. The tropical portion of u2 also resembles the pattern identified by Deser et al. (2004) as related to an interdecadal signal, or “epoch difference,” in the Aleutian low, although their North Pacific anomaly is farther east than in u2, perhaps because their epoch difference pattern also appears to include a contribution from u3. Their analysis of 1900–98 data suggested that their pattern did not represent a long-term trend but rather had undergone a few changes of sign in the twentieth century; a successor study (D’Arrigo et al. 2005) found other possible sign reversals of the pattern before 1900. Almost half of the 102-yr segments of the model run have larger u2 trends than observed.
The above results are qualitatively similar for different EOF truncations and different dataset periods. There are some quantitative changes in the eigenvalues, however. For example, removing different decades from the dataset (as in the cross validation) makes the EFT for the leading (second) eigenmode vary between 9 and 13 (4 and 7) yr. Likewise, periods of propagating eigenmodes vary, between about 20 and 40 yr for u3 and between about 4.5 and 7 yr for u4. Using the Kaplan dataset instead produces eigenvalues that also fall in these ranges, although all eigenmodes are more damped than their HadISST counterparts. Common to all cases, however, are two leading stationary eigenmodes plus an overdamped decadal ENSO eigenmode clearly distinct from an interannual ENSO eigenmode, all with the spatial characteristics described above.
Stratifying forecast skill by eigenmode rather than by EOF clarifies the source of predictability. Results for forecast leads of 1 and 2 yr are shown in Fig. 8, compared to predicted skill determined from the LIM using (6). Now apparent random variation of skill by PC (Fig. 3) is replaced by a strong relationship between skill and EFT. For forecast leads beyond 1 yr, useful skill resides primarily in the two leading eigenmodes, also clearly reflected in the forecast skill maps (Fig. 2). In fact, skill for u2 is above 0.5 for forecast leads up to 6 yr. Note that little long-range skill is associated with u3, despite its long period, because its EFT is so short.
In eigenmode space, (2) becomes a set of uncoupled univariate equations for each eigenmode projection time series,
where ξ̂j is white noise in eigenmode space. Another test of the LIM, and of the eigenmodes themselves, is to determine whether ξ̂j truly represents white noise. We find that this is indeed the case: the ξ̂j, calculated as residuals from (10), are normally distributed and almost uncorrelated from year to year. As an immediate consequence, each time series αi(t) can be considered to be (real or complex) univariate red noise.
d. Reconstructing the PDO as a sum of univariate red noises
No single eigenmode can be identified as “the” PDO. That is, no individual time series αj(t) correlates with the PDO at a value higher than 0.47. Yet the LIM reproduces the entire PDO spectrum, suggesting that the PDO represents not a single mode but rather a superposition of different modes acting on different spatial and temporal scales. This is illustrated in Fig. 9 by comparing the observed PDO time series P(t) to a reconstruction from the sum of just three eigenmodes, P̃(t) = [u2α2(t) + u3α3(t) + u4α4(t)]i, where i is the index corresponding to the PDO PC. The correlation of P̃ and P is 0.8; the difference d = P − P̃ is mostly noise (the decorrelation time scale of d is less than 1 yr) that projects on the most strongly damped eigenmodes. Interestingly, including u2, which has large amplitude in the PDO region, is important even though the correlation of α2 and the PDO is 0.2, because without it P̃ is correlated with P at only 0.65. Additionally, reconstruction using just u2 and u3 gives a correlation of 0.83 when both P̃ and P are filtered with 5-yr running means.
Figure 9 also notes years commonly associated with “regime shifts” in the PDO (e.g., Hare and Mantua 2000; Deser et al. 2004): P̃ clearly reproduces these shifts, but here they represent not nonlinear state changes but rather random changes due to the summation of these univariate red noises. Also, these shifts arise from neither constructive interference nor phase locking of multidecadal oscillations of differing periods (e.g., Minobe 1999). Since the eigenmodes that contribute to the PDO are nonorthogonal, a white noise “kick” can perturb all three modes to some degree. On the other hand, the regime shifts do not seem to correspond to consistent changes of the three eigenmodes. For example, while the shift in the early 1940s occurs as u2 begins a long period of negative values (after a period of weakly positive to neutral values), the mid-1970s shift occurs about a decade after u2 changed sign. Some authors have also suggested a shift occurred in 1998, but as of mid-2002 the PDO has had mostly positive values (not shown). This possibly transitory excursion of negative PDO values is consistent with the strong negative excursion of u3, which due to its very short EFT is unlikely to be long lived.
5. Impact of tropical–North Pacific coupling
As noted in section 3, (8) allows diagnosis of the impact of tropical–North Pacific coupling upon SST variability. This is done by first repeating the model run of section 4b, executing additional 100 000-yr runs in which either 1) the Tropics and North Pacific are uncoupled (𝗟TN = 𝗟NT = 0), and noise in the Tropics and North Pacific is uncorrelated (〈ξNξTT〉 = 0) (“fully uncoupled”); 2) the Tropics and North Pacific are uncoupled (𝗟TN = 𝗟NT = 0), but the original correlated noise eigenmodes are used; that is, noise fluctuates with coherent tropical–North Pacific patterns (“partially uncoupled”); 3) the Tropics do not force the North Pacific (𝗟TN = 0) and 〈ξNξTT〉 = 0); or 4) the North Pacific does not force the Tropics (𝗟NT = 0 and 〈ξNξTT〉 = 0). Runs 3 and 4 are repeated with the tropical–North Pacific correlated noise included. The fully uncoupled model (green lines in Figs. 4 and 5) results in two independent dynamical systems:
Note that 𝗟NN is different than the linear operator (i.e., L̂NN) that would be obtained from the LIM of just xN alone, since the latter would involve actual evolution of North Pacific SST, which includes tropical forcing. That is, we are trying to remove the effects of xN upon xT and vice versa so that (11) represents only internal North Pacific dynamics and (12) represents only internal tropical dynamics. As an important caveat, “coupling” here refers to the relationship of xN with xT, without differentiating either which parts of the state vectors are most important or whether the connections are through the atmosphere and/or through the ocean.
PDO–ENSO correlations for each run, for both annual means and 5-yr running means, are compared with correlations from the full LIM and from observations in Fig. 10a. Coherence between the PDO and ENSO (Fig. 10b) shows no interannual band PDO–ENSO relationship when the Tropics do not force the North Pacific. Conversely, there is no predictable effect of the North Pacific upon the Tropics on interannual time scales, although there is an unpredictable effect that occurs when noises in the Tropics and North Pacific are spatially correlated, a result consistent with seasonal footprinting (Vimont et al. 2001, 2003; Seager et al. 2004). On decadal time scales, the PDO–ENSO relationship is dependent on both tropical and North Pacific forcing.
How coupling affects 𝗟 can be understood through eigenanalysis of the uncoupled dynamical system (11) and (12). Eigenanalysis of 𝗟NN results in a pair of propagating eigenmodes with a 12-yr period and 5-yr EFT, and one strongly damped (0.9-yr EFT) stationary eigenmode. The propagating eigenmode is consistent with behavior found in many models of North Pacific variability: an initial SST response to anomalous surface forcing in the central North Pacific (cf. Fig. 11a) is followed by a remote response in the KOE region (cf. Fig. 11b) due to westward propagating (cf. bottom panels in Fig. 11) oceanic Rossby waves, with subsurface temperature anomalies brought to the surface in the KOE by mixing and/or gyre adjustment (e.g., Qiu 2000; Xie et al. 2000; Seager et al. 2001; Schneider et al. 2002). The EFT is only long enough to follow a Rossby wave to the KOE, consistent with earlier studies (Seager et al. 2001; Schneider and Miller 2001) suggesting that knowledge of wind stress forcing over the North Pacific could result in predictability of a few years in the western North Pacific.
Furthermore, the North Pacific power spectra resulting from the fully uncoupled run (Fig. 5) show much more North Pacific decadal variability than for either the full LIM or observations, with the peak centered slightly less than the uncoupled eigenmode period because of the shorter EFT. Thus, while 𝗟NN is consistent with previous theoretical and simple modeling studies of the North Pacific, the full dynamical operator is needed to reproduce observations; that is, consideration of internal North Pacific dynamics without tropical interactions appears insufficient to understand North Pacific variability.
While the leading tropical PC is only weakly affected by the North Pacific, coupling enhances decadal variability of the remaining tropical PCs (Fig. 4). This is consistent with the eigenmodes of 𝗟TT (not shown): the leading propagating eigenmode looks very similar to the tropical portion of u4 (Figs. 6c,d) with only a minor increase in period and EFT, and the other 𝗟TT eigenmodes also have counterparts in the eigenmodes of the full operator 𝗟 but are relatively more damped.
The leading coupled eigenmode u1 has no clear counterpart in the uncoupled eigenmodes, has a considerably longer EFT than any eigenmode of either 𝗟TT or 𝗟NN, and has a tropical pattern that looks nothing like any eigenmode of 𝗟TT. Similarly, while tropical and North Pacific counterparts to u2 exist in the uncoupled eigenmodes, their EFTs are shorter, especially for the North Pacific. Furthermore, maxima of multidecadal variance for both the Tropics (PC2) and North Pacific (PC2 and PC3) do not exist for the uncoupled system. This suggests that these eigenmodes and spectral peaks reflect a fundamentally coupled process.
6. Comparison to coupled GCMs
How well the dynamical system of observed Pacific SST described in this paper is replicated in coupled atmosphere–ocean GCMs is investigated for output from the World Climate Research Programme’s (WCRP’s) Coupled Model Intercomparison Project phase 3 (CMIP-3) multimodel dataset used for the current Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4). (The models, along with the lengths of the control runs used, are listed in Table 2; further details are available online at http://www-pcmdi.llnl.gov/projects/cmip/index.php.) As with the HadISST data, the GCM datasets were averaged, and/or interpolated where necessary, to the same grid and July–June annual means. Anomalies were then determined by removing the model climatology. For consistent comparison and because it is the real system these sophisticated models are trying to simulate, the model data were projected onto the observed EOFs to produce PC time series in each model. Earlier analyses of CMIP model data (AchtuaRao and Sperber 2002, 2006) that tended to claim somewhat better results for ENSO simulation were focused on an index of Niño-3, a region in which model simulations are perhaps least erroneous (not shown).
It is occasionally suggested that GCMs may appear different than the observed system because they represent different realizations of the climate state, not necessarily different physics. If we interpret the LIM confidence intervals as showing the possible range of realizations, however, then the spectra of all of the models (not shown) are statistically significantly different from the observed system. Differences between GCM and observed variability are also readily apparent by comparing the variance of tropical (Fig. 12) and North Pacific (Fig. 13) PCs, for both annual and 5-yr running means. Many of the models have an ENSO spectral peak whose period is too short, a well-known coupled GCM problem; not surprisingly, these models also have relatively too much of their variance on interannual time scales. In fact, virtually all the GCMs have tropical decadal variability that is less than observed, particularly for PCs 2 and 3. On the other hand, there is no apparent systematic error in the variance of the North Pacific PCs; most of the GCMs lie outside the LIM confidence intervals on annual time scales but are within the intervals on decadal time scales. Perhaps most important, all the GCMs consistently underestimate the PDO–ENSO correlation on both annual and decadal time scales, as is shown in Fig. 14; notably, for annual means none of the models are within the LIM 95% confidence interval.
The leading eigenmodes likewise are inadequately simulated. As an example, the GCM data are projected onto these eigenmodes. Decorrelation time scales, the time the autocorrelation function has decreased to 1/e, are then determined for the resulting time series; note that for the propagating modes, the decorrelation includes the effect of propagation as well. These time scales are compared with the observed decorrelation time scales in Fig. 15. The leading eigenmodes show much less persistence in the GCMs than is observed. This is true even if the observed time series are detrended, which shortens the decorrelation time scale of u1 to about 3 yr while leaving the other decorrelation time scales unchanged.
This combination of underestimating both tropical decadal variability and ENSO–PDO correlation yet having more reasonable amplitudes of North Pacific decadal variability, along with low persistence of the leading eigenmodes, suggests that North Pacific SSTs in these GCMs may be too independent of the Tropics.
7. Concluding remarks
A multivariate red noise model, empirically constructed from a 1-yr lag, can explain the observed power spectra of annual mean tropical and North Pacific SST on both interannual and interdecadal time scales. Spectral peaks on decadal time scales that may represent slow physical oceanic processes also correspond to strongly damped propagating eigenmodes, so predictability on these time scales is low (Scott 2003 makes a related point). Forcing of the North Pacific by the Tropics, and also forcing of the Tropics by the North Pacific, is a critical element of Pacific decadal variability and predictability but appears poorly represented in coupled GCMs. The range of these GCM simulations appears outside what might be expected from a series of “real world” realizations despite the fact that substantial variations in both the power spectra of, and the correlations between, the Tropics and North Pacific are possible even if the dynamics of annual-mean SST are fixed and multilinear. Thus, previous studies employing coupled GCMs likely underestimated the tropical contribution to North Pacific decadal variability and predictability.
Virtually all long-range LIM skill comes from two stationary eigenmodes that have the longest e-folding times. Anomaly correlation skill of both eigenmodes is greater than 0.5 for up to 6 yr. The time series of the leading eigenmode has a 100-yr trend that is unlikely, but not impossible, to have occurred by chance in the multivariate linear system. The second eigenmode might represent natural decadal variability; it has a pattern somewhat similar to the multidecadal signal found by Deser et al. (2004; see also D’Arrigo et al. 2005), although their result also appears to include a contribution from the less persistent “decadal ENSO” eigenmode. Note, however, that the second eigenmode does not propagate with a multidecadal period, but instead has a sufficiently long EFT that it varies on a decadal time scale.2 Both modes appear to have persistence and thus predictability significantly enhanced by tropical–North Pacific coupling, and are poorly simulated by the coupled GCM control runs examined here. Whether one or both eigenmodes represent anthropogenic forcing (as suggested by Livezey and Smith 1999), other external radiative forcing, natural decadal variability that these GCMs cannot capture, or some combination of the three, remains undetermined by our analysis. Certainly, the time series of the leading eigenmode has a strong correspondence to the time series of the global mean temperature, whose trend is often associated with anthropogenic effects. Of course, if the two least damped eigenmodes include externally forced variability, they should not be fully simulated by the IPCC control runs; definitive assessment awaits repetition of this analysis upon coupled GCMs forced with historical changes in radiative forcing. These eigenmodes might also be affected by nonstationarity artificially introduced by the SST reconstruction technique applied to differing amounts and quality of input data, but it is encouraging that the HadISST bias correction has a different spatial pattern than either eigenmode (Folland and Parker 1995), that in the Tropics the leading eigenmode has a different pattern than that of observation locations (Penland and Matrosova 2006), and that both eigenmodes also result from LIMs of post-1950 SST datasets, either annual means (not shown) or 3-month running means (Penland and Matrosova 2006).
Much of the interest in the PDO has come from its apparent relationship with regime shifts throughout the climate system and possibly in the marine ecosystem as well. There has been related interest in whether the PDO might represent a “long-memory” process or 1/f noise, with much more regimelike behavior than exists in univariate red noise (Percival et al. 2001; Rudnick and Davis 2003; Overland et al. 2006; see also Fraedrich et al. 2004). Such behavior might be characteristic, for example, of the slow manifold of a nonlinear dynamical system. Also, nonlinear models have been proposed to account for the decadal variability of ENSO (e.g., Tziperman et al. 1995; Timmermann and Jin 2002; Karspeck et al. 2004), and it might be expected that, at a minimum, regime shifts in the Tropics would produce regime shifts in the North Pacific (Seager et al. 2004). However, it is well known that long-memory behavior can also exist in processes that are the sum of independent red noises (Beran 1994). Moreover, a multivariate linear system representing advection and diffusion, driven by Gaussian white noise, can produce 1/f noise (Milotti 1995). Our results suggest that, on the annual time scale, the PDO represents not a single phenomenon but rather the sum of several different basin-scale processes (see also Livezey and Smith 1999; Schneider and Cornuelle 2005). Each of these processes is represented by different spatial patterns whose temporal evolution can be separately modeled as red noise with differing autocorrelation time scales. Then, apparent PDO “regime shifts” are best understood not as sudden, yet potentially predictable, nonlinear changes between relatively stable climate states, but rather as random occurrences in a multivariate linear system driven by noise. Importantly, in this view regime shifts are not predictable beyond the time scale of the most rapidly decorrelating red noise, less than 2 yr. On the other hand, once a regime shift has occurred, the expected duration of the regime may be predicted, depending upon the relative contributions of different eigenmodes. Based on this model, it appears unlikely that 1998 marked the beginning of a long period of negative PDO values.
This dynamical model of annual-mean SST may constrain possible physical models of coupled atmosphere–ocean Pacific decadal variability. For example, predictable effects of the North Pacific upon the Tropics are associated with only stationary modes, as opposed to propagating modes where, say, North Pacific SST leads tropical SST. This result seems more consistent with atmospheric rather than oceanic teleconnections between the North Pacific and the Tropics since it suggests a rapid (less than a year) connection between the two regions. Also, a theory that produces fairly regular cycling of anomalies on decadal time scales could be inconsistent with the short EFT of u3. To the extent that coupling appears important on decadal time scales, a physical model that considers just the North Pacific in isolation, or just the Tropics, will probably not explain observed variability. As a caveat, note that the LIM does not imply that realistic physical models must be actually linear, but rather be effectively linear, that is, with little nonlinear predictability on annual time scales. Ultimately, in the LIM framework further questions about important physical processes, as well as gauging the impact of predictable nonlinearities on seasonal time scales, will require explicitly including both atmospheric and subsurface ocean data in the state vector, using shorter time averaging and including seasonality, issues that are being considered in our current research. Surface heat fluxes may include state-dependent, or multiplicative, noise (Sura et al. 2005, 2006), which could also affect our results.
Finally, decadal SST predictability does not necessarily equal decadal atmospheric predictability. Previous coupled GCM studies found some decadal predictability over ocean but also considerably lower predictability over land (e.g., Grotzner et al. 1999; Pohlmann et al. 2004). Similarly, the fraction of variance on decadal time scales is much higher for SST than for surface pressure over the North Pacific (NCA). In fact, decadal signals in Pacific SST may be best used to forecast climate integrators, variables such as soil moisture and snowpack that tend to integrate anomalous climate forcing, which may have a stronger relationship to North Pacific SST than instantaneous atmospheric variables such as surface temperature (NCA). Alternatively, since operational seasonal North American temperature forecasts have some skill using the previous 10-year mean (Huang et al. 1994), small additional skill may be gained by instead using a bivariate predictor based on the leading two eigenmodes of 𝗟.
Conversations with many coworkers at the lab formerly known as CDC greatly advanced the development of this study, particularly Mike Alexander, Gil Compo, Cécile Penland, Prashant Sardeshmukh, Amy Solomon, and Cathy Smith. Comments by Rob Scott and Richard Seager improved the manuscript. Jamie Scott assisted with obtaining the IPCC model dataset. We acknowledge the modeling groups for making their simulations available for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting and archiving the CMIP-3 model output, and the WCRP’s Working Group on Coupled Modelling (WGCM) for organizing the model data analysis activity. The WCRP CMIP-3 multimodel dataset is supported by the Office of Science, U.S. Department of Energy. This work was partially supported by a grant from NOAA CLIVAR/Pacific.
Corresponding author address: Dr. Matthew Newman, Physical Sciences Division, NOAA/Earth System Research Laboratory, 325 Broadway, R/PSD1, Boulder, CO 80305-3328. Email: email@example.com
The ideal name for this mode would then of course be the Pacific multidecadal fluctuation (PMF).